Olly Betts | 6 Jan 04:47
Favicon
Gravatar

Re: NearPostList and get_wdf

On Mon, Dec 29, 2008 at 02:09:14PM +0100, Yann ROBIN wrote:
> On Mon, Dec 29, 2008 at 1:50 PM, Richard Boulton
> <richard <at> lemurconsulting.com> wrote:
> > I'm not sure that modifying the wdf is really the way to go about this - it
> > seems to me that you might do better to use a custom weight class, which
> > factored in the frequencies of the individual terms, as well as their
> > proximity.

You have to choose a weight class for the whole query - it can't be
different for different subqueries.  So I'm not sure how this would
work.

A sane approach would probably be in NewNearPostList::get_weight() to
multiply the weight returned by the AND query's get_weight() method by a
non-negative factor which varies depending how close the terms are -
largest when they're together, much smaller when they are far apart.

This will be slower to run than the current NearPostList though as it
can't stop working on a document when it finds a match within the window
size - instead it has to check all the positional data for each document
matching the AND query to find the closest match.

This factor needs to have a known upper bound, which you multiply
get_maxweight() and recalc_maxweight() from the AND query by.

> > Feel free to open a feature request ticket, describing the feature that you
> > would like to exist.  OP_NEAR as it is currently implemented is behaving as
> > intended, though.
> 
> The ticket was more for the get_wdf not being called, i don't think this was
(Continue reading)

David Sainty | 15 Jan 02:28
Picon

Xapian core build failure under gcc 2.95

Hi,

Under gcc 2.95 Xapian fails to build like so:

 g++ -DHAVE_CONFIG_H -I. -I./common -I./include 
-I/home/dsainty/not-backed-up/pkgsrc/textproc/xapian/work/.buildlink/include 
-Wall -W -Wredundant-decls -Wpointer-arith -Wcast-qual -Wcast-align 
-Wno-long-long -Wformat-security -fno-gnu-keywords -Wundef -O2 -c 
queryparser/queryparser_internal.cc 
-Wp,-MD,queryparser/.deps/queryparser_internal.TPlo  -fPIC -DPIC -o

queryparser/.libs/queryparser_internal.o
/data/home/olly/tmp/xapian-svn-snapshot/tags/1.0.10/xapian/xapian-core/queryparser/queryparser.lemony:25: 
queryparser_internal.h: No such file or
directory
/data/home/olly/tmp/xapian-svn-snapshot/tags/1.0.10/xapian/xapian-core/queryparser/queryparser.lemony:31: 
queryparser_token.h: No such file or directory
*** Error code 1

The problem seems to be that the build system is relying on the compiler 
to imply -Iqueryparser when the source file is 
queryparser/queryparser_internal.cc.  Modern gcc makes this implication, 
gcc 2.95 doesn't.

queryparser/Makefile.mk has an almost-solution there already, but it's 
conditionally disabled.

  if VPATH_BUILD
  # We need this so that generated sources can find non-generated 
headers in a
(Continue reading)

David Sainty | 15 Jan 04:53
Picon

Re: Xapian core build failure under gcc 2.95

David Sainty wrote:
> Hi,
>
> Under gcc 2.95 Xapian fails to build like so:

I can confirm that the attached patch fixes the build under gcc 2.95 
(after an automake).

Cheers,

Dave
--- queryparser/Makefile.mk.orig	Tue Dec 23 20:55:31 2008
+++ queryparser/Makefile.mk	Thu Jan 15 16:30:10 2009
@@ -2,14 +2,9 @@
 # We need this so that generated sources can find non-generated headers in a
 # VPATH build from SVN.
 INCLUDES += -I$(top_srcdir)/queryparser
+endif

-if MAINTAINER_MODE
-# We need this because otherwise, if depcomp is being used (as it will be for a
-# build with gcc-2.95), depcomp will be unable to find queryparser_token.h.
-# This may be a bug in depcomp, but it certainly happens with automake-1.10.
 INCLUDES += -I$(top_builddir)/queryparser
-endif
-endif

 noinst_HEADERS +=\
(Continue reading)

Olly Betts | 15 Jan 12:39
Favicon
Gravatar

Re: Xapian core build failure under gcc 2.95

On Thu, Jan 15, 2009 at 04:53:52PM +1300, David Sainty wrote:
> David Sainty wrote:
> >Under gcc 2.95 Xapian fails to build like so:
> 
> I can confirm that the attached patch fixes the build under gcc 2.95 
> (after an automake).

Thanks for the patch.

But it seems there's something odd going on, as other subdirectories
also include headers from the same directory without an explicit -I.
The files here are generated, but that shouldn't make a difference as
they are shipped in the tarball and it appears you're building from the
1.0.10 source tarball.

Perhaps the issue is the "#line" directives with full paths in
queryparser_internal.cc - if GCC 2.95 resolves header includes relative
to the filename given by "#line" then that would cause this problem.

Could you try:

perl -pi -e 's/^#line.*//' queryparser/queryparser_internal.cc

And then building without your patch.

(Unfortunately I no longer have access to GCC 2.95 to test this myself).

Another question - what's the reason for using GCC 2.95?

We came quite closing to dropping support for GCC < 3 a while back (but
(Continue reading)

David Sainty | 16 Jan 01:02
Picon

Re: Xapian core build failure under gcc 2.95

Olly Betts wrote:
> On Thu, Jan 15, 2009 at 04:53:52PM +1300, David Sainty wrote:
>   
>> David Sainty wrote:
>>     
>>> Under gcc 2.95 Xapian fails to build like so:
>>>       
>> I can confirm that the attached patch fixes the build under gcc 2.95 
>> (after an automake).
>>     
>
> Thanks for the patch.
>
> But it seems there's something odd going on, as other subdirectories
> also include headers from the same directory without an explicit -I.
> The files here are generated, but that shouldn't make a difference as
> they are shipped in the tarball and it appears you're building from the
> 1.0.10 source tarball.
>
> Perhaps the issue is the "#line" directives with full paths in
> queryparser_internal.cc - if GCC 2.95 resolves header includes relative
> to the filename given by "#line" then that would cause this problem.
>
> Could you try:
>
> perl -pi -e 's/^#line.*//' queryparser/queryparser_internal.cc
>   

Huh, mighty good guessing! Yeah, that also fixes the build (without the 
patch).
(Continue reading)

towel moist | 17 Jan 00:34
Picon

chert vs flint vs lucene

Hi,

What's the main difference between chert and flint?  What above vs lucene?

I am mainly asking about data structure (lexicon, posting list, document data), what's in memory, what's on disk, hash vs b-tree and reasons behind them.

Any pointer is appreciated.

Thanks!
Crystal

_______________________________________________
Xapian-devel mailing list
Xapian-devel <at> lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
Olly Betts | 19 Jan 09:17
Favicon
Gravatar

Re: Xapian core build failure under gcc 2.95

On Fri, Jan 16, 2009 at 01:02:22PM +1300, David Sainty wrote:
> I don't think gcc's behaviour here is universally true of all compilers 
> (implying the source directory as an include path entry), but in saying 
> that I'm not sure of a counterexample either.

I've successfully compiled Xapian with quite a few different compilers
without encountering problems with this (at least GCC, Intel, Sun, HP,
SGI).

My understanding is that #include with "" implicitly adds the source
directory to the search path (whereas #include with <> doesn't).

I'm reluctant to start coding around behaviour which compilers *might*
have, as that's a very open-ended list.

But if anyone has actual evidence of a compiler which doesn't behave this
way, we probably need to explicitly add -I options for several other
subdirectories which rely on this behaviour.

> Obviously 2.95(.4) has the required behaviour in some form, but is
> confused by the #line lines.

My guess is that this is because it uses a separate preprocessor and
relies on "#line" in the preprocessor output to tell the compiler the
filename of the source file.  GCC 2.95 is adding the source directory
to the search path but is confused as to what the source directory
is.

This is the patch I've actually applied to trunk, which fixes up "#line"
directives rather than nuking them (depcomp parses preprocessor output
for "#line" so we no longer need that workaround):

http://trac.xapian.org/changeset/11823/trunk/xapian-core/queryparser/Makefile.mk?format=diff&new=11823

I'd be grateful if you could try this (it should apply to 1.0.10 cleanly).
If it works I'll backport it for 1.0.11.

> Yeah, it's in use on some old systems that are long long overdue for 
> updates. A separate project is working on that, but it's not a variable 
> I can control. Essentially it's the usual reasons - the more important 
> the server the harder it is to regularly maintain it :)

OK.  I guess that's a minor argument for keeping GCC 2.95 support,
though it's problematic that we aren't regularly testing it.  At some
point we're going to have to just start telling people to upgrade.

Cheers,
    Olly
Olly Betts | 19 Jan 12:33
Favicon
Gravatar

Re: chert vs flint vs lucene

On Fri, Jan 16, 2009 at 03:34:11PM -0800, towel moist wrote:
> What's the main difference between chert and flint?  What above vs lucene?

Flint is documented here:

http://trac.xapian.org/wiki/FlintBackend

The user-visible change in Chert are covered in here (to find them,
search for "chert backend:"):

http://trac.xapian.org/browser/trunk/xapian-core/NEWS

That should be complete at present, but there are likely to be further
changes before chert is declared "finished".

I don't know of any comparisons with Lucene's low level details.

Cheers,
    Olly
David Sainty | 21 Jan 03:59
Picon

Re: Xapian core build failure under gcc 2.95

Hi Olly,

> This is the patch I've actually applied to trunk, which fixes up "#line"
> directives rather than nuking them (depcomp parses preprocessor output
> for "#line" so we no longer need that workaround):
>
> http://trac.xapian.org/changeset/11823/trunk/xapian-core/queryparser/Makefile.mk?format=diff&new=11823
>
> I'd be grateful if you could try this (it should apply to 1.0.10 cleanly).
> If it works I'll backport it for 1.0.11.
>   

It took me a while to get the maintainer mode tools together to get the 
patch to have an effect. I've tried a fresh build in maintainer mode 
with and without this patch. The bad news is that it builds with both :) 
The reason is that in maintainer mode the #line entries are valid anyway 
(oddly I don't get the fully qualified paths [without the patch] that 
you get when building the tarball - something different about how you 
kick off the build).

I think your patch is a good fix (and gets rid of your machine's build 
paths in the distribtion :). But my test isn't perfect - though you do 
at least know that the build works after the with-patch manipulation. If 
you want to verify for sure I guess you need to build 
queryparser_internal.cc as you would for a distribution tar, and then I 
can do a non-maintainer build in this environment with that file and see 
if it completes.

Cheers,

Dave
Olly Betts | 21 Jan 05:22
Favicon
Gravatar

Re: Xapian core build failure under gcc 2.95

On Wed, Jan 21, 2009 at 03:59:08PM +1300, David Sainty wrote:
> It took me a while to get the maintainer mode tools together to get the 
> patch to have an effect. I've tried a fresh build in maintainer mode 
> with and without this patch. The bad news is that it builds with both :) 
> The reason is that in maintainer mode the #line entries are valid anyway 
> (oddly I don't get the fully qualified paths [without the patch] that 
> you get when building the tarball - something different about how you 
> kick off the build).

Ah, sorry about this.  I should have thought things through more.

The script which builds releases and snapshots does them with builddir
!= srcdir (sometimes called a VPATH build) - that's the difference
you're seeing.  With builddir = srcdir you used to get stuff like
this (with an extra harmless './'):

#line 1234 "./queryparser/queryparser.lemony"

Thanks for your efforts though - much appreciated, and they do give me
extra confidence in the fix.

> I think your patch is a good fix (and gets rid of your machine's build 
> paths in the distribtion :).

Yes, that's certainly an improvement (though there are other places
where these leak in still).

> But my test isn't perfect - though you do 
> at least know that the build works after the with-patch manipulation. If 
> you want to verify for sure I guess you need to build 
> queryparser_internal.cc as you would for a distribution tar, and then I 
> can do a non-maintainer build in this environment with that file and see 
> if it completes.

The simplest test would be to just use the version from an SVN trunk
snapshot - I've extracted one here:

http://oligarchy.co.uk/xapian/patches/queryparser_internal.cc

Cheers,
    Olly

Gmane