Daniel Ménard | 8 Nov 18:26
Picon

QueryParser : some remarks

Hi to all,

First, I would like to say a big thank you for the work which was done 
on my 'wish bug' to allow mapping one field to multiple prefixes 
(http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=93).
That's great!

I have upgraded to 1.0.4 and I am revisiting my code, replacing the php 
query parser I wrote with Xapian's one.

Everything works well, but I have some remarks:

1. Adding a stopper to the query parser can make apache hangs under 
windows (using php bindings)
I already reported this problem in the past, see thread:
http://thread.gmane.org/gmane.comp.search.xapian.general/4599/focus=1198
but I did not filled a bug report and it was never addressed.

It is not critical for me, as I have a workaround (store the stopper in 
a global variable or property so it is not destroyed too early, see 
above thread for details), but it would be nice if we can finally 
address it...

2. Wildcards: no limits?
It seems that there is no limit on the number of terms a wildcard will 
generate: the query "a*" will generate a huge query OR'ing all the terms 
which start with an 'a' that will take lot of resources and time to 
execute (this is a problem: a malicious user can exploit this to deny 
access to others).

(Continue reading)

Kevin Duraj | 8 Nov 20:12
Picon

Xapian Search Websites Listings

Xapian Search Websites Listings,

I come across Xapian Search Websites Listings for Xapian search engines.
 http://xapian.org/users.php

Can you please ad MyHealthcare.com search engine to section: Search Websites
MyHealthcare.com using Xapian to crawl and search 50 million web sites
on single 1U server.

MyHealthcare.com
Url: http://myhealthcare.com
General web search engine with 50 million websites.

Thank you,
Kevin Duraj
Olly Betts | 10 Nov 13:35
Favicon
Gravatar

Re: QueryParser : some remarks

On Thu, Nov 08, 2007 at 06:26:53PM +0100, Daniel M?nard wrote:
> 1. Adding a stopper to the query parser can make apache hangs under 
> windows (using php bindings)
> I already reported this problem in the past, see thread:
> http://thread.gmane.org/gmane.comp.search.xapian.general/4599/focus=1198
> but I did not filled a bug report and it was never addressed.

It's not been forgotten - the best way to fix this would be to have a
reference counting mechanism for Stopper (and other similar classes
like MatchDecider) but we can't use the same technique we use for most
of the API classes as it doesn't allow subclassing.  Bug#186 is about
this:

http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=186

But that's bound to involve ABI changes, so it's something for Xapian 1.1.

> 2. Wildcards: no limits?
> It seems that there is no limit on the number of terms a wildcard will 
> generate: the query "a*" will generate a huge query OR'ing all the terms 
> which start with an 'a' that will take lot of resources and time to 
> execute (this is a problem: a malicious user can exploit this to deny 
> access to others).
> 
> In my old parser, I had two independent limits:
> - minimum number of chars before the '*' (e.g. 3 would alllow abs* but 
> not ab*)
> - maximum number of terms a wildcard can expand to (e.g. 100= abs* is 
> allowed if there are less than 100 terms else an exception is raised)

(Continue reading)

Charlie Hull | 28 Nov 13:50

Problem compiling SVN-HEAD

Hi guys,

Quick question - I'm getting errors compiling the tests; of the form:

.\apitest.cc(46) : fatal error C1083: Cannot open include file: 
'api_anydb.h': No such file or directory
api_anydb.cc
.\api_anydb.cc(26) : fatal error C1083: Cannot open include file: 
'api_anydb.h':  No such file or directory

These files don't seem to exist. Any hints?

Cheers

Charlie
Richard Boulton | 28 Nov 13:56

Re: Problem compiling SVN-HEAD

Charlie Hull wrote:
> Hi guys,
> 
> Quick question - I'm getting errors compiling the tests; of the form:
> 
> .\apitest.cc(46) : fatal error C1083: Cannot open include file: 
> 'api_anydb.h': No such file or directory
> api_anydb.cc
> .\api_anydb.cc(26) : fatal error C1083: Cannot open include file: 
> 'api_anydb.h':  No such file or directory
> 
> These files don't seem to exist. Any hints?

You need to run the perl script "tests/collate-apitest" to generate these.

--

-- 
Richard
Rusty Conover | 14 Nov 01:38

Problem indexing text with spelling enabled in Perl

Hi All,

I'm using the TermGenerator::index_text() on version 1.0.4 with the  
FLAG_SPELLING turned on, because the new spelling suggestion stuff  
seems awesome, but I'm getting a segv.

(gdb) bt
#0  0xb7ae153c in Xapian::WritableDatabase::add_spelling  
(this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/ 
base.h:154
#1  0xb7becf47 in Xapian::TermGenerator::Internal::index_text  
(this=0xa553970, itor=
       {p = 0xab2d69f " North Face Windwall 1 Jacket boys", end =  
0xab2d6c1 "", seqlen = 1}, weight=3, prefix=@0xbff977ac,  
with_positions=true)
     at queryparser/termgenerator_internal.cc:207
#2  0xb7bebf0c in Xapian::TermGenerator::index_text (this=0x9b12d68,  
itor=@0xbff9779c, weight=3, prefix=@0xbff977ac) at queryparser/ 
termgenerator.cc:90
#3  0xb7c6b6d6 in XS_Search__Xapian__TermGenerator_index_text ()  
from /usr/local/lib/perl5/site_perl/5.8.8/i686-linux/auto/Search/ 
Xapian/Xapian.so
#4  0x080acc85 in Perl_pp_entersub ()
#5  0x080ab9ae in Perl_runops_standard ()
#6  0x08060ed0 in perl_run ()
#7  0x0805dc19 in main ()

It seems like there's a ref counting problem with the term, but its  
in C++ templates and I can't quite figure out what's wrong.

(Continue reading)


Gmane