Olly Betts | 1 Aug 11:16 2009

Re: refined search

On Fri, Jul 31, 2009 at 03:13:24PM -0700, Marc Fromm wrote:
> Once an initial search displays the results is there a way to do a
> search on just those documents included in the initial search?

If the new query is "new_query" and the old one "old_query", you can
just do:

new_query = Xapian::Query(Xapian::Query::OP_FILTER, new_query, old_query);

The data for old_query will be cached, so this is pretty efficient.

Cheers,
    Olly
Rafi | 1 Aug 11:41 2009
Picon

something is wrong with matched estimated

Hello All,

I've installed xapian search on my site www.bazarek.pl, its quick and
nice ;), but I've problem with get_matches_estimated, when I've two
words to search in first page Im getting 39 results on next page 62
and on third 65, little bit strange, you can see live example here:

page 1: http://bazarek.pl/searchx.php?q=srebrne+cyrkonie
page 2: http://bazarek.pl/searchx.php?q=srebrne+cyrkonie&s=25
page 3: http://bazarek.pl/searchx.php?q=srebrne+cyrkonie&s=49

What is wrong?
--

-- 
Best regards,
 Rafi                          mailto:webdeveloper <at> poczta.onet.pl
Olly Betts | 1 Aug 12:24 2009

Re: something is wrong with matched estimated

On Sat, Aug 01, 2009 at 11:41:46AM +0200, Rafi wrote:
> I've installed xapian search on my site www.bazarek.pl, its quick and
> nice ;), but I've problem with get_matches_estimated, when I've two
> words to search in first page Im getting 39 results on next page 62
> and on third 65, little bit strange, you can see live example here:

http://trac.xapian.org/wiki/FAQ/MoreAccurateEstimates

Cheers,
    Olly
Olly Betts | 1 Aug 13:48 2009

Re: Multiple Databases on the same port

On Fri, Jul 31, 2009 at 11:43:50AM -0400, Eddie Drapkin wrote:
> The suggestion to add a database ID or a site ID
> field in the databases doesn't seem too practical, as we have a
> type-ahead search and even that piece of overhead could turn out to be
> more costly than we'd expect, not to mention making the entire search
> process harder to manage.

Yes, doesn't seem a good approach if you care about performance.

> As far as running a xapian-tcpsrv instance
> per database, that's a management and security nightmare, not to
> mention putting a very real limit on how far we can scale out.

It's not ideal for a large number of databases, but it's the only way
to achieve this with the remote backend as things stand.

> It's not really practical to switch search engines right now, not that
> we're not happy with Xapian, but I do know that Sphinx has the
> capability of serving on a single port and having multiple "indexes."
> Is there another, alternative, way to set up our datasets so that
> they're uniquely searchable without being in separate databases or
> having to search based on DatabaseID all the time?

I don't see one.  If you put them all in one database, you need a way
to filter out just the ones you want...

I guess you could serve the databases over NFS or similar to the search
boxes, and then just open them without the remote backend.

Another approach is to do the searching on the box with database
(Continue reading)

Olly Betts | 2 Aug 14:53 2009

Re: xapian-compact 'undefined symbol' error

On Fri, Jul 31, 2009 at 03:31:36PM -0400, Paul Williamson wrote:
> yes, we are using 1.0.13, with libxapian.so.15.6.4.  So, what are our
> options, will upgrading to 1.0.14 fix this, or do we need to wait until
> 1.0.15?

This issue will only manifest if you use xapian-compact and libxapian
*from different releases*.  Ideally, any two 1.0.x versions (except
1.0.0) should be compatible in this way, but they aren't always it
turns out.

Before I cause mass panic (or at least mass unease), for user code the
ABI is fully compatible so you can just upgrade Xapian and everything
should work.  The mistake involves symbols not accessible via the public
API headers, but which xapian-compact uses.

So just make sure you're using xapian-compact and libxapian from the
same release.  Assuming you're on Linux (or a platform with the same
versioning scheme), then libxapian.so.15.6.4 is from 1.0.13.  To see
what version your xapian-compact is from try:

xapian-compact --version

My guess is you have 1.0.13 installed with prefix /usr and an older
release with prefix /usr/local and /usr/local/bin is before /usr/bin
on your path.

Hopefully 1.0.15's library will restore compatibility with older
versions of xapian-compact, but it is more complex than I'd hoped as the
class changed size and layout a few releases ago.

(Continue reading)

Rolf Koehling | 4 Aug 22:51 2009
Picon
Picon

Re: omindex hangs while scanning (update)

Hi Olly,

I have been busy the last days but finally applied both of your patches 
and the new
version works just fine. Many thanks for your help.

Kind regards
Rolf Köhling.

P.S. As this discussion is listed in your mailing list could you please 
be so kind to
    remove the "Elke" from the messages? Many thanks in advance.

Olly Betts schrieb:
> There's no need to cc: me on mailing list replies.
>
> On Sun, Jul 05, 2009 at 11:57:56PM +0200, Elke + Rolf Koehling wrote:
>   
>> My problem comes running the omindex.exe under windows and the html 
>> files did have
>> the windows line terminator ( CR / LF ). Converting the to the unix line 
>> terminator (LF )
>> the software was running fine. The bug is the file loadfile.cc
>>
>>     while (n) {
>>         int c = read(fd, blk, min(n, sizeof(blk)));
>>         cout << "### read " << c << endl;
>>         if (c < 0) {
>>
>> The problem is the  read() call which returns for my file off 592 bytes 
(Continue reading)

Jason Chapin | 5 Aug 18:50 2009

Trouble understanding MultiValueSorter with PHP bindings

Hi all,

Using the PHP5 bindings, I have been trying to sort results using 
MultiValueSorter:
                ...
        $enquire = new XapianEnquire($database);
        $enquire->set_query($query);
        $sorter = new XapianMultiValueSorter();
        $sorter->add(2);
        $enquire->set_sort_by_key($sorter, false);
        $matches = $enquire->get_mset((int)$startAt, (int)$perPage);
        ...

the results always come sorted by ascending docId. My understanding of 
$sorter->add(2) is to sort the results on the third field of the 
document (field 0 being the docId and field 3 being the second field 
defined in the scriptindex file used to build the database). Any advice 
apreciated.

Regards,
Jason

--

-- 
////////////////////////////////////
Jason Chapin
jason <at> roasted.org
David Sauve | 11 Aug 19:04 2009
Picon

xapian-haystack - A Xapian backend for Django Haystack

Hi Everyone,

Apologies if this is inappropriate for this group, but I wanted to introduce
myself and to announce the beta release of xapian-haystack, a backend for
Django Haystack.

The code can be found here: http://github.com/notanumber/xapian-haystack/

I'd love to hear any comments, concerns, questions, etc, about my
implementation.  It's been a pleasure working with Xapian.

David
Jonathan Yu | 12 Aug 15:41 2009
Picon

Test POD without testing POD Coverage (Was: RT#48221)

Hi:

I'm re-reporting this bug here since Olly Betts informs me that I
filed the bug in the wrong place.

There's currently no way to just test the POD, without testing POD coverage.

The following patch to t/03podcoverage.t allows this:
-plan skip_all => 'set TEST_POD to enable this test' unless $ENV{TEST_POD};
+plan skip_all => 'set TEST_POD_COVERAGE to enable this test' unless
$ENV{TEST_POD_COVERAGE};

We currently apply this patch in the Debian package for this module,
because there are lots of coverage issues, but we'd still like to test
the POD on our build daemons.

Cheers,

Jonathan
Olly Betts | 13 Aug 05:53 2009

Re: Trouble understanding MultiValueSorter with PHP bindings

On Wed, Aug 05, 2009 at 12:50:38PM -0400, Jason Chapin wrote:
> My understanding of 
> $sorter->add(2) is to sort the results on the third field of the 
> document (field 0 being the docId and field 3 being the second field 
> defined in the scriptindex file used to build the database).

No, that's incorrect.  The argument of Sorter::add() is the number of a
Document value slot.

With the API, you add these with Document::add_value().  With
scriptindex, you can add these with value=SLOT or valuenumeric=SLOT.
See:

http://xapian.org/docs/omega/scriptindex.html

As for sorting by docid as well, then note that this only makes sense as
the "least significant" ordering (since no two documents have the same
document id, sorting by docid then <anything> means that <anything>
never affects the ordering).

The docid is always used as the final decider anyway, but you can change
whether you want ascending, descending, or you don't care (in which case
you get whatever is most efficient).  In the API that is
Enquire::set_docid_order().  Doesn't seem to be accessible from
OmegaScript at present (but would be easy to add).

There's more discussion of sorting here:

http://xapian.org/docs/sorting

(Continue reading)


Gmane