J5 | 30 Sep 00:44
Picon

is EXACT_MATCH working?

Hello,

We are using Xappy to create indexes for package searching in Fedora.
Right now the results are a bit skewed due to freetext searches simply
matching the number of times a term shows up.  I want to fix this
using exact matching on the package name so that if an exact match is
found we return that as the top result.  This does not seem to work.
If I do this and remove all of the other matching fields we always get
an empty result

iconn.add_field_action('exact_name', xappy.FieldActions.INDEX_EXACT)
iconn.add_field_action('exact_name', xappy.FieldActions.STORE_CONTENT)
doc.fields.append(xappy.Field('exact_name', 'dbus',  weight=100.0))
.
.
.

then searching for 'dbus' using xapian should return that match but we
get an empty set:

query = qp.parse_query('dbus')
enquire.set_query(query)
matches = enquire.get_mset(0, 10)
count = matches.get_matches_estimated()
print count

> 0

How do we get count working?  BTW we are using xappy for indexing
because it presents a nice interface but xapian is simple enough on
(Continue reading)

Bruno Rezende | 6 May 20:02
Picon

Fix for fieldmapping

Hi,

I've created a patch for a problem in fieldmapping:

http://code.google.com/p/xappy/issues/detail?id=37 (FieldMappings is
generating wrong prefixes )

can someone review it? what is the right procedure for asking for
patch reviews?

--

-- 
You received this message because you are subscribed to the Google Groups "xappy-discuss" group.
To post to this group, send email to xappy-discuss@...
To unsubscribe from this group, send email to xappy-discuss+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/xappy-discuss?hl=en.

Bruno Rezende | 11 Mar 16:05
Picon

Applying Multiple Caches support

Hi,

I've added support to applying multiple caches to a xappy index here:

http://code.google.com/p/xappy/issues/detail?id=36

it still misses some tests related to document removal and update, but
I think it is in shape for a review. If someone can take a look, i'd
be grateful.

regards,
Bruno

--

-- 
You received this message because you are subscribed to the Google Groups "xappy-discuss" group.
To post to this group, send email to xappy-discuss@...
To unsubscribe from this group, send email to xappy-discuss+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/xappy-discuss?hl=en.

Bruno Rezende | 17 Feb 14:16
Picon

Multiple Caches and replace document

Hi,

suppose I have a index and have multiple caches that can be applied to
it. The cache that will be used will be chosen at search time. In this
scenario, how would incremental indexing be affected? I'm looking
IndexerConnection.replace method and have seen this:

       if self._index.get_metadata('_xappy_hascache'):
           if store_only:
               # Remove any cached items from the cache - the document
is no
               # longer wanted in search results.
               self._remove_cached_items(id, xapid)
           else:
               # Copy any cached query items over to the new document.
               olddoc, olddocid = self._get_xapdoc(id, xapid)
               if olddoc is not None:
                   for value in olddoc.values():
                       if value.num < self._cache_manager_slot_start:
                           continue
                       xapdoc.add_value(value.num, value.value)

we can remove documents from our multiple caches, but I don't know
what should I do when the document is modified.

--

-- 
You received this message because you are subscribed to the Google Groups "xappy-discuss" group.
To post to this group, send email to xappy-discuss@...
To unsubscribe from this group, send email to xappy-discuss+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/xappy-discuss?hl=en.
(Continue reading)

Bruno Rezende | 3 Feb 13:40
Picon

replace document slow?

(it seems I'm having problems emailing xappy-discuss, so sorry if this
message is sent twice)

Hi,

I'm doing some incremental updates in a xapian database using xappy
api. The changes to the documents are minimal, just adding/removing
some terms. The way I'm doing is something like:

1. get the documents from a search connection
2. change the terms with ProcessedDocument.add_term /
ProcessedDocument.remove_term
3. call IndexerConnection.replace(changed_doc)

I'm getting an average of ~200 items/sec. If instead of using the
document returned by search connection I get the document from the
indexer connection and continue using replace(doc), I see no real
gain.

I tried this too:

1. get the documents from a search connection
2. get each document from indexer connection
3. change the terms with ProcessedDocument.add_term /
ProcessedDocument.remove_term
4. see if the changes would be applied to the index, without calling
IndexerConnection.replace(changed_doc)

with this approach the number of items per second was raised to ~1900
items/sec. But, then the changes were not applied to the db.
(Continue reading)

Stéphane Klein | 3 Nov 10:40
Gravatar

How can I get "Xapian::MSetIterator::get_collapse_count()" C++ function in xappy ?

Hi,

when I use :

result = conn.search(q, 0, 10, collapse='category')

I can do :

result.matches_estimated

now, how can I get "Xapian::MSetIterator::get_collapse_count()" C++
function in xappy ?
(more information about this function :
http://xapian.org/docs/apidoc/html/classXapian_1_1MSetIterator.html#5a0c5216cb505912318e6a552725e3af
)

Thanks for your help,
Stéphane

--

-- 
You received this message because you are subscribed to the Google Groups "xappy-discuss" group.
To post to this group, send email to xappy-discuss@...
To unsubscribe from this group, send email to xappy-discuss+unsubscribe <at> googlegroups.com.
For more options, visit this group at http://groups.google.com/group/xappy-discuss?hl=en.

Stéphane Klein | 3 Nov 10:03
Gravatar

Is it possible to create a xappy field with actions : INDEX_FREETEXT and COLLAPSE ?

Hi,

this is my short question, is it possible to create a xappy field with
actions : INDEX_FREETEXT and COLLAPSE

something like this :

    conn.add_field_action(
        'laboratory_name',
        xappy.FieldActions.INDEX_FREETEXT ||
xappy.FieldActions.COLLAPSE,
        weight=10,
        language=lang,
        spell=True
    )

I've looking in xappy documentation and source code, I found nothing
about this subject.

Thanks for your help,
Stephane

--

-- 
You received this message because you are subscribed to the Google Groups "xappy-discuss" group.
To post to this group, send email to xappy-discuss@...
To unsubscribe from this group, send email to xappy-discuss+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/xappy-discuss?hl=en.

Yannick Gingras | 4 Aug 04:49
Favicon
Gravatar

Patch for Highlighter.makeSample


Hello Xappy devs, 
  please find attached a patch for `Highlighter.makeSample()`.  The
previous algorithm had a very bad tendency to strip search terms from
the sample and it was putting double dots ("..") very often where it
should have used the ellipsis ("...").  The current algorithm is not
perfect and it will also strip the search terms is they appear at the
end of a long sentence and that that sample size is very small but
that should not happen very often in real life.  The complexity is not
changed.

The `block-class` patch should be applied before the
`better-sample-production` one.

Cheers,

--

-- 
Yannick Gingras
http://ygingras.net
http://montrealpython.org -- lead organizer
# HG changeset patch
# Parent 87844d4d20ef0909234da5958979d50180c4e54c
converted matched blocks to class to make the sample selection algorithm more selt-documenting

diff -r 87844d4d20ef xappy/highlight.py
--- a/xappy/highlight.py	Tue Aug 03 17:22:49 2010 -0400
+++ b/xappy/highlight.py	Tue Aug 03 21:11:23 2010 -0400
@@ -73,6 +73,20 @@
(Continue reading)

charliejuggler | 31 Mar 12:29
Picon

Updating Xapian

Hi,

Could we get a new version of the merged Xapian packages generated?
I've submitted a patch to the Xapian bugtracker with improved Windows
build files, as the current Xapian packages don't build on Windows.
I'm trying to release a new version of Flax Basic.

Cheers

Charlie

--

-- 
You received this message because you are subscribed to the Google Groups "xappy-discuss" group.
To post to this group, send email to xappy-discuss@...
To unsubscribe from this group, send email to xappy-discuss+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/xappy-discuss?hl=en.

Dominic LoBue | 9 Mar 18:48
Picon

Massive memory leak

Hello,

I am seeing a massive memory leak in xappy, but I'm having trouble
pinning it down exactly.

Some background: I'm using xappy to index my email and store useful
headers for quick access.

In the program I'm developing I will open a new search connection,
perform a query, copy all the header information into a custom
container class, and then close the search connection. I have found
that as I keep performing these operations my program continues to use
more and more ram, and never releases anything.

Here's a really simple example that makes the problem obvious:
import pdb
import xappy
from overwatch import xapidx
from databasics import msg_factory
sconn = xappy.SearchConnection(xapidx)
r = sconn.search(sconn.query_all(), 0, 99999999, checkatleast= -1,
sortby= '-sent')
r = map(msg_factory, r)
del r
del sconn
pdb.set_trace()

msg_factory is just a factory function that returns a named tuple that
contains all the header information contained in the ProcessedDocument
it gets.
(Continue reading)


Gmane