Richard Boulton | 3 Dec 13:17 2007

Re: Document clustering module?

☼ 林永忠 ☼ (Yung-chung Lin) wrote:
> I have made a new class for calculating document similarity. Please
> review it. Maybe the class should be an internal one, since this will
> only be called by Xapian::Cluster in my plan.

I'm going to be taking a look at this shortly.  If you have any more 
changes since the version you sent to the list (on 18/09/07), sending 
them would be helpful.  Don't worry if you don't, though - I've just 
done a quick review of the last version sent to the list, and it's 
definitely a good start on this.

--

-- 
Richard
Olly Betts | 8 Dec 13:02 2007

Re: Problem indexing text with spelling enabled in Perl

Please don't Cc: me when mailing the lists - I do read the lists, and it
won't get you a reply any quicker.

On Tue, Nov 13, 2007 at 05:38:43PM -0700, Rusty Conover wrote:
> I'm using the TermGenerator::index_text() on version 1.0.4 with the  
> FLAG_SPELLING turned on, because the new spelling suggestion stuff  
> seems awesome, but I'm getting a segv.

Hmm, TermGenerator::set_flags() doesn't appear to be wrapped for Perl
yet, so how are you doing this?  Have you patched Search::Xapian?  If
so, please contribute the patch!

> (gdb) bt
> #0  0xb7ae153c in Xapian::WritableDatabase::add_spelling  
> (this=0xa553988, word= <at> 0xbff97724, freqinc=1) at ./include/xapian/ 
> base.h:154
> #1  0xb7becf47 in Xapian::TermGenerator::Internal::index_text  
> (this=0xa553970, itor=
>       {p = 0xab2d69f " North Face Windwall 1 Jacket boys", end =  
> 0xab2d6c1 "", seqlen = 1}, weight=3, prefix= <at> 0xbff977ac,  
> with_positions=true)
>     at queryparser/termgenerator_internal.cc:207
> #2  0xb7bebf0c in Xapian::TermGenerator::index_text (this=0x9b12d68,  
> itor= <at> 0xbff9779c, weight=3, prefix= <at> 0xbff977ac) at queryparser/ 
> termgenerator.cc:90
> #3  0xb7c6b6d6 in XS_Search__Xapian__TermGenerator_index_text ()  
> from /usr/local/lib/perl5/site_perl/5.8.8/i686-linux/auto/Search/ 
> Xapian/Xapian.so
> #4  0x080acc85 in Perl_pp_entersub ()
> #5  0x080ab9ae in Perl_runops_standard ()
(Continue reading)

Daniel Brumbaugh Keeney | 11 Dec 17:49 2007
Picon

ruby bindings documentation

I am interested in helping with the documentation for the Ruby
bindings of Xapian. Very little is documentation is currently
available in RDoc, Ruby's standard documentation system. I'm not
familiar with Xapian, I would like to do this in part so I can learn
the API. I'm also not familiar with C++, however, I believe with the
examples I can figure out how the C++ documentation applies to Ruby.
Basically:

Is anyone working on this?
Is there more Ruby-specific documentation beyond what is found (and
linked to) on <http://www.xapian.org/docs/bindings/ruby/>?
Who do I ask if I can't understand the C++ documentation?
Where can I find the source of the current RDoc?

Thank you,
Daniel Brumbaugh Keeney
Olly Betts | 11 Dec 18:42 2007

Re: ruby bindings documentation

On Tue, Dec 11, 2007 at 10:49:46AM -0600, Daniel Brumbaugh Keeney wrote:
> I am interested in helping with the documentation for the Ruby
> bindings of Xapian.

Great!

> Very little is documentation is currently
> available in RDoc, Ruby's standard documentation system. I'm not
> familiar with Xapian, I would like to do this in part so I can learn
> the API. I'm also not familiar with C++, however, I believe with the
> examples I can figure out how the C++ documentation applies to Ruby.
> Basically:
> 
> Is anyone working on this?

Not that I'm aware of.

> Is there more Ruby-specific documentation beyond what is found (and
> linked to) on <http://www.xapian.org/docs/bindings/ruby/>?

That's it currently.

> Who do I ask if I can't understand the C++ documentation?

Ask here, or #xapian on freenode.

> Where can I find the source of the current RDoc?

It's generated from ruby/xapian.rb in xapian-bindings:

(Continue reading)

Daniel Brumbaugh Keeney | 11 Dec 20:16 2007
Picon

Re: ruby bindings documentation

On Dec 11, 2007 11:42 AM, Olly Betts <olly <at> survex.com> wrote:
> I'd suggest studying how we do this for Python.  Essentially, the C++
> headers have documentation comments which we collate into the C++ API
> documentation using doxygen (http://www.doxygen.org/).  Doxygen can
> also output XML, so we do this, and then parse it using a script to
> generate Python docstrings:

I expect I'll want to start with the generated XML, run some
XSLT/Ruby, and then modify it by hand. Where can I find that XML? I
don't really want to download all of Xapian's source just to get it.

Daniel Brumbaugh Keeney
Rusty Conover | 17 Dec 07:57 2007

Crashes with spelling enabled and perl.

Hi Guys,

Here's a simple test case that causes a segfault with the perl bindings patched to enable spelling correction:

use strict;
use warnings;
use Search::Xapian;
my $db = Search::Xapian::WritableDatabase->new("test.db", Search::Xapian::DB_CREATE_OR_OPEN);
if (!defined($db)) {
    die("Failed to open xapian_database: $!");
  }
my $indexer = Search::Xapian::TermGenerator->new();
$indexer->set_flags(Search::Xapian::FLAG_SPELLING);
my $document = Search::Xapian::Document->new();
$indexer->set_document($document);
$indexer->index_text(lc('test'), 1);
$db->add_document($document);
undef $db;  

Here's the patch to enable spelling against Search-Xapian-1.0.4.0:


Here's the backtrace against 1.0.4:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208211776 (LWP 27230)]
0x001c9bbc in Xapian::WritableDatabase::add_spelling (this=0x9d77de0, word= <at> 0xbf846fa0, freqinc=1) at ./include/xapian/base.h:154
154         return dest;
Current language:  auto; currently c++
(gdb) bt
#0  0x001c9bbc in Xapian::WritableDatabase::add_spelling (this=0x9d77de0, word= <at> 0xbf846fa0, freqinc=1) at ./include/xapian/base.h:154
#1  0x0032608a in Xapian::TermGenerator::Internal::index_text (this=0x9d77dc8, itor={p = 0x0, end = 0x9cb5db8 "", seqlen = 0}, weight=1,
    prefix= <at> 0xbf84703c, with_positions=true) at queryparser/termgenerator_internal.cc:207
#2  0x0032506c in Xapian::TermGenerator::index_text (this=0x9c94cd0, itor= <at> 0xbf84702c, weight=1, prefix= <at> 0xbf84703c) at queryparser/termgenerator.cc:90
#3  0x0017100e in XS_Search__Xapian__TermGenerator_index_text (my_perl=0x9c78008, cv=0x9d7f7e8) at /usr/local/include/xapian/termgenerator.h:115
#4  0x00c3142d in Perl_pp_entersub () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
#5  0x00c2a88f in Perl_runops_standard () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
#6  0x00bd010e in perl_run () from /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so
#7  0x0804921e in main ()

Any help would be great as I'm excited to get spelling correction working.

Thanks,

Rusty
--
Rusty Conover
InfoGears Inc.



_______________________________________________
Xapian-devel mailing list
Xapian-devel <at> lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
Olly Betts | 18 Dec 02:46 2007

Re: Crashes with spelling enabled and perl.

On Sun, Dec 16, 2007 at 11:57:15PM -0700, Rusty Conover wrote:
> my $db = Search::Xapian::WritableDatabase->new("test.db",  
> Search::Xapian::DB_CREATE_OR_OPEN);
> if (!defined($db)) {
>     die("Failed to open xapian_database: $!");
>   }
> my $indexer = Search::Xapian::TermGenerator->new();
> $indexer->set_flags(Search::Xapian::FLAG_SPELLING);

You need to add this line here so that the TermGenerator object knows
which database to add spellings to:

$indexer->set_database($db);

But a SEGV is undesirable.  I'll see if I can fix that.

Cheers,
    Olly
Rick Olson | 28 Dec 23:12 2007

Build Error in trunk (omega)

Hello,

There is a build error by what looks to be a typo in 
xapian-applications/omega/omega.cc

My checkout is about 15 minutes old, so it's possible it has been caught 
since then.

omega.cc: In function ‘int main(int, char**)’:
omega.cc:204: error: ‘pretty_tery’ was not declared in this scope
make[3]: *** [omega.o] Error 1
make[3]: Leaving directory 
`/home/rolson/projects/xapian/xapian/xapian-applications/omega'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory 
`/home/rolson/projects/xapian/xapian/xapian-applications/omega'
make[1]: *** [all] Error 2
make[1]: Leaving directory 
`/home/rolson/projects/xapian/xapian/xapian-applications/omega'
make: *** [all] Error 2

The following patch should fix the typo (unless it's been fixed already, 
or there really is a pretty_tery function somewhere).

--- /tmp/omega.cc.orig  2007-12-28 14:04:40.000000000 -0800
+++ xapian-applications/omega/omega.cc  2007-12-28 14:04:52.000000000 -0800
 <at>  <at>  -201,7 +201,7  <at>  <at> 
             for (Xapian::ESetIterator i = eset.begin(); i != 
eset.end(); i++) {
                 if ((*i).empty()) continue;
                 if (!query_string.empty()) query_string += ' ';
-               query_string += pretty_tery(*i);
+               query_string += pretty_term(*i);
             }
         }
      }

Cheers,

Rick
Richard Boulton | 29 Dec 06:59 2007

Re: Build Error in trunk (omega)

Rick Olson wrote:
> Hello,
> 
> There is a build error by what looks to be a typo in 
> xapian-applications/omega/omega.cc
> 
> My checkout is about 15 minutes old, so it's possible it has been caught 
> since then.
> 
> 
> omega.cc: In function ‘int main(int, char**)’:
> omega.cc:204: error: ‘pretty_tery’ was not declared in this scope
> make[3]: *** [omega.o] Error 1
> make[3]: Leaving directory 
> `/home/rolson/projects/xapian/xapian/xapian-applications/omega'
> make[2]: *** [all-recursive] Error 1
> make[2]: Leaving directory 
> `/home/rolson/projects/xapian/xapian/xapian-applications/omega'
> make[1]: *** [all] Error 2
> make[1]: Leaving directory 
> `/home/rolson/projects/xapian/xapian/xapian-applications/omega'
> make: *** [all] Error 2

That's very odd; my checkout has "query_string += pretty_term(*i);" as 
line 204, and according to svn blame, that line hasn't changed since SVN 
revision 4948 (on October 16th 2003).  Are you sure it's not a local 
modification, or perhaps you are working on a branch?

--

-- 
Richard
Rick Olson | 29 Dec 11:38 2007

Re: Build Error in trunk (omega)

Richard Boulton wrote:
> Rick Olson wrote:
>> Hello,
>>
>> There is a build error by what looks to be a typo in 
>> xapian-applications/omega/omega.cc
>>
>> My checkout is about 15 minutes old, so it's possible it has been 
>> caught since then.
>>
>>
>> omega.cc: In function ‘int main(int, char**)’:
>> omega.cc:204: error: ‘pretty_tery’ was not declared in this scope
>> make[3]: *** [omega.o] Error 1
>> make[3]: Leaving directory 
>> `/home/rolson/projects/xapian/xapian/xapian-applications/omega'
>> make[2]: *** [all-recursive] Error 1
>> make[2]: Leaving directory 
>> `/home/rolson/projects/xapian/xapian/xapian-applications/omega'
>> make[1]: *** [all] Error 2
>> make[1]: Leaving directory 
>> `/home/rolson/projects/xapian/xapian/xapian-applications/omega'
>> make: *** [all] Error 2
>
> That's very odd; my checkout has "query_string += pretty_term(*i);" as 
> line 204, and according to svn blame, that line hasn't changed since 
> SVN revision 4948 (on October 16th 2003).  Are you sure it's not a 
> local modification, or perhaps you are working on a branch?
>

Hmm, it was a fresh checkout of trunk... but who knows.  I probably 
should have double checked first :p  svn blame does have it correct.

--
Rick

Gmane