Dirk Rothe | 1 Dec 08:42 2009
Picon

Re: latin stemmer -> java

On Tue, 01 Dec 2009 00:25:23 +0100, Richard Boulton <richard <at> tartarus.org>  
wrote:

> 2009/11/30 Dirk Rothe <d.rothe <at> semantics.de>:
>> Thnx, but I ran into another problem now. In the last distribution[1]  
>> the
>> snowball binary was embedded and I've simply used it. This time it's
>> missing, so I tried to build it by myself with a simple 'make', but that
>> failed with:
>
> Actually, that failure happens after it's created the snowball binary.

Ah, haven't seen that.

> However, I've fixed the problem now, I believe: I'd just forgotten to
> add the libstemmer_c.in file to the distribution tarball.

Thnx, compiling of snowball works now.

I've recreated the .java class, but it's still failing. I'm using "javac  
1.6.0_17" (windows) - the latest SDK from SUN. There are now less error  
messages, so things have improved. Here's the full output:

--------------------------------
compile-core:
     [javac] Compiling 10 source files to  
D:\pylucene\pylucene-2.9.1-1\lucene-java-2.9.1\build\contrib\snowball\classes\java
     [javac] 

D:\pylucene\pylucene-2.9.1-1\lucene-java-2.9.1\contrib\snowball\src\java\org\tartarus\snowball\ext\LatinStemmer.java:261:
(Continue reading)

Richard Boulton | 1 Dec 10:51 2009

Re: latin stemmer -> java

2009/12/1 Dirk Rothe <d.rothe <at> semantics.de>:
> I've recreated the .java class, but it's still failing. I'm using "javac
> 1.6.0_17" (windows) - the latest SDK from SUN. There are now less error
> messages, so things have improved. Here's the full output:

I'm sorry - I have no idea why it's still failing for you.  It's now
working fine for me, so I suspect it must be something to do with your
lucene build environment.  I suggest comparing the
SnowballProgram.java file in lucene (probably in
D:\pylucene\pylucene-2.9.1-1\lucene-java-2.9.1\contrib\snowball\src\java\org\tartarus\snowball\
)  to that in snowball, since the most likely reason for these errors
to exist is either if the Lucene developers have changed this file, or
are using an old version.

--

-- 
Richard
Dirk Rothe | 1 Dec 11:26 2009
Picon

Re: latin stemmer -> java

On Tue, 01 Dec 2009 10:51:30 +0100, Richard Boulton <richard <at> tartarus.org>  
wrote:

> 2009/12/1 Dirk Rothe <d.rothe <at> semantics.de>:
>> I've recreated the .java class, but it's still failing. I'm using "javac
>> 1.6.0_17" (windows) - the latest SDK from SUN. There are now less error
>> messages, so things have improved. Here's the full output:
>
> I'm sorry - I have no idea why it's still failing for you.  It's now
> working fine for me, so I suspect it must be something to do with your
> lucene build environment.  I suggest comparing the
> SnowballProgram.java file in lucene (probably in
> D:\pylucene\pylucene-2.9.1-1\lucene-java-2.9.1\contrib\snowball\src\java\org\tartarus\snowball\
> )  to that in snowball, since the most likely reason for these errors
> to exist is either if the Lucene developers have changed this file, or
> are using an old version.

Jep, they are using a modified Version[1]. The main difference seems to be  
the use of StringBuffer() instead of StringBuilder(). After changing two  
constructors (S_verb_form, S_noun_form) to StringBuffer() the latin  
stemmer compiles happily. I guess I have to proceed with the *real*  
integration now :)

Thnx for the fast help!

--dirk

[1]  
http://svn.apache.org/viewvc/lucene/java/tags/lucene_2_9_1/contrib/snowball/src/java/org/tartarus/snowball/SnowballProgram.java?view=markup
(Continue reading)

Roberto Mirizzi | 10 Dec 19:01 2009
Picon

Project for Italian Stemmer

Italian Stemmer in PHP:
http://www.phpclasses.org/browse/package/3731.html

and Italian Stemmer as Drupal Module:
http://drupal.org/project/italianstemmer

by Roberto Mirizzi
http://sisinflab.poliba.it/mirizzi/
Marek Piorkowski | 11 Dec 16:28 2009
Picon

C# implementation?

Dear all,
I was wondering if there is any C# implementation of the stemmers? Had this ever been considered?
Many thanks for information
Marek Piorkowski
_______________________________________________
Snowball-discuss mailing list
Snowball-discuss <at> lists.tartarus.org
http://lists.tartarus.org/mailman/listinfo/snowball-discuss
Richard Boulton | 11 Dec 17:12 2009

Re: C# implementation?

2009/12/11 Marek Piorkowski <marekpe <at> yahoo.com>:
> Dear all,
> I was wondering if there is any C# implementation of the stemmers? Had this
> ever been considered?

There isn't a C# backend for the snowball compiler that I know of, and
I don't recall hearing of anyone considering one.  C# is similar
enough to java that it probably would be quite easy to create one from
the Java backend.  There may be some C# implementations of some of the
individual stemmers available, but I can't help you there, either, I'm
afraid.

--

-- 
Richard
Martin Porter | 11 Dec 18:26 2009
Picon

-nisse ending in German stemmer


Wolfgang Klinger pointed out in October that the German stemmer reduces
Ku"rbisse (pumpkins) to Ku"rbiss not Ku"rbis, and I promised to
investigate. 

In the sample German vocabulary there is a collection of words ending
-isse (or -issen or -isses). Among these, about 70% actually have the
ending -nisse, while 30% have -isse without the preceding n. For those
with the -nisse ending, stemming to -nis is always correct. For those
with the -isse ending and no preceding n, stemming to -is is wrong in
all but a couple of cases, one being the word Ku"rbisse again.

So I've put in a new rule, to reduce -nisse (and -nissen and -nisses) to
-nis. 

Thanks to Wolfgang for this pointer.

However, Ku"rbisse still stems to Ku"rbiss by this change.

Martin
Martin Porter | 11 Dec 18:31 2009
Picon

French stopword list


Some months ago, Jean-Christophe Deschamps pointed out some errors in
the French stopword list and suggested some additional words for
inclusion. The list has now been corrected, and most of his suggestions
added.

Martin
Bradley Grainger | 11 Dec 19:09 2009

Re: C# implementation?

> I was wondering if there is any C# implementation of the stemmers? Had
this
> ever been considered?

I submitted a Snowball to C# compiler about two years ago:
http://article.gmane.org/gmane.comp.search.snowball/916

Bradley
Enzo Lombardi | 11 Dec 19:09 2009
Picon

Re: C# implementation?

Lucene.Net has a nice implementation of the snowball stemmer in C#.
-e

-----Original Message-----
From: snowball-discuss-bounces <at> lists.tartarus.org
[mailto:snowball-discuss-bounces <at> lists.tartarus.org] On Behalf Of Richard Boulton
Sent: Friday, December 11, 2009 8:13 AM
To: Marek Piorkowski
Cc: snowball-discuss <at> lists.tartarus.org
Subject: Re: [Snowball-discuss] C# implementation?

2009/12/11 Marek Piorkowski <marekpe <at> yahoo.com>:
> Dear all,
> I was wondering if there is any C# implementation of the stemmers? Had this
> ever been considered?

There isn't a C# backend for the snowball compiler that I know of, and
I don't recall hearing of anyone considering one.  C# is similar
enough to java that it probably would be quite easy to create one from
the Java backend.  There may be some C# implementations of some of the
individual stemmers available, but I can't help you there, either, I'm
afraid.

--

-- 
Richard

_______________________________________________
Snowball-discuss mailing list
Snowball-discuss <at> lists.tartarus.org
http://lists.tartarus.org/mailman/listinfo/snowball-discuss

Gmane