Stefan Trcek | 1 Dec 2009 10:52
Picon
Favicon

Re: What does "out of order" mean?

On Monday 30 November 2009 18:42:50 Michael McCandless wrote:
> I was able to apply that git patch just fine -- so I think it'll
> work?

Good to hear it works that simple.
This patch completes the task.
It is a "two file" patch, so if this will work too, I'm confident.

Stefan

> On Mon, Nov 30, 2009 at 12:22 PM, Stefan Trcek <wzzelfzzel <at> abas.de> 
wrote:
> > On Monday 30 November 2009 14:24:20 Michael McCandless wrote:
> >> I agree, it's silly we label things like TopDocs/TopFieldDocs as
> >> expert -- they are no longer for "low level" APIs (or, perhaps
> >> since we've removed the "high level" API (= Hits), what remains
> >> should no longer be considered low level).
> >>
> >> Do you wanna cough up a patch to correct these?
diff --git a/src/java/org/apache/lucene/search/TopDocs.java b/src/java/org/apache/lucene/search/TopDocs.java
index 0f098e1..06ee171 100644
--- a/src/java/org/apache/lucene/search/TopDocs.java
+++ b/src/java/org/apache/lucene/search/TopDocs.java
 <at>  <at>  -20,7 +20,7  <at>  <at>  package org.apache.lucene.search;
 /**
  *  <at> see Searcher#search(Query,Filter,int) */
 public class TopDocs implements java.io.Serializable {
-  /** Expert: The total number of hits for the query.
(Continue reading)

Michael McCandless | 1 Dec 2009 11:07

Re: What does "out of order" mean?

OK -- none of IndexSearcher's search methods needed tweaking?  Just
TopDocs/TopFieldDocs?

Mike

On Tue, Dec 1, 2009 at 4:52 AM, Stefan Trcek <wzzelfzzel <at> abas.de> wrote:
> On Monday 30 November 2009 18:42:50 Michael McCandless wrote:
>> I was able to apply that git patch just fine -- so I think it'll
>> work?
>
> Good to hear it works that simple.
> This patch completes the task.
> It is a "two file" patch, so if this will work too, I'm confident.
>
> Stefan
>
>> On Mon, Nov 30, 2009 at 12:22 PM, Stefan Trcek <wzzelfzzel <at> abas.de>
> wrote:
>> > On Monday 30 November 2009 14:24:20 Michael McCandless wrote:
>> >> I agree, it's silly we label things like TopDocs/TopFieldDocs as
>> >> expert -- they are no longer for "low level" APIs (or, perhaps
>> >> since we've removed the "high level" API (= Hits), what remains
>> >> should no longer be considered low level).
>> >>
>> >> Do you wanna cough up a patch to correct these?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe <at> lucene.apache.org
> For additional commands, e-mail: java-user-help <at> lucene.apache.org
(Continue reading)

Stefan Trcek | 1 Dec 2009 11:31
Picon
Favicon

Re: What does "out of order" mean?

On Tuesday 01 December 2009 11:07:41 Michael McCandless wrote:
> OK -- none of IndexSearcher's search methods needed tweaking?  Just
> TopDocs/TopFieldDocs?

Yes, you can use these methods in Searcher, they are sufficient:

TopDocs Searcher.search(Query query, Filter filter, int n)
TopFieldDocs Searcher.search(Query query, Filter filter, int n, Sort 
sort)

Stefan
Michael McCandless | 1 Dec 2009 12:33

Re: What does "out of order" mean?

Super, thanks.  I'll commit your patch, fixing javadocs for
TopDocs/TopFieldDocs.

Mike

On Tue, Dec 1, 2009 at 5:31 AM, Stefan Trcek <wzzelfzzel <at> abas.de> wrote:
> On Tuesday 01 December 2009 11:07:41 Michael McCandless wrote:
>> OK -- none of IndexSearcher's search methods needed tweaking?  Just
>> TopDocs/TopFieldDocs?
>
> Yes, you can use these methods in Searcher, they are sufficient:
>
> TopDocs Searcher.search(Query query, Filter filter, int n)
> TopFieldDocs Searcher.search(Query query, Filter filter, int n, Sort
> sort)
>
> Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe <at> lucene.apache.org
> For additional commands, e-mail: java-user-help <at> lucene.apache.org
>
>
Michael McCandless | 1 Dec 2009 14:15

Re: What does "out of order" mean?

OK I committed this, plus further removes of "expert" from TopDocs, to
trunk (future 3.1), 2.9 and 3.0 branches.

Thanks!

Mike

On Tue, Dec 1, 2009 at 5:31 AM, Stefan Trcek <wzzelfzzel <at> abas.de> wrote:
> On Tuesday 01 December 2009 11:07:41 Michael McCandless wrote:
>> OK -- none of IndexSearcher's search methods needed tweaking?  Just
>> TopDocs/TopFieldDocs?
>
> Yes, you can use these methods in Searcher, they are sufficient:
>
> TopDocs Searcher.search(Query query, Filter filter, int n)
> TopFieldDocs Searcher.search(Query query, Filter filter, int n, Sort
> sort)
>
> Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe <at> lucene.apache.org
> For additional commands, e-mail: java-user-help <at> lucene.apache.org
>
>
Stefan Trcek | 1 Dec 2009 15:11
Picon
Favicon

Re: What does "out of order" mean?

On Monday 30 November 2009 18:51:34 Nick Burch wrote:
> On Mon, Nov 30, 2009 at 12:22 PM, Stefan Trcek <wzzelfzzel <at> abas.de> 
wrote:
> > I'd do, but was not successful to get the svn repo some months ago.
> > I have to claim the sys admin for any svn repo to open a door
> > through the firewall. Gave up due to
> >
> > $ nmap -p3690 svn.apache.org
> >     PORT     STATE    SERVICE
> >     3690/tcp filtered unknown
>
> Apache svn doesn't use the svnserve protocol, it uses plain old HTTP
> (or HTTPS for committers), so you only need port 80 access, and that
> should be open everywhere.
>
> You can get the svn url, and the appropriate commandline, from:
>  	http://lucene.apache.org/java/docs/developer-resources.html

Thanks, I got it and it works.
I just talked to a collegue who is familiar with svn: the patch format 
of git and svn seems to be the same. As I've never used svn, but use 
git regularly I can stay with git.

Stefan
Robert Muir | 1 Dec 2009 19:43
Picon
Gravatar

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

Hi Ahmet,

After thinking about what Shai brought up, I changed my mind and think it is
not good enough that we only have Collation as a way to solve this.
Because you might want turkish stemming too, and right now there is no way
for the included snowball turkish stemmer to work.
I really do not like this.

So as much as I want to reduce clutter and not have lots of filters that can
be solved in a general way with unicode, I think this is one case
where the best solution for now would be to have a turkish-specific
lowercasefilter...

I don't think we have to use String for this either, we can just apply rules
to the two uppercase I's, and lowercase everything else.

Will you open an issue?

On Mon, Nov 30, 2009 at 2:00 PM, AHMET ARSLAN <iorixxx <at> yahoo.com> wrote:

> In Turkish alphabet lowercase of I is not i. It is LATIN SMALL LETTER
> DOTLESS I. LowerCaseFilter which uses Character.toLowerCase() makes mistake
> just for that character.
>
> http://java.sun.com/javase/6/docs/api/java/lang/String.html#toLowerCase()<http://java.sun.com/javase/6/docs/api/java/lang/String.html#toLowerCase%28%29>
>
> I am not sure if it is worth to add a new TokenFilter for Turkish language.
> I see there exist GreekLowerCaseFilter and RussianLowerCaseFilter. It would
> be nice to see TurkishLowerCaseFilter in Lucene.
>
(Continue reading)

AHMET ARSLAN | 1 Dec 2009 21:35
Picon
Favicon
Gravatar

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

> Hi Ahmet,
> 
> After thinking about what Shai brought up, I changed my
> mind and think it is
> not good enough that we only have Collation as a way to
> solve this.
> Because you might want turkish stemming too, and right now
> there is no way
> for the included snowball turkish stemmer to work.
> I really do not like this.
> 
> So as much as I want to reduce clutter and not have lots of
> filters that can
> be solved in a general way with unicode, I think this is
> one case
> where the best solution for now would be to have a
> turkish-specific
> lowercasefilter...
> 
> I don't think we have to use String for this either, we can
> just apply rules
> to the two uppercase I's, and lowercase everything else.
> 
> Will you open an issue?

Of course. Here it is: https://issues.apache.org/jira/browse/LUCENE-2102
I can add more test cases if you want.
Thank you for your interest.
Otis Gospodnetic | 1 Dec 2009 23:09
Picon
Favicon

Re: Need help regarding implementation of autosuggest using jquery

Hi,

Have a look at http://www.sematext.com/products/autocomplete/index.html

It handles Chinese and large volumes of data.

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch

----- Original Message ----
> From: fulin tang <tangfulin <at> gmail.com>
> To: java-user <at> lucene.apache.org
> Sent: Thu, November 26, 2009 9:10:41 PM
> Subject: Re: Need help regarding implementation of autosuggest using jquery
> 
> By the way , we search Chinese words, so Trie tree looks not perfect
> for us either
> 
> 
> 2009/11/27 fulin tang :
> > We have the same needs in our music search, and we found this is not a
> > good approach for performance reason .
> >
> > Did any one have experience of implement the autosuggestion in a heavy
> > product environment ?
> > Any suggestions ?
> >
> >
> > 2009/11/26 Anshum :
(Continue reading)

Weiwei Wang | 1 Dec 2009 23:23
Picon

Re: Need help regarding implementation of autosuggest using jquery

Hi, dudes,

I finished an search suggestion module a few months ago.

The framework is as below:
1. Log all your search keywords or retrieve all your segmented terms which
can be searched
2. Index all the keywords and or terms with N-Gram tech
3. Search this index with a same analyzer based on user input

You can find a example in lucene contribute materials: SpellCheck or you can
find the source code of our project here:
http://code.google.com/p/askrosa/source/browse/trunk/RosaCrawler/src/autocomplete/AutoCompleter.java
This code is not really up to date, the parameters in the EdgeNGramTokenFilter
should be 2 and 5 or some values not too small and not too large. I
recommend you read some thing about N-Gram(don't worry, it's very easy to
understand)

You can do any modifying to the code and redeploy.

Hope you will enjoy the search tools you are building.

On Wed, Dec 2, 2009 at 6:09 AM, Otis Gospodnetic <otis_gospodnetic <at> yahoo.com
> wrote:

> Hi,
>
> Have a look at http://www.sematext.com/products/autocomplete/index.html
>
> It handles Chinese and large volumes of data.
(Continue reading)


Gmane