Shah, Vineel | 2 Jan 17:08 2003

RE: custom scoring api questions

I tried this out and it works perfectly. I'm loving Lucene! Thanks Doug!

-----Original Message-----
From: Doug Cutting [mailto:cutting <at> lucene.com]
Sent: Tuesday, December 31, 2002 12:40 PM
To: Lucene Developers List
Subject: Re: custom scoring api questions

Shah, Vineel wrote:
> Here's what I'm trying to do:
> A query that looks for for "java unix windows" in the "keywords" field of an index.
> 
> If the document has "java unix", the score is .66..., regardless of any other factor. I want 1.0 for all
three, .33... for just one, and no hit for none.

This is easy to do with the Similarity API.  Just define all of the 
methods to return 1.0, except queryNorm(), which, for your purposes, 
should return the inverse of the value passed.

I've attached a demonstration that implements the scoring you desire.

Doug
Shah, Vineel | 2 Jan 17:19 2003

RE: custom scoring api questions

It's not a bad thought. We're using Oracle and usually would just do a SQL query and let Oracle indices take
care of the searching. However:

1. One of the fields is a clob with >16k of text per entry. We're using Oracle's Context, which has proven
unreliable and slow.
2. We have to search on a normalized data structure. Each parent row may have 10-100 child rows.There may be
up to 200,000 parent rows. There may be 60 query terms to look for in the child rows. In my inherited
codebase, the query does 60 joins against the child table for each parent. Needless to
say, the web page times out before the search is done. Our users are understandably frustrated.

And so, it seems worthwhile to use a seperate search engine and sync the database contents to it. I looked at
Lucene because it is open source, in java, low overhead, and fast. So far, I'm extremely pleased with the results!

vineel

-----Original Message-----
From: Leo Galambos [mailto:galambos <at> com-os2.ms.mff.cuni.cz]
Sent: Tuesday, December 31, 2002 5:16 PM
To: Lucene Developers List
Subject: Re: custom scoring api questions

On Mon, 30 Dec 2002, Shah, Vineel wrote:

> I've been developing a search function with Lucene for a couple of weeks
> (it's wonderful!) I've run into a snag-- the way I need to calculate
> scores seems to have nothing to do with Lucene's scoring paradigm. I
> think this is because I'm doing a database-oriented search instead of a
> document-oriented one.

Isn't it better to use RDBMS with B+? I am not sure if a fulltext module
(Continue reading)

Leo Galambos | 2 Jan 22:59 2003
Picon

RE: custom scoring api questions

> It's not a bad thought. We're using Oracle and usually would just do a
> SQL query and let Oracle indices take care of the searching. However:
> 
> 1. One of the fields is a clob with >16k of text per entry. We're using
> Oracle's Context, which has proven unreliable and slow.

right, you would parse it to tokens. it could be done by many OS projects 
(i.e. mnoGoSearch) which use RDBMS as backend. It will be faster solution 
for you, I think.

> search on a normalized data structure. Each parent row may have 10-100
> child rows.There may be up to 200,000 parent rows. There may be 60 query
> terms to look for in the child rows. In my inherited codebase, the query
> does 60 joins against the child table for each parent. Needless to say,
> the web page times out before the search is done. Our users are

It sounds like a bug in DB design IMHO.

-g-
Lukas Zapletal | 3 Jan 11:11 2003
Picon

Directory implementation in a ZIP file via HTTP (read-only)

Dears

I made a new class, new implementation of a Directory as a ZIP file via 
HTTP.

This is very useful for people that needs use Lucene in applets (nonsigned) 
.

The problem is the source is tested with Java2 1.4.1, I don`t use Java1 any 
more... Anybody can test it?

How or where can I send java source?

--

-- 
Lukas Zapletal
http://www.tanecni-olomouc.cz/lzap
lzap <at> root.cz
Andrew C. Oliver | 4 Jan 03:03 2003
Picon

http://jakarta.apache.org/lucene/docs/luceneplan.html?

Does anyone mind if I delete this: 
http://jakarta.apache.org/lucene/docs/luceneplan.html?  Its been far 
suprassed.  After I saw LARM and the rest of the sandbox taking off so 
nicely without me, I kinda gave it up and figured I'd just come back and 
write examples and guides for LARM in time.

If you're using this and don't want it deleted let me know now, or 
forever hold your peace.

Thanks,

-Andy
Otis Gospodnetic | 4 Jan 07:34 2003
Picon

http://jakarta.apache.org/lucene/docs/luceneplan.html?

May I suggest that you only remove a link to it and leave
luceneplan.xml in the repository for a bit longer?
I've been meaning to do that, too.

Thanks,
Otis

--- "Andrew C. Oliver" <acoliver <at> apache.org> wrote:
> Does anyone mind if I delete this: 
> http://jakarta.apache.org/lucene/docs/luceneplan.html?  Its been far 
> suprassed.  After I saw LARM and the rest of the sandbox taking off
> so 
> nicely without me, I kinda gave it up and figured I'd just come back
> and 
> write examples and guides for LARM in time.
> 
> If you're using this and don't want it deleted let me know now, or 
> forever hold your peace.
> 
> Thanks,
> 
> -Andy
> 
> 
> --
> To unsubscribe, e-mail:  
> <mailto:lucene-dev-unsubscribe <at> jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help <at> jakarta.apache.org>
> 
(Continue reading)

acoliver | 4 Jan 15:20 2003
Picon

cvs commit: jakarta-lucene/xdocs/stylesheets project.xml

acoliver    2003/01/04 06:20:38

  Modified:    xdocs/stylesheets project.xml
  Log:
  removed plan link

  Revision  Changes    Path
  1.17      +0 -4      jakarta-lucene/xdocs/stylesheets/project.xml

  Index: project.xml
  ===================================================================
  RCS file: /home/cvs/jakarta-lucene/xdocs/stylesheets/project.xml,v
  retrieving revision 1.16
  retrieving revision 1.17
  diff -u -r1.16 -r1.17
  --- project.xml	4 Dec 2002 05:56:32 -0000	1.16
  +++ project.xml	4 Jan 2003 14:20:38 -0000	1.17
   <at>  <at>  -32,10 +32,6  <at>  <at> 
           <item name="Lucene Sandbox"    href="/lucene-sandbox/"/>
       </menu>

  -    <menu name="Plans">
  -        <item name="Application Extensions"           href="/luceneplan.html"/>
  -    </menu>
  -
       <menu name="Download">
           <item name="Binaries"           href="/site/binindex.html"/>
           <item name="Source Code"        href="/site/sourceindex.html"/>
Otis Gospodnetic | 4 Jan 17:23 2003
Picon

Re: Directory implementation in a ZIP file via HTTP (read-only)

Hello Lukas,

I suggest you just attach your class (or a set of classes zipped) to a
message and send it to this mailing list, lucene-dev.

I know a few people have inquired about the ability to search a
zipped/jarred index.  Hello Erik :)

Thanks,
Otis

--- Lukas Zapletal <lzap <at> root.cz> wrote:
> Dears
> 
> I made a new class, new implementation of a Directory as a ZIP file
> via 
> HTTP.
> 
> This is very useful for people that needs use Lucene in applets
> (nonsigned) 
> .
> 
> The problem is the source is tested with Java2 1.4.1, I don`t use
> Java1 any 
> more... Anybody can test it?
> 
> How or where can I send java source?
> 
> -- 
> Lukas Zapletal
(Continue reading)

otis | 4 Jan 17:29 2003
Picon

cvs commit: jakarta-lucene/docs/lucene-sandbox/larm overview.html

otis        2003/01/04 08:29:08

  Modified:    xdocs    powered.xml
               docs     benchmarks.html contributions.html demo.html
                        demo2.html demo3.html demo4.html fileformats.html
                        gettingstarted.html index.html luceneplan.html
                        powered.html queryparsersyntax.html resources.html
                        todo.html whoweare.html
               docs/lucene-sandbox index.html
               docs/lucene-sandbox/indyo tutorial.html
               docs/lucene-sandbox/larm overview.html
  Log:
  - Updated (keine Scheisse).

  Revision  Changes    Path
  1.15      +1 -0      jakarta-lucene/xdocs/powered.xml

  Index: powered.xml
  ===================================================================
  RCS file: /home/cvs/jakarta-lucene/xdocs/powered.xml,v
  retrieving revision 1.14
  retrieving revision 1.15
  diff -u -r1.14 -r1.15
  --- powered.xml	19 Sep 2002 12:34:23 -0000	1.14
  +++ powered.xml	4 Jan 2003 16:29:07 -0000	1.15
   <at>  <at>  -26,6 +26,7  <at>  <at> 
   <li><a href="http://www.rockynewsgroup.org/">RockyNewsgroup.org</a></li>
   <li><a href="http://scarab.tigris.org/">Scarab Issue Tracking</a></li>
   <li><a href="http://yazd.yasna.com/">Yazd Discussion Forum Software</a></li>
  +<li><a href="http://guests.evectors.it/zoe/">Zoe</a></li>
(Continue reading)

otis | 4 Jan 17:38 2003
Picon

cvs commit: jakarta-lucene/src/java/org/apache/lucene/analysis LowerCaseFilter.java

otis        2003/01/04 08:38:39

  Modified:    src/java/org/apache/lucene/analysis LowerCaseFilter.java
  Log:
  - Import stmt,  <at> version + CVS Id tag.

  Revision  Changes    Path
  1.3       +7 -2      jakarta-lucene/src/java/org/apache/lucene/analysis/LowerCaseFilter.java

  Index: LowerCaseFilter.java
  ===================================================================
  RCS file: /home/cvs/jakarta-lucene/src/java/org/apache/lucene/analysis/LowerCaseFilter.java,v
  retrieving revision 1.2
  retrieving revision 1.3
  diff -u -r1.2 -r1.3
  --- LowerCaseFilter.java	9 Dec 2002 19:02:20 -0000	1.2
  +++ LowerCaseFilter.java	4 Jan 2003 16:38:39 -0000	1.3
   <at>  <at>  -54,14 +54,19  <at>  <at> 
    * <http://www.apache.org/>.
    */

  -/** Normalizes token text to lower case. */
  +import java.io.IOException;

  +/**
  + * Normalizes token text to lower case.
  + *
  + *  <at> version $Id$
  + */
   public final class LowerCaseFilter extends TokenFilter {
(Continue reading)


Gmane