Seeta Somagani | 1 Jul 01:06 2006

RE: Null field values

Hi Erick,

The fields that are missing are sort of primary keys and they exist in
all the documents (including those that were returned in my search
results) when I browsed through the index using Luke. And the field
names are exactly the same all in the same case. I never get the three
field values no matter what I'm searching for.

Thanks,
Seeta

-----Original Message-----
From: Erick Erickson [mailto:erickerickson <at> gmail.com] 
Sent: Friday, June 30, 2006 6:55 PM
To: java-user <at> lucene.apache.org
Subject: Re: Null field values

There is no requirement that every document contain values for every
field.
Doc A could have fields z, y, x, and Doc B could have fields x, w, v.
So,
when you say "some of the values are being returned as null", do you
mean
that you *never* get any values for some field or you get values for a
field
for some, but not all documents?

You might try using Luke to look at the specific document that's giving
you
problems and see if it has values for all fields. You can enter the
(Continue reading)

markharw00d | 1 Jul 01:10 2006
Picon

Re: Any existing query types that support equivalent of "-not interested" ?


>Maybe this:
>
>SpanNotQuery(interested, SpanNearQuery(not,interested))
>
>with a SpanTermQuery for each term?
>  
>

Thanks, Paul. This is working well for me and I can happily use multiple 
SpanTermQueries embedded in a SpanOrQuery in place of each of the single 
words in your example.

SpanNotQuery(
	SpanOrQuery(interested,curious...) 
	SpanNearQuery(
		SpanOrQuery(not,wasnt,isnt,...)
		SpanOrQuery(interested,curious...)
		)
	)

	
	
		
___________________________________________________________ 
All new Yahoo! Mail "The new Interface is stunning in its simplicity and ease of use." - PC Magazine 
http://uk.docs.yahoo.com/nowyoucan.html
Dominik Bruhn | 1 Jul 01:10 2006
Picon

Sorting & SQL-Database

Hy,
i use Lucene to index a SQL-Table which contains three fields: a index-field, 
the text to search in and another field. When adding a lucene document I let 
Lucene index the search-field and also save the id along with it in the 
lucene index.

Uppon searching I collect all ids and add them to a java-string with commas in 
between to issue a SQL-Query like this one:

SELECT id,addfield FROM table WHERE id IN ([LUCENERESULT]);

Where LUCENERESULT is like 2,3,19,3,5.

This works fine but got one problem: The Search-Result of Lucene is order by 
relevance and so the id-list is also sorted by relevance. But the result of 
the SQL-Query is sorted by the id which destroys the relevance-sorting.

Does anybody know a work-arround?

Thanks
--

-- 
Dominik Bruhn
mailto: dominik <at> dbruhn.de
http://www.dbruhn.de
Amit | 1 Jul 09:24 2006

FW: how Boolean query work internally in lucene


Hi All,

I just want to know how the lucene processes the Boolean query internally??

As per my knowledge:

   if I search for "java apache".
   Note: let consider i want documents that contents both words and i
constructed boolean query for that (i.e. +java +apache).

   Please let me clear if i wrong how lucene process this query?
   First it search for all document for java then for apache and after that
it take the intersection of these to sets. is it?
   if so where i want to know where lucene take the intersection and how
process the query??

 I would appreciate any response.

  Thanks in advance.

Thanks and Regards,
Amit
Amit | 1 Jul 09:37 2006

RE: how Boolean query work internally in lucene

Hi All,

I just want to know how the lucene processes the Boolean query internally??

As per my knowledge:

   if I search for "java apache".
   Note: let consider i want documents that contents both words and i
constructed boolean query for that (i.e. +java +apache).

   Please let me clear if i wrong how lucene process this query?
   First it search for all document for java then for apache and after that
it take the intersection of these to sets. is it?
   if so where i want to know where lucene take the intersection and how
process the query??

 I would appreciate any response.

  Thanks in advance.

Thanks and Regards,
Amit
karl wettin | 1 Jul 10:09 2006
Picon

Re: Sorting & SQL-Database

On Sat, 2006-07-01 at 01:10 +0200, Dominik Bruhn wrote:

> SELECT id,addfield FROM table WHERE id IN ([LUCENERESULT]);
> 
> Where LUCENERESULT is like 2,3,19,3,5.
> 
> This works fine but got one problem: The Search-Result of Lucene is order by 
> relevance and so the id-list is also sorted by relevance. But the result of 
> the SQL-Query is sorted by the id which destroys the relevance-sorting.
> 
> Does anybody know a work-arround?

This is really a question you should ask in the forum of your RDBMS. You
could always execute multiple SQL-queries within the same statement
without too much loss. But I'm certain there is a way to enforce the
order as you specified it in the WHERE-clause.
Aleksander M. Stensby | 1 Jul 10:27 2006
Picon

Re: Sorting & SQL-Database

Well, it is common in most databasesystems, that if you dont specify a  
sort, you get the results sorted by id, or by when the rows are inserted  
into the db.

The quickest way for you is to write around your query.
instead of doing one query, just do where queries with equals. this would  
produce a bit overflow if you have your database on an external server,  
but again, the quickest query to ever run in sql are equals-queries. Since  
the ID is unique (i hope so), it should be no problem at all.
... WHERE id = 2;
... WHERE id = 3
... WHERE id = 19
and so on would give you the correct relevance-order.

On Sat, 01 Jul 2006 10:09:01 +0200, karl wettin <kalle <at> snigel.net> wrote:

> On Sat, 2006-07-01 at 01:10 +0200, Dominik Bruhn wrote:
>
>> SELECT id,addfield FROM table WHERE id IN ([LUCENERESULT]);
>>
>> Where LUCENERESULT is like 2,3,19,3,5.
>>
>> This works fine but got one problem: The Search-Result of Lucene is  
>> order by
>> relevance and so the id-list is also sorted by relevance. But the  
>> result of
>> the SQL-Query is sorted by the id which destroys the relevance-sorting.
>>
>> Does anybody know a work-arround?
>
(Continue reading)

Paul Elschot | 1 Jul 10:37 2006
Picon
Picon

Re: Any existing query types that support equivalent of "-not interested" ?

On Saturday 01 July 2006 01:10, markharw00d wrote:
> 
> >Maybe this:
> >
> >SpanNotQuery(interested, SpanNearQuery(not,interested))
> >
> >with a SpanTermQuery for each term?
> >  
> >
> 
> Thanks, Paul. This is working well for me and I can happily use multiple 
> SpanTermQueries embedded in a SpanOrQuery in place of each of the single 
> words in your example.

I've never tried it myself, so it's good to hear that it actually works...

> 
> SpanNotQuery(
> 	SpanOrQuery(interested,curious...) 
> 	SpanNearQuery(
> 		SpanOrQuery(not,wasnt,isnt,...)
> 		SpanOrQuery(interested,curious...)
> 		)
> 	)

How about sth like this to get rid of the duplicates in there:

SpanNotNearQuery(includeSpanQuery, excludeSpanQuery, distance, ordered)

?
(Continue reading)

Paul Elschot | 1 Jul 10:51 2006
Picon
Picon

Re: how Boolean query work internally in lucene

On Saturday 01 July 2006 09:37, Amit wrote:
> Hi All,
> 
> I just want to know how the lucene processes the Boolean query internally??
> 
> As per my knowledge:
> 
>    if I search for "java apache".

This is a PhraseQuery internally in Lucene.

>    Note: let consider i want documents that contents both words and i
> constructed boolean query for that (i.e. +java +apache).
> 
>    Please let me clear if i wrong how lucene process this query?
>    First it search for all document for java then for apache and after that
> it take the intersection of these to sets. is it?
>    if so where i want to know where lucene take the intersection and how
> process the query??

It's a mix of determining the documents per term and merging.
It works in the order of the internal document number, (which is the order
of addition to the index, ) for both sets of documents.
To determine the next matching document each set of documents
advances to the document number at or after the current document of the
other set, until both sets have the same document number.

The intersection is taken in ConjunctionScorer and the sets are
represented by a TermScorer, both viewable in this package:
http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/search/
(Continue reading)

Amit | 1 Jul 13:24 2006

RE: how Boolean query work internally in lucene

Thanks Paul for quick reply.

regards,
Amit

-----Original Message-----
From: Paul Elschot [mailto:paul.elschot <at> xs4all.nl]
Sent: Saturday, July 01, 2006 2:22 PM
To: java-user <at> lucene.apache.org
Subject: Re: how Boolean query work internally in lucene

On Saturday 01 July 2006 09:37, Amit wrote:
> Hi All,
>
> I just want to know how the lucene processes the Boolean query
internally??
>
> As per my knowledge:
>
>    if I search for "java apache".

This is a PhraseQuery internally in Lucene.

>    Note: let consider i want documents that contents both words and i
> constructed boolean query for that (i.e. +java +apache).
>
>    Please let me clear if i wrong how lucene process this query?
>    First it search for all document for java then for apache and after
that
> it take the intersection of these to sets. is it?
(Continue reading)


Gmane