1 Sep 2008 04:27
[jira] Created: (LUCENE-1372) Proposal: introduce more sensible sorting when a doc has multiple values for a term
Paul Cowan (JIRA <jira <at> apache.org>
2008-09-01 02:27:44 GMT
2008-09-01 02:27:44 GMT
Proposal: introduce more sensible sorting when a doc has multiple values for a term
-----------------------------------------------------------------------------------
Key: LUCENE-1372
URL: https://issues.apache.org/jira/browse/LUCENE-1372
Project: Lucene - Java
Issue Type: Improvement
Components: Search
Affects Versions: 2.3.2
Reporter: Paul Cowan
Priority: Minor
At the moment, FieldCacheImpl has somewhat disconcerting values when sorting on a field for which
multiple values exist for one document. For example, imagine a field "fruit" which is added to a document
multiple times, with the values as follows:
doc 1: {"apple"}
doc 2: {"banana"}
doc 3: {"apple", "banana"}
doc 4: {"apple", "zebra"}
if one sorts on the field "fruit", the loop in FieldCacheImpl.stringsIndexCache.createValue() (and
similarly for the other methods in the various FieldCacheImpl caches) does the following:
while (termDocs.next()) {
retArray[termDocs.doc()] = t;
}
which means that we look over the terms in their natural order and, on each one, overwrite retArray[doc]
with the value for each document with that term. Effectively, this overwriting means that a string sort in
(Continue reading)
RSS Feed