1 Jul 2009 02:40
Re: Order of fields within a Document in Lucene 2.4+
Chris Hostetter <hossman_lucene <at> fucit.org>
2009-07-01 00:40:11 GMT
2009-07-01 00:40:11 GMT
Hmmm... i'm not an expert on the internals of indexing, and i don't use FieldSelectors much, but this seems like a pretty big bug to me ... or at the very least: a change in behavior that completely eliminates the value of LOAD_AND_BREAK. https://issues.apache.org/jira/browse/LUCENE-1727 : The Lucene FAQ says... : : What is the order of fields returned by Document.fields()? : * Fields are returned in the same order they were added to the document. : (now getFields() as fields is deprecated) : : However I think this may no longer be the case in 2.4 : : We are indexing documents in a specific order so that we can LOAD_AND_BREAK out of our FieldSelector as early as possible. : i.e. we have typically 50 indexed fields for a document, but when we are loading results with .doc(), we know we only need 4 of them. : : So, our code ensures that these are added to the index first - and once the 4th field is loaded we break out of the selector. : : This speeds us up by an order of magnitude. : : : : However, we are finding that our field selector is processing fields in alphabetical order, not order of addition. This means that we'd have to rename our fields to 'aaa..' in order to guarantee they'd be(Continue reading)
RSS Feed