1 Dec 2010 02:10
Re: deprecating Versions
Marvin Humphrey <marvin <at> rectangular.com>
2010-12-01 01:10:45 GMT
2010-12-01 01:10:45 GMT
On Mon, Nov 29, 2010 at 05:34:27AM -0500, Robert Muir wrote: > Is it somehow possible i could convince everyone that all the analyzers we > provide are simply examples? This way we could really make this a bit more > reasonable and clean up a lot of stuff. I understand what you're getting at. We don't really expect people to fork an analyzer code base, though -- so we need to draw a line between e.g. the code that implements StopFilter and stoplist content. We want the low-level code to be part of the library, but maybe we want stoplist content to be considered example code. > Seems like we really want to move towards a more declarative model where > these are just config files... so only then it will ok for us to change them > because they suddenly aren't suffixed with .java?! Consider how this might work with e.g. RussianAnalyzer. The declaratively-expressed sample analyzer config could contain a hard-coded list of Russian stop words, and as this hard-coded stoplist would travel with the index in a config file, there would be no index compatibility problems upon upgrading Lucene. The stoplist in the sample config could change, even on bugfix releases. Config file syntax would potentially be affected by a Lucene upgrade, but that doesn't affect index content and maintaining back compat is straightforward. Things are more difficult with versioning e.g. stemmers, but I think the stoplist example illustrates the potential of declarative analyzer specification. Maybe specifying Version in a sample file and dispatching to different revs of a Snowball stemmer is less painful than forcing a user to figure out Version from API documentation?(Continue reading)
RSS Feed