1 Nov 2004 16:29
jaspq: dashed numerical values tokenized differently
Daniel Taurat <daniel.taurat <at> gaussvip.com>
2004-11-01 15:29:35 GMT
2004-11-01 15:29:35 GMT
Hi, I have just another stupid parser question: There seems to be a special handling of the dash sign "-" different from Lucene 1.2 at least in Lucene 1.4.RC3 StandardAnalyzer. Examples (1.4RC3): A document containing the string "dash-test" is matched by the following search expressions: dash test dash* dash-test It is _not_ matched by the following search expressions: dash-* dash-t* If the string after the dash consists of digits, the behavior is different. E.g., a document containing the string "dash-123" is matched by: dash* dash-* dash-123 It is not matched by: dash 123 Question: Is this, esp. the different behavior when parsing digits and characters,(Continue reading)
RSS Feed