Re: PostgreSQL full text search/OpenFTS ranking algorithm?
Oleg Bartunov <
oleg@...>
2009-07-29 10:56:06 GMT
Robert,
there two ranking functions in tsearch2. One is ts_rank_cd(), which uses
"cover density" approach from paper "Relevance ranking for one to three term
queries", mention by Neophytos. We added various normalizations.
Better, just see source code. Another one is old ts_rank() function, which
uses statistical approach. Again, see source code.
Some notices:
1. ts_rank_cd generally is better for AND queries
2. ts_rank has some support for OR queries
3. both functions use only local (current document) informations, so
they are very good for combining search results from several machines.
I have several presentations
http://www.sai.msu.su/~megera/postgres/talks/,
http://www.sai.msu.su/~megera/postgres/talks/fts-pgday-2007.pdf
On Wed, 29 Jul 2009, Neophytos Demetriou wrote:
> Dear Robert,
>
> I just got back from holidays. Attached you may find the paper behind the
> rank_cd ranking function in tsearch2 (this is the one I had mentioned in my
> previous reply). You may also want to check out the corresponding code in
> postgresql/src/backend/adt/tsrank.c (calc_rank_cd and Cover).
>
> IIRC, the ranking functions in OpenFTS were different than the ones in
> tsearch2. I'm CC-ing Oleg Bartunov which might have more information to
> share.
(Continue reading)