1 Sep 2008 15:14
Re: Google Ngram count
Miles Osborne <miles <at> inf.ed.ac.uk>
2008-09-01 13:14:21 GMT
2008-09-01 13:14:21 GMT
I can't comment on the status of counts derived from Google's search page and there is no published statement on the relationship (if any) between the Ngram counts and any other counts. That aside, as with any resource gathered from the Web, caveat emptor. I do know of people using the Ngram set within SMT language models to advantage, so it can be a useful resource. Miles 2008/8/31 Christopher Brewster <C.Brewster <at> dcs.shef.ac.uk>: > There was an extensive analysis by Jean Veronis a while ago showing that > Google counts were invalid (long before the release of the ngram corpus). > Is that analysis still valid? > Does that mean we should not trust Google's ngram corpus at all? > > Christopher > > ***************************************************** > Department of Computer Science, University of Sheffield > Regent Court, 211 Portobello Street > Sheffield S1 4DP UNITED KINGDOM > Web: http://www.dcs.shef.ac.uk/~kiffer/ > Tel: +44(0)114-22.21967 Fax: +44 (0)114-22.21810 > Skype: christopherbrewster > SkypeIn (UK): +44 (20) 8144 0088 > SkypeIn (US): +1 (617) 381-4281 > ***************************************************** > Corruptissima re publica plurimae leges. Tacitus. Annals 3.27 >(Continue reading)
RSS Feed