[OSM-dev] Tokenization in Nominatim
philippe.david at allgoob.com
Tue Aug 16 09:12:44 BST 2011
I have found a potential bug in the way nominatim creates token from names.
I can create a ticket on trac if you want but I wanted to have your feeling
In Bordeaux, France(, Europe), there is a point called "Barrière d'Ornano".
When I search for "Ornano, Bordeaux", I get no result but when I search for
"d'Ornano, Bordeaux", it works. The correct way to tokenize in french would
be to split on the single quote after "d".
If I try to translate that in english, the point is called "Gate of Ornano".
There is a good chance that people type only the
proper name "Ornano" and not the full name.
If things were simpe in french, we would write "Barrière de Ornano" but
there is a contraction here so it becomes "Barrière d'Ornano".
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dev