[OSM-dev] Tokenization in Nominatim

Philippe DAVID philippe.david at allgoob.com
Tue Aug 16 09:12:44 BST 2011


Hi developers,
I have found a potential bug in the way nominatim creates token from names.
I can create a ticket on trac if you want but I wanted to have your feeling
first.
In Bordeaux, France(, Europe), there is a point called "Barrière d'Ornano".
When I search for "Ornano, Bordeaux", I get no result but when I search for
"d'Ornano, Bordeaux", it works. The correct way to tokenize in french would
be to split on the single quote after "d".
If I try to translate that in english, the point is called "Gate of Ornano".
There is a good chance that people type only the
proper name "Ornano" and not the full name.
If things were simpe in french, we would write "Barrière de Ornano" but
there is a contraction here so it becomes "Barrière d'Ornano".

Cheers,
Philippe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20110816/baece8a0/attachment-0001.html>


More information about the dev mailing list