[Geocoding] Regarding issue #967
Sarah Hoffmann
lonvia at denofr.de
Wed Apr 1 06:12:43 UTC 2020
Hi Rahul,
On Wed, Apr 01, 2020 at 05:36:00AM +0530, K Rahul Reddy wrote:
> For issue #967 <https://github.com/osm-search/Nominatim/issues/967>, These
> are some points I found so far:
>
> In Geocode.php lookup(),
>
> 1) The sNormQuery is made by using PHP's Transliterator.
>
> 2) The normalization method make_standard_name is used on phrases in line
> 630. This is an sql function which returns
> trim(public.gettokenstring(public.transliteration(name))).
>
> We need to replace %09-%0d characters in phrases. This can be done
> simply by adding
>
> $sPhrase = preg_replace('/[\x09|\x0a|\x0b|\x0c|\x0d]/', ' ',
> $sPhrase);
>
> before normalization function is called.
>
> 3) Other solution would be to change normalization(breaks the DB). The
> transliteration() uses the utfasciitable.h
>
> Changing UTFASCIILOOKUP by replacing 9-13 th position elements by '2'
> does the job.
>
>
> I have tested both the ways, and both seem to work as expected. What should
> I do now?
Go for solution 3). It is true that it breaks the DB but only for places
that have characters %09-%0d in their name. That's basically data that is
broken in the OSM database already and should be fixed. Therefore it is
okay to make an exception to the rule not to change the normalization.
Cheers
Sarah
More information about the Geocoding
mailing list