[Geocoding] support tab as space delimiter #967
Sarah Hoffmann
lonvia at denofr.de
Mon Mar 23 21:00:53 UTC 2020
On Sun, Mar 22, 2020 at 11:46:52AM +0000, Rahul Reddy wrote:
> Thanks for the reply!
>
> I took some time to understand the utfasciitable.h and nominatim.c in the path module/. The entries are already such that ASCII 9-14 will be converted to space character.
>
> But the resultant string does not contain the character. This happens with other characters like @#+() etc.(these are irrelevant in search) too.
>
> I think the part
> '// assume lenngth 1, silently drop bogus characters'
> in nominatim.c is dropping these characters. Can anyone help me with this?
Are you sure that it ends up there? I would expect it to hit the
first if ( ((*sourcedata & 0x80) == 0) as these should be normal
ASCII characters in the 0-128 range.
That last case is only hit, when the input is not valid UTF-8.
The jucy part where characters are skipped comes further below
Look out for 'if (*(asciilookup + *wchardata) > 0)'
Sarah
>
> PS: While trying to understand the table, I fixed issue #886<https://github.com/openstreetmap/Nominatim/issues/886>. I'll write test cases and send a PR for that.
More information about the Geocoding
mailing list