[Geocoding] Support to stop words?

Vitor George vitor.george at aparabolica.com.br
Mon Apr 10 16:30:48 UTC 2017


Hi Sarah,

Thank you very much for the response.

Another related question, does this abbreviation wiki page is been used as
a data source in Nominatim?

https://wiki.openstreetmap.org/wiki/Name_finder:Abbreviations

The text says it does but I couldn't find any reference to it in the docs
at the Github repository.

Thanks,
Vitor


On Wed, Apr 5, 2017 at 4:42 PM, Sarah Hoffmann <lonvia at denofr.de> wrote:

> Hi,
>
> On Wed, Apr 05, 2017 at 08:37:52AM -0300, Vitor George wrote:
> > ​When searching​
> >  "Biblioteca Prestes Maia" [1]
> > ​the ​
> > best result should be "Biblioteca Prefeito Prestes Maia" [2], instead the
> > results are
> > ​ other *bibliotecas* (libraries) with different names. This affects very
> > much the usability of Nominatim in Portuguese.
> >
> > Is there support to stop words? Could Nominatim be using PostgreSQL full
> > text search?
>
> Nominatim has limited support for searching for partial words and also
> for a few stop words. However, the latter is very difficult to implement
> in a system that has to work with arbirary languages. As it happens,
> search in your example trips over stop words. What happens is this:
>
> 'en' is marked as a stop word in Nominatim because it is 'and' in
> some languages. Stop words are handled in Nominatim by removing them
> completely from search terms and queries. Nominatim can also handle
> so-called special phrases which are used for POI search. These allow
> you to enter phrases like 'restaurant near Trafalgar Square' or
> 'supermarket in Berlin'. One of the Spanish special phrases is
> 'biblioteca in'. Because of the stop word deletion that gets shortend
> to 'biblioteca'. So, one of the interpretations of your search
> query above becomes 'find me a library in Prestes Maia'. And that's
> the results you see.
>
> To resolve this, the stop word handling in Nominatim needs to be
> changed to not unconditionally delete stop words but leave them
> in where they are essential. This is unfortunately not a simple
> change and requires rewriting some of the fundamentals of how
> query normalisation works. Until it's done I'm rather reluctant
> to add new stop words or special terms.
>
> Kind regards
>
> Sarah
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/geocoding/attachments/20170410/162973d9/attachment.html>


More information about the Geocoding mailing list