[OSM-talk] Search results quality (and some testing on Elasticsearch)

José Juan Montes jjmontes at gmail.com
Fri May 29 02:19:54 UTC 2020


Hi all,

This is my first message to the list so I take the opportunity to say hello
to all and thanks to the community for the awesome software, data, and
organisation.

Now to the point. At the ES comunity, we've been discussing how difficult
is to obtain useful results from OSM. Too many times results are odd or
surprising: ordering puts better results down, sometimes it misses obvious
matches entirely... Specifically, we are referring about the search engine
of OSM front page, and other Nominatim bsaed services.

After some anaysis, issues seem related to:

- stop words usage (prepositions, articles...)
- result scoring and ordering (a perfect match placed below far and
unrelated results)
- word matching when there are tildes or non-unicode chars
- synonyms / ignoring for some categories and common nouns (street /
road...)
- lack of autocompletion (helps users finding a result when they don't
quite know the exact term)
- lack of cross-langugae search (eg. in regions with several official
languages, people mixes street names and road types between languages)
- support for typo errors

Part of the problem is that every language requires particular
considerations, which impacts most of the points above. So in my view, a
suitable solution would need to have good i18n support bottom up.

We think that other communities (language-wise) may be hitting the same
issues according to Github issues. I list some references at the bottom,
but they don't seem to get much attention.

Ultimately, the technology stack Nominatim is built upon is not state of
the art. I have done a quick test with Elasticsearch and a simple default
installation with naive data loading already produces decent results. I
later found that alternative search engines exist, for example "Pelias",
which are implemented on top of newer technologies, and their demo seems to
work fine...

Has any alternative to the current geocoder been tested? What would it take
for this to be improved? If alternatives exist, can the search engine at
the front page be changed? or provide options so users can choose their
preferred search engine? maybe even from specialized local/themed search
providers? Perhaps something like that would pave the way for alternative
search software and services, and foster innovation.

Cheers!

Refs:

- https://github.com/osm-search/Nominatim/issues/1811
- https://github.com/osm-search/Nominatim/issues/333
- https://github.com/osm-search/Nominatim/issues/1208
- https://wiki.openstreetmap.org/wiki/Search_engines
- source code of my tests:
https://github.com/jjmontesl/cubetl/tree/master/examples/osm


Jose Juan Montes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20200529/24461360/attachment.htm>


More information about the talk mailing list