[OSM-talk] Nominatim data and code updates

Brian Quinion openstreetmap at brian.quinion.co.uk
Sun Aug 26 14:49:13 BST 2012


tl;dr: nominatim is updated (data and code), add wikipedia tags and
label/admin_centre relation member to improve search quality.

As some of you may have noticed Sarah (lonvia) has been busy updating
the openstreetmap nominatim instance.  It should now be fully up to
date and back running live updates, so, if anyone is still finding
data that they have added is not included please let us know by email
on the geocoding mailing list, via trac (select nominatim component)
or on either the #osm or #osm-nominatim irc channels.

In the process of updating the data we also took the opportunity to
release some code changes.  Probably the most visible changes in this
update are modification in how address hierarchy is calculated, a new
technique for calculating the importance using wikipedia articles and
a new system for deduplicating place and admin areas.  As a result of
these changes nominatim now supports a few extra tags that were
previously ignored:

'wikipedia' tag [1] and its variants. Adding this allows nominatim to
have a far better value for how important a place is which helps with
the ordering, for instance a search for "statue of liberty" now
consistently returns the correct one [2] first rather than the on on a
traffic island in the UK [3].  If you find a place where the ordering
of results is still bad consider adding suitable wikipedia links to
the osm elements to help improve scoring.

The new version of nominatim also supports both 'label' and
'admin_centre' relation members for boundary relations [4] which
allows it to reduce the amount of duplicated data returned and produce
a more consistent result set.  The 'label' member will be merged (name
and tags), relation tags win if there is a conflict. 'admin_centre'
member will be merged only if the names and 'rank' (effectively
admin_level / place=*) match, but other than that same rules.  In both
cases the node will also be added as the centre point of the polygon -
i.e. the location that the map will centre on and this is returned in
preference to the geometric centre of the polygon.  If no
'admin_centre' and 'label' members are present the code will try to
guess by looking for a node at the right admin level, name and
approximate location - obviously explicit tagging is far better and
more accurate.

Thanks to all those who help with this update, in particular Sarah
Hoffmann who did by far the majority of the work!

--
 Brian

[1] http://wiki.openstreetmap.org/wiki/Key:wikipedia
[2] http://www.openstreetmap.org/browse/way/32965412
[3] http://www.openstreetmap.org/browse/node/355219404
[4] http://wiki.openstreetmap.org/wiki/Relation:boundary#Relation_members



More information about the talk mailing list