[OSM-talk] Fixing wiki* -> brand:wiki*

Yuri Astrakhan yuriastrakhan at gmail.com
Wed Sep 27 16:14:36 UTC 2017


I think we should re-start with the definition of the problems we are
(hopefully) trying to solve, or else we might end up too far in the
existential realm, which is fun to discuss, but should be left for another
thread.

* Problem #1:  In my analysis of OSM data, wikipedia tags quickly go stale
because they use Wikipedia page titles, and titles are constantly renamed,
deleted, and what's worse - old names are reused for new meanings.  This is
a fundamental problem with all Wikipedia tags, such as wikipedia,
brand:wikipedia, operator:wikipedia, etc, that needs solving. The solution
does not need to be perfect, it just needs to be better than what we have.

* Problem #2: the *meaning* of the "wikipedia" tag is ambiguous, and
therefor cannot be processed easily. The top three meanings I have seen are:
  a) This WP article is about this OSM feature (a so called 1:1 match, e.g.
city, famous building, ...)
  b) This WP article is about some aspect of this OSM feature, like its
brand, tree species, or subject of the sculpture
  c) Only a part of this WP article is about this OSM feature, e.g. a WP
list of museums in the area contains description of this museum.

* Problem #3: data consumers need cleaner, more machine-processable data.
The text label is much more error prone than an ID:  McDonalds vs mcdonalds
vs McDonald's vs ..., so having "brand=mcdonalds" results in many errors.
Note that just because OSM default map skin may handle some of them
correctly, each data consumer has to re-implement that logic, so the more
ambiguous something is, the more likely it will result in errors and data
omissions.

The brand:wikidata discussion is about #1, #2b, and #3.

Are we in agreement that these are problems, or do you think none of them
need solving?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170927/50763e95/attachment.html>


More information about the talk mailing list