[OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

Yuri Astrakhan yuriastrakhan at gmail.com
Wed Sep 20 15:54:39 UTC 2017

Such an awesome discussion, thanks!

* https://www.wikidata.org/wiki/Special:GoToLinkedPage can already be used
to open a Wikipedia page when you only have a Wikidata ID.  It even accepts
a list of wiki sites. For example, this link automatically opens the wiki
page for Q3669 in the first available language ("pt" in this case)


* Sarah, thanks for the heads up about Nominatim using Wikipedia tags.  I
recently added page popularity (pageviews) to the OSM+Wikidata service.
Another metric is the number of Wikipedia articles in different languages
per topic (sitelinks count).  Together, they can be used to calculate
relative weights.

* I am a bit radical, but not enough to propose we get rid wikipedia tags
just yet.  They sometimes provide a good indication of the original
intent.  Once Wikidata is used in all the tooling, we may revisit, but not
until then.  But yes, wikipedia tags are very unstable, especially when
articles get renamed because multiple places have identical names, thus
creating a link to disambig. So in general, they often go stale and become
less useful without any indication.

* Oleksiy, OSM can use any data from Wikidata because of the public domain
dedication (CC0), but the reverse depends on if the OSM contributor agreed
to dedicate their edits to public domain. Without it, OSM data is licensed
under ODbL, and cannot be copied. We should make it easier to detect what
piece of OSM data is in PD.  I do like your USB analogy :) About names -
you will be surprised to discover that MB and other places are actively
pursuing Wikidata integration because WD tends to have a huge names list,
possibly bigger than OSM itself?

* Christoph, a very valid point in general. Do you have any statistics on
how often multiple meanings per osm object is a problem? In my experience,
this is very rare, but hard to say without numbers.  For the case of the
island being both a country and a land feature, I think it would benefit
OSM to actually have two objects with the same geometry - e.g. two
relations containing the same way(s).  One relation would treat it as an
admin boundary, with all the related tags, the other - as a land feature.
Data consumers would treat them separately. Conflating tags related to both
concepts into one object is not very good.  In a more general terms, you
usually have three cases:
-- 1:1 (most common imo)
-- one osm obj being a part of larger page (e.g. a list of churches). I
don't think wikidata/wikipedia tag is appropriate in this case, as that
page is not about this specific object, but about a class of similar
objects. We could use listed-on:wp, or partof:wp, or some other tag.
-- Your case - multiple concepts for the same object. Use either a
semicolon separated list of wd ids, or (better) - create multiple relations
to describe multiple concepts.

* Frederik, that bit of a small personal attack is uncalled for. I exposed
a lot of existing bad data, not added it. And I created complex tooling to
help everyone resolve it as a community, rather than try to tackle all of
it by myself.  A system for fixing problems is always better than one
person doing it by hand, and later retiring because the challenge is too
great.  Also, corresponding wikidata tag is not a bad data - it is simply a
copy of the existing Wikipedia tag, making it easier for tools and humans
to find and fix. As for your last email - fetching *corresponding* wikidata
items is not an error - its a duplicate of an existing information. That
information might be incomplete, but that's a separate issue.

* Lester, I'm not sure I understood your Douglas Adams example, PM me and
lets try to figure it out. It might has to do with ranking of each statement

See also:
Feature request for any lang fallback:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170920/47f7cc92/attachment-0001.html>

More information about the talk mailing list