[OSM-talk] Semi-auto converting Wikipedia -> Wikidata tags

Yuri Astrakhan yuriastrakhan at gmail.com
Fri Nov 25 21:55:08 UTC 2016


Hi, I am exploring ways to make more educational maps in Wikipedia. For
example, this graph shows all US state governors. It works by querying
Wikidata for the governors' info, and drawing state overlays using OSM
relations tagged with the Wikidata IDs.

https://www.mediawiki.org/wiki/Help:Extension:Kartographer#GeoShapes_via_Wikidata_Query

This new technology should (hopefully) enhance location and politics
related articles. To work, this technology relies on the Wikidata-tagged
objects in OSM, so the more objects are tagged, the more interesting maps
can be created by the community. While the top level (countries, states)
are already tagged, the smaller areas tend to have just the Wikipedia tag.
I have been adding the matching Wikidata tag for many admin-level relations
by using JOSM's "Fetch Wikidata ID" command (Wikipedia plugin).  This works
great most of the time, but on occasion it is not perfect. For example, in
England there are Administrative and Ceremonial (historical) parishes. Both
would be tagged with the same Wikipedia tag because both concepts are
described in the same article, yet the matching Wikidata ID would usually
cover just one aspect (usually ceremonial), but not the admin.  I plan to
do the following:

* Going from admin_level 1..10+, for all locations that have Wikipedia tag
but not Wikidata tag, add the matching Wikidata IDs using Wikipedia
plugin's "fetch Wikidata ID" command. At the moment, Wikipedia plugin does
not automatically resolve Wikipedia page redirects (if a page was renamed),
so I often have to do it by hand.
* Once all areas are marked, I would like to ensure that Wikidata and OSM
are in sync, by checking that Wikidata tags are actually pointing to admin
areas, and that the tree structure in OSM and in Wikidata match. E.g. this
query shows the tree structure of Wikidata. If anyone has any CC0 sources
of the admin structure of the countries, please msg me.

https://www.wikidata.org/w/index.php?title=User:Yurik/Admin_regions

To clarify - I am NOT adding wikidata IDs by some magical GPS coordinate
resolution or name matching.  I am simply converting existing Wikipedia tag
into the Wikidata tags, because there is always a 1 to 1 matching between
them, and adding a Wikidata tag ensures that even if the WP article is
renamed or deleted, at least Wikidata tag stays valid.  Adding WD tag that
describes ceremonial parish rather than admin district is "incrementally
beneficial", in the sense that it is still relevant - it points to the
right Wikipedia article, and it also makes it easier to further improve it
to point to the admin district via a semi-automated (spreadsheet/text
checks) validation, or checking for dups.

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20161125/49540775/attachment.html>


More information about the talk mailing list