[OSM-talk] Fixing wikipedia/wikidata tags

Yuri Astrakhan yuriastrakhan at gmail.com
Tue Feb 7 09:26:38 UTC 2017


Oleksiy, we should continue doing the on-the-ground scouting, but I am not
talking about that, I am talking about tens of thousands of errors that OSM
already contains, and a way to find them. Having good cameras with GPS do
not help with this.

I already found 1000+ OSM objects with incorrect wikipedia tags. For
example, http://www.openstreetmap.org/relation/3715384 :
wikipedia = https://en.wikipedia.org/wiki/Oron -- a disambiguation page
that lists all the meanings of the word Oron
wikidata = https://www.wikidata.org/wiki/Q407935

Thanks to Wikidata tag, I can catch this errors. Wikidata item
shows instance of = Disambiguation page, meaning that the page is NOT about
the place in Nigeria.  I could of course simply delete the wrong tags, but
I would prefer to fix them one by one. Also, the disambiguation pages are
just the tip of the iceberg. There are many other types of imprecise
information, such as lists.

Lets discuss our tagging approaches and guidelines as listed in
https://www.mediawiki.org/wiki/User:Yurik/Wikidata_OSM_questions

On Tue, Feb 7, 2017 at 3:24 AM Oleksiy Muzalyev <oleksiy.muzalyev at bluewin.ch>
wrote:

> Good morning Yuri,
>
> On Saturday I added the Wikidata tag to the monument [1] of Mikhail
> Bakunin [2] in Bern. In fact, I had added also the monument itself on the
> map. I searched for it for quite some time at Bremgartenfriedhof, as there
> was a typing error in the English Wikipedia article concerning the box
> number (it is corrected already).
>
> I also added some ground and aerial photos of the monument with GPS
> coordinates to the Wikimedia category, published the GPS trace to the OSM,
> and filmed a short video in English language:
>
> https://commons.wikimedia.org/wiki/File:Bakunin_Monument_Bern_EN.webm
> https://youtu.be/GCGdnFf8BDY
>
> and the same video in Russian:
>
> https://commons.wikimedia.org/wiki/File:Bakunin_Monument_Bern_RU.webm
> https://youtu.be/REjGTkJYKwU
>
> Quality on Youtube is better, as I could not figure out yet how to convert
> a video to the WEBM format without some quality loss.
>
> I mean that in addition to validating by scripts the legwork also have got
> a potential. In this respect, it would be helpful if we had the Wikipedia &
> Wikidata layer on the OSM map, with an option to see Wikidata items without
> an image, Wikipedia articles in different languages, so a human may see,
> analyze, and visit an object on the ground to clarify the situation. At the
> this point, I would not dare to correct an OSM-Wikipedia inconsistency
> without first visiting, recording a GPS trace, and filming it. So in my
> opinion it should be on a map, in addition to a list.
>
> Some new hardware tools became affordable by now: precise GPS/GLONASS
> trackers, video-cameras with stabilized gimbals for ground and aerial
> filming, directional microphones. But also the photo-cameras themselves
> became better. A human armed with these new tools can do a lot of useful
> work at a location, though it may take some time until we learn how to
> employ these tools effectively.
>
> [1] http://www.openstreetmap.org/node/4665613556#map=19/46.95039/7.42234
> [2] https://en.wikipedia.org/wiki/Mikhail_Bakunin
>
> With best regards,
> Oleksiy
>
>
> On 07.02.17 03:06, Yuri Astrakhan wrote:
>
> TLDR: researching ways to validate wikipedia and wikidata tags, wrote a
> script to cross-check OSM and Wikidata, found many incorrect disambig
> references, would love to start community discussion on best guidelines
> going forward.
>
>
> I have been analyzing the quality of OSM's wikipedia and wikidata tags by
> cross-checking data using both OSM tags and Wikidata.  My first goal is to
> fix "disambiguation" references - when OSM object links to the Wikipedia
> disambiguation page, instead of the real location page. I have already
> fixed about 200 objects, but there are about 800+ relations left, and I
> could really use some help.  I don't think its possible to add them to
> MapRoulette just yet.
> https://www.mediawiki.org/wiki/User:Yurik/OSM_disambigs
>
> While fixing wd/wp tagging issues, I have been putting together a list of
> open questions on how we want to improve wikipedia and wikidata tags in
> general, and create some guidelines. Lets discuss them in the talk page?
> https://www.mediawiki.org/wiki/User:Yurik/Wikidata_OSM_questions
>
> Lastly, if you have any suggestions on different ways to validate data
> using the mixture of Wikidata and OSM, let me know.  At the moment I have a
> list of all types of OSM objects' wikidata IDs, and mark the bad ones with
> a value. If OSM's wikidata's "instance of" of one of the bad types, my
> script puts those OSM objects it into a separate list that I can analyze.
> The list of types is here - sort by the second column:
>
> https://commons.wikimedia.org/wiki/Data:Sandbox/Yurik/OSM_object_instanceofs.tab
> Feel free to modify the second value of any row to indicate that those
> objects should be fixed.
>
>
> _______________________________________________
> talk mailing listtalk at openstreetmap.orghttps://lists.openstreetmap.org/listinfo/talk
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170207/c1c5b733/attachment-0001.html>


More information about the talk mailing list