[OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

Yuri Astrakhan yuriastrakhan at gmail.com
Tue Sep 26 02:40:36 UTC 2017


Since this thread had not received any new discussion in the past 4 days, I
assumed all points were answered and proceeded as planned, per mechanical
edit policy. Yet, after I have added all the nodes and moved on to
relations, I have been blocked by Andy Townsend with the following message.
I believe Andy is acting in best interest of the project, yet might have
missed or misread this discussion.  Also, the block is such that I am no
longer able to even reply on the changesets to the raised questions, so
moving it here.  I believe I acted in good faith according to the
mechanical edit policy - discussed with the community, and proceeded.

A few interesting semi-relevant statistics so far:  the number of
discovered links to disambig pages is now back to over 800, even without
100k+ untaged ways. And there are almost 38,000 osm objects where wikipedia
tag does not correspond with wikidata tag. The number is very high, but
fixing them should be semi-automated, as most of them are redirects. TBD.

Here's Andy's message, with my inlined replies. I think that almost all of
the raised points have been raised and answered in our previous discussion,
but I feel it is my responsibility to present them again.

You're conducting an import of known bad data (your own changeset comments
> say "Further cleanup will be done using...").
>

Per previous description, the existing data is already bad, and I am simply
making it possible to identify it, after discussing it on this thread.


> You are wilfully ignoring the feedback that you're receiving now and have
> received in the past. A lot of issues have been raised about the quality of
> your edits - see
> http://resultmaps.neis-one.org/osm-discussion-comments?uid=339581 . In
> many cases you seem to agree that you're adding rubbish, and yet you
> continue.
>
You seem to be suggesting (in
> https://lists.openstreetmap.org/pipermail/talk/2017-September/078767.html
> ) that "the community" clean up your mess. This is not the way that
> OpenStreetMap works - if an individual is adding data to it (especially
> large quantities of data) then it is their responsibility to ensure that
> the data that they are adding is valid, or at least as valid as the data
> that is already there.
>

Again, no, I am identifying rubbish, not introducing it, and I am very
actively replying to every comment I receive.  This is not "my data" - the
data is already in OSM in the form of the incorrect wikipedia tags. This
action is identical to what iD editor does - it *automatically* adds
corresponding wikidata ID, without any additional checks, and without many
users even being aware of it.  The way to solve the quality of this data is
to analyze it with the OSM+Wikidata tool I have built, to see the
mismatches.  Since there are tens (hundreds?) of thousands of issues
already in the database, it is clearly impossible to fix it by one person.
The available choices are:  me doing it by hand, and fixing a handful, or
make it possible to find problems, so everyone can fix them. (per Andy
Mabbett explanation)

Please go back and reread some of your previous replies on
> http://resultmaps.neis-one.org/osm-discussion-comments?uid=339581 .
> Things like "I will mostly work on high level objects (admin level <= 6)"
> suggests that you are at the very least being disingenuous in your dealings
> with the OSM community.
>

This was written a long time ago, before this effort was even started, and
before I have built the tools (OSM+Wikidata) to let community find issues.
Back then I had to do everything myself, and since it was clearly
impossible, I stopped after fixing the wast majority of the uncovered
issues by hand.


> Please stop this mechanical edit now and instead spend your time
> addressing the issues that have been raised.
>

I believe i have answered this numerous times above and in previous
conversations.  I cannot address tens of thousands of issues i *find*, I
can only help community see them, and do my part in fixing them.  Without
this effort, all the bad data in the form of incorrect wikipedia tags will
still be there, quickly rotting away with every wikipedia page rename.

P.S.  An interesting point was brought by Andy in the later online chat:

>
> in the case of https://www.openstreetmap.org/changeset/43749373 the
> errors were explicitly introduced by you.  The links from OSM to wikipedia
> were correct, the thing (probably a bot) creating the wikidata from
> wikipedia didn't understand the breadth of what the wikipedia article
> represented, and you incorrectly linked from OSM to the wikidata article.
>

Andy, Wikidata ID is not correct or incorrect -- it is simply a number
assigned to a Wikipedia article.  That number may have other statements,
which themselves may be incorrect. Adding Wikidata ID locks that Wikipedia
tag in place, to keep it from going stale - in case that page is renamed,
and in case a disambig is created in its place.  In some cases, the concept
presented in Wikipedia page is too big for a single Wikidata entry, so
someone may create additional entries.  Adding Wikidata ID to an OSM object
is not incorrect - it might simply be not precise enough, and can be
improved with my analysis tools. Not having wikidata ID is far worse, than
having a less precise one, because it cannot be easily analysed and worked
on.

With utmost sincerity,
Yuri.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170925/b6c7698e/attachment-0001.html>


More information about the talk mailing list