[OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

Andy Townsend ajt1047 at gmail.com
Tue Sep 26 23:17:18 UTC 2017


On 26/09/2017 03:40, Yuri Astrakhan wrote:
> ... I have been blocked by Andy Townsend with the following message.
>
Let's begin at the beginning - this was a "0-hour block" - you weren't 
prevented from using the API for _any_ period of time, merely forced to 
read this message first.  This was a last resort - many other attempts 
at communication have been made over at least the last 10 months (since 
November 2016 - https://www.openstreetmap.org/changeset/43749373 ).  The 
issues that I raised back then are still true today - see 
https://lists.openstreetmap.org/pipermail/talk/2017-September/078780.html 
for more details.  It makes no sense to mechanically copy a wikidata 
value to OSM when the wikidata object expresses only part of the sense 
of the wikipedia page.

Simple example in case things are still not clear:

1) Imagine there are two objects in OSM - a village and an admin area 
containing that village.
2) Wikipedia only has one page for a "village and an admin area"
3) The wikidata page (probably created by a bot) is only for the village
4) Linking the OSM admin area to the wikidata page for the village is an 
error.

This is the sort of thing that you've been doing again and again for the 
last 10 months.

> A few interesting semi-relevant statistics so far:  the number of 
> discovered links to disambig pages is now back to over 800, even 
> without 100k+ untaged ways. And there are almost 38,000 osm objects 
> where wikipedia tag does not correspond with wikidata tag. The number 
> is very high, but fixing them should be semi-automated, as most of 
> them are redirects. TBD.

There are a lots of possibilities here.  Maybe the OSM object shouldn't 
have a wikipedia entry at all.  Maybe it's significantly changed since 
the link was added, and should be changed.  It needs someone with 
real-world knowledge of the OSM object to update the links - anything 
else is just guessing, and has no place in OSM.

If by "semi-automated" you mean a human-centric approach like Kort, 
MapRoulette, StreetComplete et al then fine - but that's not been your 
approach so far.

>
> Here's Andy's message, with my inlined replies. I think that almost 
> all of the raised points have been raised and answered in our previous 
> discussion, but I feel it is my responsibility to present them again.
>
>     You're conducting an import of known bad data (your own changeset
>     comments say "Further cleanup will be done using...").
>
>
> Per previous description, the existing data is already bad, and I am 
> simply making it possible to identify it, after discussing it on this 
> thread.
No, that is untrue.  See e.g. 
https://www.openstreetmap.org/changeset/52002597 .


>
>     You are wilfully ignoring the feedback that you're receiving now
>     and have received in the past. A lot of issues have been raised
>     about the quality of your edits - see
>     http://resultmaps.neis-one.org/osm-discussion-comments?uid=339581
>     . In many cases you seem to agree that you're adding rubbish, and
>     yet you continue.
>
>     You seem to be suggesting (in
>     https://lists.openstreetmap.org/pipermail/talk/2017-September/078767.html
>     ) that "the community" clean up your mess. This is not the way
>     that OpenStreetMap works - if an individual is adding data to it
>     (especially large quantities of data) then it is their
>     responsibility to ensure that the data that they are adding is
>     valid, or at least as valid as the data that is already there.
>
>
> Again, no, I am identifying rubbish, not introducing it, and I am very 
> actively replying to every comment I receive.
You are not actually _resolving_ any of the problems that people are 
finding with the edits that you are making.  See for example 
https://www.openstreetmap.org/changeset/52341792 .  In that example 
someone says that you added a wikidata tag in error.  You agree that you 
added it in error (and in fact a whole category of the tags that you've 
added is in error - I've commented on a couple more within the last 
hour).  You have not done anything to resolve this error that you have 
introduced into OSM .

Going further back, in your replies to changeset comments you've said 
things like "I have already stopped changing any objects except the 
admin levels regions 1-6" https://openstreetmap.org/changeset/43775555 
but have carried on regardless.  Mappers have repeatedly asked you to 
use geographically smaller changesets 
https://openstreetmap.org/changeset/44078387**https://openstreetmap.org/changeset/44090685 
https://openstreetmap.org/changeset/44203236 and yet you continue 
regardless.

Either you're incompetent in the changes you're making or you're lying 
to us; in neither case should you be continuing to edit as you have been 
doing.

> ... The way to solve the quality of this data is to analyze it with 
> the OSM+Wikidata tool I have built,
... or with something else that doesn't require OSM to be mechanically 
edited by you first.  As has already been said 
(https://lists.openstreetmap.org/pipermail/talk/2017-September/078867.html) 
this is utter nonsense.  If you need help resolving links between 
wikipedia and wikidata then get someone from the wikipedia/wikidata 
community to help you - don't dump a whole pile of stuff into OSM and 
expect us to resolve the errors that this reveals (like the village / 
admin area example above, where the error was basically at the stage 
where wikidata was created from wikipedia).

> to see the mismatches.  Since there are tens (hundreds?) of thousands 
> of issues already in the database, it is clearly impossible to fix it 
> by one person. The available choices are:  me doing it by hand, and 
> fixing a handful, or make it possible to find problems, so everyone 
> can fix them. (per Andy Mabbett explanation)
I'm perfectly happy to find and fix problems in OSM.  What I'm not happy 
to do is to find and fix errors between wikipedia and wikidata, and I 
very much object to your assumption that you can simply throw your 
wikidata problems at OSM and have us fix them.  It should be perfectly 
possible to navigate from wikipedia to wikidata so if you really need to 
get a wikidata entry there's no need to add wikidata to OSM at all.  In 
fact, since wikidata tags aren't human-readable there's no way that 
mappers can verify that they are still correct - something that has 
already been pointed out at 
https://lists.openstreetmap.org/pipermail/talk/2017-September/078750.html .

>
> I believe i have answered this numerous times above and in previous 
> conversations.  I cannot address tens of thousands of issues i *find*,
I think you need to explain what these errors are.  You do not need to 
change any OSM data in order to do that.  A missing wikidata tag in OSM 
is _not_ an error.  A wikidata tag added by you that links something in 
OSM to the wrong thing in wikidata (as per 
https://www.openstreetmap.org/changeset/43749373 et al) _is_ an error.

It's true that some things in OSM have wikpedia links that should really 
be "brand:wikipedia" or "operator:wikipedia" links.   That's not a big 
deal in the scheme of things since often the class of object is one that 
would not normally have a wikipedia page at all and whatever's consuming 
the data (human or machine) can probably filter out the rubbish.  What 
does not help, as you did in 
https://openstreetmap.org/changeset/52008692 , is adding an incorrect 
wikidata link to something that is unlikely to have one (in this example 
a bus stop).  This is just an example found be selecting changesets at 
random; I'm sure there are many more.

> I can only help community see them, and do my part in fixing them.  
> Without this effort, all the bad data in the form of incorrect 
> wikipedia tags will still be there, quickly rotting away with every 
> wikipedia page rename.

Let's be honest - data starts getting out of data as soon as it is added 
to OSM, and wikipedia / wikidata links are no different. Anything (human 
or software) that encounters a broken or meaningless wikipedia page can 
just ignore it and move on.  If we're worried about fixing stuff in OSM, 
let's concentrate on something worthwhile first - survey your local area 
or fix up some TIGER "residential" roads.  If you have personal 
knowledge of objects in OSM then by all means add wikipedia links to 
them if you think that it's useful - but adding wikidata links to OSM as 
well adds no real value to OSM and actually introduces errors if you 
link to the wrong thing, as you have been doing.

>
> P.S.  An interesting point was brought by Andy in the later online chat:
>
>
>     in the case of https://www.openstreetmap.org/changeset/43749373
>     the errors were explicitly introduced by you. The links from OSM
>     to wikipedia were correct, the thing (probably a bot) creating the
>     wikidata from wikipedia didn't understand the breadth of what the
>     wikipedia article represented, and you incorrectly linked from OSM
>     to the wikidata article.
>
>
> Andy, Wikidata ID is not correct or incorrect -- it is simply a number 
> assigned to a Wikipedia article. That number may have other 
> statements, which themselves may be incorrect. Adding Wikidata ID 
> locks that Wikipedia tag in place, to keep it from going stale - in 
> case that page is renamed, and in case a disambig is created in its 
> place.  In some cases, the concept presented in Wikipedia page is too 
> big for a single Wikidata entry, so someone may create additional 
> entries.  Adding Wikidata ID to an OSM object is not incorrect - it 
> might simply be not precise enough, and can be improved with my 
> analysis tools. Not having wikidata ID is far worse, than having a 
> less precise one, because it cannot be easily analysed and worked on.
That's simply rubbish.  Tags on an OSM object describe it in the real 
world.  They should be verifiable.  Whether an OSM object has a wikidata 
tag on it is essentially irrelevant as far as OSM is concerned - it's 
just a primary key into an external database. External data consumers 
might find the data in that database useful, but they can also get to it 
via wikipedia tags (which, being human-readable, are more likely to be 
maintained), so it's really not a big deal.

Regards,

Andy



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170927/1037d47e/attachment-0001.html>


More information about the talk mailing list