[OSM-talk] Could we just pause any wikidata edits for a month or two?

Yuri Astrakhan yuriastrakhan at gmail.com
Wed Oct 18 04:14:34 UTC 2017


Lester, I agree with you that Wikidata should not contain an object for
everything that OSM may have.  I don't believe there should be an entry for
every McDonalds on the planet, or for every artist's work that someone may
decide to include in OSM.  But that's up to Wikidata contributors.  Lets
instead talk about practical usages of our data.

Here is a wonderful site I saw at a conference a few days ago.  It lets you
plan your trip based on the places you are interested in.  You can
visualise all sorts of places - cultural, religious, hotels, bars -
anything, and plot your course.  And it uses Wikidata, images from Commons,
and Wikipedia text itself to describe the places.  The authors spoke at
length how Wikidata tags in OSM has helped them build it, and the
difficulty they had in all sorts of "data voodoo" to figure things out.
For example, they often correlate OSM & Wikidata locations by proximity,
and try to guess if it's a match. They have done an outstanding job making
sense of our data, but I think we could have made their job a lot easier
with our communal data curation capabilities, and also help others who may
have similar needs.

https://opentripmap.com/en/#14/40.7355/-73.9806

You do raise an important point about 1:1 vs part of vs ...  In order to be
useful in data processing by 3rd party, data needs to answer a simple
questions:  does the linked Wikidata/Wikipedia represents this whole
object, or is it simply related to it in some way.  Here, the 1:1 is meant
somewhat loosely - there are some cases when things don't align perfectly,
but that's a separate topic.

If wiki* page is about that object, the consumer may choose to use
multilingual names, show a portion of Wikipedia articles in the user's
language, use Wikidata statements, and show images from Commons.

If wiki* is only *related* somehow to the object, no such automatic usage
is possible. The link is still very valuable for the editors of the map,
but not as much to the data consumers.  Examples include a wiki article
that has just a section about this work of art, or wiki page is a list that
includes all churches in the area, or describe a class of these objects
(e.g. brand) but not this object itself.  Moreover, I suspect our favorite
tools like Nominatim would also be mislead if they rely on Wiki* links that
relate to the object, but not about the object itself. After all, if the
object is well known, it would probably have its own wiki page, or at least
a wikidata entry.

Some translations are completely different articles?

I'm not sure what you meant here. I have heard of rare cases when unrelated
wikipedia articles are connected to each other, but usually those get fixed
as soon as someone notices.


> The problem I still see is that many of the items I am looking to link to
> are elements of an article rather than the whole article, such as the
> location of the works of a particular artist. At some point in the
> future wikidata may well have a complete index of QID's for every
> artist's work, but currently I don't have the time to add wikidata
> entries where they don't exist, so a link to the artists wikipedia
> article which may or may not actually list this particular work is
> second best and in many cases there is not even an english version :(
>

Sure, lets just add it as a different tag, not wikipedia/wikidata. We could
call it related:wikidata or related:wikipedia:en, or subdivide it even
further. Note that here, unlike the main wikipedia tag, the
related:wikipedia:en might not be the same as wikidata. Moreover, I would
argue that here we should use related:wikipedia:xx format with the language
code, because the article content is likely to differ between languages.


> Some bot then modifying that link out of context is not helpful and
> while the idea of 'nobot' flags may seem a solution, it's just adding
> another layer of complexity which potentially needs to exist for EVERY
> tag on EVERY object. Something I don't think should be allowed!
>

Agree - I think a bot injecting wikipedia/wikidata tags based on some
heuristics, e.g. "has the same object class and is nearby" is not very good
and error prone. This could be a human-curated process, e.g. ask the user
to help deciding which  Wikipedia articles does this object represent, and
offer some likely candidates, but it shouldn't be automatic.  I think
Mapbox was working on something like that?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20171018/9fc6d0d4/attachment.html>


More information about the talk mailing list