[OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

Yuri Astrakhan yuriastrakhan at gmail.com
Tue Sep 26 17:50:20 UTC 2017


Yves, yes, they are external IDs. But so are wikipedia titles.  Visually
inspecting Wikipedia tile does not provide you with any way to verify its
correctness - you have to look in the external data source (WP).  As for
entering by hand - just like you shouldn't enter Wikipedia articles by hand
- you should copy/paste it from the article, or use the autocomplete field
in iD.  So in reality, these two things are nearly the same.  On the other
hand, modern rely on the internet connection, which means that an ID can be
shown as text in the user's language, together with other metadata from
Wikidata.  The concept of "internal" vs "external" is not as relevant now
as it was in the past...  (there is only one data - the internet :))

On Tue, Sep 26, 2017 at 1:43 PM, Yves <yvecai at gmail.com> wrote:

> I think that the underlying issue in wikidata tags is that they are
> external IDs. Not human readable, they cannot be entered 'by hand' nor
> verified on the ground.
> Once you accept them in OSM, you can't really complain about bots.
>
> Yves (who still think such UIDs are only needed for the lack of good query
> tools).
>
>
>
> Le 26 septembre 2017 19:08:33 GMT+02:00, Yuri Astrakhan <
> yuriastrakhan at gmail.com> a écrit :
>>
>> > p.s. OSM is a community project, not a programmers project, it's about
>>> > people, not software :-)
>>>
>>> It's both.  OSM is first and foremost is a community, but the result of
>> our effort is a machine-readable database.  We are not creating an
>> encyclopedia that will be casually flipped through by humans. We produce
>> data that gets interpreted by software, so that it can render maps and be
>> searchable.  For example, if every person uses their own tag names and ways
>> to record things, the data will have nearly zero value.  We must agree on
>> conventions so that software can understand our results - which is exactly
>> what we have been doing on wiki and in email channels. Any tag and value
>> that cannot be recognized and processed by software is effectively ignored.
>>
>>
>>>   Totally agree. If some script can automatically add new tag
>>> (wikidata) without any actual WORK needed, then it is pointless,
>>> anybody can run an auto-update script.
>>
>>   When ordinary (non geek) mappers do ACTUAL WORK - add wikipedia
>>> data, they add wikipedia link, not wikidata "stuff".
>>>
>>
>> While sand castles may look nice, they don't last very long. When
>> ordinary people add just the Wikipedia article, that link quickly gets
>> stale and become irrelevant and often incorrect. The wikipedia article
>> titles are not stable. They get renamed all the time - there are tens of
>> thousands of them in OSM already that I found.  Often, community renames wp
>> articles because there are more than one meaning, so they create a new
>> article with the same name in its place - a disambig page.  There is no
>> easy way to analyse wikipedia links for content - you cannot easily
>> determine if the wikipedia article is about a person, a country, or a
>> house, which makes it impossible to check for correctness.
>>
>> When I spend half an hour of my time researching which WP article is best
>> for an object, I do not want that effort to be wasted just because someone
>> else puts a disambig page in its place, and I have to redo all my work.
>>
>>   When data consumers want to get a link to corresponding wikipedia
>>> article, doing that with wikipedia[:xx] tags is straightforward. Doing
>>> the same with wikidata requires additional pointless and time
>>> consuming abrakadabra.
>>>
>>
>> no, you clearly haven't worked with any data consumers recently. Data
>> consumers want Wikidata, much more than wikipedia tags - please talk to
>> them. Wikidata gives you the list of wikipedia articles in all available
>> languages, it lets you get multi-lingual names when they are not specified
>> in OSM, it allows much more intelligent searches based on types of objects,
>> it allows quality controls.  The abrakadabra is exactly what one has to do
>> when parsing non-standardized data.
>>
>>>
>>>   Validation of wikipedia tag values can and IS already done using osm
>>> data versus wikipedia-geolocated data extracts/dumps.
>>>
>>> Sure, it can be via dump parsing, but it is a much more complicated than
>> querying.  Would you rather use Overpass turbo to do a quick search for
>> some weird thing that you noticed, or download and parse the dump?  Most
>> people would rather do the former. Here is the same thing - you *could* do
>> validation via a dump, but that barrier of entry is so high, most people
>> wouldn't.  With the new OSM+Wikidata tool, which is already getting
>> hundreds of thousands requests (!!!) , it is possible to get just the data
>> you need, and fix the problems that have been always present, but hidden.
>> And all that is possible because of a single tag.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170926/7c918ff1/attachment.html>


More information about the talk mailing list