[OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

Yuri Astrakhan yuriastrakhan at gmail.com
Tue Sep 26 17:08:33 UTC 2017


>
> > p.s. OSM is a community project, not a programmers project, it's about
> > people, not software :-)
>
> It's both.  OSM is first and foremost is a community, but the result of
our effort is a machine-readable database.  We are not creating an
encyclopedia that will be casually flipped through by humans. We produce
data that gets interpreted by software, so that it can render maps and be
searchable.  For example, if every person uses their own tag names and ways
to record things, the data will have nearly zero value.  We must agree on
conventions so that software can understand our results - which is exactly
what we have been doing on wiki and in email channels. Any tag and value
that cannot be recognized and processed by software is effectively ignored.


>   Totally agree. If some script can automatically add new tag
> (wikidata) without any actual WORK needed, then it is pointless,
> anybody can run an auto-update script.

  When ordinary (non geek) mappers do ACTUAL WORK - add wikipedia
> data, they add wikipedia link, not wikidata "stuff".
>

While sand castles may look nice, they don't last very long. When ordinary
people add just the Wikipedia article, that link quickly gets stale and
become irrelevant and often incorrect. The wikipedia article titles are not
stable. They get renamed all the time - there are tens of thousands of them
in OSM already that I found.  Often, community renames wp articles because
there are more than one meaning, so they create a new article with the same
name in its place - a disambig page.  There is no easy way to analyse
wikipedia links for content - you cannot easily determine if the wikipedia
article is about a person, a country, or a house, which makes it impossible
to check for correctness.

When I spend half an hour of my time researching which WP article is best
for an object, I do not want that effort to be wasted just because someone
else puts a disambig page in its place, and I have to redo all my work.

  When data consumers want to get a link to corresponding wikipedia
> article, doing that with wikipedia[:xx] tags is straightforward. Doing
> the same with wikidata requires additional pointless and time
> consuming abrakadabra.
>

no, you clearly haven't worked with any data consumers recently. Data
consumers want Wikidata, much more than wikipedia tags - please talk to
them. Wikidata gives you the list of wikipedia articles in all available
languages, it lets you get multi-lingual names when they are not specified
in OSM, it allows much more intelligent searches based on types of objects,
it allows quality controls.  The abrakadabra is exactly what one has to do
when parsing non-standardized data.

>
>   Validation of wikipedia tag values can and IS already done using osm
> data versus wikipedia-geolocated data extracts/dumps.
>
> Sure, it can be via dump parsing, but it is a much more complicated than
querying.  Would you rather use Overpass turbo to do a quick search for
some weird thing that you noticed, or download and parse the dump?  Most
people would rather do the former. Here is the same thing - you *could* do
validation via a dump, but that barrier of entry is so high, most people
wouldn't.  With the new OSM+Wikidata tool, which is already getting
hundreds of thousands requests (!!!) , it is possible to get just the data
you need, and fix the problems that have been always present, but hidden.
And all that is possible because of a single tag.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170926/dc02abd5/attachment-0001.html>


More information about the talk mailing list