[OSM-talk] Separating all metadata from coordinates in OSM into a wikibase instance (Was: Re: Roadmap for deprecation of name tags in OSM)

pangoSE pangose at riseup.net
Sun Aug 9 12:29:58 UTC 2020


The discussion below spawned the following idea of migrating the whole tags system instead.

Remember we OSM contributors are in the "business" of generating high quality geo- AND metadata with references ideally for every single change or statement and ideally linked to other data sources about the physical world.

The geodata handling in OSM is IMO superb. 

Unfortunately the metadata is handled less nice because they are not linked data with a unique id and not individually referencable. 

I really dislike the current osm tags system. Its IMO not the best way anno 2020 to handle metadata and should be fixed as soon as possible.
I therefore suggest we create a wikibase instance called OSMData and migrate all our tags into that system.

Of course this is also a big change which has to be considered carefully.

I believe linked data is the only sane way to go forward when it comes to metadata.

What I'm suggesting here is to migrate from our own special purpose legacy tags system very specific to osm to a more standardized linked data storage model based on unique identifiers instead.

This would make it much easier for others to handle our non-geographic data and to create validations on the languages used, tag combination used, etc. It could simply link to wikidata when needed e.g. for labels of countries, cities, etc.

We would also in get unique ids for all osm objects, which is very very nice.

The js interface of wikibase is unfortunately quite horrible IMO, but I believe our editors can easily be adapted to update osmdata in the wikibase via the REST API so that most mappers won't have to visit the wikibase UI at all.

I also suggest that we carefully consider what license we want for this metadata. I personally suggest CC0.
All the geographic data in OSM remains the current license.

This protects all the hard labor of gathering and verifying coordinates and makes it easier to share metadata and for data consumers to integrate osmdata and wikidata (except the coordinates).

I have not thought a lot about the implications for this license change for the metadata but it really makes no sense to restrict public metadata in general IMO. It would help wikidata and the open data movement a lot as you would be able to e.g. crosscheck the names of all hospitals in Kenya and add missing names in either project without worrying about license restrictions. Or list missing hospitals in either project that appears in the other. Or list differing names/labels and compare the references, etc.

None of the above examples are easy to do today as the author of the brilliant https://github.com/EdwardBetts/osm-wikidata/ can surely testify to. In a SPARQL linked data world these would be rather simple queries crafted in a few minutes by an experienced SPARQL query editor which we already have in our community.

PS: The past introduction of wikibase as an addon to osmwiki is unfortunate because it does not seem to have resulted in many benefits and have maybe given some osm contributers a negative image of linked data. Its really very different from what I propose here.

pangoSE <pangose at riseup.net> skrev: (9 augusti 2020 13:05:54 CEST)
>These are valid concerns. See my response to James.
>If Wikimedia should become uncooperative we could easily set up our own
>wikibase installation. See https://www.wbstack.com/
>It takes a few minutes plus some configuration time.
>In fact this might be much better than forcing our data into wikidata
>which is very tied to education and does not accept all our objects
>that have names currently.
>In case we take this route I would recommend having another prefix than
>Q for our unique ids.
>Mateusz Konieczny via talk <talk at openstreetmap.org> skrev: (9 augusti
>2020 12:16:33 CEST)
>>or has downtime? or deletes data/items used by OSM? or bans OSM
>>or refuses to ban vandal/troll/harasser? or fails to ban them quickly?
>>Aug 9, 2020, 11:45 by james2432 at gmail.com:
>>> is there a contingency plan if wikipedia/wikimedia ceases to exist?
>>> On Sun., Aug. 9, 2020, 4:29 a.m. pangoSE, <> pangose at riseup.net> >
>>>> I suggest we create a roadmap for deprecating of storing and
>>updating names in OSM for objects with a Wikidata tag.
>>>> The rationale is explained here:
>>>> https://josm.openstreetmap.de/ticket/19655
>>>> This of course affects the whole project and data consumers as
>>Every OSM user will have to become a Wikidata user as well to edit the
>>names or add name references (through the editors)
>>>> Substantial changes will have to be made:
>>>> * nominatim will need to support fetching names from wikidata
>>somehow. It could probably be done on the fly.
>>>> * >> openstreetmap.org <http://openstreetmap.org>>>  will need to
>>fetch from wikidata when displaying any object. 
>>>> * rendering the standard map will have to support fetching from
>>>> * all editors would have to fetch and enable editing of Wikidata
>>>> * maybe it no longer makes sense to have 2 separate logins? We
>>should unify the logging in as much as possible. Ideas are welcome on
>>how to do that. Perhaps retire signing up as OSM user on >> osm.org
>><http://osm.org>>>  and ask users to create a Wikimedia account
>>and log in with that? 
>>>> I personally don't see any problems connecting Wikimedia and OSM
>>closer than the islands they are today.
>>>> As mentioned in the ticket above data consumers like Mapbox already
>>prefer Wikidata names. I'm guessing thats because they are simply
>>better quality, better modeled, better referenced and better protected
>>against vandalism.
>>>> WDYT?
>>>> Cheers
>>>> pangoSE
>>>> Ps I choose this list because this not only relates to tagging, but
>>to the wider ecosystem._______________________________________________
>>>>  talk mailing list
>>>>  >> talk at openstreetmap.org
>>>>  >> https://lists.openstreetmap.org/listinfo/talk

