[Tagging] RFC: remove alphanumeric code visible in infoboxes at OSM Wiki linking to Wikidata

Minh Nguyen minh at nguyen.cincinnati.oh.us
Wed Jan 20 09:05:17 UTC 2021

Vào lúc 13:20 2021-01-19, Mateusz Konieczny via Tagging đã viết:
> Jan 19, 2021, 19:34 by 
> minh at nguyen.cincinnati.oh.us:
>     Like OSM and its own tagging ontology, Wikidata is still a work in
>     progress. The fact that something isn't represented now doesn't mean
>     it can't represent it a minute from now. You've essentially
>     identified a number of instances of mistaken or unrefined tagging.
>     We know all about that in OSM! :-)
> My claim is that
> - (1) it is a fundamental issue, with very large part of supposed 
> matches being useless
> or misleading

I'm not convinced that enough of the data items' "Wikidata concept" 
statements or Wikidata's "OSM tag or key" statements are useless that we 
should cast aspersions on both properties -- if only because these 
properties aren't necessarily used in ways that require the degree of 
semantic precision and binary correctness that has been suggested in 
this thread. Besides, there are plenty of tags for which there can be no 
daylight between Wikidata's definition and OSM's. For example, the 
consensus in [1] would be untenable without some way to associate 
denomination=apostolic_assembly_of_the_faith_in_christ_jesus with 
Q3625552 and so on.

To the extent that anyone has been misled by a Wikidata link in the 
infobox, it may be because the link is presented without explanation. A 
cramped infobox isn't an effective place to introduce a reader to an 
advanced concept or workflow, nor is it a junk drawer for everything 
that might be handy as a shortcut for advanced users. This is not 
limited to Template:Description. (Template:Place, I'm looking at you. [2])

> - (2) even if we fix mismatches this links would not be useful for 
> infobox display

I don't disagree with you. I was commenting on the inevitable digression 
on this thread rather than the original RFC. Sorry for being unclear.

> - (3) we are anyway missing people willing to fix this invalid links in 
> OSM Wiki
> though I see that you improved
> https://www.wikidata.org/w/index.php?title=Q994972&action=history 
> <https://www.wikidata.org/w/index.php?title=Q994972&action=history> - 
> thanks!


The OSM Wiki needs more care and attention in general. Others have 
claimed that Wikidata could serve as a stopgap for the wiki's 
unfortunate gaps in translation. I'm sympathetic to that notion because 
relatively few of OSM's popular languages have made a dent in terms of 
translation coverage on the wiki. (Not sure we even have statistics on 
that, aside from the handful of languages privileged enough to have 
tracking categories.)

If the community finds the tradeoff of semantic imprecision to be 
anathema, then the alternative to a dependency on Wikidata is that wiki 
needs to be more self-sufficient. Self-sufficiency requires a _lot_ more 
activity from a broader cross-section of the mapping community. Some 
outstanding proposals to upgrade the wiki [3][4] would facilitate that 
increased activity, but even some subscribers to this list are 
intimidated by the wiki's basic editing workflow. Of course, it would 
also help if standard parts of the wiki's pages, such as the infobox, 
were more self-explanatory or at least better documented, so that folks 
can more easily identify mistakes.

> If you are interested I can generate big table with list of OSM values/keys
> and description at OSM wiki and description of what was matched to them
> at Wikidata. Blatantly wrong are nearly fixed, but many
> dubious ones remain and I am not sure what should be done with them.

If you can spare the time and effort, I'm sure the Wikidata proponents 
would appreciate your help identifying these problems. Getting these 
associations right will also benefit OSM, not just Wikidata. 
Correspondence tables are a great way to spot gaps in our tagging 
system, as I'm discovering trying to map U.S. traffic signs to OSM tags. [5]

> There are many like
> flag:type=military 
> https://wiki.openstreetmap.org/wiki/Tag:flag:type%3Dmilitary 
> <https://wiki.openstreetmap.org/wiki/Tag:flag:type%3Dmilitary>
> linking https://www.wikidata.org/wiki/Q602300 
> <http://www.wikidata.org/wiki/Q602300war>
> "war flag - variant of a national flag for use by the nation's military 
> forces on land"
> with mismatch "on land" from Wikidata and
> "Military flags include war flags, naval flags, and flags representing 
> specific military forces or units."
> on OSM wiki
> Keep it? But it is actively misleading, incorrect just enough to not be 
> spotted by
> someone who followed link but enough to cause confusion, especially by
> someone using it for translating or understanding tag meaning.
> Delete it? Then we would delete most of current matches. There are some 
> exact matches
> but it is fairly rare.

I solved this one by creating a new item to represent military flags in 
general. Wikidata actually needed this item anyways, because Wikimedia 
Commons was distinguishing between military flags and war flags, whereas 
Wikipedia was not. [6]

I don't think a handful of examples proves that the 2,191 "Wikidata 
concept" data item statements or the 2,959 "OSM tag or key" Wikidata 
statements are generally unreliable. (Though you happened to find a 
mistake I made in haste, so if anything, I'm unreliable!) Regardless, I 
think we can find agreement on the merits of the infobox link (or lack 
thereof) before even considering its reliability.

>     If there's no exact match for amenity=toilets, one can be created as
>     a replacement for Q813966 in the infobox and in the associated data
>     item. The "OSM tag or key" external ID alone will likely protect
>     that entity from deletion, based on Wikidata's notability
>     guidelines. It isn't so outlandish to imagine hundreds of OSM keys
>     and tags getting their own Wikidata items in the future for the same
>     reason.
> What would be the benefit over 
> https://wiki.openstreetmap.org/wiki/Data_items 
> <https://wiki.openstreetmap.org/wiki/Data_items>
> that are strict matches to OSM tags?

The data item would still link to a Wikidata item regardless, and that 
item would in turn link to the original, not-quite-fitting item.

Perhaps the question is why the data item needs a link to Wikidata at 
all if Wikidata is going to link back to the OSM Wiki anyways? In 
theory, reciprocal properties should not be necessary. However, in 
practice, some data consumers like iD will find themselves querying the 
OSM Wiki in the first instance, while others will query Wikidata 
directly; it's a bit more convenient to minimize database crosswalks 
where possible, and that allows the wiki to be marginally more 

> And putting there some relationships to Wikidata instances such as
> - narrower than Q813966
> - wider than Q602300 <http://www.wikidata.org/wiki/Q602300war>
> - intersection of QX and QY
> - sum of QA, QB, QC
> - place where profession QT is practiced
> - access of QZ is specified by this tag
> - organization QP assigns this identifier
> etc
> rather than current "this is sort-of related, not specified how"

Another half-measure would be to add a qualifier on the data item's 
"Wikidata concept" statement to indicate that the "nature of statement" 
is "approximate" in some way. Wikidata is rife with qualifiers along 
these lines, and we could easily set up something similar for data 
items. But that nuance won't easily fit into the infobox.

>     I think you've at least made a pretty good case that the Wikidata
>     line in the infobox needs to be contextualized. For that matter, the
>     wiki could use a Help:Infobox page that explains each of the
>     sections in more detail.
> What you mean by "contextualized"? (I am open to alternative proposals 
> how to handle it,
> but I am not sure what you mean here)

My first thought was to simply fold the Wikidata link into the "Tools 
for this tag" list. [7] At least this sets the expectation that Wikidata 
is some kind of external helper tool. In the context of documentation 
about a specific tag, Wikidata is probably no more relevant than taginfo 
or overpass turbo anyways.

Secondly, if we point a new mapper to the wiki as a valuable resource 
that somehow gives them enough guidance to avoid nastygrams about 
tagging in changeset comments, then I'm not sure they'd necessarily know 
the distinction between "Implies", "Requires", and "Useful combination"; 
what taginfo is; what the taginfo numbers mean; or what the "IN" link 
does. We've managed to squeeze in a few links and tooltips, but those 
things are hardly discoverable or comprehensive. The Wikidata link is 
just one more usability papercut.

Contextualization means explicitly clarifying the relationship between 
the link and the overall page and pointing out any major caveats. It may 
be too much to expect of such a small amount of real estate. That, to 
me, would be a stronger argument for removing the link than any 
hairsplitting about washrooms.

A "Help:Key and value description pages" article would complement the 
editor-focused page you recently created [8] but for readers. 
Wikipedia's help page for taxoboxes could be a helpful starting point. 
[9] Personally, I tend to dive into things before reading documentation, 
but some people are more comfortable interacting with a website that 
provides that level of detail.

[2] But seriously: 
[3] 3 proposals at 
et seq.
[4] https://github.com/openstreetmap/operations/labels/service%3Awiki
[5] https://wiki.openstreetmap.org/wiki/MUTCD
[6] https://www.wikidata.org/wiki/Q104911496
[7] https://wiki.openstreetmap.org/wiki/Special:Diff/2094793
[9] https://en.wikipedia.org/wiki/Wikipedia:How_to_read_a_taxobox

minh at nguyen.cincinnati.oh.us

More information about the Tagging mailing list