[Osmf-talk] Tagging standards and meaning of words

grin grin-osm at drop.grin.hu
Fri Oct 21 22:46:44 UTC 2022


On Fri, 21 Oct 2022 16:42:00 +0300
Jaak Laineste <jaak at nutiteq.com> wrote:

> > For example there is no key for Hungarian cukrászda, which is not confetery, but the key used for such objects. Shall I create a new key? Maybe I should.
> > 
> > Does café mean something which works like a café (in England?) or something which is CALLED a café locally?  
> 
> Maybe you should semantically, but pragmatically you need to think of all the data “renderers” also, these would never get cukrászda and whatever it is called in the 1000+ other known written languages of the world. Find nearest English term with the most logical icon on your favorite map.

On the contrary, the problem is that there is no such a thing in England and in the US, while Europe is full of them, called Konditorei or like. 

It is a problem when we take into account that OSM was created on that island using their point of view of the world and we try to find simlar things where the key WORDS do NOT describe the objects but the key MEANING (what we define somehow) may do. 


However cukrászda was just an example that we are using natural language words as labels defining a theory (an type of object), more or less loosely, and we cannot often even know how loosely. Let me show you an example what I mean here.


There is a wonderful project called Wikidata, which is the database magic behind Wikipedia. It is a database of objects, or theories. Every entry is strictly one, single, well defined thing. (In theory at least, let's suppose for a moment that they really are.)

So there are a dozen entries for Hungary. Why? Since it's not the same country as Republic of Hungary (different name), not the same a People's Republic of Hungary (different name and different governance), which differs from Kingdom of Hungary (different area) or actually half a dozen countries all called "Kingdom of Hungary" in various times in history with different governance, area, borders, whatnot. 

These distinct objects do not have a name, they have a "label" or an object identifier like Q1470101 or Q16056854, so there is no clash between the NAME ("word") and the actual OBJECT ("theory") behind the name.

Outside Wikidata there MAY be a conflict when someone looks for "Kingdom of Hungary", since they have to actually think about it, and strictly define what they have in mind, then look up the specific object identifier, but from then on the meaning of the identifier is very specific. 
Right is Q14565199 (as in direction) and right is Q2386606 (as a legal position allowing a person to require a certain thing from another person or other persons). 


And back to the reality: even Wikidata have hard times handling humans. Like, my fine example about confectionery, where WD actually mix up things like confectioneries in Britain with ones in the USA, and with pastry shops and Konditorei/Cukrászda as well. It would be a multiple-month worldwide debate to actually sort this out there. I'm not ready yet. :-) But that's a practical problem of the theory, and in case of WD it could be fixed and I'll try to fix it someday.


So, let me go back to discussing Frederik's opinion: I believe it is a problem if the word "cafe" is a key, and some people get emotional when they run a pub or a pastry or a cukrászda and they're tagged as "cafe", and same way someone may get offended when they're tagged pastry when their name is actually "World Best Café" but they fall under the international "pastry" tag definition. 

I do not think it works well when we don't define what KEYS mean instead of letting people guessing at the specific meaning of an English word, and also letting people guessing at how much it means what it means, and whan are the limits of similarity of an idealised thing-called-the-key and the real-life object to be tagged. 

I do not think forest paths are "with a very bad smoothness": I walk them easily, there is nothing bad about them; yet they shall be tagged as such based on the definition of smoothness=*, and rightly so, since there is no point subjectively tag smoothness for every possible way of transport. So, in reality, "very_bad" does not mean it is very bad, it means that the path have class-6 smoothness, which we have defined, and we use the word "very_bad" to label that. Getting angry at OSM that it calls a perfectly good path "very_bad" would feel pretty confusing to me.

Peter



More information about the osmf-talk mailing list