[Tagging] Can OSM become a geospacial database?

Michael Patrick geodesy99 at gmail.com
Fri Dec 7 04:30:17 UTC 2018

> you are right that there are dictionaries about this stuff, but you will
> have to have a basic idea in order to make use of them, particularly if
> English is not your native language, you might look up terms that you are
> familiar with, and might not be aware that you are missing another relevant
> term, or that the term is not a translation with more or less the same
> meaning but only loosely connected.

Trade in goods and services is international - most countries publish their
classifications in their official language(s) and script. I looked looked
at Mozambique at random, it's 520 pages of similar tables as the ISCO. The
one that don't publish their own probably default to the ISOC standard, or
like North Korea, don't play well with others.

(Sample extract ... some of the fonts for the Asian languages didn't copy
Ireland - CSO Standard Employment Status Classification
Israel - (SCO) די יחלשמ לש דיחאה גוויסה
Italy -  Classificazione delle professioni (CP 2011)
Jamaica - Jamaica Standard Occupational Classification (JSOC 1991)
Japan - (JSOC)
Korea, Republic of -
Latvia - Profesiju klasifikators (PK)
Lithuania -  Lietuvos profesijų klasifikatorius (LPK)
Malaysia -  Malaysia Standard Classification of Occupations (MASCO 2008)
Maldives - International Standard Classification of Occupations (ISCO)
Mauritius - National Standard Classification of Occupations (NASCO-08)
Micronesia, Federated States of - International Standard Classification of
Occupations (ISCO)
Mozambique - Classificação das Profissões de Moçambique revisão 2 (CPM Rev2)
Nauru - International Standard Classification of Occupations (ISCO)
Netherlands - Standaard Beroepenclassificatie 1992 (SBC 1992)
New Zealand - Australian and New Zealand Standard Classification of
Occupations (ANZSCO)
Norway - Standard for yrkesklassifisering (STYRK-08)
Palestine -  (PSCO) ل د ط ا ير ا ا ف
Panama - Clasificación Nacional de Ocupaciones (CNO 2010)
Paraguay - Clasificación Paraguaya de Ocupaciones (CPO)
Philippines - 2011 Philippine Standard Occupational Classification (PSOC)
Poland - Klasyfikacja zawodów i specjalności na potrzeby rynku pracy (KZiS)
Portugal - Classificação Portuguesa das Profissões (CPP/2010)
Qatar - Occupations
Romania - Clasificarea Ocupatiilor din Romania (COR)
Russian Federation - Общероссийский классификатор занятий (ОКЗ),
Общероссийский классификатор профессий рабочих, должностей служащих и
тарифных разрядов (ОКПДТР)
Sao Tome and Principe - Classicação das Profissões (CNP)
Serbia - Jedinstvena nomenklatura zanimanj / Klasifikacija zanimanja (JNZ /
Singapore - Singapore Standard Occupational Classification (SSOC 2010)
Slovakia -  Štatistická klasifikácia zamestnaní (ISCO-08)
Spain - Clasificación Nacional de Ocupaciones (CNO-11)
Suriname - International Standard Classification of Occupations (ISCO-08)
Sweden - Standard för svensk yrkesklassificering (SSYK)

Language is reflecting reality, and if the way construction work is
> organized is different in different countries, also the terms describing
> the workers will not be matchable.

Yes, there are similarities and differences. The Japanese top-level
categories make a clear distinction between the wooden structure and other
structures that I guessed at in my previous post. But it is visible there.
The sub-items are what provide the ability to match. That is why there are
'crosswalks', a means like a document or table describing a mechanism or
approach to translating, comparing or moving between standards, converting
skills or content from one discipline to another. If you don't want to use
the local one, there is an international one
<http://www.ilo.org/public/english/bureau/stat/isco/>, and eventually
someone can reconcile the local one.

Additionally we are not going to add every single term that describes a
> profession as a tag, because there are synonyms and overlap. We usually try
> to create (not too) coarse classes/groups and use subtags to distinguish
> minor differences.

Classification schemes are always a Goldilocks Problem
<https://en.oxforddictionaries.com/definition/goldilocks> -  if it is too
simple, it will grow awkwardly as it encounters edge cases and exceptions,
it it is too complex, it will be ignored or even worse applied incorrectly.
Usually you will see only three levels, and maybe a sparse fourth. The 'not
too' qualifier mentioned is the issue. The trick is to define the
hierarchy, and then assign the term to the proper level in the hierarchy. I
would submit that domain experts coming to agreement worldwide over decades
have had a better opportunity to refine these aspects than any small set of
individuals in separate language communities ( for all I know, the Germans
may have already solved this ).

FWIW, with regard to dictionaries, in the case of the misleading roofer
> description, it was copied exactly from the English wikipedia article on
> roofers, which is in itself not consistent there (mixes carpentry and
> roofing in the article).

My text was *not* copied from Wikipedia -  I only work from original
authoritative source documents, in this case the U.S. Department of Labor,
and the example I gave was an *extract* of an *intermediate* level, at the
'leaf' level ( click on the Details tab
<https://www.onetonline.org/link/details/47-2181.00> ) it goes down to the
specific tools used: ... i.e. *Hoists* — Hydraulic swing beam hoists; Power
hoists; Shingle ladder hoists; Trolley track hoist". Of course Wikipedia
might be inconsistent, I occasionally use a Wikipedia reference if
accurately gives a more general description than a more precise technical

That is a fairly extreme hierarchy, but needed for it's intended use -
supporting crosswalks.
Employment-->Industry (Construction)-->Occupation ( Roofers)-->Details
(Tools)-->Hoists ( Power, Swing, Trolley, etc.) ... most of the utility for
ordinary people is in the top three levels.If I was asked to send a wood
building 'roofer' to Japan, I will send a Log Peeler / Tile Setter instead,
based on the tools and materials.

Additionally we are not going to add every single term that describes a
> profession as a tag, because there are synonyms and overlap.

There is no need for that. In my work with cross-national data, I reference
the source hierarchy (and crosswalk if needed if I am integrating to
someone else's work), fill in the two top levels, then add mine at the
appropriate level. On the rare occasion, if there is a significant
difference, I will invoke the fourth level - the next person using my
classification then is automatically alerted why I used the wildly
unexpected 'bark peeler' instead of 'roofer', I don't list *all* the tools
and materials, just the two *categories* that made the distinction. If it
is critical I might put the specific tool name.( 5th level ).

It is actually beneficial not to add everything. Over time, the entries
actually in the data can be scanned and dichotomous keys
<https://www.education.com/science-fair/article/dichotomous-key/> created,
where by asking a set of yes/no questions, one can get to an already used
entry. I've seen school children get to exact species identification doing
this, with no knowledge of the classification system itself. So things
actually become easier for new users in the beginning.

There seems to be some sort of perception that standard classifications
discourage diversity, but the opposite is true. As new concepts appear,
they can be preserved and not force fit to the existing biases of previous
categories - because there is a way to do that without disturbing the bulk
of the existing base.

Michael Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tagging/attachments/20181206/3edcc2bd/attachment-0001.html>

More information about the Tagging mailing list