[Tagging] Identifying language regions

Imre Samu pella.samu at gmail.com
Tue Apr 24 16:56:43 UTC 2018


> The main problem multilingual map effort is trying to solve is how to
calculate the language of the "name" tag.

As I understand - We need a "simple metadata" - about the "current mapping
rules"  [ https://wiki.openstreetmap.org/wiki/Multilingual_names ]
So, We can use this for:
-  Multilingual Maps
-  OSM Editors  - checking/validating  character sets, extreme characters
-  "Localization of name suggestion":
https://github.com/osmlab/name-suggestion-index/issues/11
-  other QA tools  ( osmcha?)

My biggest problem is the "on the ground" rule:
  *  "The "on the ground" rule remains the method of determining the
appropriate value for the name tag. "*

https://wiki.osmfoundation.org/wiki/Working_Group_Minutes/DWG_2014-06-05_Special_Crimea

But sometimes reusing this metadata for QA rules is not so simple :
- " Béla Bartók square in Paris. The “ó” is not valid in French."    see
more:    https://wiesmann.codiferes.net/wordpress/?p=15187


*My  pragmatic solution*

in my mind, this is 2 separated problem:
- inventing a good metadata for every case   ( see
https://wiki.openstreetmap.org/wiki/Multilingual_names  for example: Hong
Kong  )
- storing the metadata  [ as an OSM tag;   in the OsmWiki  ;  in the Github(
https://github.com/osmlab/....)


First - We can create a simple metadata -   with the  "Wikidata"-keys on
the OSM admin areas

like  a simple    Wikidata(OSM admin-area) -  primary/secondary language
code table

name_en,        Wikidata,  primaryOnTheGroundLang,    secondaryOnTheGround
Lang
Aruba,         Q21203,    nl         ,
Afghanistan, Q889,      ps
Angola,         Q916,      pt
Anguilla, Q25228,    en
Albania, Q222,      sq
Åland Islands, Q5689,     sv
..
Crimea,         Q7835,     ru,                        uk
Russia,         Q159,      ru,
Ukraine,        Q212,      uk,
...

- If some area overlapping (  "Crimea") - the smaller area has a higher
priority
- We can merge this metadata with the OSM  - and after we have polygons.






2018-04-24 15:58 GMT+02:00 Yuri Astrakhan <yuriastrakhan at gmail.com>:

> The main problem multilingual map effort is trying to solve is how to
> calculate the language of the "name" tag.  Without it, name tag becomes
> nearly useless.  For example:
>
> * An Italian user viewing a feature in China with two tags: "name" and
> "name:fr".   In this case, "name:fr" tag is preferred because "name" is
> likely to be in Chinese - not great for an Italian speaker.
> * Same tags, but the feature is in Italy -- now "name" tag is the better
> choice because the name is actually in the same language as the reader.
>
> Without knowing the language of the "name" tag, we cannot use it as part
> of the "script matching" - give preference to languages that use the same
> script as the reader, even if the language is different.
>
> On Tue, Apr 24, 2018 at 12:29 PM, Andy Townsend <ajt1047 at gmail.com> wrote:
>
>> On 24/04/2018 09:11, Rory McCann wrote:
>>
>>> Ireland has 2 official languges (Irish first & then English), but only
>>> ~2% of the population speak Irish daily. There are some legal defined
>>> regions of Ireland which are supposed to be "Irish speaking areas", but
>>> even there Irish is a minority language. So how should that be tagged?
>>> (Some day we'll get around to mapping the Gaeltachtaí)
>>>
>>
>> Ireland's pretty much a "best case" for this as it does have defined
>> language regions for Irish.  Most places don't.
>>
>>
>>> If you want to know the language in a multi-lingual area, why not look
>>> at the name, and name:XX tags. If the name value is the same as a name:Z
>>> then Z is the language.
>>>
>>
>> That won't always work.  You can probably guess the example I'm going to
>> pick next - https://www.openstreetmap.org/node/52241235 :)
>>
>> For those unaware, the story there is summarised at
>> https://en.wikipedia.org/wiki/An_Daingean#Name .  It's a while since
>> I've been there; not sure how much of a "cause celebre" it is currently.
>> I've certainly heard people on RTE refer to it as "Dingle / An Daingean"
>> (that's the English name and the commonly used Irish name but not the
>> official Irish name...).
>>
>> Best Regards,
>> Andy
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Tagging mailing list
>> Tagging at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/tagging
>>
>
>
> _______________________________________________
> Tagging mailing list
> Tagging at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/tagging
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tagging/attachments/20180424/a136612f/attachment.html>


More information about the Tagging mailing list