[Tagging] Nonbreakable spaces in name tags

Matej Lieskovský lieskovsky.matej at gmail.com
Fri Jan 26 15:47:29 UTC 2018


In Czech, a nonbreakable space should follow any single-letter
preposition or conjunction and academic or military titles. A
nonbreakable space should also be used due to some common
contractions, between a number and a unit, and around some punctuation
marks.

I noticed that some Overpass queries were not returning some elements
- that is how I found out that we actually have a rather large number
of nonbreakable spaces in the data.

Nonbreakable spaces are currently quite troublesome - not all
consumers actually use Unicode collation, it is invisible in JOSM and
it is not exactly easy to input. Also, the chance that we convince all
contributors to use it correctly is exactly zero. Along with this
potentially being "tagging for the renderer", there are many calls for
a mass-removal.

On the other hand, there is software that actually handles Unicode
collation well and it does make the correct rendering of names an
order of magnitude easier. Leaving this up to the renderer sounds
logical, but imagine forcing every renderer to figure out what
language any given name is in and then running the appropriate
subprogram to fill in the nonbreakable spaces. This could require
semantic analysis due to the need to add a nonbreakable space after
the "V" in "V jámě" (preposition) but before the "V" in "Jiří V."
(roman ordinal number) and after the "V." in "V. Špidla" (contraction
of name (and yes, there are cases when you should use a contraction)).

Nonbreakable spaces are strange - you cannot reliably tell if they are
used OTG (but in some cases you can), official documents often ignore
them (leaving them up to the automated systems in office software, so
they do occur sometimes) and the rules governing them are older than
computers, so asking if they are a rule or a character is... dubious.

And yes, we do have really long names of things. Names of POIs named
after people are a common use case.

Matej

On 26 January 2018 at 16:11, marc marc <marc_marc_irc at hotmail.com> wrote:
> Le 26. 01. 18 à 15:48, Matej Lieskovský a écrit :
>> Several Slavic languages have rather formal rules about line breaks.
>
> it depends on whether it is a grammar rule or a "char".
> In French, it is a rule to know how to cut a word at the end of a line.
> Since it's a grammar rule, I don't see any point in adding a character
> between syllables to describe it. it's up to the render
> to know when it can do it if ppl wants this feature.
> I know nothing about your language, but I feel it look like the same.
> If my understanding is correct, I am in favour of not putting
> this "nonbreakable" information into a value and moving it to app code
> that need it (witch ? have you so long value that's needed to break it
> in several line ?)
>
> Regards,
> Marc
> _______________________________________________
> Tagging mailing list
> Tagging at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/tagging



More information about the Tagging mailing list