[Tagging] Various alt_name values?

moltonel 3x Combo moltonel at gmail.com
Thu Nov 27 13:44:39 UTC 2014


On 27/11/2014, Martin Koppenhoefer <dieterdreist at gmail.com> wrote:
> 2014-11-27 12:59 GMT+01:00 moltonel 3x Combo <moltonel at gmail.com>:
>> Either the
>> data user forgets to do the split, or he does it when it wasn't the
>> mapper's intent, or litteral semincolons in the data get in the way.
>
> Yes, formally introducing the semicolon practise will force data consumers
> to look for it, or they will get problems in those cases (not that the
> current situation is much different).

Yes, the current semicolon practice is fine as-is. It's the idea of
formalising it and generalizing it which I find troubling.

> If we were to implement such a
> formalization we will surely cater for encoding of actual semicolons in the
> names/values, e.g. by defining a double semicolon as escape (this should work
> well, as we won't expect null-values in our value lists, do we?).

For fun usecases you can always look at the lanes tag. Yes, they do
have empty values.

Sure, we can devise an encoding that works for all usecases. We can
even (maybe) manage to keep it elegant. And I'd be very happy to use
it in an environement where I have a decent control over readers and
writers.

But OSM is not such an environement. You can document and provide
example code all you want, somebody will not read the docs. Somebody
will introduce a bug when converting the python example to erlang.
Somebody will see at a glance that it is a csv format, and implement a
naive split(). Somebody will use level0 instead of id/josm and make a
typo.

The numbered keys scheme has the advantage that if any of this
happens, users get missing values instead of corrupted ones.

The data model change has the advantage that editors and consumers
that do not support it will fail immediately instead of silently
making a mess.



More information about the Tagging mailing list