[Tagging] Values in namespaces/prefixes/suffixes Considered Harmful - Or: Stop over-namespacing and prefix-fooling

Sun Jan 6 23:27:58 UTC 2019

Hi,

After an answer by Rtfm/ti-lo at namespace wiki page [1] (thanks!) I
have to add some thoughts, especially regarding the multiple values!

Initially I thought it's just about namespaces which I'm calling
"prefix-fooling" or pseudo-boolean-namespaces. "Pseudo" because users
start adding other values to yes/no like yes/no/maybe.

My main concern with prefix-fooling or pseudo-boolean-namespaces is
not against namespaces in principle. But this "new-style" tagging has
a strong tendency to defragment OSM and to loose a sense of
reusability. I've listed several problems above. See e.g. "Shop
subtags proposal" [2] as a negative example well because it's key
fragmentation in the form
"<anyshop_or_service>:<WhateverValue>=yes/no". Looking at the example
in [2]:

Now I realize, that there are three fundamental questions behind this
"new-style" tagging:

First it's about how to describe objects which contain two 'top-level'
tags, like shop=bicycle and shop=motorcycle (see rationale on [2]). We
have had this issue with businesses for long time on how to combine
e.g. restaurant, bar, hotel, bakery etc.. => IMO I'd handle this as
separate POIs as long as possible.

Second, it's about grouping subtags: Namespaces are an attempt to
this. But this is aka redundant to curated presets...

The third and to me most important issue is about handling multiple
values [5]! Multiple values are undoubtedly a data modeling
requirement. They have been handled so far nolens-volens by the
semi-colon value separator [6] - now with a trend to pseudo-boolean
namespaces. Admittedly, processing semi-colon separated fields is
complex and only few SW can process it. I suspect the reason behind is
it's that multiple values are't handled by programming languages out
of the box (databases like Postgres support that not only as data
types but also in queries).

Just recently the iD Editor maintainer added more multiCombo functions
(like [3]) and presets key (like "service:vehicle" [4]). Both is OK
per se, but the latter preset was undocumented on the Wiki, and
obviously the iD Editor maintainer prefers namespaces over semicolons
for handling multiple values - and both issues seem to be completely
undiscussed!

=> So I urgently propose to discuss and sort of out this multiple
values, respectively "semi-colon vs. pseudo-boolean-namespaces" issue!

:Stefan

P.S. I'm now tending to accept new-style boolean-namespaces - but only
under certain conditions. These conditions start with the usual ones
(see the proposal process) complemented perhaps by the following:
* Is it just about grouping? If yes, obstain from namespaces and look
at presets and document it rather in the wiki.
* Can the proposed key be re-used with/by other objects? If yes,
obstain from (over-specific) namespaces and try to choose a more
common or generic and simple key-value tag.
* Is the proposed key namespace in Mixed Case? If yes, this is a
strong smell indicating that it's a value. So choose a more common or
generic and simple key should.
* (See also the six "consequences of prefix-fooling" in my mail above).

[1] https://wiki.openstreetmap.org/wiki/Talk:Namespace#Over-namespacing_and_Prefix-fooling
[2] https://wiki.openstreetmap.org/wiki/Proposed_features/shop_subtags
[3] https://github.com/openstreetmap/iD/issues/5291
[4] https://wiki.openstreetmap.org/wiki/Key:service:vehicle
[5] https://wiki.openstreetmap.org/wiki/Multiple_values
[6] https://wiki.openstreetmap.org/wiki/Semi-colon_value_separator

Am Sa., 5. Jan. 2019 um 16:42 Uhr schrieb Markus <selfishseahorse at gmail.com>:
>
> On Thu, 27 Dec 2018 at 02:05, Stefan Keller <sfkeller at gmail.com> wrote:
> >
> > It's really turning processing of key-values (or key-value pairs KVP,
> > entity-attribute-values EAV, dictionnaries, associative arrays, map
> > collections, Hash stores/hstores) ad absurdum. In addition to the
> > troubles of over-namespacing mentioned above there are following
> > consequences of prefix-fooling - among others (sticking at the example
> > "service:bicycle:retail=yes;service:bicycle:repair=yes;"):
> >
> > * Existing code to validate and cleanup values is in vain: One can't
> > check with usual functions if a value is in range
> > "retail,repair,second_hand".
> > * Existing code to match is in vain too: Prefix-fooled keys pretend to
> > have mixed cases (which they should'nt).
> > * Worse, users still extend "yes/no" values to arbitrary values (which
> > again makes processing unnecessarily complicated).
> > * Even worse, users are encouraged to invent new sparsely used keys
> > (which we can't prevent, but it's less harmful in the values).
> > * Source code is flooded by boolean expressions (which would else be a
> > single function) and need to be predefined in the code (instead of
> > being put in values).
> > * Values in namespaces/prefixes/suffixes are hard or impossible to
> > search, match, count or group in computer languages, including SQL.
>
> I'm a bit late but thank you, Stefan, for your explanation!
>
> Regards, Markus
>
> _______________________________________________
> Tagging mailing list
> Tagging at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/tagging