[Tagging] Extended Conditions - response to votes

Thu Jul 5 10:04:11 BST 2012

Hi Eckhart.
Yes, it's possible to do it that way, and yes, it's possible even to 
implement variable keys to the tools data consumers use,
but let's look into how they work now and what would change, if this 
kind of variable keys are introduced:

Most tools "consuming" osm data filters the data, using only a known 
subset of tags.
Mapnik as the most prominent rendering engine usually fetches a known 
subset of tags for dedicated table columns to get fast access to it, 
throwing everything else into an hstore column, which is kind of an own 
inner table - as an option.

Let's consider a routing engine that has to load osm data to built the 
routing graph. Here the data parsing itself probably isn't the most 
expensive part of the whole process, but nevertheless: currently a fixed 
string comparison is used; with variable keys a substring comparison 
would be necessary to fetch the necessary tags.

Every software that stores tags in key-value stores would have to parse 
over all keys instead of a simple lookup.

Some more directed responses after quotes:

Am 04.07.2012 20:36, schrieb Eckhart Wörner:
> 1. "constant keys is a fundamental rule in OSM"
> [...]
> Now one may argue that this rule has been established just by not using
> "variable keys" until now.
I think, it's partly written in the tools: I don't know any tool that 
actually semantically parse or use subkey-syntax, that has established 
as one common way to structure our flat tagging scheme. No editor allows 
folding of "namespaces", yet, and as far as I know this 
meta-tagging-scheme is not implemented anywhere as a general rule, while 
it's a step in a similar direction (not the same: namespaces does NOT 
directly introduce variable keys, they group keys together in a first 
place).
>   This argument is flawed in two ways:
> First of all, there is plenty of precedent: the (old) TMC scheme uses variable
> keys extensively,
As you note yourself: that's the old TMC scheme, that has been in 
critics very often because it's not maintainable. The new TMC scheme 
uses a fixed set of keys and IMHO increases maintainability by doing this.
> the source base key uses variable keys (source:<key> has
> <key> as variable),
right and I would like to argue (which is not a good argument, I fear), 
that it's a special case... - I think I cannot give good arguments for 
that...
> to my knowledge OpenSeaMap uses them for lights, etc.
it uses a number-variable for lights as far as I see, but at least not 
arbitrary variables following a complex scheme.
> Also, the argument is broken: not having done something in the past is not an
> argument by itself for not doing something in the future.
that's indeed true.
> However, let's have a look at some alternatives:
> a. Use maxspeed:condition<X>=<condition>, maxspeed:valueX=<value> (or
> something similar), where X is a number: first of all, the key is also a
> variable, which violates the constant key "rule" the same way. Second, one
> could easily argue that from a semantic POV this is even worse, since only the
> combination of keys gives a key its meaning.
+1
> b. Merge all conditions and all possible values into one value, i.e.
> maxspeed=<complex expression> or access=<complex expression>
> First of all, "complex" expressions are actually complex, people inevitably
> will get them wrong.
well, but is it really more easy to create a concise set of tags 
dividing the one very complex value to several semi-complex values using 
several semi-complex keys?
> More importantly, one can quite easily exceed the 255
> byte limit for tag values.
+1;
> 2. "[the new tagging scheme is] unusable for data consumers (or not without
> huge impact on performances)"
> The tagging scheme is pretty simple to implement; just match all tag keys
> against /^access(:.*)/, sort all matching tags according to their specificness,
> and skip less specific tags accordingly.
Your assumption here is that the user matches all keys against a regular 
expression instead of comparing them with a set of constants. May be 
much more expensive.
> Here [1] is a simple implementation
> that evaluates maxspeed in less than 100 SLOC,
100 SLOC are a lot, if you take into account, that it's not fault 
tolerant: changing the last key in your example to 
maxspeed:(weight>1.5t) your interpreter fails to tolerate the t as a 
default unit, adding a unit to the maxspeed-value (e.g. 100mph) fails, too.

As this has to be taken into account even in the KEY parser, the regular 
expression get's more complicated again, and it's not an argument that 
everybody should use km/h as a unit for tagging, because we tag what's 
on the ground, and 60mp/h (96,56...km/h) are different from 100km/h - 
but tagging a maxspeed of 60mph is okay.

regards
Peter