[Tagging] Extended Conditions - response to votes

Thu Jul 5 14:40:16 BST 2012

Hi Peter,

Am Donnerstag, 5. Juli 2012, 11:04:11 schrieb Peter Wendorff:
> Let's consider a routing engine that has to load osm data to built the 
> routing graph. Here the data parsing itself probably isn't the most 
> expensive part of the whole process, but nevertheless: currently a fixed 
> string comparison is used; with variable keys a substring comparison 
> would be necessary to fetch the necessary tags.
> Every software that stores tags in key-value stores would have to parse 
> over all keys instead of a simple lookup.

The only software affected by this is routing preprocessors, and I highly doubt you could even spot the difference.

> >   This argument is flawed in two ways:
> > First of all, there is plenty of precedent: the (old) TMC scheme uses variable
> > keys extensively,
> As you note yourself: that's the old TMC scheme, that has been in 
> critics very often because it's not maintainable. The new TMC scheme 
> uses a fixed set of keys and IMHO increases maintainability by doing this.

The old TMC scheme is not maintainable because it a) extensively uses relations, and b) made no attempt at simplifying tagging. But yeah, that's a bit off-topic.

> > b. Merge all conditions and all possible values into one value, i.e.
> > maxspeed=<complex expression> or access=<complex expression>
> > First of all, "complex" expressions are actually complex, people inevitably
> > will get them wrong.
> well, but is it really more easy to create a concise set of tags 
> dividing the one very complex value to several semi-complex values using 
> several semi-complex keys?

First of all, the values are not semi-complex, they are dead-simple, since they are exactly from the same range as their base keys:
maxspeed=100
maxspeed:(weight>7.5)=70
Second, since the system is based on specialization, it naturally maps signs, implicit restrictions, etc. In the above example, the maxspeed key is due to a country-specific speed limit on rural roads, the maxspeed:(weight>7.5) is a sign.

> > Here [1] is a simple implementation
> > that evaluates maxspeed in less than 100 SLOC,
> 100 SLOC are a lot, if you take into account, that it's not fault 
> tolerant: changing the last key in your example to 
> maxspeed:(weight>1.5t) your interpreter fails to tolerate the t as a 
> default unit, adding a unit to the maxspeed-value (e.g. 100mph) fails, too.

Please bear in mind that this is just proof-of-concept code.
Also note that the code contains both the preprocessing step *and* the final evaluation, starting with raw key/value pairs. The actual evaluation happens in the evaluateConditionStructure() and is 25 LOC.
Unit parsing and conversion should take place in the preprocessing, not in the final evaluation.
Also, unit parsing/date parsing/whatever are all things that have to happen regardless of any tagging scheme (unless you want to limit yourself to access=yes/no).

> As this has to be taken into account even in the KEY parser, the regular 
> expression get's more complicated again, and it's not an argument that 
> everybody should use km/h as a unit for tagging, because we tag what's 
> on the ground, and 60mp/h (96,56...km/h) are different from 100km/h - 
> but tagging a maxspeed of 60mph is okay.

You seem to read to much into my "regular expressions". Those things are just implementation details.
(Also, just for the record: regular expressions are not slow, they are among the fastest tools we have, it's just the build of the state machine that's slow.)

Eckhart