[OSM-talk] Tagging schema

Lester Caine lester at lsces.co.uk
Mon Oct 5 11:43:28 BST 2009

David Earl wrote:
> I think it would be really helpful to bring together the tag definitions 
> into one place, *in the database and API itself*. I mean a complete 
> schema: the tags, their possible values, their descriptions (in multiple 
> languages), their equivalences both in other languages and synonyms, 
> their related tags (in essence properties of the main descriptive tag, 
> hence oneway=... with highway=...), deprecations and so on.
> And I think this gets changed as other objects in the database get 
> changed: freely but consciously. So if there is a new value for shop, it 
> is a conscious act to add that to the list of values for shop, and to 
> describe it, not just casually adding it as a tag value.


> Let me be quite clear again: this doesn't restrict anyone's freedom to 
> add new tags or values. Anyone can edit them just like the map data. It 
> does make it a little more work, but the value of doing so both to the 
> person making the change and the rest of the community is also increased:
Until such time as agreement is made to restrict some tags, there is 
nothing to stop free format text as at present, but having a list of 
agreed values - and presenting them in the correct local language - can 
only be a good thing?

> (a) the tag/value is publicised, not buried in the map data, so if it is 
> a good one, it is more likely to be adopted. For example, take 
> "landuse=orchard" discussed recently. I've tagged at least three areas 
> with landuse=orchard in the last 3 years. I just did it. Others may have 
> used land=orchard, whatever. However, it would only be obvious I'd done 
> this if the renderers knew about it or I'd made a song and dance about 
> it. With a central schema, it would automatically be possible for it to 
> appear on editor menus for example.
Using an agreed set of landuse tags and having on-line links to help 
relating to the values makes sense, and again they an be mapped 
internally to other languages

> (b) if we choose to check data against this schema, spelling mistakes 
> would be eliminated (not in names and other naturally free form data, 
> obviously)
Behind the scenes local language tags can be added automatically.

> (c) editor and consumer programs can all work off the same schema: 
> presets and menus of values are table driven and in sync, renderers know 
> the possible things they might want to render (not that they have to of 
> course) and can see automatically that highway=gate and barrier=gate are 
> the same thing (or indeed barriere=tor or barrière=porte).
My own preference internally would be for simple numeric tags - but then 
I work in 'real' relational databases where mapping appropriate text 
when displaying the user view is natural. XML is not really designed 
with language translation in mind :(

> (d) the meaning of newly introduced or changed tags goes along with 
> them, so that the intention is described to others. Editors can offer 
> help. Renderers can offer legends.
And rendering rules could be enhanced by being able to select the 
preferred elements that a particular map requires.

> Here's the kind of thing I had in mind:
> * Three new primitives, tagkey for describing the k part of tags, 
> tagvalue for the v part of tags and tagdescription separated off to 
> allow for multiple descriptions in multiple languages without having to 
> download all the data for languages you're not interested in. ("tagkey" 
> etc can be anything we want, don't get too hung up on the terminology, I 
> just use it for didactic purposes).
> In the following, the fields could be key/value pairs, i.e. tags 
> themselves, or separate named fields in the database depending on how 
> things need to be indexed. But allowing the schema to itself have tags 
> means it is extensible. Perhaps it can even be self-describing.
> tagkey
>    name = [tagkey]
>    type = text | scalar | real | integer | boolean | value
>           where...
>           text: any arbitrary string
>           scalar: a number possibly qualified by some units
>           real: a floating point number
>           integer: an integer
>           boolean: vlues such as 'yes', 'true', '1', 'no', 'false', '0'
>           value: a value chosen from among a specific set of strings
>                  documented by the tagvalue object
>    units = [semicolon separated list of possible units]
>    defaultunits = [one from the units list]
>    appliesto = [semicolon separated list of tagkey or tagkey=tagvalue]
>           indicates this tag is usually used as a property qualifying the
>           given tags
>    relevantto = area | node | way | relation
> tagvalue
>    name = [tagvalue]
>    appliesto = [tagkey]
>    relevantto = area | node | way | relation
>    photo[:N] = [url] <!-- allows for more than one photo, photo:1 etc -->
>    synonym = [tagkey or tagkey=tagvalue]
>    seealso = [tagkey or tagkey=tagvalue]
> tagdescription
>    lang = [languagecode]
>    appliesto = [tagkey or tagkey=tagvalue]
>    plus a description in that language (not a tag value)
> For example
>    <tagkey name='barrier' type='value' />
>    <tagvalue name='gate' appliesto='barrier' relevantto='node' />
>    <tagvalue name='bollard' appliesto='barrier' relevantto='node' />
>    <tagvalue name='bollards' appliesto='barrier' relevantto='node'
>     synonym='bollard'/>
>    <tagdescription lang='en' appliesto='barrier=bollard'>one or a series 
> of short posts for excluding or diverting motor vehicles from a road, 
> lawn, or the like</tagdecription>

Geerr ... This is why I hate XML ... Everybody's version is right.
Rather than all these separate elements, tag values should form part of 
the tagkey object, and descriptions can be added at any level. I need to 
find the link to a good example, but

<tag name='barrier' type='value' relevantto='node'>
     <tagvalue name='gate' />
     <tagvalue name='bollard'/>
     <tagvalue name='bollards'/>
     <description lang='en'>one or a series
     of short posts for excluding or diverting motor vehicles from a
     road, lawn, or the like</decription>

But I suspect this is just a misunderstanding, as a scheme needs to be 
defined in .xsl. 
http://www.cabinetoffice.gov.uk/media/260545/BuildingStructure.xml is a 
good example of a definition of a building with enumerated tag values. I 
was trying to find the one that goes with 'landuse' but I don't have 
time ... need to be on the road by 1 ...

Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php

More information about the talk mailing list