[OSM-dev] Restrict key names on order to retain reusability of OSM

Frederik Ramm frederik at remote.org
Tue Feb 12 01:23:21 GMT 2008


Hi,

> > I just have finished a converter of OSM xml format to GML and I
> > *BOLDLY*suggest to constrain the allowed characters of tags (=
> > key-names) to the
> > following XML related set:
> > 'aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ_' in order to retain
> > reusability.

There is a reason why our data format has 

<tag k="keyname" v="value" />

instead of

<keyname>value</keyname>

and that reason is allowing non-XML stuff in the key names. Or at
least it seems like that could have been a reason. I hope nobody's
going to come forward and tell me it was just decided over a few beers
;-)

> > After having looked at more than 100 MB of data we found in key names
> > characters like space, slashes, colons and even more weird ones. I don't
> > think this will take too much of users freedom of choice...

Colons are officially sanctioned as namespace delimiters; people
creating their very own personal tags often use them to prepend the
actual tag with their user name.

But I'd agree with Dutch,

> If an OSM tool is unable to utilize UTF8 characters, then the tool 
> should be rewritten. It is a big step backwards, if we instead choose to 
> limit the characters available for use.

BTW, does having UTF8 keys mean that a key may contain a null byte, or
is UTF8 crafted in a way to avoid that?

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00.09' E008°23.33'





More information about the dev mailing list