[OSM-dev] Find way with tag containing a carriage return x0D

Andy Allan gravitystorm at gmail.com
Fri Feb 11 13:33:53 GMT 2011

On Thu, Feb 10, 2011 at 12:54 PM, Robert Whittaker (OSM)
<robert.whittaker+osm at gmail.com> wrote:

> I also wonder if there are any other objects in the OSM database with
> control characters in their tags? It would be great if someone could
> do a report of such objects with links to edit them...

It's worth pointing out the 0xD (along with 0x9 and 0xA) are valid
characters in XML[1], which is what we define[2] as the range of
characters acceptable for OSM tags. Hopefully there aren't any actual
control characters in the prohibited ranges being returned from the
API/diffs/dumps, although I suspect there are some still lingering in
the utf8 fields in the database, just that nobody will ever notice[3].


[1] http://www.w3.org/TR/REC-xml/#charsets
[2] Not that we've ever written that down anywhere, afaik
[3] Until someone starts making json outputs, and we remember that all
our character-range checking is implicit in our XML libraries rather
than explicit in the models...

