[OSM-dev] What subset of UTF-8 should the API accept?
Ævar Arnfjörð Bjarmason
avarab at gmail.com
Thu Jul 16 19:07:17 BST 2009
The OSM protocol specifies that it accepts UTF-8 data, but in reality
it only accepts the subset of UTF-8 that the XML parser being used
doesn't barf on, see:
http://lists.openstreetmap.org/pipermail/dev/2009-July/016165.html
This issue surfaces e.g. here:
http://trac.openstreetmap.org/ticket/2072
So what subset should the API specify? If it's to accept full UTF-8
all the tools that parse the XML will have to learn to deal with
control characters.
More information about the dev
mailing list