[OSM-dev] REST API encoding
Thomas Walraet
thomas at walraet.com
Wed Jul 12 17:17:59 BST 2006
Dean Earley a écrit :
>>
>> Here is the reply for single segment :
>>> <tag k="name" v="tést"/>
> <SNIP>
>> For single way :
>>> <tag k="name" v="tést"/>
> <SNIP>
>> A9 is the hex code for é in ISO-8859-1
>> C3A9 is the hex code for é UTF-8
>
> Technically, these are neither.
> It happens to be hex encoded representation of the aforementioned encoding
> formats. Saying "it is 8859-1" or "it is utf-8" means it will be that
> format raw (not hex encoded).
I think I understood that. It means that the server is only sending
ASCII chars and that clients don't have to specified an encoding when
reading the stream from the server (actually JOSM use an
InputStreamReader without encoding (I proposed a patch to Imi but he
wisely didn't apply it), and the applet use an Apache component
configured to read ISO-8859-1 stream)
The server just have to encode string from it's internal system to
unicode entities (&#xXX; things), and clients have to decode them.
If this is correct, I had to remove what I added to the REST page (the
thing about using <?xml version='1.0' encoding='UTF-8'?> header for
server response)
For client request, JOSM and the applet actually use UTF-8 encoding, and
it seems to work (except for way's tags that the server serve back wrongly).
Does this behavior is considered OK, or is it better if we switch to
&#XX; things to encode characters outside ASCII ?
More information about the dev
mailing list