[OSM-dev] planet.osm - fix

David Sheldon dave-osm at earth.li
Tue Aug 15 13:38:39 BST 2006


On Tue, Aug 15, 2006 at 02:11:34PM +0200, Michael Strecke wrote:
> <?xml version="1.0"?>
> <osm version="0.3" generator="OpenStreetMap server">
>   <way id="2837877" timestamp="2006-08-09 23:53:34">
>     <seg id="10134927"/>
>     <tag k="name" v="Genter Stra&#xDF;e"/>
>   </way>
> </osm>
> 
> Not UTF-8, but latin-1 encoding. :(

This is Unicode encoding (see
http://www.unicode.org/charts/PDF/U0080.pdf) , and is specified by XML
to be the same as using the UTF-8 sequence. This is the safest
solution. And I think it is exactly what should be generated for
difficult characters such as this.

>   <way id="2837877" timestamp="2006-08-09 23:53:34">
>     <seg id="10134927"/>
>     <tag k="name" v="Genter Stra&#xC3;&#x178;e"/>
>   </way>
> 
> Well, it starts like UTF-8, but what is 0x178 ? A bug.

No it doesn't. UTF-8 would just have the two bytes, no &# or anything.
This is very broken and should be fixed.

David
-- 
"You can't hand-over-hand up monomolecular wire; your fingers fall off."
                      -- Beauvoir, 'Count Zero'




More information about the dev mailing list