[OSM-dev] XML parsing, expat and UTF8

Andreas Brauchli linux at elementarea.net
Tue Dec 26 10:12:52 GMT 2006


> The expat documentation says it supports UTF8. ISTR that is the encoding which 
> OSM is using, so using expat will be safe to parse OSM data containing 
> non-English languages. Is this correct?
if it behaves like described in the docu then there shouldn't be any
problems.

try parsing this: (name should display Thun Sud, with u-umlaut (two dots
over the u of Sud: Süd))

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.3" generator="OpenStreetMap server">
  <way id="728459" timestamp="2006-05-14 16:47:29">
    <seg id="2635641"/>
    <seg id="2635642"/>
    <seg id="2635643"/>
    <seg id="2635644"/>
    <seg id="2635645"/>
    <tag k="name" v="Thun Süd"/>
    <tag k="highway" v="motorway_link"/>
    <tag k="created_by" v="JOSM"/>
  </way>
</osm>

andreas





More information about the dev mailing list