[OSM-dev] Non-UTF-8 German Umlauts in planet.osm

Keith Sharp kms at passback.co.uk
Thu Mar 15 20:38:04 GMT 2007


On Thu, 2007-03-15 at 21:27 +0100, Jan-Benedict Glaw wrote:
> Hi!
> 
> Current planet.osm has a sharp-s in (probably) ISO-8859-1{,5}, which
> breaks the PostGIS import:
> 
> 
> jbglaw at nini:~/planet.osm$ bzcat planet-070314.osm.bz2 |./osm2pgsql/osm2pgsql -
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "tmp_segments_pkey" for table "tmp_segments"
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "tmp_nodes_pkey" for table "tmp_nodes"
> Processing: Node(8550k)
> Processing: Segment(8970k)
> Processing: Way(376k)-:73188476: parser error : Input is not proper UTF-8, indicate encoding !
> Bytes: 0xDF 0x65 0x22 0x20
>     <tag k="name" v="Volkersbrunner Stra?e" />
>                                         ^
> - : failed to parse
> 
> Any chance to report (and in case of tags: drop) non-UTF-8 stuff
> during planet.osm generation?

Use UTF8Sanitize from SVN to clean these up, and see this page on the
Wiki as well:

	http://wiki.openstreetmap.org/index.php/Utf8_errors

Keith.





More information about the dev mailing list