[OSM-dev] Non-UTF-8 German Umlauts in planet.osm
Keith Sharp
kms at passback.co.uk
Thu Mar 15 20:38:04 GMT 2007
On Thu, 2007-03-15 at 21:27 +0100, Jan-Benedict Glaw wrote:
> Hi!
>
> Current planet.osm has a sharp-s in (probably) ISO-8859-1{,5}, which
> breaks the PostGIS import:
>
>
> jbglaw at nini:~/planet.osm$ bzcat planet-070314.osm.bz2 |./osm2pgsql/osm2pgsql -
> NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "tmp_segments_pkey" for table "tmp_segments"
> NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "tmp_nodes_pkey" for table "tmp_nodes"
> Processing: Node(8550k)
> Processing: Segment(8970k)
> Processing: Way(376k)-:73188476: parser error : Input is not proper UTF-8, indicate encoding !
> Bytes: 0xDF 0x65 0x22 0x20
> <tag k="name" v="Volkersbrunner Stra?e" />
> ^
> - : failed to parse
>
> Any chance to report (and in case of tags: drop) non-UTF-8 stuff
> during planet.osm generation?
Use UTF8Sanitize from SVN to clean these up, and see this page on the
Wiki as well:
http://wiki.openstreetmap.org/index.php/Utf8_errors
Keith.
More information about the dev
mailing list