[OSM-dev] Problems parsing planet.osm with Perl XML::Parser

Dave osm at randomjunk.co.uk
Wed Nov 1 16:45:56 GMT 2006


> You've diagnosed this somewhat backwards, I'm afraid.  The file
> doesn't have a character-set declaration.  According to the XML spec,
> that means it's in utf8, and the garbage on line 587103 simply isn't
> valid utf8.  Several other lines have things that appear to be latin-1
> instead of utf8.
Seems to declare itself UTF-8 actually. It has has encoding specified in 
the <?xml ?> line.
But whatever, it clearly isn't. At least bits of it aren't.







More information about the dev mailing list