[OSM-dev] UTF-8 problem with API 0.5 planet.osm and osmosis 0.16
brett at bretth.com
Sat Sep 22 02:18:38 BST 2007
Karl Newman wrote:
> I'm trying to use osmosis 0.16 to slice out a 1-degree square section
> of the 0.5 planet.osm dump (specifically the 070905 file listed on the
> Wiki page), but I'm getting a UTF-8 conversion error. Here's the
> command line I'm using:
> osmosis --read-xml-0.5 file=planet-api05-070905.osm --bounding-box-0.5
> left=-123 right=-122 top=46 bottom=45 --write-xml-0.5 file=dump.osm
> Here's the exception stack trace:
> Exception in thread "Thread-1-read-xml-0.5 "
> com.bretth.osmosis.core.OsmosisRuntimeException: Unable to read XML file.
> at java.lang.Thread.run(Unknown Source)
> Caused by:
> Invalid byte 2 of 3-byte UTF-8 sequence.
> com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(Unknown Source)
> ... etc.
> I'm using osmosis on Windows and have written my own batch file
> wrapper which mostly duplicates the shell script functions. I have
> successfully used osmosis to read from the 0.5 api on
> openstreetmap.gryph.de <http://openstreetmap.gryph.de>, so I think
> osmosis is working correctly.
> Could it be a line-endings problem? Is there a known issue about
> UTF-8? As you can see, unfortunately the exception gives no line
> number in the source document, so it's impossible to nail it down.
> However, the exception happens almost immediately, so it must be
> occurring early in the file. I didn't see anything strange peeking at
> it with head.
> Thanks for your time.
> Karl Newman
> dev mailing list
> dev at openstreetmap.org
It will take me a few hours to get back to this. I've checked in a new
version of osmosis that supplies line number information when parse
errors occur but it's only in svn at the moment.
More information about the dev