[OSM-dev] UTF-8 problem with API 0.5 planet.osm and osmosis 0.16

Brett Henderson brett at bretth.com
Sat Sep 22 02:18:38 BST 2007


Karl Newman wrote:
> I'm trying to use osmosis 0.16 to slice out a 1-degree square section 
> of the 0.5 planet.osm dump (specifically the 070905 file listed on the 
> Wiki page), but I'm getting a UTF-8 conversion error. Here's the 
> command line I'm using:
> osmosis --read-xml-0.5 file=planet-api05-070905.osm --bounding-box-0.5 
> left=-123 right=-122 top=46 bottom=45 --write-xml-0.5 file=dump.osm
> Here's the exception stack trace:
> Exception in thread "Thread-1-read-xml-0.5 " 
> com.bretth.osmosis.core.OsmosisRuntimeException: Unable to read XML file.
>         at 
> com.bretth.osmosis.core.xml.v0_5.XmlReader.run(XmlReader.java:107)
>         at java.lang.Thread.run(Unknown Source)
> Caused by: 
> com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: 
> Invalid byte 2 of 3-byte UTF-8 sequence.
>         at 
> com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(Unknown 
> Source)
>         at 
> com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(Unknown Source)
>         at 
> com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(Unknown 
> Source)
> ... etc.
> I'm using osmosis on Windows and have written my own batch file 
> wrapper which mostly duplicates the shell script functions. I have 
> successfully used osmosis to read from the 0.5 api on 
> openstreetmap.gryph.de <http://openstreetmap.gryph.de>, so I think 
> osmosis is working correctly.
>
> Could it be a line-endings problem? Is there a known issue about 
> UTF-8? As you can see, unfortunately the exception gives no line 
> number in the source document, so it's impossible to nail it down. 
> However, the exception happens almost immediately, so it must be 
> occurring early in the file. I didn't see anything strange peeking at 
> it with head.
>
> Thanks for your time.
>
> Karl Newman
> ------------------------------------------------------------------------
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>   
It will take me a few hours to get back to this.  I've checked in a new 
version of osmosis that supplies line number information when parse 
errors occur but it's only in svn at the moment.





More information about the dev mailing list