[OSM-dev] Problem with 2010/03/10 planet...
Jon Burgess
jburgess777 at googlemail.com
Thu Mar 11 23:33:03 GMT 2010
On Thu, 2010-03-11 at 17:05 -0600, Jeffrey Ollie wrote:
> Getting the following traceback trying to extract some data from the
> March 10th planet file. I'm using osmosis 0.34.
>
> org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to parse xml file
> /dev/stdin. publicId=(null), systemId=(null), lineNumber=529642199,
> columnNumber=27.
> at org.openstreetmap.osmosis.core.xml.v0_6.XmlReader.run(XmlReader.java:113)
> at java.lang.Thread.run(Thread.java:636)
> Caused by: org.xml.sax.SAXParseException: Character reference ""
> is an invalid XML character.
It looks like this was caused by a change made by Frederick back in
r19176. The planet dump code used to turn all characters less than 32
into '?' instead of creating these character sequences. I guess he
didn't read the bit of the XML spec which says that all characters <32
are invalid except for tab / newline / carriage return[1]. It makes no
difference whether they exist as plain characters or character entities,
they are still not allowed, e.g.
$ echo "<test></test>" | xmllint - -noout
-:1: parser error : xmlParseCharRef: invalid xmlChar value 24
<test></test>
I have a committed a change which should resolve this for future dumps
in r20430 but someone needs to compile and update the copy on the
server.
Jon
1: http://www.w3.org/TR/REC-xml/#NT-Char
More information about the dev
mailing list