[OSM-dev] issue with geofabrik europe update

Julien Fastré julien at fastre.info
Wed May 16 12:43:01 UTC 2018


Hi,

We had a strange issue with a europe diff update from geofabrik: the
diff file is not a valid xml.

The affected file:

http://download.geofabrik.de/europe-updates/000/001/872.osc.gz

We encounter an error when parsing this file with osmosis:


```
org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to parse
xml file 872.osc.  publicId=(null), systemId=(null), lineNumber=972870,
columnNumber=158.
	at
org.openstreetmap.osmosis.xml.v0_6.XmlChangeReader.run(XmlChangeReader.java:114)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.xml.sax.SAXParseException: Element type "tag" must be
followed by either attribute specifications, ">" or "/>".
	at
org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown
Source)
	at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
	at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
	at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown
Source)
	at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
	at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
	at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
	at javax.xml.parsers.SAXParser.parse(SAXParser.java:189)
	at
org.openstreetmap.osmosis.xml.v0_6.XmlChangeReader.run(XmlChangeReader.java:109)
	... 1 more
```

This come from the line 972870, which is not a valid xml tag indeed:

> <tag k="source:boundary" v="http://ka.wikipedia.org/w/index.php?title=%E1%83%A4%E1%83%90%E1%83%98%E1%83%9A%E1%83%98:Tbilisi_Admin_Map.jpg&fileuid="0" user="" changeset="0" timestamp=20080104164655"/>

Removing this line (`sed '972870d' 872.osc > 872.fixed.osc`) make
osmosis able to parse the file.

I wonder if we were the only one affected and, if not, how did you cope
to pass this diff without error ?

Thanks,
Julien



More information about the dev mailing list