[OSM-dev] Osmosis data corruption on Debian Jessie/Testing

Jochen Topf jochen at remote.org
Fri Mar 6 12:50:39 UTC 2015


I think the bug is important and subtle enough that we should make sure it
doesn't resurface again. Either by detecting the runtime or by the check
you describe. At least we should put the check into a unit test, so that
people who run the tests on their platform after building can be safe.

Jochen

On Thu, Mar 05, 2015 at 05:25:12PM +1100, Brett Henderson wrote:
> Date: Thu, 5 Mar 2015 17:25:12 +1100
> From: Brett Henderson <brett at bretth.com>
> To: Jochen Topf <jochen at remote.org>
> Cc: OSM-Dev Openstreetmap <dev at openstreetmap.org>
> Subject: Re: [OSM-dev] Osmosis data corruption on Debian Jessie/Testing
> 
> I suspect that attempting to detect the underlying XML runtime would be
> brittle.  Another option might be to embed that bit of data in Osmosis
> itself and do a self test before attempting to execute any XML tasks.
> 
> I'm surprised that this is still an issue in standard Java.  I tried
> raising tickets against Sun Java before it moved under Oracle but never got
> a response.  I gave up, embedded Xerces in the main Osmosis distribution,
> and then forgot about it.
> 
> On 5 March 2015 at 10:16, Jochen Topf <jochen at remote.org> wrote:
> 
> > Hi!
> >
> > Just spent a few hours debugging this problem: The way Osmosis is packaged
> > on Debian Jessie seems to be wrong. It doesn't use the Xerces XML parser
> > but seems to fall back to Java default XML parser which mangles Unicode
> > characters.
> >
> > This can lead to data corruption (and has for me today) when using Osmosis
> > for planet updates etc.
> >
> > You can test whether this bug is on your system, too: Download the XML
> > for this node: http://www.openstreetmap.org/node/3382756758. Then run
> > it through osmosis:
> >
> >     osmosis --rx 3382756758.osm --wx out.osm
> >
> > Compare the two files, you'll see the musical notation character doubling
> > in the second case when your Osmosis is broken. The fix is simple: Add
> > a line "load /usr/share/java/xercesImpl.jar" to /etc/osmosis/plexus.conf.
> > As I understand this, it tells Java to load Xerces replacing the built-in
> > XML parser.
> >
> > I have opened a bug with Debian.
> >
> > Arguably Osmosis should somehow detect when Xerces isn't found and return
> > an
> > error instead of using a different implemenation. But I don't know enough
> > about
> > Java to say whether thats possible.
> >
> > Jochen
> > --
> > Jochen Topf  jochen at remote.org  http://www.jochentopf.com/
> > +49-173-7019282
> >
> > _______________________________________________
> > dev mailing list
> > dev at openstreetmap.org
> > https://lists.openstreetmap.org/listinfo/dev
> >

-- 
Jochen Topf  jochen at remote.org  http://www.jochentopf.com/  +49-173-7019282



More information about the dev mailing list