[OSM-dev] Osmosis data corruption on Debian Jessie/Testing
Jochen Topf
jochen at remote.org
Mon Nov 2 06:14:22 UTC 2015
As I mentioned below:
> > > > You can test whether this bug is on your system, too: Download the XML
> > > > for this node: http://www.openstreetmap.org/node/3382756758. Then run
> > > > it through osmosis:
> > > >
> > > > osmosis --rx 3382756758.osm --wx out.osm
> > > > Compare the two files, you'll see the musical notation character
> > doubling
On Mo, Nov 02, 2015 at 05:34:14 +0000, Brett Henderson wrote:
> Sorry, I'm terrible at checking this list. 6 months isn't ideal. Does
> anybody have an XML snippet that I could use for such a test?
>
> On Fri, 6 Mar 2015 at 23:50 Jochen Topf <jochen at remote.org> wrote:
>
> > I think the bug is important and subtle enough that we should make sure it
> > doesn't resurface again. Either by detecting the runtime or by the check
> > you describe. At least we should put the check into a unit test, so that
> > people who run the tests on their platform after building can be safe.
> >
> > Jochen
> >
> > On Thu, Mar 05, 2015 at 05:25:12PM +1100, Brett Henderson wrote:
> > > Date: Thu, 5 Mar 2015 17:25:12 +1100
> > > From: Brett Henderson <brett at bretth.com>
> > > To: Jochen Topf <jochen at remote.org>
> > > Cc: OSM-Dev Openstreetmap <dev at openstreetmap.org>
> > > Subject: Re: [OSM-dev] Osmosis data corruption on Debian Jessie/Testing
> > >
> > > I suspect that attempting to detect the underlying XML runtime would be
> > > brittle. Another option might be to embed that bit of data in Osmosis
> > > itself and do a self test before attempting to execute any XML tasks.
> > >
> > > I'm surprised that this is still an issue in standard Java. I tried
> > > raising tickets against Sun Java before it moved under Oracle but never
> > got
> > > a response. I gave up, embedded Xerces in the main Osmosis distribution,
> > > and then forgot about it.
> > >
> > > On 5 March 2015 at 10:16, Jochen Topf <jochen at remote.org> wrote:
> > >
> > > > Hi!
> > > >
> > > > Just spent a few hours debugging this problem: The way Osmosis is
> > packaged
> > > > on Debian Jessie seems to be wrong. It doesn't use the Xerces XML
> > parser
> > > > but seems to fall back to Java default XML parser which mangles Unicode
> > > > characters.
> > > >
> > > > This can lead to data corruption (and has for me today) when using
> > Osmosis
> > > > for planet updates etc.
> > > >
> > > > You can test whether this bug is on your system, too: Download the XML
> > > > for this node: http://www.openstreetmap.org/node/3382756758. Then run
> > > > it through osmosis:
> > > >
> > > > osmosis --rx 3382756758.osm --wx out.osm
> > > >
> > > > Compare the two files, you'll see the musical notation character
> > doubling
> > > > in the second case when your Osmosis is broken. The fix is simple: Add
> > > > a line "load /usr/share/java/xercesImpl.jar" to
> > /etc/osmosis/plexus.conf.
> > > > As I understand this, it tells Java to load Xerces replacing the
> > built-in
> > > > XML parser.
> > > >
> > > > I have opened a bug with Debian.
> > > >
> > > > Arguably Osmosis should somehow detect when Xerces isn't found and
> > return
> > > > an
> > > > error instead of using a different implemenation. But I don't know
> > enough
> > > > about
> > > > Java to say whether thats possible.
> > > >
> > > > Jochen
> > > > --
> > > > Jochen Topf jochen at remote.org http://www.jochentopf.com/
> > > > +49-173-7019282
> > > >
> > > > _______________________________________________
> > > > dev mailing list
> > > > dev at openstreetmap.org
> > > > https://lists.openstreetmap.org/listinfo/dev
> > > >
> >
> > --
> > Jochen Topf jochen at remote.org http://www.jochentopf.com/
> > +49-173-7019282
> >
--
Jochen Topf jochen at remote.org http://www.jochentopf.com/ +49-351-31778688
More information about the dev
mailing list