[OSM-dev] osmosis bz2 performance
brett at bretth.com
Wed Feb 6 13:44:21 GMT 2008
The apache bz2 implementation was the only version I could find, if
there's an alternative I'd like to hear about it. The apache
implementation is pure java and I suspect that is the main difference.
The gzip implementation uses Inflater and Deflator classes which have a
bunch of native methods. I see the bz2 support in osmosis as being a
convenience aid, but if I'm processing large files I use the native
bzip2 command line tools and pipe the data into osmosis. For example.
bzcat planet.bz2 | osmosis --rx /dev/stdin --wn
The hourly and minute changesets on planet.openstreetmap.org are all
written to gz due to the performance improvement. The daily changesets
are bz2 files but are created using the command line bzip2.
Stefan Baebler wrote:
> It seems that apache's bz2 implementation that is used in Osmosis is
> very slow compared to the gz implementation. Could it simply be due to
> Java or are other bz2 implementations in Java better?
More information about the dev