[OSM-dev] osmosis bz2 performance

Brett Henderson brett at bretth.com
Wed Feb 6 13:44:21 GMT 2008


Hi Stefan,

The apache bz2 implementation was the only version I could find, if 
there's an alternative I'd like to hear about it.  The apache 
implementation is pure java and I suspect that is the main difference.  
The gzip implementation uses Inflater and Deflator classes which have a 
bunch of native methods.  I see the bz2 support in osmosis as being a 
convenience aid, but if I'm processing large files I use the native 
bzip2 command line tools and pipe the data into osmosis.  For example.

bzcat planet.bz2 | osmosis --rx /dev/stdin --wn

The hourly and minute changesets on planet.openstreetmap.org are all 
written to gz due to the performance improvement.  The daily changesets 
are bz2 files but are created using the command line bzip2.

Brett

Stefan Baebler wrote:
>
> It seems that apache's bz2 implementation that is used in Osmosis is
> very slow compared to the gz implementation. Could it simply be due to
> Java or are other bz2 implementations in Java better?
>   




More information about the dev mailing list