[OSM-dev] Binary Format, Performance, osm2pbf

Frederik Ramm frederik at remote.org
Fri Oct 15 14:56:49 BST 2010


    a few weeks ago when I announced that I am planning to rely on the 
new binary format in the future, I caught some flak for claiming that an 
.osm.pbf was not only faster to produce, parse, and transmit than a .bz2 
but it would also unpack faster.

Kai did some measurements with osmosis and found unpacking the .osm.pbf 
to be slower.

I have now done some tests and also included Stefan's pbf2osm (which is 
written in C). I used the current OSM file for the German state of 
Bavaria for testing:

File sizes:

.osm       2462778377
.osm.pbf    134579577 (with compression=deflate and lossless)
.osm.bz2    223006298


from .osm.bz2 with bunzip:  0m 57s (user: 0m 52s)
from .osm.pbf with pbf2osm: 0m 55s (user: 0m 41s)
from .osm.pbf with osmosis: 1m 11s (user: 1m 23s)

So Kai was right, decompression with osmosis is a bit slower than a 
bunzip2 - but pbf2osm is faster.

And once we take into account that if we manage to build pbf2osm into 
processing tools like osm2pgsql (which I hear Stefan is working on), 
that will get rid of XML parsing, things will again improve considerably.


More information about the dev mailing list