[OSM-dev] Binary Format, Performance, osm2pbf
frederik at remote.org
Fri Oct 15 14:56:49 BST 2010
a few weeks ago when I announced that I am planning to rely on the
new binary format in the future, I caught some flak for claiming that an
.osm.pbf was not only faster to produce, parse, and transmit than a .bz2
but it would also unpack faster.
Kai did some measurements with osmosis and found unpacking the .osm.pbf
to be slower.
I have now done some tests and also included Stefan's pbf2osm (which is
written in C). I used the current OSM file for the German state of
Bavaria for testing:
.osm.pbf 134579577 (with compression=deflate and lossless)
from .osm.bz2 with bunzip: 0m 57s (user: 0m 52s)
from .osm.pbf with pbf2osm: 0m 55s (user: 0m 41s)
from .osm.pbf with osmosis: 1m 11s (user: 1m 23s)
So Kai was right, decompression with osmosis is a bit slower than a
bunzip2 - but pbf2osm is faster.
And once we take into account that if we manage to build pbf2osm into
processing tools like osm2pgsql (which I hear Stefan is working on),
that will get rid of XML parsing, things will again improve considerably.
More information about the dev