[osmosis-dev] Fast PBF Reader
Brett Henderson
brett at bretth.com
Sun Jul 29 14:13:07 BST 2012
Hi All,
I've been playing with the PBF reader implementation in Osmosis to see if I
can improve its performance.
The nice thing about the PBF format is that the data stream is broken into
coarse chunks that can be processed using multiple threads without thread
synchronisation being a major overhead. I've just checked in a new
--read-pbf-fast implementation which does just that. It is a complete
rewrite of the existing PBF implementation (it was easier to re-write than
to retrofit threading into the current implementation). From an end user
perspective, it is similar to the existing --read-pbf task but has an
additional argument called "workers" which defines the number of worker
threads to use for processing. It defaults to 1, but increasing it to
match your number of cores gives a significant performance boost. On my
quad-core system (no hyper-threading) I get a 2-3 times performance
increase when just reading the file and discarding the contents.
Real-world usage with a longer pipeline will be less dramatic.
My command line in testing looks like this:
osmosis --read-pbf-fast myfile.pbf workers=4 --b bufferCapacity=10000 --wn
A large buffer is very important. The task implementation uses the master
thread to split the input stream *and* send results to the sink. Using a
buffer is essential if you have any downstream tasks connected.
It's only available in Git for now, and is documented on the development
version of detailed documentation.
http://wiki.openstreetmap.org/wiki/Osmosis/Detailed_Usage_0.41#--read-pbf-fast_.28--rbf.29
Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/osmosis-dev/attachments/20120729/668dd570/attachment.html>
More information about the osmosis-dev
mailing list