[OSM-dev] [osmosis-dev] Proposal for a multithreaded PBF reader

Andrew Byrd andrew at fastmail.net
Thu Jun 4 09:04:48 UTC 2015


Hello,

Can anyone provide anecdotes of use cases where multi-threaded PBF reading significantly speeds up processing? Generally I would expect PBF reading to be IO bound rather than processor bound, but I still need to make more accurate measurements. 

Of course actually processing the OSM data once the PBF is decoded can be quite CPU intensive, but that would imply buffering decoded data and parallelizing geometric operations for example, not the reading.

I’d appreciate any data points and example use cases you might have, as I’m currently working on related tooling.

Andrew Byrd

> On 04 Jun 2015, at 05:57, Brett Henderson <brett at bretth.com> wrote:
> 
> On 30 April 2015 at 03:27, Paul Norman <penorman at mac.com <mailto:penorman at mac.com>> wrote:
> On 4/29/2015 9:55 AM, Martijn van Exel wrote:
> If osmosis is the reference implementation, is there a reason why it
> doesn't seem to leverage this block structure to speed up reading? Or
> does it?
> Osmosis has the --read-pbf-fast task which allows multiple worker threads.
> 
> That's right.  I forget how the PBF structure works off the top of my head, but the file is already split into blocks.  The main --read-pbf-fast thread simply grabs the outer protobuf blocks from file and then distributes them to worker threads who parse out the OSM entities from within the block.  After extraction, the entities within each block are passed to the downstream task in original file order.  I'm not sure I see the need to modify the PBF file format.
>  
> _______________________________________________
> osmosis-dev mailing list
> osmosis-dev at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/osmosis-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20150604/78b3fcaf/attachment.html>


More information about the dev mailing list