[OSM-dev] [osmosis-dev] Proposal for a multithreaded PBF reader

François Battail francois.battail at sipibox.fr
Wed Apr 29 18:28:31 UTC 2015


Le 29/04/2015 19:27, Paul Norman a écrit :
> The real gains from the threading branch were not multi-threaded PBF reading, but more concurrency in the
> geometry processing and database parts.

Completely agree. Parsing Planet *without* doing anything is around 800s 
(less than 15 min), copying it is something like 200s, so ideally we can 
gain a 4x speedup by using tricky things (AIO, look ahead, 
threading...). It's simply ridiculous according to the time needed to 
process OSM objects and invoking libpq even when using binary format and 
prepared statements.

May be for some specific applications it could be of interest, but for 
integrating OSM data in a database there's no value for optimizing 
parsing as the database workers are mostly the limiting factor.

In my application, with 32 GB of memory (and 32 GB of swap) I need to 
pause the parser because the queue is full and I'm waiting for the 
database to process the bulk loading (without indexes).

I've tried to optimize as much as possible all stages - even the parsing 
by using a custom allocation system - I don't see the point to optimize 
more this part as the bottleneck is the database (and I don't want to 
rewrite PostgreSQL which is a very good software!).

Best regards



More information about the dev mailing list