[OSM-dev] [osmosis-dev] Proposal for a multithreaded PBF reader
François Battail
francois.battail at sipibox.fr
Wed Apr 29 18:28:31 UTC 2015
Le 29/04/2015 19:27, Paul Norman a écrit :
> The real gains from the threading branch were not multi-threaded PBF reading, but more concurrency in the
> geometry processing and database parts.
Completely agree. Parsing Planet *without* doing anything is around 800s
(less than 15 min), copying it is something like 200s, so ideally we can
gain a 4x speedup by using tricky things (AIO, look ahead,
threading...). It's simply ridiculous according to the time needed to
process OSM objects and invoking libpq even when using binary format and
prepared statements.
May be for some specific applications it could be of interest, but for
integrating OSM data in a database there's no value for optimizing
parsing as the database workers are mostly the limiting factor.
In my application, with 32 GB of memory (and 32 GB of swap) I need to
pause the parser because the queue is full and I'm waiting for the
database to process the bulk loading (without indexes).
I've tried to optimize as much as possible all stages - even the parsing
by using a custom allocation system - I don't see the point to optimize
more this part as the bottleneck is the database (and I don't want to
rewrite PostgreSQL which is a very good software!).
Best regards
More information about the dev
mailing list