[OSM-dev] [osmosis-dev] Proposal for a multithreaded PBF reader

Martijn van Exel m at rtijn.org
Wed Apr 29 19:45:51 UTC 2015


This is assuming that you are feeding a database, which is not our use
case. But I realize it is for most people, which is why I wanted to
open this discussion in the first place.
Martijn
Martijn van Exel
skype: mvexel


On Wed, Apr 29, 2015 at 12:28 PM, François Battail
<francois.battail at sipibox.fr> wrote:
> Le 29/04/2015 19:27, Paul Norman a écrit :
>>
>> The real gains from the threading branch were not multi-threaded PBF
>> reading, but more concurrency in the
>> geometry processing and database parts.
>
>
> Completely agree. Parsing Planet *without* doing anything is around 800s
> (less than 15 min), copying it is something like 200s, so ideally we can
> gain a 4x speedup by using tricky things (AIO, look ahead, threading...).
> It's simply ridiculous according to the time needed to process OSM objects
> and invoking libpq even when using binary format and prepared statements.
>
> May be for some specific applications it could be of interest, but for
> integrating OSM data in a database there's no value for optimizing parsing
> as the database workers are mostly the limiting factor.
>
> In my application, with 32 GB of memory (and 32 GB of swap) I need to pause
> the parser because the queue is full and I'm waiting for the database to
> process the bulk loading (without indexes).
>
> I've tried to optimize as much as possible all stages - even the parsing by
> using a custom allocation system - I don't see the point to optimize more
> this part as the bottleneck is the database (and I don't want to rewrite
> PostgreSQL which is a very good software!).
>
> Best regards
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev



More information about the dev mailing list