[Tile-serving] Parallelizing more of osm2pgsql

Kai Krueger kakrueger at gmail.com
Tue Aug 27 02:29:04 UTC 2013


Hello everyone,

I would like to push this parallelization work on osm2pgsql along and
get it ready for merging into trunk.

I have now cleaned up the code further and it passes all of the
regression tests in the current test suit, as well as pass "visual
inspection" of a test import of Germany.

To understand the scaling issues Paul mentioned in his test below, I
have also done a set of tests. Equally on a cr1.8xlarge EC2 spot
instance with 244GB of ram. As I only imported Germany (1.7GB pbf) to be
able to run more tests, I imported everything into a ram disk. So disk
I/O is pretty much unlimited.

On the old single threaded code, I got the following values for the
first stage of processing:

Processing: Node(142090k 1406.8k/s) Way(21620k 37.09k/s) Relation(335190
686.86/s)  parse time: 1172s

For the new code, running it with --number-processes=1 is a slight loss
of performance:

Processing: Node(142090k 1435.3k/s) Way(21620k 32.46k/s) Relation(335190
663.74/s)   parse time: 1270s --number-processes 1

But with increasing number of processes performance does improve
considerably over the single processor case:

Processing: Node(142090k 1420.9k/s) Way(21620k 55.58k/s) Relation(335190
1151.86/s)  parse time: 780s --number-processes 2
Processing: Node(142090k 1406.8k/s) Way(21620k 86.83k/s) Relation(335190
1831.64/s)  parse time: 533s  --number-processes 4
Processing: Node(142090k 1420.9k/s) Way(21620k 92.40k/s) Relation(335190
2280.20/s)  parse time: 481s --number-processes 8
Processing: Node(142090k 1449.9k/s) Way(21620k 77.49k/s) Relation(335190
2377.23/s)  parse time: 518s --number-processes 16

However you can also see that Paul's findings that it doesn't really
scale beyond 4 processors holds up in my testing as well. Closer
analysis indicates, that for ways, it might really be bottlenecked on
the single threaded pbf parser at a bit over 90k ways/s. I am still a
little surprised that the relations don't scale better though. I wonder
if using postgresql 9.2 instead of 9.1 has any effect which supposedly
has a number of scalability enhancements.

Overall, a decrease of time of 1200s to 500s (for the first section of
the import of Germany) is not too bad though and so I do think it is
well worth working on committing the code to osm2pgsql trunk.


The question remains though, how best to proceed with this?

Although the tests do all pass now, I would still like to see some
additional testing by others to verify that there are no issues before
committing it. Thread race conditions are notoriously difficult to catch
in unit or integration testing. Furthermore, I'd like to know How are
these scaling numbers affected by different hardware?

Furthermore, a question would be how best to actually commit this. The
commit history of the threading branch is messy, in not always logical
order and often broken at times. So a bisection through the commit
history is difficult at best. I am therefore thinking of flattening the
history into one or two big patches. Any thoughts on what would be best
here?

Kai



On 07/16/2013 04:49 PM, Paul Norman wrote:
>> From: Kai Krueger [mailto:kakrueger at gmail.com]
>> Sent: Sunday, July 14, 2013 4:14 PM
>> Subject: Re: [Tile-serving] Parallelizing more of osm2pgsql
>>
>> Hello everyone,
>>
>> If anyone is brave enough to try this out, or has a spare machine they 
>> could test it on, it would be great to get some additional results on 
>> different hardware to see how much this does or doesn't help. Also it 
>> seems some of the changes needed to get things thread safe appear to 
>> have reduced single threaded efficiencies somewhat. My guess would be 
>> either due to lock operations, or as things are now no longer in a 
>> single transaction, as you can't share a transaction across threads.
> I ran some tests on a cr1.8xlarge EC2 spot instance (0.35 USD/hr spot), and
> the results are interesting.
>
> The instance has 244GB of RAM, 240GB of SSD as 2 120GB volumes, 2 E5-2670,
> which have a total of 16 cores (32 threads) at 2.6GHz, 3.3GHz turbo. Flags
> used were --slim --flat-nodes --drop -C 20000 --unlogged, postgres with
> fsync off and all options for speed over anything resembling integrity, with
> 4GB maintenance_work_mem. This machine is actually *slower* for single-core
> performance over Kai's laptop which has better IPC and a higher turbo. The
> geofabrik Europe extract was used for testing.
>
> I tried with both 32 and 8 processes. I had to up the postgresql connections
> for the former. 
>
> For the first stage: 
>
> 8: Processing: Node(987516k 1256.4k/s) Way(119852k 39.08k/s)
> Relation(1498360 2578.93/s)  parse time: 4434s
> 32: Processing: Node(987516k 1231.3k/s) Way(119852k 41.82k/s)
> Relation(1498360 2522.49/s)  parse time: 4262s
>
> Ways were at about 400% CPU, relations peaked at 600%. I hit 2.5k iops write
> on the SSD array.
>  
> We could be limited by the single-threaded PBF reader here.
>
> For the second stage:
>
> 8: Pending ways: 17k/s
> 32: Pending ways: 24k/s
>
> The indexing and clustering stages do benefit from higher core counts, and
> exceeded 5k iops write, being CPU bound.
>
> Overall:
> 8: 4h30m
> 32: 4h
> 32 without USE_TREE: 3h30m
>
> Conclusions: osm2pgsql is limited to effectively 4-8 cores, perhaps from a
> single-threaded task. Good for a desktop i7, but doesn't really make full
> use of a server CPU. The fastest machine I have access to for osm2pgsql
> would probably be my overclocked gaming desktop with a 4-core i5, no HT.
>
> I then tried non-slim mode. This used slightly more RAM then before, but at
> no point did all the RAM get used. The total time was 6:27:12.
>
> Conclusions: If you have enough ram for non-slim you have enough ram to
> cache your database reads and writes, and --slim --drop is faster than
> non-slim.
>
> Something for discussion: Do we want to optimize non-slim mode or drop it?
> Right now it's slower and has a very high RAM requirement, so it really
> doesn't make sense to keep it as-is.
>




More information about the Tile-serving mailing list