[OSM-dev] Speeding up Osm2pgsql through parallelization?
Kai Krueger
kakrueger at gmail.com
Sat Sep 17 17:43:47 BST 2011
On 01/-10/-28163 12:59 PM, Hartmut Holzgraefe wrote:
[...]
>
> For the sorting code to perform well work_mem is key though, while
> for the index recreation step maintenance_work_mem is needed.
> Currently the wiki guide on pgsql configuration suggests static
> values for both (and until recently did not mention work_mem at all).
> There's also an issue with osm2pgsql not really returning the
> cache memory to the operating system due to heap fragmentation.
>
> Better results for large imports (table sizes much larger than
> RAM size) could probably be archived by:
>
> * making sure osm2pgsql properly returns the memory used for its
> cache to the operating system, for this i've got a working patch:
>
> https://github.com/hholzgra/osm2pgsql/tree/freeable_cache
OK, I have committed your patch and extended it to have a command line
option to fall back to the old behavior.
In Linux, allocating the node cache as one large chunk works well, as
internally Linux over commits memory and only allocates physical ram for
those pages that are actually written too. So you can still specify a
large cache value and only use as much physical memory as you need to
cache all the nodes. As it is hard to guess how much cache one needs,
particularly for diff-imports, this is pretty helpfull to not waste memory.
Other operating systems (Solaris? Mac OSX? Windows?) might behave
differently and actually reserve the full amount of memory, in which
case one might want to fall back to the old behavior at least for diff
imports.
>
> * serializing the index creation and clustering steps
>
> running these in parallel makes sense where everything fits
> into caches in RAM so that things get CPU bound, but on large
> imports things will be IO bound anyway, and parallel index
> builds just lead to lower cache hit rates which causes even
> more IO load
I have committed this patch too, although I have changed it so that the
default remains to do the indexing in parallel, and the command line
switch changes the behavior to serial indexing.
On my preliminary benchmarks, doing the indexing in parallel was
slightly faster than doing it one table at a time. However, I only tried
it on small planet extracts (about 100Mb for the osm.pbf file). I also
didn't play around with the postgresql settings of work_mem and
maintanance_work_mem. Which is potentially where the benefit from doing
things sequentially comes from, by being able to set those values higher.
>
> * start with low default work_mem and maintenance_work_mem settings
> and raise them on a per statement level, so making the appropriate
> buffer for a given operation (work_mem for ORDER BY,
> maintenance_work_mem for CREATE INDEX) as large as possible and
> then shrinking it back to its default size afterwards
Is it possible to adjust work_mem and maintanance_work_mem at run time?
>
> Whether CLUSTER or the current approach is better/faster for our
> imports needs to be benchmarked, my personal bet would be that
> CLUSTER wins as our data distribution over time is not totally
> random, but this really needs to me tested out.
I haven't yet benchmarked the difference between CLUSTER and the current
sorting based method.
Has anyone else got numbers on this?
Kai
>
> (one additional advantage of CLUSTER would be that peak disk
> space usage during the operation would only be about two times
> the data size instead of three times with the current approach)
>
More information about the dev
mailing list