[OSM-dev] Speeding up Osm2pgsql through parallelization?

Frederik Ramm frederik at remote.org
Tue Sep 13 01:20:00 BST 2011


Kai,

   partial answer:

On 09/13/2011 02:07 AM, Kai Krueger wrote:
> 2) Currently all the (diff-) import is done in a single transaction.
> Therefore other db users (e.g. renderers) don't see any change until the
> full transaction is committed. In order to do things in parallel,
> however, there needs to be intermediary commits

[...]

> The question though is this valid? For the initial import this is
> probably not a problem as there won't be any db users concurrently until
> the import is complete. However, diff imports with concurrent rendering
> is a different matter. What will committing pending ways do to rendering?

Renderers use the geometry tables; the "pending" way is in the data 
table where it will not usually be touched by renderers. So I don't see 
a problem here. I am however not familiar with internal Postgres 
processing and I could imagine that there is a speed penalty in 
commiting pending ways as opposed to resetting the pending flag in the 
same transaction where it was set.

> 3) Currently the string cache is not thread safe. It is possible to
> disable it via a single preprocessor define and then parallelizing at
> least doesn't lead to crashes, but I assume it is there for a good
> reason. Presumably with a bit of work, it should be possible to get the
> string cache thread safe though as well. So assuming the other two
> points aren't show stoppers, this should be possible to fix.

Have you considered multiprocessing (i.e. fork) instead of 
multithreading - would this perhaps make these things go away elegantly? 
Personally I abhor multithreading for the complexity it brings at 
(usually) little gain compared to simply forking a few worker processes 
but of course YMMV especially if you want tight communication between 
workers.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"



More information about the dev mailing list