[OSM-dev] osm2pgsql slow slim import

Kai Krueger kakrueger at gmail.com
Fri Dec 2 06:16:35 GMT 2011


Thanks for catching this regression. As my test database was always set to
fsync=off, I didn't notice this performance regression during the
development of the parallelisation work.

The problem is that in order to allow multiple threads to work through the
pending ways and thus potentially speed this phase up if you were CPU bound
or have a raid array, I had to break the extended transaction in which
everything was previously encapsulated. This was necessary as transactions
can't be shared across connections and it could otherwise dead lock.
Therefore each processing of a pending way however was in a separate
transaction and the transaction commit rate of postgresql suddenly became
critical. If run with fsync on (and you don't have a raid controller with
battery backed cache), the commit rate can actually be incredibly low. For
example according to pg_test_fsync, on my laptop it can only do 50 fsyncs/s
and indeed with fsync and synchronous_commit turned on, I got exactly those
50 pending ways per second processed.

Anyway, as Frederik suggested, I have changed osm2pgsql to automatically set
the parameter synchronous_commit to off for the import session.

Doing this should be safe. Unlike turning off fsync, turning of
synchronous_commit can not lead to a corrupted database. All that can happen
is that a few transactions that osm2pgsql though were processed might get
lost on a database crash. During a full import that is "fine" as there is no
way to recover from a partial import anyway and one needs to start from
scratch. However, it should also be fine during diff imports, as it will
simply mean some pending ways that were processed did not get mark as done
and will be re-processed on the next diff import.

Hopefully this therefore fixes the performance regression introduced
previously.

Kai

P.S. there is still a potential for dead lock during diff imports with more
than one helper process, as the output tables are still written to in an
extended transaction. I'll still need to try and fix this, probably breaking
that extended transaction into transactions per statement, too.

--
View this message in context: http://gis.638310.n2.nabble.com/osm2pgsql-slow-slim-import-tp7044819p7053692.html
Sent from the Developer Discussion mailing list archive at Nabble.com.



More information about the dev mailing list