[OSM-dev] osm2pgsql slow on update import
jukka.rahkonen at latuviitta.fi
Sat May 7 07:06:04 BST 2011
Kai Krueger kirjoitti:
> Ian Dees wrote:
>> On subsequent updates osm2psgql does not have node information in memory
>> anymore, so it must request the node information from PostgreSQL. This
>> orders of magnitudes longer to do than a hit to memory.
> One possible additional problem is that osm2pqsql retrieves the node
> information one node at a time, rather than multiple nodes in a single sql
> query. Therefore if a way has 2000 nodes, it will need to fire off 2000
> queries to retrieve the nodes for that way. It also does not have any
> parallelism or asynchrony, resulting in them all being executed in a
> sequential blocking fashion.
> The easy solution to this would probably be to simply implement the
> pgsql_nodes_get_list function correctly, rather than as a simple for-loop
> wrapper around pgsql_nodes_get.
> A more complicated solution could potentially prefetch all nodes in a diff
> with a single query, before going through the normal processing. However,
> have not measured either solution to see how much they would improve
> performance and if they are worth the effort.
> However, looking at Yevaud trying to catch up with the 2 and a half days
> since the full planet import, it also appears that there are considerable
> amount of times when applying diffs are not I/O bound, suggesting there
> more bottlenecks.
Am I right in thinking that if there were suitable diffs available it
would be much faster to update directly the osm_point, osm_line and
osm_polygon tables without going through osm_points, osm_ways and
More information about the dev