[OSM-dev] osm2pgsql slow on update import

Jukka Rahkonen jukka.rahkonen at latuviitta.fi
Sat May 7 07:06:04 BST 2011


Kai Krueger kirjoitti:
>
> Ian Dees wrote:
>>
>> On subsequent updates osm2psgql does not have node information in memory
>> anymore, so it must request the node information from PostgreSQL. This
>> takes
>> orders of magnitudes longer to do than a hit to memory.
>>
> One possible additional problem is that osm2pqsql retrieves the node
> information one node at a time, rather than multiple nodes in a single sql
> query. Therefore if a way has 2000 nodes, it will need to fire off 2000
> SQL
> queries to retrieve the nodes for that way. It also does not have any
> parallelism or asynchrony, resulting in them all being executed in a
> sequential blocking fashion.
>
> The easy solution to this would probably be to simply implement the
> pgsql_nodes_get_list function correctly, rather than as a simple for-loop
> wrapper around pgsql_nodes_get.
>
> A more complicated solution could potentially prefetch all nodes in a diff
> with a single query, before going through the normal processing. However,
> I
> have not measured either solution to see how much they would improve
> performance and if they are worth the effort.
>
> However, looking at Yevaud trying to catch up with the 2 and a half days
> since the full planet import, it also appears that there are considerable
> amount of times when applying diffs are not I/O bound, suggesting there
> are
> more bottlenecks.

Am I right in thinking that if there were suitable diffs available it
would be much faster to update directly the osm_point, osm_line and
osm_polygon tables without going through osm_points, osm_ways and
osm_rels?

-Jukka Rahkonen-





More information about the dev mailing list