[OSM-dev] Proposed database improvement
nick at nickhill.co.uk
Thu Aug 31 10:26:53 BST 2006
I have been performing various tests on osm queries and found an
optimisation which will speed up the very queries which are generating
The following queries are taking around 9 seconds to complete:
SELECT id, node_a, node_b, tags FROM current_segments where (node_a IN
(<nodelist>) OR node_b IN (<nodelist>)) and visible = true
We have indexes on both node_a and node_b.
Each query will typically list 500 nodes, and be repeated as necessary
for a given area. a 500 node query takes 9 seconds to complete. On
investigation, the index for node b is not being used.
1. A compound index. This may not work given the selects are OR ed.
2. A union. This will generate duplicate lines unsorted
3. Two queries into a temporary table then select distinctrow
3 provides the exact bit-compatible output to the existing query at the
expense of a little more complexity. This is the one I have chosen,
although 2 may be adequate for the job, and could be faster still.
CREATE TEMPORARY TABLE
current_segments_temp LIKE current_segments; INSERT INTO
current_segments_temp SELECT * FROM current_segments where node_a IN
(<nodelist>) and visible = true; INSERT INTO current_segments_temp
SELECT * FROM current_segments where node_b IN
(<nodelist>) and visible = true;SELECT DISTINCTROW id, node_a, node_b,
tags FROM current_segments_temp ORDER BY id ASC'
Using this schema, we can further reduce API load by inserting the
output of all select statements for a given list of nodes into the one
temporary table in the single SQL connection, then performing distinctrow.
The new query takes 28 milliseconds to execute, compared to the existing
More information about the dev