[OSM-dev] Proposed database improvement

Thu Aug 31 11:54:35 BST 2006

Results

real    0m9.000s
user    0m0.006s
sys     0m0.003s

real    0m0.028s
user    0m0.006s
sys     0m0.001s

real    0m0.027s
user    0m0.008s
sys     0m0.000s

Script:
origquery="SELECT id, node_a, node_b, tags FROM current_segments where 
(node_a IN ($nodelist) OR node_b IN ($nodelist))  and visible = true"

nickquery="CREATE TEMPORARY TABLE current_segments_temp LIKE 
current_segments; \
INSERT INTO current_segments_temp SELECT * FROM current_segments where 
node_a IN ($nodelist) and visible = true; \
INSERT INTO current_segments_temp SELECT * FROM current_segments where 
node_b IN ($nodelist) and visible = true; \
SELECT DISTINCTROW id, node_a, node_b, tags FROM current_segments_temp 
ORDER BY id ASC"

davidquery="SELECT DISTINCTROW * FROM (
SELECT id, node_a, node_b, tags FROM current_segments where node_a IN 
($nodelist) and visible = true
UNION
SELECT id, node_a, node_b, tags FROM current_segments where node_b IN 
($nodelist) and visible = true
) as b ORDER BY id ASC"

time mysql -u$username -p$password -e "$origquery" openstreetmap >test1

time mysql -u$username -p$password -e "$nickquery" openstreetmap >test2

time mysql -u$username -p$password -e "$davidquery" openstreetmap >test3

In conclusion, David's suggestion of using a union is slightly faster 
than my idea of using a temporary table. (I am used to MySQL <4.1 so am 
not particularly familiar with sub-queries). This may or may not be the 
case when using larger numbers of selects, as my solution replicated 
indexes on the temporary table so could be more scalable.

David Sheldon wrote:
> On Thu, Aug 31, 2006 at 10:26:53AM +0100, Nick Hill wrote:
>> SELECT id, node_a, node_b, tags FROM current_segments where (node_a IN 
>> (<nodelist>) OR node_b IN (<nodelist>))  and visible = true
> 
>> 2. A union. This will generate duplicate lines unsorted
>> 3. Two queries into a temporary table then select distinctrow
> 
> How about a union and a distinctrow?
> 
> 
> SELECT DISTINCTROW * FROM (
>    SELECT id, node_a, node_b FROM current_segments where node_a IN 
> 	                   (<nodelist>) and visible = true
> UNION
>    SELECT id, node_a, node_b FROM current_segments where node_b IN 
> 	                   (<nodelist>) and visible = true
>    )
> 
> Of course this may be implemented internally using a temporary table,
> but we can leave that up to the query analyser.
> 
> Could you see how long that takes?
> 
> David