[OSM-dev] Potlatch update

Richard Fairhurst richard at systemeD.net
Tue May 8 20:04:47 BST 2007


Frederik Ramm wrote:

>> I guess the best option would be a query to 'select all those  
>> nodes within this area which aren't in a segment'
>
> I am unfamiliar with the data structure but assuming you have a table
> "nodes" and a table "segments", both with an "id", and "segments"
> having a "fromnode" and a "tonode" value

Yes, that's it.

> it would probably be done like this:
>
> SELECT nodes.id
> FROM nodes
> LEFT OUTER JOIN segments
> ON (segments.fromnode=nodes.id OR segments.tonode=nodes.id)
> WHERE ... put your node-in-bounding-box condition here ...
> AND segment.id IS NULL;
>
> This will first try to pair up all nodes with segments using them,
> and later use only those records where pairing didn't work out.

Ok, so for a typical bounding box (the Oxford area), my query is:

SELECT current_nodes.id
   FROM current_nodes
   LEFT OUTER JOIN current_segments
     ON (current_segments.node_a=current_nodes.id OR
         current_segments.node_b=current_nodes.id)
WHERE (latitude BETWEEN 51.7047726979662 AND 51.8310050584275)
    AND (longitude BETWEEN -1.37151000827839 AND -1.11118408054402)
    AND current_segments.id IS NULL
    AND current_nodes.visible=1

> Pray that the SQL engine is smart enough to apply the bounding box
> constraint first and not pair up all the nodes in the world ;-)

Unfortunately it isn't. :(

The query takes many, many minutes to complete. It runs at an almost  
passable speed if you just use one half of the 'ON', i.e. ON  
current_segments.node_a=current_nodes.id. But obviously this will  
then also return nodes which are only referenced in node_b of the  
segments table.

The output from EXPLAIN is:

+----+-------------+------------------+------- 
+------------------------------------------------------------- 
+---------------------------+---------+------+-------- 
+-------------------------+
| id | select_type | table            | type  |  
possible_keys                                               |  
key                       | key_len | ref  | rows   |  
Extra                   |
+----+-------------+------------------+------- 
+------------------------------------------------------------- 
+---------------------------+---------+------+-------- 
+-------------------------+
|  1 | SIMPLE      | current_nodes    | range |  
current_nodes_lat_lon_idx                                   |  
current_nodes_lat_lon_idx |      18 | NULL |  26044 | Using  
where             |
|  1 | SIMPLE      | current_segments | ALL   |  
current_segments_a_idx,current_segments_b_idx,nodes_a_and_b |  
NULL                      |    NULL | NULL | 850328 | Using where;  
Not exists |
+----+-------------+------------------+------- 
+------------------------------------------------------------- 
+---------------------------+---------+------+-------- 
+-------------------------+

i.e. there's no index being used on current_segments. I've added  
indexes on various columns until they're coming out of my ears, but  
still no joy.

cheers
Richard




More information about the dev mailing list