[OSM-dev] getnodes query optimization

Christopher Schmidt crschmidt at crschmidt.net
Fri Jun 2 15:37:30 BST 2006


http://trac.openstreetmap.org/browser/www.openstreetmap.org/ruby/api/osm/dao.rb#L587
says 

res = call_sql { "select id, latitude, longitude, visible, tags
from (select * from (select nodes.id, nodes.latitude, nodes.longitude,
nodes.visible, nodes.tags from nodes, nodes as a where a.latitude >
#{lat2}  and a.latitude < #{lat1}  and a.longitude > #{lon1} and
a.longitude < #{lon2} #{timeclause} and nodes.id = a.id order by
nodes.timestamp desc) as b group by id) as c where visible = true and
latitude > #{lat2}  and latitude < #{lat1}  and longitude > #{lon1} and
longitude < #{lon2}" }

This seems to have several unneccesary subselects. I'm not sure for the
exact reason for this, but I loaded data into a local OSM instance, and
ran the query, and got the same results as I would have with:

select nodes.id, nodes.latitude, nodes.longitude,
nodes.visible, nodes.tags from nodes, nodes as a where a.latitude >
#{lat2}  and a.latitude < #{lat1}  and a.longitude > #{lon1} and
a.longitude < #{lon2} #{timeclause}

(The 'order by nodes.timestamp' probably makes a difference, but most
apps should hopefully not depend on the order of nodes.)

It looks like this was legacy code from some other point in time when
things were done differently: can anyone check this out and see if this
needs to be this way? Perhaps this is just a problem with my small
dataset or something else I'm missing?

The reason I ask is that the former query is going to be *much* slower
than the latter in any but the most simplistic case: the former uses
both filesort and temporary tables, presumably because of the subselects
in some way, and both of those are known as big no-nos to database
optimization...

Thanks in advance,

-- 
Christopher Schmidt
Web Developer




More information about the dev mailing list