[OSM-dev] Spatial queries in Mysql

Tue Jul 21 11:40:29 BST 2009

> Does anyone have experience with Mysql and 2d-range-queries?
>
> I am looking for the optimal way of:
> a) selecting all nodes in a given bounding-box  and
> b) selecting all ways intersecting a given bounding-box.
>
> I am not bound to any existing schema.
>
> What works best?

I've started OSM3S with mySQL (and myIsam tables), but it had very poor 
performance for spatial queries with all the nodes from the planet.

For mySQL, I used an extra index containing the latitude rounded to an 
integer, then the longitude. Thus, a spatial query like the one below was 
aligned or nearly aligned with the index. This had the same performance than a 
quadtile index with latitude and longitude interleaved as binary numbers. The 
legacy system to which I compare uses a quadtile index.
I don't have exact test results, but for retrieving a bbox like 51.0<lat<52.0, 
7.0<lon<8.0 the legacy system takes 2 seconds while mySQL took more than two 
hours. Similar results have appeared for the ways table. The tests were 
performed on a Intel T9500 with 4 GB RAM, operated by Hardy Heron 32-bit.

Monitoring the process showed that mySQL was busy with the  disk all the time. 
So must probably, mySQL organises the data on the disk in a way that is 
inappropriate for retrieving large amounts of data. Unfortunately, there is no 
documentation how the data is arranged on the disk and no switch how to let it 
be organised in a more useful way.

So if you are intending to do spatial queries on a planet file sized amount of 
data, you should probably pass to a more performant system. PostgreSQL does 
not document its disk storage strategy either, so the same problem might or 
might not appear. Somebody has suggested the Cern database Root, but I have 
not tried this one. The source code of the mentioned legacy system is also 
available.

Cheers,
Roland