[OSM-dev] Spatial queries in Mysql
Roland Olbricht
roland.olbricht at gmx.de
Tue Jul 21 11:40:29 BST 2009
> Does anyone have experience with Mysql and 2d-range-queries?
>
> I am looking for the optimal way of:
> a) selecting all nodes in a given bounding-box and
> b) selecting all ways intersecting a given bounding-box.
>
> I am not bound to any existing schema.
>
> What works best?
I've started OSM3S with mySQL (and myIsam tables), but it had very poor
performance for spatial queries with all the nodes from the planet.
For mySQL, I used an extra index containing the latitude rounded to an
integer, then the longitude. Thus, a spatial query like the one below was
aligned or nearly aligned with the index. This had the same performance than a
quadtile index with latitude and longitude interleaved as binary numbers. The
legacy system to which I compare uses a quadtile index.
I don't have exact test results, but for retrieving a bbox like 51.0<lat<52.0,
7.0<lon<8.0 the legacy system takes 2 seconds while mySQL took more than two
hours. Similar results have appeared for the ways table. The tests were
performed on a Intel T9500 with 4 GB RAM, operated by Hardy Heron 32-bit.
Monitoring the process showed that mySQL was busy with the disk all the time.
So must probably, mySQL organises the data on the disk in a way that is
inappropriate for retrieving large amounts of data. Unfortunately, there is no
documentation how the data is arranged on the disk and no switch how to let it
be organised in a more useful way.
So if you are intending to do spatial queries on a planet file sized amount of
data, you should probably pass to a more performant system. PostgreSQL does
not document its disk storage strategy either, so the same problem might or
might not appear. Somebody has suggested the Cern database Root, but I have
not tried this one. The source code of the mentioned legacy system is also
available.
Cheers,
Roland
More information about the dev
mailing list