[OSM-talk] OGL, OSM, NASA

Tue Apr 25 17:25:02 BST 2006

Over the last couple of days, I have been trying the geometric data 
types in MySQL with different size data sets. I have also been on the 
MySQL list to try to make the database for OSM faster.

It seems that the existing database structure (and the tests I posted on 
osm-dev) use only one column of the index which narrows down for either 
a lat or lon range, not both. The other range is searched. Having that 
column in the index makes searching faster through having the index 
already in memory. This explains the sqrt(2) relationship between tile 
size and query time.

MySQL does handle Geographic data types, and it handles R-tree indexes 
for these types.

I have performed tests with point type geographic data type on MySQL 
(not postresql yet). The R-Tree spatial index type on one level is much 
more efficient. On another level, it is less efficient.

The R-tree index on MySQL (possibly also on postgresql) for a given area 
is distributed across the disc surface. Also, a point data type actually 
takes up 32 bytes in the table and 63 bytes in the index. Consequently, 
it is extremely memory inefficient, and look-ups are very slow unless 
the index is in memory, when they are very fast.

At this point, I am sure that the point data type implementation for 
MySQL is no good for GPX traces due to the huge amount of memory each 
point takes. The geometric data types may be OK for objects edited by 
humans, as this is likely to grow less quickly.

I don't know how efficient postgress is for these data types, or if an 
R-tree index can be created for a 4-byte integer type in postgress or 
MySQL.

 From my tests over the last few days, I tentatively conclude:
1) The geographic data types in MySQL are not suitable for GPX traces.
2) Assuming we can't yet set an R-tree index on a 2-column integer 
co-ordinate, then a partitioned database scheme using a single index key 
column would be the most efficient
3) The Geographic data types with an object representation moving from 
the database through the API to the application would likely be efficient.

Raphaël Jacquot wrote:
> SteveC wrote:
> 
>> * @ 25/04/06 02:47:15 PM sxpert at esitcom.org wrote:
>>
>>> as I figured that the osm server was slow for some reason, I set out 
>>> to create a replica that could be used as a temporary repository that 
>>> would be synced at night, allowing me to use josm much faster. so I 
>>> started writing a clone of the api...
>>>
>>> that's where I am right now...
>>
>>
>>
>> Oh cool, sorry for misunderstanding!
>>
>> We could do with help moving the db etc if you can work on it.
> 
> 
> sure...
> lemme finish the postgresql implementation :D
> 
>> have fun,
>>
>> SteveC steve at asklater.com http://www.asklater.com/steve/
>>
> 
> 
> _______________________________________________
> talk mailing list
> talk at openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/talk
>