[OSM-dev] on-disk indexing of geodata

Fri Oct 17 16:53:41 BST 2008

On Fri, Oct 17, 2008 at 05:20:52PM +0200, Freek wrote:

>> But how do you do those splits, i.e. node inserts?
> Just pick a new block somewhere else where you have free space (for 
> example at the end of your mmap()'ed file), then split the pointers to 
> the child nodes evenly over the two blocks (old and new one) so they 
> are about half-full, and finally add a pointer to the new block in the 
> parent node (again splitting if necessary, etc.).
OK, so you just use pointers (=> index gets much larger) and often give 
up locality (even if you spread some free blocks across the index). 
Somehow I'd hoped for a silver bullet. ;)
For regular planet imports and hourly diffs, using the current structure 
+ invalidation bitmap for the bulk part and a pointer-based RTree for 
the updated objects might be an idea. Or just a second database with the 
current structure (=> easy to implement) for the updated objects - 
updates usually are small (compared to the whole data set), so an index 
rebuild should be fast.

>> R-Tree generation scales fine (still fits into memory),
>> 1D index sorting is O(2n)
> O(2n)? I guess you are using big O's for a different purpose then I do 
> normally ;-)
It's not strictly canonical use, yes. :)
It was intended to say that it's about twice as slow as if it would 
scale strictly linearly. While O(...) ignores constant factors (by 
definition), they do matter in practice. :)

CU Sascha

-- 
http://sascha.silbe.org/
http://www.infra-silbe.de/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 481 bytes
Desc: Digital signature
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20081017/a7d6f1fc/attachment.pgp>