[OSM-dev] on-disk indexing of geodata
sascha-ml-gis-osm-dev at silbe.org
Fri Oct 17 16:53:41 BST 2008
On Fri, Oct 17, 2008 at 05:20:52PM +0200, Freek wrote:
>> But how do you do those splits, i.e. node inserts?
> Just pick a new block somewhere else where you have free space (for
> example at the end of your mmap()'ed file), then split the pointers to
> the child nodes evenly over the two blocks (old and new one) so they
> are about half-full, and finally add a pointer to the new block in the
> parent node (again splitting if necessary, etc.).
OK, so you just use pointers (=> index gets much larger) and often give
up locality (even if you spread some free blocks across the index).
Somehow I'd hoped for a silver bullet. ;)
For regular planet imports and hourly diffs, using the current structure
+ invalidation bitmap for the bulk part and a pointer-based RTree for
the updated objects might be an idea. Or just a second database with the
current structure (=> easy to implement) for the updated objects -
updates usually are small (compared to the whole data set), so an index
rebuild should be fast.
>> R-Tree generation scales fine (still fits into memory),
>> 1D index sorting is O(2n)
> O(2n)? I guess you are using big O's for a different purpose then I do
> normally ;-)
It's not strictly canonical use, yes. :)
It was intended to say that it's about twice as slow as if it would
scale strictly linearly. While O(...) ignores constant factors (by
definition), they do matter in practice. :)
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 481 bytes
Desc: Digital signature
More information about the dev