[OSM-dev] Splitting the planet into thousands of pieces in one pass.
nroets at gmail.com
Wed Dec 1 18:33:58 GMT 2010
On Wed, Dec 1, 2010 at 8:18 PM, Scott Crosby <scrosby at cs.rice.edu> wrote:
> On Wed, Dec 1, 2010 at 7:27 AM, Nic Roets <nroets at gmail.com> wrote:
> > Hello Scott,
> > How do you keep track of what bboxs each entity belongs to ?
> An Int2ShortMultiMap implemented by composing two underlying
> Int2ShortMap implementations with different space efficiency
> tradeoffs, a custom sparsearray
> implementation based on
> and one imported from a library of java collections specialized to
> primitive types ('fastutils') that uses a standard hashtable. The
> hybrid has a memory overhead of about 4 bytes per node output, storing
> 750m keys with 800m vals in 3.2gb of heap.
> The approach for generating an Int2ShortMultiMap from several
> Int2ShortMap's is by layering them. When put()ing keys, I store in
> Int2ShortMap, but if the key is already there, I try
> Int2ShortMap, ..... until I find one that is free. I create
> additional maps as needed. For dense maps, sparsehash is the more
> efficient implementation, and for non-dense maps at the higher layers,
> a hashtable is the more efficient implementation.
> To avoid checking each bbox for each point, used a simpler design than
> a quadtree or r-tree as andrzej suggests, some precomputing and binary
> searches at a cost of (sqrt(n)) instead of O(log n) for mostly
> > I'm not really asking a question, I'm just saying that I found a way to
> > reduce the memory requirement for that considerably. Instead of a bit per
> > bbox per entry, I store only 16 bits or 32 bits per entry. Here is the
> > source.
> Definitely. Can you explain how you track the information?
> You have a much more concise implememtation than mine, but I can't
> easily figure it out what it is doing from the source code.
Let's say there are 80 bboxes, then you can use a 80 bit number for each
entity to record which bboxes they fell into and which they didn't. That
2^80 space is extremely sparse. For my regular splits of 80 bboxes there are
typically only 1000 combinations that actually occur. So I store 1000
"younions" and then a 16-bit index per entity. And my bbox are as chaotic as
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dev