[OSM-dev] Proposal: Database accelerator and dirty tile marker based on simple algorithm.

Sat Sep 16 17:37:06 BST 2006

Hello Nigel

I remember reading your quadtree post.

You could, for example, use the two most significant bits to represent 
each quadrant, then the two next most significant bits to select  the 
next level quadrant and so on. With a 32 bit int, you can do this 16 
times, dividing the planet into 4Bn portions, much like the tiles I 
suggested.

I think using an int and bits scheme would be wildly more efficient than 
using characters.

Extending the idea I proposed, If each object was associated with a 
tile, each object would only need an additional single 32 bit int to 
locate to within 1cm. The algorithm to derive lat/lon would be extremely 
computationally simple and fast- a few steps of a shift register. In 
fact, faster than the transport of data that they will derive.

I haven't yet considered your quadtree proposal in enough depth to fully 
compare the two approaches, but there are many similarities.

My proposal could potentially be mimicked by MySQL by selecting partial 
indexing. For example, indexing only the first 16 bits of each lat and lon.

Nigel Magnay wrote:
> yes - use tiles, but I'd rather use quadtiles, as they have nicer 
> properties (particularly you can fetch the super-tile of 4 child tiles 
> incredibly easily. Calculating which tile a point is in is not very 
> hard; I've got a bit of java lying around, but it's easy enough to 
> understand.
> 
> Incidentally, though I didn't mention it at the time, this scheme is 
> particularly good for databases that support data file partitioning.
> 
> Here's the post I sent a while ago:
> 
>  > [sniiip]
>  > I'd be interested to hear from computer scientists about fancy 
> algorithms for
>  > spacially-sorting the data, doing indexes, etc. First person to mention
>  > Hilbert space gets a nerd of the month award.
>  >
> 
> Quadtrees. We need to be doing this in OSM in general, and it would
> make things much simpler.
> 
> To illustrate - divide the world into 4 quadrants. Call the top-left
> A, top-right B, bottom-left C, bottom-right D. Then for each quadrant,
> do the same. thus the grid is now
> 
> AA AB BA BB
> AC AD BC BD
> CA CB DA DB
> CC CD DC DD
> 
> each time you repeat this process (and you can do this as much as you
> like), you halve the area in each addressable tile. For around 13
> characters, you get (IIRC) 100m of resolution if you started with the
> entire globe (may be wrong, memory is hazy).
> 
> So, just thinking about nodes, we store along with each (long, lat)
> coordinate the quadtree address. Instead of querying, say,
> 
> select * from nodes where lat>? and lat<? and lon>? and lon<?
> 
> we just do (say)
> select * from nodes where quadtree='AAABDBA'
> 
> rather neatly, if you wanted to zoom out to the next level of detail, 
> you can do
> 
> select  * from nodes where quadtree like 'AAABDB%'
> 
> Which will get any node in AAABDBA, AAABDBB, AAABDBC and AAABDBD. This
> is a single index lookup, and is very fast.
> 
> This also solves another sticky problem. Consider segments (ways are
> just the same on a larger scale). You have a segment going from AB to
> BC. Using the current 'get nodes in area, get missing nodes in
> segments' method, if you fetch the area AD, you'll miss the fact that
> there is this segment going through it. By storing for each
> way/segment the list of tiles it intersects, the segments query is
> also simplified to 'select * from segments where quadtree like 'xyz%'.
> 
> This is particularly important for areas - for example, you could be
> viewing a tile in the middle of a forest whose edge does not actually
> intersect with your tile at all.
> 
> This makes the map query somewhat easy, just a sequence of
> select * from nodes where quadtile like ?
> select * from segments join segment_quadtile on
> segment.id=segment_quadtile
> .quadtile where segment_quadtile like ?
> etc., etc.
> 
> Using quadtree addresses is also useful for external applications. Say
> OSM had an RSS feed of changes that listed which quadtiles have been
> changed. Map renderers can thus use this information to invalidate
> their cache for just that tile, and re-render, without having to check
> for changes each time. I could 'register' for information if anything
> changed in a list of quadtile addresess (say, those covring oxford),
> and it would be very easy to calculate when something I would be
> interested in had happened.
> 
> This isn't very complicated and could be done relatively quickly. I
> have most of a java implementation buried on my harddrive somewhere...
> 
>