[OSM-dev] Optimizing map tile trees

Sat Aug 20 01:07:15 BST 2011

I am relaying this question by Michael Katz from help.openstreetmap.org:

----------------------- quote ------------------

I am building my own map tiles using the OSM data, with PostGIS and 
Mapnik (as explained, for instance, here).

I am going to be serving these tiles from my own server, so I have a lot 
of flexibility in how I process a request for a tile.

I have a few closely related questions:

(1) When I'm running the script to generate the tiles 
(generate_tiles_multiprocess.py, in my case), for a huge percentage of 
the output tiles it writes "Empty Tile" to the script output line. My 
understanding is that "Empty Tile" actually means "a tile of just one 
color". So it might be all blue in a water area, or all green in a park 
area, or all gray in some other area. My question is whether there is a 
deeper meaning to "Empty Tile", specifically whether it means that all 
sub-tiles generated at higher levels will also be empty? For instance, 
if I generate a level 16 tile that is a gray "Empty Tile", does it mean 
when I generate level 17, and I am generating the four level 17 tiles 
that correspond to that one level 16 tile, that those four tiles will 
also be gray "Empty Tile", and the same for the 16 sub-tiles at level 
18? Or, in contrast, is it possible for features that weren't visible in 
the level 16 tile to "appear" in that area at level 17 or level 18? 
Another way of saying it: Does "Empty Tile" mean that the tile is "empty 
all the way down"?

(2) Regardless of the answer to question to (1) above, suppose I have 
generated a full tree of tiles for a given area, all the way to level 
18. Suppose there is a tile on level 15 that is all gray, and it turns 
that the four corresponding tiles on level 16, the 16 corresponding 
tiles on level 17, and the 64 corresponding tiles on level 18, are also 
all gray. A natural optimization of the tree is just to delete those 
level 16, 17, and 18 tiles (84 total tiles) from the tree altogether. 
Then, when the server goes to serve a tile and it's not present in the 
tree, it knows to "look up" to the parent levels of the tree as far as 
necessary until it finds a tile, and it knows that that tile (the all 
gray tile in this case) is good to use for the originally requested 
tile. My questions are: what is that optimization called? Is it commonly 
used? Is it discussed somewhere?

(3) Finally, I notice that many of the output tiles are very simple. 
First of all there is the huge number of one-color tiles. But there are 
also many two-color tiles with a very simple shape, such as a single 
line dividing a blue area from a gray area. I see that the one-color 
tiles are 103 byte .png files. The simple two-color tiles are typically 
anywhere from 1.0k to 1.5k. So it seems like a natural compression 
technique for these would be to use just three bytes ( R, G, B ) for the 
one-color files, and a simple run-length encoding scheme for the 
two-color files. Of course, to make this really save space on the disk 
it would be necessary to combine lots of these small files into some 
kind of multi-file that would be cached in memory on the server. But 
anyway for both storage and transmittal (as well as helping reduce cache 
missing when actually serving the data) it seems like there are many 
optimizations like this that could be employed. Can you tell me if these 
optimizations make sense, and where I can find documentation of how 
others have implemented stuff like this?

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"