[Tilesathome] t at h server next generation

Mon Jul 7 09:20:45 BST 2008

On Sun, Jul 06, 2008 at 03:07:58PM +0100, Kai Krueger wrote:
> The new server code looks good, so it would be great to see it go live at 
> some point.

Thanks, I am still busy finding and fixing bugs as well as adding required core functionality. I need to finish up the "Legacy tile converter" first (implement database blank lookups, for example). And I will be gone to conferences from Aug 1st-Aug 18th. So I expect to take the new server code online shortly after that.

> Do you think it would make sense to add some more metadata to the file? 
> E.g. client version number

I had been thinking about it, as that would be useful. But given that I have a file version number, I can always modify and extend the tilesetfile format at a later point in time. Let's start simple first. Another option would be to store a user id next to each index entry (if people do not upload full tilesets we might want to insert single tiles into an existing tileset, etc)

> Another useful piece of information might be to have the min_z and max_z 
> values in the structure. Given that not all layers have the same number of 
> zoom levels, I guess one needs the information of how many tiles should be 
> in the tileset.

This is something I am not so sure about, right now a tileset always contains 6 zoom levels and 1365 tiles. e.g. tileset would be from z0-5, z6-11 and z12-17. That keeps things simple, and I like it. The downside is that a lowzoom tileset is bigger than it currently is, so uploaders would have to do more work. On the other hand, if the server performs well, it might just stitch lowzoom tiles together itself. As we have the tiles available locally, this might not use much more resources than if a client downloaded all the tiles and we would have to process the uploads again. But I haven't spent any efforts on that yet.
Let's start with the simple structure and see how much space we waste, once it's running. I would like to see it scale first, before we make it more complex.

> Will the caption / captionless layers cause any problems? Aren't those zoom 
> levels a bit strange with respect to the z0, z6 and z12 structure of the 
> tilesets?

captionless is only z12, so we are wasting about 5.5kb of unneeded header data per tileset (which makes a whopping 90GB of metadata for the whole world at z12), true. But then I plan to not store empty z12 tilesets at all, but have a "blank tileset db" for that (which only has entries for completely empty tilesets). But given that we currently waste tons of diskspace for the millions and millions of files and thousands of directories (which all use up an unknown minimum filesize), I would think that we waste still much less space than we currently do.

> Hmm, I think there is a tiny bug in the serve_tiles.py, it checks for the 
> offset being larger than 1 for special case tiles. Both sea and transparent 
> tiles would therefore try and load a tile with an invalid offset. Shouldn't 
> it be greater than 3?

True, that is a bug and it's fixed now. Thanks for spotting :-).

> It seems like the oceantiles file is a good start for a database of blank 
> tiles info. I have attached a patch (in case this is useful), that in the 
> case of not finding a tileset, looks up the info in the oceantiles file. 

Thanks for the patch, I appreciate your efforts but will not commit it, let me explain:
We have all the blankness info for everything (but much of the captionless layer) in the blankness db. And I plan to run a tileset conversion from all the tiles we currently have to the new tilesetfile format. This conversion will look up the blankness db in case it does not find a file (see the code stub get_blank() in LegacyTileset.py), so we will have all the blankness info necessary without relying on oceantiles.dat. I hope that is ok for you, I am usually happy to accept patches.

> I was wandering what the plans for the transition period are. I have seen 
> that you have a converter script to take the existing tiles and convert 
> them to the one file per tileset. Are you planning on converting all of the 
> files during transition or, will the two types of tiles go side by side? 
> Adding a fall-back to the serve_tiles.py shouldn't be hard in case of 
> gradually transitioning.

A fallback would be possible, but a test run of tileset z0-5,x=0,y=0 (which contains 1365 tiles as all the other tilesets do) took just 0.2 seconds to run. This will be a little slower with the blank tile lookups added etc, but it's still rather quick. So my initial plan would be to make one big conversion run and be done with it. I hate adding layers and layers of Legacy fallbacks :-). So if we can start out with a clean code base that would be great. If the conversion takes much longer than I expect, than some fallback will be necessary. But my priority is to keep tile serving as quick as possible, so I would rather not have that do other stuff on the side.

> That would be great. A short instruction of how to get all of this running 
> might be good as well.

I have decided on a working name now ("Tahngo") and created a stub in the wiki (http://wiki.openstreetmap.org/index.php/Tahngo). I plan to add documentation necxt to the code and expand the wiki page. For now, I have worked with mercurial, as I like that best, but I will of course check in the code into the OSM svn.

spaetz