[OSM-dev] tiles at home disk usage
Frederik Ramm
frederik at remote.org
Wed May 2 10:54:53 BST 2007
Hi,
> Server code is all in SVN if people want to see what's behind-the-
> scenes.
> Tile metadata and access stats available to download if you want to
> model
> different storage strategies.
I think the general way to go is this:
* Store only tiles that have something on them as individual PNG files.
* For "empty" tiles, store only the land/sea information.
* If a tile is requested, return an individual PNG file if one is
there, or a generic "empty" tile, either blue or white, otherwise.
The questions remaining are:
1. How do we store blue-empty and white-empty information on the
server and deliver approriate PNG files,
2. How do we determine if a tile is blue-empty or white-empty.
Storing blue-empty or white-empty information does of course happen
in the database - we will need an extra column that can handle one of
three values: "normal tile", "empty sea tile", "empty land tile".
Retrieval of this information is possible in two ways:
1a. create symlinks in the directory structure linking to the empty
blue or empty white tiles where appropriate; no further work required
(enable symlinks in Apache; caution when overwriting tiles - delete
first, otherwise you overwrite the link destination). Drawback:
potential huge number of symlinks in the file system (roughly 70% of
all entries).
1b. use Apache's "ErrorDocument" directive to execute a CGI script
whenever a requested tile is not found. Have that script query the
database and return a "Location" header pointing either to the empty
blue or empty white tile. If the requested tile is on a level greater
than 12, and the database does not contain a record for it, answer
based on the information for the enclosing level-12 tile. Drawback:
potential performance problems (the old "too many mysql connections"
if someone surfs across the Atlantic at level 12).
We can even combine 1a and 1b, so that for some tiles near the coast
we provide symlinks (as they will be viewed often - fast access, less
strain on the server), and use the database lookup as a last resort.
Determining wheter a tile is empty is something that can be done by
the entity creating the tile.
2a. tilesGen.pl, which creates all level-12 and higher tiles, already
detects empty-land tiles and uploads a 67-byte dummy PNG instead. It
should be improved to upload 69-byte dummy PNGs for empty-sea as
well, so that the server can recognize that is is an empty tile.
(Instead of communitcating via byte sizes, an XML "meta data file"
could be envisaged, but that can also be done as a tidy-up step later.)
2b. lowzoom.pl, which creates all level-11 and lower tiles, needs to
be beefed up to detect empty tiles as well, and upload appropriate
dummy PNGs. It would be desirable to have some sort of "database
access", so that lowzoom.pl could, before it commences download of
individual tiles for constructing lowzoom tiles, download the meta
information for the tiles it is about to process, and then only
download those tiles that carry information (not the empty-blue or
empty-white tiles).
The server side script that accepts uploads would have to be modified:
3a. for 67-byte or 69-byte tiles, delete the existing tile and
replace by a symlink according to 1a; possibly, if a mix of 1a and 1b
is to be run, apply some logic to determine whether a symlink should
be created.
When we implement these mechanisms, we need to clean up the tile
database and also generate some information:
4a. check the database for all existing empty blue tiles, delete the
PNGs (optionally replacing them by a symlink) and flag them "empty
sea" in the database.
4b. the same for exiting empty white tiles (are there any?)
4c. determine all empty sea tiles from Martijn's level-12 index, and
add "empty sea" entries to the database for them if they do not exist
already. Also, add "empty sea" entries for all ocean tiles on lower
levels, as computed from existing data (a tile on level n is "empty
sea" if all its four sub-tiles in level n+1 are).
Step 4a will free something like 7 GB of data currently used by empty
sea tiles. Step 4c will create an extra 10m records in the database
(currently containing about 27m records).
Any comments on this, or can I implement it?
Bye
Frederik
--
Frederik Ramm ## eMail frederik at remote.org ## N49°00.09' E008°23.33'
More information about the dev
mailing list