[Tile-serving] Pre-Rendering tiles for all of Germany
Josef Schugt
josef.schugt at benndorf.de
Wed Dec 27 21:41:47 UTC 2017
Hi there,
a few months ago, our company's then system administrator who among
other responsibilities had the task of bulk pre-generating custom-design
tiles had the opportunity to start working at the German Federal Office
for Information Security which he quite understandably took. When I
write 'tiles' I actually mean it; it is not metatiles - our customer
doesn't want the tiles to be generated on the fly.
Since then I have been more or less successful in little by little
understanding and improving the system used and now wonder if the
fundamental approach actually is the right one. For now let me detail
two altogether different questions that arise.
ISSUE 1: HOW DATA IS IMPORTED
The region for which tiles are needed is Germany in its current borders.
The first issue that arises is the data import (we use data from
download.geofabrik.de, please see
http://download.geofabrik.de/europe/germany.html).
The machine we use memory-wise comes to its limits when generating
indexes after importing germany-latest.osm.pbf. Sounds odd at first but
there is considerably more OSM data for Germany than for the Russian
Federation while Russia has about 50 times the area that Germany has -
about one twelvth of the amount of data available for the whole planet!
Is it possible to import the data federal-state-wise? I know that I can
use osmconvert to make sure that there are no collisions by using
something like
osm2pgsql ... --create ... first_region.pbf
osmconvert first_region.pbf -o=first_region.o5m
osmconvert second_region.pbf -o=second_region.o5m
osmconvert second_region.o5m --subtract first_region.o5m
-o=first_region_cleaned.pbf
osm2pgsql ... --append ... second_region_cleaned.pbf
Obviously, that's already getting involved with about a dozen or so
files (don't know the precise number; there are fewer files needed than
there are federal states and the server is unresponsive ATM) but it is
feasible. I am however unsure if that breaks features stretching more
than one region represented by these files or not. An alternative
approach would be to split the file for all of Germany into several ones
(like splitting the bounding box into m times n rectangular parts) where
I wonder if that doesn't as well run into memory constraints.
ISSUE 2: HOW RENDERING IS PERFORMED
The rendering takes place using a modified version of
https://github.com/openstreetmap/mapnik-stylesheets/blob/master/generate_tiles.py
where the modification is using a couple of regions that cover all of
Germany and some output that allows to have a web page for keeping an
eye on the progress: http://tileserver.benndorf.de/tile/progress/
The data used for the marker (click on it, it has additional information
- note that most up to date Firefox is fubar and won't display it) is
updated every 1000 tiles and the page automatically refreshes every 5
minutes. Each region is rendered at zoom levels 10 thorough 18, then the
program moves on to the next region (the rows are processed from North
to South, within each row the regions are processed from West to East).
We (i.e. my boss and I) are under the impression that for such bulk
generation it should be faster to frist generate all metatiles and from
these generate all actual tiles. Or, simply put: use mod_tile and simply
fetch each and every tile abusing Apache as the de facto rendering
engine. However, the machine we'd like to use for this purpose has
mod_tile operational for on-demand rendering of other tiles and I am not
sure if it is a good idea to have a second instance of mod_tile doing
lots of rendering in parallel.
Any substantial feedback is appreciated. If you have comments on the
granularity of the grid used for rendering all of Germany, feel free to
share them as well. I still have no feeling for what the optimal size of
a region for rendering is. To my understanding it should not be too
large because then the whole data used no longer fits into memory and
not too small because then there are way too many features that need to
be fetched many times - say autobahns that go on for hundreds of kilometres.
There are many other questions that have been arising the last couple of
months but many are of minor importance.
Those I mentioned are what I’d call mission-critical as improving the
rendering efficiency goes beyond the obvious effect of making things
faster - it allows to have data that is more up to date or alternatively
reduce actual costs - I keep an eye on the machine's temperature: When
idle its cores are at about 30°C (90°F) and when active they are at
about 80°C (170°F). In other words power consumption goes way up (and
remains there for currently about 3 weeks) - that electricity bill needs
to be paid and among the OECD countries only Denmark has higher
electricity costs than Germany. On average you pay about twice as much
as in neighbouring France and about 3 times as much as in the USA.
Yours,
Josef 'Jupp' Schugt (nom de guerre: penpendede)
More information about the Tile-serving
mailing list