[OSM-talk] Tile caching (osm startpage)
Kai Krueger
kakrueger at gmail.com
Sun Mar 7 08:30:10 GMT 2010
On 01/-10/-28163 08:59 PM, Michal Migurski wrote:
>
> On Mar 5, 2010, at 11:34 AM, John Smith wrote:
>
>> On 6 March 2010 01:24, Bernhard zwischenbrugger<bz at datenkueche.com> wrote:
>>> Google Cache Time:
>>> Cache-Control: public, max-age"222222 //feels like one month (I
>>> didn't calculate)
>>
>> I'd say it's a bad idea to specify a cache time, instead there is
>> other caching mechanisms to tell if a tile has changed:
>>
>>> ETag: "d096ddafba32c0da609007e224530ccd"
>>
>> This way if a tile never changes you never need to refresh.
>
>
> For what it's worth, the current tile server does specify a cache time as well as an ETag.
>
> % curl -sI "http://tile.openstreetmap.org/14/2627/6331.png"
> HTTP/1.1 200 OK
> Date: Sun, 07 Mar 2010 02:19:30 GMT
> Server: Apache/2.2.8 (Ubuntu)
> ETag: "93087c5713c17d9939cac9e341fdd14c"
> Content-Length: 26595
> Cache-Control: max-age36
> Expires: Sun, 07 Mar 2010 02:36:46 GMT
> Content-Type: image/png
>
> 1,000 sec. max age there is a little over 15 minutes, though when I repeat this request I get expiry times all over the place, from a few minutes to many hours. What currently decides on the cache expiration time?
mod_tile, the apache module used to server the tiles, has a fairly
sophisticated mechanism to decided the expiry times, driven by a bunch
of heuristics. As with the minutely rendering, we don't have a periodic
update cycle anymore, there is no real good way of setting the expiry
times, as one would need to guess when in the future this tile might
change. As that is obviously not possible, we need to trade off between
caching time (reducing server resources and client side latency) and
up-to-dateness to not loose the benefits of the minutely updates.
The heuristics currently supported (and used) are the following.
At a first instance it decides if the tile is known to be "dirty" i.e.
outdated. If the tile server is overloaded, or the rendering takes
longer than 3 seconds, mod_tile will serve an old tile rather than wait
until the on-the-fly rendering will finish. (Again a trade-off between
client side latency and up-to-dateness) At that point, given that we
know the tile will soon change, the max-age cache parameter is set very
low. 15 minutes + a 7 minute random jitter.
If the tile served is not stale, there are another 3 heuristics
A zoom level based heuristic, a last modified heuristic and a known
planet update cycle if it exists.
The zoom level based heuristic allows to set the minimum max-age caching
time based on if the tile served is a low zoom, medium zoom or high zoom
tile. The idea behind this is that low zoom tiles (even though they are
effected by all changes) don't appear to change much. Thus it seems
reasonable to allow clients to cache these much longer as the effect of
a stale tile from cache is probably less.
The current setup of tile.osm.org, I think, doesn't use this heuristic
though and setts the minimum max-age caching to 3 hours + 3 hours random
jitter for all zoom levels, even though the minutely tile expiry doesn't
actually expire low zoom tiles and thus only change if manually
requested. So I think it would be good to increase the time to cache low
zoom tiles, as in the current setup it shouldn't affect things
negatively.
The last modified heuristic tries to guess how likely it is for a tile
to change. E.g. a tile in the middle of the pacific is probably not
going to change anytime soon. So it wouldn't matter to give e.g. a
max-age of a week. A tile perhaps in central Berlin is more likely to
change. So the heuristic guesses how likely it is to change in the
future based on how long it has been since it last changed. It then
specifies a linear scaling of max-age to last modified time with a
tunable slope parameter. As it is fairly unclear how well this heuristic
works, I believe the osm tile server still has this at its default, i.e.
turned off completely.
The last "heuristic", is that based on planet update cycles. For those
servers that have a planet update cycle (i.e. not tile.osm.org), you
don't have to guess and can just set the expiry time to when the next
update cycle begins. This is the most efficient from a caching point of
view, but doesn't work with minutely updates.
The final max-age handed out by the server for clean tiles is then the
maximum time of any of the 3 heuristics capped to a week.
The random jitter factor is there mostly for if you have weekly update
cycles, to not expire all tiles at exactly the same time and then
overwhelm your tile server when suddenly all cached tiles expire.
Since a couple of hours, the mod_tile code would now also support a tile
expiry based on hostname header, so it would theoretically be possible
to do something like cache.tile.osm.org handing out expiry headers of
e.g. a month. But it isn't clear how one would decided who to send to a
hypothetical cache.tile and who to the normal tile server. It is also
not clear what it would do to osmf's own (currently still relatively
limited) caching, as it would now require two copies of each tile being
kept by the accelerator caches, doupling the required resources. So I am
not sure if or in what form this would potentially happen, even though I
do think it is a good idea from the client perspective.
Cutting it short, the current tile.osm.org server basically hands out
expiry times of 15-22 minutes for stale tiles and 3 - 6 hours for clean
tiles with a bunch of more parameters that could be tuned.
>
> The Phnom Penh issue all sounds like a job for a CDN like Akamai's or a caching proxy (i.e. squid-cache.org) closer to Cambodia. Bernhard, these are not difficult to set up for yourself if you are interested, and require little knowledge of the actual map.
Having a CDN would definitely help and would probably be indeed the
preferred option in this specific case. But it would require osmf having
hosting facilities in various countries. Great, if it were possible, but
I am not sure if it is at the moment.
Since a few days, there is a trial to see how well a CDN / caching proxy
would work in our setup with a.tile.osm.org redirecting to a simple
proxy server at a different hoster (although in London, too). It is too
early to say much yet, but it does seem like the cache hit ratios are
lower than I would have hoped them to be with only about 40 - 60% of
request successfully being served by the proxy without needing to
contact the main server. (
http://munin.openstreetmap.org/openstreetmap/konqi.openstreetmap.html#Squid
for reference )
We will need to see how this all pans out, but I would guess it will
depend on resources donated to osmf to make some of this happen and
ensure that the tile serving infrastructure can be expanded in the future.
Kai
>
> -mike.
>
> ----------------------------------------------------------------
> michal migurski- mike at stamen.com
> 415.558.1610
>
>
>
>
>
More information about the talk
mailing list