[OSM-dev] Questions/Ideas/Plans on OSM Infrastructure

Kai Krueger kakrueger at gmail.com
Thu Sep 17 13:19:10 BST 2009


On 22/07/28164 20:59, Dominik Bay wrote:
> Hi all,
>
> I'm coming up with a topic for discussion on how to save and
> serve OSM data for Slippy Maps, Mobile Devices and handling
> routing-requests.

I am not entirely sure what you are trying to achieve with this topic 
and who you are targeting with it (Osm sysadmin? potential donors? The 
community and what they want from OSM, or are you planning on donating 
Hardware your self?), but I will try and comment on some of the points 
about the rendering.

>
> ====
>
> 2. Why do we need it?
> ---------------------
> Due to growing Software based on OpenStreetMap and OpenRouteService
> we need a more-realtimish behaviour of our data, to serve mobile
> Users with actual tiles and a fast route calculation.

Not sure what your definition of "more-realtimish" is, but I would say 
our rendering is already approaching "near realtime" At least the main 
mapnik layer should currently only have a lag of about 5 to 10 minutes 
behind the main OSM database. During load spikes it could be more, 
although with the new render server put in place a week or two ago, we 
haven't really seen any times when the rendering couldn't keep up. 
Furthermore, with the new replication mechanism of diffs, that is 
starting to be tried out, this lag may get reduced even further to about 
a minute.  There is a potential that it feels less realtimish to users 
though due to client side caching. Tile expiry for proxy servers is set 
to something between 3 and 6 hours currently, so if you have visited the 
map before the edits, your browser may still cache the old tiles for 
that duration.  The tiles at home layer I think also aims to achieve 
rendering turn-around times of 10s of minutes to couple of hours, so 
isn't that much behind mapnik either.

> We also need to get tiles *fast* to our users, this is why they are
> called *Slippy* Maps ;-)
> (They are fast at the moment, thanks for your work on mod_tile, I'm
> impressed!)

Yes, the current server can quite happily sustain the full 100Mbs link 
that we currently have available which was demonstrated again the other 
day during the load spike caused by the TV program Quarks and Co.

The response time of a single tile should be anywhere between 50ms if 
the tile is currently cached and you have a good connection to where OSM 
is hosted to about 3 seconds, if it tries to render the tile on the fly 
but is too complex to do so and times out to return a slightly older 
cached tile. Which is pretty fast as well.

So at the moment the problem with tile serving, if any, is a question of 
hosting and the increasing bandwidth requirements OSM has. During 
typical daytime tile serving uses about 50 - 60 Mbs sustained with night 
time dropping to 10Mbs. (Tiles at home and cycle map not included)



>
> 3. A short description of what can be done
> ------------------------------------------
> We can make use of things like Anycast and Geo-aware Caching.
> This means a user connects to a Tile-Proxy which is near his
> ISP (routing-wise) and holds all tiles which are relevant for
> the users location (Europe for example) plus the Tiles which
> are requested often (big Cities).

Yes, that could theoretically be done, and there have been two 
successful tests in the past with using Wikimedia proxyies. So software 
wise this could presumably be done. Software is not everything though 
and for this to work, you also need the proxy caches and the hosting. So 
again, the issue here as far as I can see it, is hosting. But I am sure, 
if ISPs or datacenters are willing to donate free, reliable hosting with 
100Mbs - 1000Mbs network links, we could scale up our current tile 
serving quite well. But for the moment, we are limited to the resource 
we currently have.

>
> To get a better understanding of the next lines, feel free to open this
> picture:<http://eimann.etherkiller.de/nmz/osm.png>
> ->  Rendering
> ------------
> The goal is to get nearly-realtime rendering of all map-types and the
> option to easily do customized rendering for supporting events,
> Wikpedia, etc.
> This also enables us to support rendering of 3D Layers with very low delay.

As I said above, we basically already have near-realtime rendering of 
the standard mapnik layer. Providing customizabile tiles of maps, I 
don't think, should be the core focus of the OpenStreetMap project 
itself. For that I think companies and projects like Cloudmade, 
Geofabrik, or Tiledrawer.com are much better suited which have sprung up 
around the OSM dataset. In my opinion OSM should provide, with respect 
to rendering tiles, a sufficiently high quality map so that the general 
public is attracted to OpenStreetMap which then hopefully lowers the 
barrier of entry to start editing and improving the data. But that goes 
into the political question of what OSM stands for and what it wants to 
offer.

>
> ->  Shared Storage
> -----------------
> Rendered tiles are stored for the Webservers to serve them to the
> Proxies on request.
> The Webservers can only *read* from the Storage, as there is no need for
> writing on it.
> Same for the Rendering-Farm. A Renderer fetches a dataset, renders it
> and saves it on the shared storage, together with a file which holds
> meta-data like country, city, data-time, render-time etc.
> This is done every five minutes, default value for expiring tiles is 30
> minutes and on requests it should be 5-10 Minutes, we need to check how
> this behaves and how much load we have, but it should be somewhere in
> this range.

This can and is already been done (apart from some of the metadata in 
particular which country it is in).

>
> ->  Webservers
> -------------
> Webservers answer proxy requests, they expire tiles on the proxies,
> serve tiles and can read additional meta-data to make decisions on
> expiring and serving old or new data.

Yup, that is pretty much what mod_tile does today.

>
> ->  Proxies
> ----------
> The proxy servers are located near the user, to *only* serve tiles and
> to help spreading the load.
> Imagine 5000 Users doing routing with AndNav2, travelling at
> 100km/h with different Zoomlevels on their map, this brings a lot of
> requests.
> Proxies use auto-expire for content but also honor expire-times served
> by the Webservers to push new content to the users.
> The proxy servers are located at various ISPs, all served out of the
> same /24, so we also have automatic failover + serving data nearly
> locally (routing-wise).

Donate the necessary reliable and secure hosting and servers and this 
could be implemented and as mentioned above, their have been tests goign 
down those lines. However, for some reason, we currently don't have a 
proxy server in every city or at variouse ISPs. So in the meantime we 
have to live with the infrastructure we have and slowly expand the 
possibilities where the resources become available. But remember that 
OSM is mainly a data project, so spending available resources on 
ensuring editing and improving the data runs smoothly will presumably 
have priority. But that said, there hase just been a big round of server 
updates ( http://wiki.openstreetmap.org/wiki/Servers/Upgrades ) 
including a new tile server.

>
> =====
>
> So, this is what I've done so far, hardware specs and other stuff is
> currently under evaluation and I'm happy to get more details
> specifically on rendering and database stuff.
> (Still reading the Wiki on that part anyway)
> My specific question for rendering is, how does it scale?
> The more cores the better, or less cores but more speed? ;-)

Rendering scales pretty well with CPUs assuming you have sufficient IO 
bandwidth for the rendering database. The current tile server Yevaud has 
16 cores (8 real ones + hyperthreading) but you can find more details 
about the servers at http://wiki.openstreetmap.org/wiki/Servers

But why exactly are you putting specs together? Are you planning on 
hosting your own rendering infrastructure or donating hardware to OSM?

>
> I'm curious on your input to deliver a better user experience at the end.

Yes, we are all interested in constantly improving the experience to the 
end user. :-) And it is indeed constantly improving!

Kai

>
> Kind regards,
> Dominik
>
>





More information about the dev mailing list