[Talk-GB] Efficient processing of map data for rendering (BOINC).

Sat Jan 24 16:22:37 GMT 2009

On Sat, Jan 24, 2009 at 10:38 AM, Steve Hill <steve at nexusuk.org> wrote:
> On Fri, 23 Jan 2009, Matt Amos wrote:
>
>> this might be helpful
>> http://svn.openstreetmap.org/applications/utils/export/tile_expiry/
>
> Yes, I had a look at that script, but it only expires tiles with nodes on
> them, which I think is rather too simplistic.  The readme says that it is
> unusual for the gap between nodes to be larger than a tile, but in my
> experience this just isn't true at all.

in my experience it is unusual for the gap between nodes to be larger
than a meta-tile, which is all we really care about as we re-render
meta-tiles, not single tiles. it does happen, of course, but at high
zoom levels in areas of low detail (i.e: not very interesting places).

i find that browser caching issues tend to produce far more problems,
but i don't know enough about the arcana of HTTP caching to try and
fix the issues ;-)

> So my idea was to work on the postgis objects themselves during import. This
> should have some advantages:
> 1. We don't need to duplicate any work in translating OSM objects into the
> objects that are actually rendered - osm2pgsql already does this and we
> don't have to know or care how.

to be fair, this is a pretty simple transformation - a single call to
proj4. of course, re-using osm2pgsql has an advantage if you're
rendering in several different projections.

> 2. We don't need to duplicate work parsing the OSM XML file - this should
> give some speed improvements.

the script in SVN has no trouble expiring tiles in less than a minute,
which is all that is required. measuring the load on the server shows
that the overhead is negligible.

> 3. There should be a reduced number of database lookups because the only
> extra things we need to look up in the database are the postgis objects that
> are being deleted.

the only database lookups that are done are to fetch the old positions
of nodes and ways when the object is modified. due to the non-local
algorithms mapnik uses to place text, i couldn't see a way to optimise
this to a smaller part of the way. :-(

> The plan is to have osm2pgsql insert a list of dirty tiles for the maximum
> zoom level into a postgres table.  I wrote a script that goes through each
> zoom level, starting at the maximum and working back to 0.  Each zoom level
> has a minimum age associated with it and when the tile has been dirty for
> that long it is deleted and the coordinates for the tile at zoom-1 are
> inserted into the table.  The idea being that low-zoom tiles change more
> frequently than high-zoom tiles, but are less interesting and more effort to
> render so shouldn't be re-rendered immediately.

this is tied into the mapnik style. for example - changes to
residential roads do not need to be propagated above the level at
which residential roads are rendered. it would be interesting to
extract this information automatically from the style file. even more
interesting to try and diff styles and expire tiles based on those...

>> this is one case where one big raid array is much better than many
>> distributed disks.
>
> I was wondering if anyone had done any tests on the speed of a database that
> is distributed over a cluster of servers.  I would imagine that there would
> be speed improvements, but I'm not sure what the overhead is like for
> actually working out which server contains the data you're after.

it would be interesting to try this with a geographical distribution
of both databases and rendering requests. i agree that the front-end
(i.e: load-balancing) server would add quite a lot of complexity,
especially if the rendering+diff load is highly geographically
localised.

it might be possible to get a similar speed-up with lower complexity
by partitioning the tables. especially if suitable partition
boundaries could found which are crossed by very few ways.

> Another possible solution is to have a number of completely independent
> rendering machines with their own copy of the database and just round-robin
> the rendering requests between them.  This is obviously not something that
> could be done with BOINC or similar - not many people would want to dedicate
> 60GB of their hard drive to the OSM postgis database. :) But it could be
> done with a cluster of dedicated servers.

i think this could be implemented quite quickly using existing
load-balancing software. the only problem would be trying to cluster
tile requests which come from the same meta-tile to avoid having all
the servers in the cluster pointlessly simultaneously rendering the
same meta-tile.

> However, I would be really interested to see just how much load there would
> be on the rendering servers if tiles were rendered on-demand only if they
> hadn't been rendered before or if they have really become dirty since the
> last render.  It just may be that there is no need to chuck lots of hardware
> at the problem if tile expiry is done well.

i totally agree. i've had a server re-rendering *all* the minutely
updated meta-tiles (8-core opteron, 16 Gb ram, 3x1Tb spinpoint in
raid0) in less than a minute. the load was pretty high (avg. 4). by
applying a little more cleverness to the problem it should be possible
to reduce that much further.

cheers,

matt