[OSM-talk] Mapnik tileset coherency issues
Jon Burgess
jburgess777 at googlemail.com
Fri Oct 19 00:07:50 BST 2007
On Thu, 2007-10-18 at 17:53 -0400, Christopher Schmidt wrote:
> On Thu, Oct 18, 2007 at 05:38:30PM -0400, Andrew MacKinnon wrote:
> > On 10/18/07, Tom Hughes <tom at compton.nu> wrote:
> > > > I'm not convinced by the need for a queue for the high zoom tiles - I
> > > > think we should either render straight away or not at all. I think the
> > > > probability of any individual high level tile being re-visited is quite low.
> > >
> > > Umm.... Without the queue lots of people would just get blank space
> > > on the map. At peak times there is no way we can render all the needed
> > > tiles on demand and if they're not queued then they will still not be
> > > there the next time somebody looks at that area.
> >
> > http://labs.metacarta.com/osm/, an alternative OSM slippy map, seems
> > to be able to render tiles in real time with quite acceptable
> > performance.
>
> 1. Super-fast machine donated to a university by Intel.
> 2. Super-low usage in comparison to OSM -- that site serves something
> like 1000 tiles a day compared to OSM's millions!
> 3. If a lot of people view it immediately after I've viewed the cache,
> it causes tons of problems -- the load average got to 120 before I
> had to kill Apache eventually on Wednesday morning.
>
> It's nice to be able to have it as a tool for visualizing the most up to
> date data for developers, but it's not a tenable solution for OSM's main
> map.
>
> > perhaps rendering tiles in real time at off-peak times, and putting
> > them into a queue at peak times.
>
> That's already the case -- there just *are* no off-peak times anymore.
The server is fairly quiet between 00:00 - 08:00 UTC but for the other
16 hours of the day there is a fairly constant load.
Our present tile rendering setup has started to show scalability issues
of the past few months and things are getting worse as the number of
tiles continues to rise.
The number of tiles currently queued for rendering is...
mysql> select count(1) from tiles where dirty_t='true';
+----------+
| count(1) |
+----------+
| 283285 |
+----------+
1 row in set (10 min 27.40 sec)
If you look at that final line you'll notice it took 10 minutes just to
count those dirty tiles (and there is an index on the dirty_t column).
The machine does not have enough ram to keep the entire index in memory
while it is also serving and rendering tiles.
I would try counting all the tiles in the DB but I can't wait for that
query to finish at the moment. When I looked last week we had 8M tiles.
I don't keep a precise record of the tile counts but I think it was
around 2M only 6 months ago. The current size of the DB is around 20GB.
The tile machine renders somewhere in the region of 20k to 30k tiles per
hour. It'll therefore take all night to catch up with the current 280k
backlog. At the same time new tiles will be marked dirty so chances are
it will be busy all day tomorrow too.
It is fairly clear that the existing scheme of storing the tiles and
metadata in a MySQL DB does not really scale well enough. We need to
develop a next generation rendering system to replace the current
setup.
I've started along one path be working on 'mod_tile' which is a C based
tile rendering manager which integrates into Apache as a module. This
is more intelligent about what tiles to render when (the code is in SVN
if you want to try it). It needs further work though before we can
deploy it.
There have been several good ideas in this thread and I won't reply to
them all individually but I'll certainly try to remember them as I
continue working on the mod_tile development.
Jon
More information about the talk
mailing list