[Tilesathome] t at h server performance

Christopher Schmidt crschmidt at metacarta.com
Sat Dec 22 14:43:24 GMT 2007


So, t at h uses a database to store blank tiles. This database is the
primary thing that slows down tileset processing at this point. The
database contents look like this:

mysql> select z,layer,count(*) from tiles_blank group by z,layer;
+----+-------+----------+
| z  | layer | count(*) |
+----+-------+----------+
|  0 |     1 |        1 | 
|  2 |     1 |        3 | 
|  3 |     1 |        9 | 
|  4 |     1 |       38 | 
|  5 |     1 |      156 | 
|  6 |     1 |      503 | 
|  7 |     1 |     1340 | 
|  8 |     1 |      819 | 
|  9 |     1 |     3453 | 
| 10 |     1 |    10040 | 
| 11 |     1 |    29195 | 
| 12 |     1 |   153465 | 
| 12 |     3 |   192977 | 
| 12 |     5 |        2 | 
| 13 |     1 |   202329 | 
| 13 |     3 |   206492 | 
| 14 |     1 |   625637 | 
| 14 |     3 |   616067 | 
| 15 |     1 |  2017676 | 
| 15 |     3 |  1810050 | 
| 16 |     1 |  6378318 | 
| 16 |     3 |  5060867 | 
| 17 |     1 | 18109393 | 
+----+-------+----------+
23 rows in set (5 min 0.66 sec)

Total, there are 35million rows. Since low content tilesets may result
in many hundreds of selects / "REPLACE INTO"s, this process is the
primary slow part of the tileset processing. 

Over the several months that the server has been running on Hypercube,
the number of tilsets that the server ca process per hour has dropped
significantly due to the growth in size of this table. Additionally, you
can see from the munin graphs that:

http://munin.openstreetmap.org/openstreetmap/tah.openstreetmap.html

When the size per tilset is smaller, the processing rate is much slower:

http://munin.openstreetmap.org/openstreetmap/tah.openstreetmap-tah_bytes.html

You can see here a huge difference between the processing of mostly full
tilesets -- the big peaks are generally when processing tilsets
requested via the changed tiles script -- versus processing of mostly
empty tilesets from the low priority queue.

There are a couple problems here:
 1. tilesets are processed in the order they are uploaded -- so when the
    tilset queue is full of low priority requests, these are much more
    slow to actually get through the processing queue, meaning that
    although the more full tilesets are there, people still have to wait
    a long time for them.
 2. This doesn't seem sustainable: I understand we're getting through
    the tile queues... but, the number of blank tiles stored in the DB
    is only going to continue to increase, and as it increases, the number
    of SQL statements that can be processed is simply going to go down. 

In the short term, making the processing take priority into account --
so that higher priority tilesets are processed before lower priority
tilesets -- seems important to me, since that's the thing that affects
people the most. IF they have to wait 3 hours for all the low priority
crap to clear through in order to see the tiles they just uploaded at
priority 1, that's clearly a bad feedback loop.

In the long term, we need to come up with some more efficient way of
storing blank tiles than the current database. I don't know what this
means: perhaps it's running under some db format (innodb?) that we can
stop using in favor of a less robust mysql table type? Perhaps we can
explore some other mechanism of storing this information? Perhaps the
code really just does too many selects/inserts and can be cleaned up? I
don't know the code well enough to say -- but I have straced the
processes and assured myself that the slowdowns we are seeing are simply
the result of the much larger blank tile db, and that we need to do
something about it if we want t at h performance to increase.

For the record, I ran the cleanblanktiles script yesterday, so the blank
tile db is as clean as that code makes it. I don't know if there are
further optimizations that can be made there -- perhaps someone else can
comment -- but I've done everything I know how to do.
   
Regards,
-- 
Christopher Schmidt
MetaCarta




More information about the Tilesathome mailing list