[Tilesathome] Observations after having played around with t at h server
John Bäckstrand
sandos at sandos.se
Fri Jul 27 07:45:21 BST 2007
Ok, I wanted to play around with the server and try to see where the
bottlenecks laid with regards to queue processsing. What I did was
install the t at h server as per
http://wiki.openstreetmap.org/index.php/Tiles%40home/Server_install_guide.
I then rendered a few randomly selected tiles in Sweden, dominated by
water and empty land since that machine is very slow at rendering
cities. I now ended up with 10 zipfiles containing tiles, and it turned
out 4 were maplint ones and 6 seems to be regular tiles, of which 1 is
empty (has a 69-byte file in it). (I don't know where the missing
maplint is, and why one of the zipfiles are missing tiles...)
I now made a test.sh which empties the t at h database, recreates the
tables and copies the zip-files into the Queue for processing, and runs
/Upload/Run/index.php and prints a few statistics:
http://rafb.net/p/4D3AyD77.html
SLOW means we insert metadata for every tile. FAST means we do it only
for z12. "x meta" is the number of operations for the tiles_meta table,
and "x blank" is the number of blanks we want to delete.
So, a few observations:
maplint tiles are never "fast", because the check in lib/checkupload.inc
checks against the number of tiles for z17 which maplint doesnt contain
atm, it only goes to z16. Maplint tilesets will always be flagged as
incomplete. These example sets have very fet metainserts anyway (0-32
for my examplezips) so making the take the FAST path would not really
make a difference anyway I think, compared to the 341 blank deletes.
Secondly, there are alot of processing done for tiles_blank, especially
for maplint since these will ideally be blank! I have an optimizing
hashing approach almost done, but I dont think it will make a huge
difference, see below.
Thirdly, on my laptop its not like the _entire_ DB access is a low
hanging fruit at all. My disk is arguably very slow imo. Removing all DB
access alltogether takes the time down to 10 seconds from 19s. Thats
still "only" one tileset per second. Initially I stored all tiles on the
disk, and the times for 10 zips was ~65 seconds, moving to a RAM-disk
brought this down to 30 seconds.
I think we might have to rethink how this is all done. The most optimal
way to do things that I can see is to let the client actually know whats
on the server (per each tile, whether it is blank land, sea and if
regular the last hash of the png? Hashes requires clients to have
bit-perfect output with regards to eachother, otherwise I guess we could
use a clientVersion+latest modified date for the API data, this to allow
a new client to re-render tiles with the same underlying data) when it
starts working, and let the client _only_ pack the tiles it _knows_ are
changed. This should, most of the time, remove much of the strain on the
server, and it would actually do _useful_ work all the time.
---
John Bäckstrand
More information about the Tilesathome
mailing list