[Tilesathome] MySQL optimization (hashing idea)

John Bäckstrand sandos at sandos.se
Thu Jul 26 11:50:57 BST 2007


Sebastian Spaeth wrote:
> Following a talk with TomH (to whom the credit has to go), I replaced 
> the mySQL statements that update the tiles meta data and the blank tiles 
> information.
> 
> The gist of the change is to use:
> "INSERT INTO `tiles_blank` (%s) values (%s) ON DUPLICATE KEY UPDATE... "
> rather than blindly do "replace into `tiles_blank` (%s) values (%s);"
> 
> The exact diff can be seen here:
> http://trac.openstreetmap.org/changeset?new=sites%2Fother%2FtilesAtHome%2FUpload%2FRun%2Findex.php%403748&old=sites%2Fother%2FtilesAtHome%2FUpload%2FRun%2Findex.php%403745
> 
> This should speed up tiles updating. Next on my list is to move the 
> mySQL data base files on an idle disk and improve the munin stats for 
> processed tiles.

Just to document it here, I mentioned a hashing idea to avoid the 
tiles_blank query alltogether if nothing will be updated: we build a 
hash of a representation of the tiles-statuses (land, sea, regular?) in 
the tileset, and store this hash as a field for every z12 tileset.

The next time we want to update tiles_blank, we check this hash before 
and if it matches it is extremely likely we dont actually need to update 
anything. This will keep the "seaness" intact, but possibly not the 
other metadata. I started implementing it this morning, problem is I 
dont have any connectivity on my trainride anymore so its going slowly.


I was also planning on setting up a benchmarking framework with a few 
"standard" tilesets and sequences of uploads, and see how various things 
affect the composite performance with reasonably composed sequences of 
uploads. (I will have to guess at what sort of uploads are common or not).

---
John Bäckstrand




More information about the Tilesathome mailing list