[Tilesathome] Proposal: New T at H Server structure

Sun Jun 1 09:00:22 BST 2008

Hi There,

i have done quite some thinking lately how to speed up the tiles at home
upload. I've come up with an complete new structure of the server System.

As stated here the main bottleneck is at the moment the disk io on the
main tile storage. There is no trivial and cheap way to fix this problem
with the structure at the moment.

OK Here my plan.

The T at H server (software) will be splitted in 3 parts.

1. Database (MySQL)
2. Reflector (Apache or squid)
3. TileServer (plain apache)

Each of the three can be exist multiple times.

The Database holds a table the location of each given tile is stored.
There is one primary database (read/write) and there can be n slave
databases (read only). (for performance and distribution)

There is at least one Reflector (more can used to distribute load) and
at least one TileServer (more can be used for load distribution)

The reflector redirects Tile Requests to the TileServer.

1. A Client requests a tile from t at h

One of the Reflector server (choose per round robin or by some
intelligent code in the OSM Javascript code) gets the request looks the
tile up in the Database and returns the location of the tile via
redirect the the client who connects to the indicated TileServer and
gets the Tile there.

2. A T at H client uploads an Tile package

The T at H client requests an upload from an web service at one of the
Database servers. The web service responds with an upload cookie and the
address of one of the TileServer which is able to store more tiles.

The T at H client starts the upload of his tile to this server and goes on
with the next render request.

The TileServer extracts the tiles in a new directory on the server and
updates the location for the uploaded tiles in the Main Database.

Each TileServer runs a garbage collection at some interval to find
tiles that are no longer referenced in the Database an delete them.

Migration:

The migration from the running System to the new system can be done
without service interruption. The database is first filled with the
tile locations of the old System. As tiles get uploaded to the new
TileServer traffic  will be redirected to the new TileServer. At some
point we can simple shut down the old system. The same goes for the T at H
clients.

Positives:
- More distributed
- No single point of failure
- More scalable (simply add more servers to the system)
- People can donate disk space and traffic in smaller parts. (Given that
the needed Bandwidth is there. No TileServer at Home :-) ).
- Simple migration
- No Change in frontendcode (the Urls used for tiles will not be changed)

Negative:
- More complexity
- Minor changes for T at H clients

So please think this proposal over and give me some comments.

And yes I will contribute in designing and coding this System. I'm at
the moment doing the database design for the main database. The
reflector code will be next an the upload code and garbage collection
will be last as the are the most complex part.

Regard
	Estartu

-- 
-------------------------------------------------
Gerhard Schmidt       | E-Mail: schmidt at ze.tum.de
TU-München	      |
WWW & Online Services |
Tel: 089/289-25270    |
Fax: 089/289-25257    | PGP-Publickey auf Anfrage