[OSM-dev] Database Schema (was Re: [OSM-talk] Oh dear - planet has duplicate id's)
frederik at remote.org
Sat Jun 23 23:09:22 BST 2007
> I've had a look through the schema and have a series of proposals to
> clean things up, which (on a table by table basis) look like:
It's great to see something happen here, I belive a ton lot of
performance can be squeezed out of mysql if it is done right.
I know you're just tying up loose ends and cleaning up, and you're
planning enhancements for later, but I cannot resist the temptation to
voice, once again, my two major enhancement wishes:
Move the individual "versioning" tags (whodunnit + when) per object to a
"change group" table. This table contains change group id, user id,
timestamp, client software used, and optional comment. All objects then
just refer to the change group id by which they were created (saving a
lot of space). Enhance the API so that all CRUD requests may include an
optional change group id, and add a "create change group" call that
returns a new change group id. (Possibly also "close change group",
dunno.) Initially, using changegroups could be optional (they could be
created on-the-fly if an uploaded object didn't have one), but as
clients take up the idea, they should soon become required.
This will make it so much easier to spot problems, display good version
histories (including user's comments which could be sent when creating
the changegroup - like a SVN commit message), and also so selective
rollbacks if things go awry.
Devise a way how CRUD messages can be passed to other servers in
real-time (or near realtime), based on a "subscription list" that may
(at a later stage perhaps) also be filtered. Set up one other server
initially that receives all these messages and uses them to keep an
identical copy of the "current" tables. Start serving all "read"
requests from that database instance, and also select a small number of
off-site mirror servers to receive a "full feed" (they, in turn, may
sub-distribute data of course).
This will take a lot of strain from the central database which becomes
the write-only master; distribution of current data through an
arbitrary-size "trickle-down" tree will make data available where it is
needed, creating really cool opportunities like live mapnik rendering -
and best of all: *Everybody* who wants it can set up their own live
mapnik rendering engine since you'll always find a node in the tree to
Frederik Ramm ## eMail frederik at remote.org ## N49°00.09' E008°23.33'
More information about the dev