[osmosis-dev] Replication Changes

Sat Nov 21 00:03:17 GMT 2009

On Fri, Nov 20, 2009 at 5:15 PM, Lars Francke <lars.francke at gmail.com>wrote:

> Once again: Thanks for all your work on this!
> After taking a stab at it myself I certainly have a new appreciation
> for what you've done.
>

Hehe, I started doing this in 2006 I think and thought it'd be done and
dusted in a few months.  3 years later and I'm still doing it ...

>
> > The "history" diffs are in the process of being generated and are well
> > through 2008 as we speak.  These are effectively daily diffs but aren't
> > getting deleted on a rolling window basis.  This is effectively creating
> a
> > full history dump of the database.  This has been in the wings for a
> while,
> > but only possible now that there is some more disk space available.
> These
> > are still timestamp based extracts due to transaction id queries being
> > useless for historical queries.  As a result of the use of timestamps,
> these
> > will be run with a large delay to avoid missing data.  I'll probably set
> > this delay to 1 day to be safe, but perhaps a couple of hours would be
> > enough.
>
> The first few years worth of history diffs have been created using the
> "old" Osmosis version. So is it possible that they are missing a few
> transactions too? (As a result of the "one off" bug).
>

The one off was in the transaction id calculation code which is only used
for the minute-replicate diffs.  The history diffs are generated using the
older style timestamp range queries which shouldn't have the same problem.

But there could certainly be bugs in the history diffs, let me know if you
see anything.

>
> > Moving away from a file-based
> > distribution approach has serious implications for reliability in the
> face
> > of server and network outages, cacheability, bandwidth consumption, and
> > server resource usage.  As a result, the existing approach is likely to
> > represent the state of the art in the near to medium future.  We need to
> > stabilise the existing features before attempting new ones :-)
>
> I thought about a replication over pubsubhubbub which should take care
> of bandwith, cacheability, server resource usage (with fat pings) and
> a few other problems. But I've done no work on it yet or even thought
> it through. It just seemed like a fitting concept for the (or _one_ of
> the types) type of replication we need.
> The MusicBrainz project is facing much the same problem as we are and
> they're using a very similar solution
> (http://musicbrainz.org/doc/Replication_Mechanics).
>

The musicbrainz replication scheme (on first read) sounds pretty similar to
what we're doing now.  In other words, the client/slave tracks which
sequence number it has reached and downloads replication files until it
reaches the current point.  One difference is that they're replicating
between identical schemas whereas Osmosis is more general but the idea seems
to be the same.

As for publish/subscribe mechanisms I'm less sure.  There are a few things I
wish to achieve in order to maximise fault tolerance and promote loose
coupling between systems:
1. Zero administration per client on the server side.  In other words I
don't wish to have to perform setup per client on the server.  A possible
exception is authentication.
2. Zero state managed per client on the server side.  To maximise
scalability and minimise administration, I'd rather all per client state be
managed at the client side.
3. Clients must be able to re-sync after network or client server outage
without having gaps in data.  To do this they need to be able to request the
server to start sending data from a specific point with the server limiting
how far back it will allow.

Point 3 is the most problematic from a pub/sub perspective because most
pub/sub mechanisms have a single server publishing updates, and already
subscribed clients receiving them.  It is hard for a client to re-sync from
a known point if it has missed updates.  I'd rather the server not have to
know which updates each client has received and track that client side
instead.

It may be necessary to write a server app from scratch.  It could run
regular extracts from the db (eg. every 10 seconds or so) but not publish
them publicly.  The existing minute-replicate process could switch to
consuming these extracts and roll them into minute chunks.  The server app
would be multi-threaded with a master thread retrieving updates from the db,
then notifying client specific threads when new data is available.  Each
client specific thread would begin sending data from the point at which the
client requests when it first connects, then push subsequent updates to the
client when it is notified of each new extract.  The client would be
responsible for tracking which sequence it had successfully received and
committed to its output data store.  The server would wrap all diffs in a
replication xml structure that would provide information about the timestamp
this change represents, and the current replication sequence number.

With the above approach, the master thread would be the only process
accessing the production db.  The master thread would never block based on
client activity.  Clients could begin processing from any point (within
limits) to allow them to load their database to a point with planet+day+hour
diffs then continue from a specified point.  If clients lost connectivity
for some time, they could resume where they left off unless they'd fallen
outside the maximum re-sync window in which case they'd have to catch up via
normal diffs.  The client threads could be placed in a pool that limits the
number of clients to a sensible number.  Connections could require a user id
if necessary to limit the number of consumers using this mechanism.  If
large numbers of consumers started using this, it would be possible to
cascade these replication systems in a hierarchy for scalability.

I don't think code complexity would be terribly high.  The most difficult
part is the threading aspect of the server, but it would be simpler than
Osmosis itself in many respects because the threads are mostly independent.
Having said all that, I do have a tendency to reinvent that which is already
available so perhaps this has already been solved elsewhere :-)

One fairly major consideration is what impact this type of system has on OSM
infrastructure.  It is likely to be more error prone and require more
maintenance than the existing approach.  The push mechanism and eliminated
network "chattiness" also makes it uncacheable which has implications on
bandwidth consumption.

Anyway, that's a dump of my thoughts.  I certainly won't implement anything
in the near future so feel free to have a play and see what you can come up
with.

Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/osmosis-dev/attachments/20091121/4181425e/attachment.html>