[OSM-dev] New Replication Functionality

Mon May 26 14:13:37 BST 2008

Hi All,

I've just released osmosis 0.27 which adds some additional support for 
downloading and applying changesets from the planet server.

As you may or not be aware there are currently three changesets 
available on the planet server.  These are the daily, hourly and minute 
changesets.

I have added two new tasks to osmosis for consuming these files.  These are:
--read-change-interval - This provides the ability to download a set of 
changes, merge them into a single changeset and pass the data to the 
next task in the pipeline.  It maintains a working directory where it 
stores its configuration file and tracks the latest downloaded timestamp.
--read-change-interval-init - This is necessary to initialise the 
working directory used by the --read-change-interval task.

I'll illustrate using an (untested) example.  This example will keep an 
xml file containing a small bounding box up-to-date.
1. Create the working directory with a timestamp initialised to the 
start of today.
$ osmosis --read-change-interval-init "myworkingdirectory" 
"2008-05-26_00:00:00"

2. Edit the config file.
The configuration file will contain suitable defaults for downloading 
the hourly changes and hopefully won't require any modifications.

3. Extract an extra of the world that you're interested in using a 
planet that is newer than the start of today (the original source must 
contain later data than you initialise the initial timestamp to).
osmosis --rx planet.osm.bz2 --bb left=0 top=10 right=10 bottom=0 --wx 
myextract.osm

4. Apply the latest changes to the extract, produce an updated extract 
and rename new file the old name.
osmosis --read-change-interval "myworkingdirectory" --rx myextract.osm 
--ac --bb left=0 top=10 right=10 bottom=0 --wx myextract-updated.osm
mv myextract-updated.osm myextract.osm (Note that the error code 
returned by osmosis should be checked in the previous step)

Some limitations exist:
Currently only the hourly and minute changesets are supported due to the 
bzip2 encoding used by the daily changesets.  I should fix this soon but 
it will take a little bit of rework in the task design, it's currently 
hard coding a gzip encoding.
There is no limits to the number of files downloaded, I should add a 
configuration option to limit the number of files (and subsequent 
processing threads) processed in a single invocation.

I hope people find it useful.  Any feedback appreciated.

Cheers,
Brett