[OSM-dev] Daily Diff Merging
Brett Henderson
brett at bretth.com
Sat Oct 20 06:38:36 BST 2007
I've timed applying the latest daily changeset to produce a new daily
snapshot.
time osmosis --read-xml file=snapshot-20071019.osm --read-xml-change
file=2007-10-19--10-20-00\:01\:02.osc --apply-change --write-xml
file=snapshot-20071020.osm
real 63m7.739s
user 46m45.776s
sys 8m14.367s
Not sure if I can improve this speed much further, the merge process
involves several threads communicating via shared buffers which adds
synchronisation overhead. Just over an hour to produce a new planet
isn't too bad though. Of course this doesn't include compression which
is the real performance bottleneck.
Unless further problems are discovered, this is ready for use by anybody
who wishes to obtain daily up to date and consistent snapshots of osm data.
Osmosis also has the capability to put these changesets directly into a
MySQL database avoiding the need to create new planet files, I haven't
tested the performance of that with recent data though.
Several people have already started to play with this, but if anybody
has any questions or issues, please shout.
Documentation is available at:
http://wiki.openstreetmap.org/index.php/Osmosis
Any documentation improvements or additional examples are welcome.
Some possible ways for taking this further are:
* Create scripts to automate downloading daily changesets and patching
local planet files or databases.
* Integrate into mapnik infrastructure to provide more up to date maps.
* Integrate into tiles at home infrastructure to improve scalability of
rendering.
* Decrease time interval to provide more up to date data.
* Update planet creation process to include user information to allow
more complete database replicas to be created.
All thoughts welcome.
Cheers,
Brett
Brett Henderson wrote:
> The latest osmosis v0.20 appears to be working correctly.
>
> I patched to the latest planet-071017.osm which is 17,917,842,715
> bytes using planetdiff.
> I then downloaded the last three daily diffs.
> 2007-10-16--10-17-00:01:01.osc.bz2
> 2007-10-17--10-18-00:01:02.osc.bz2
> 2007-10-18--10-19-00:01:02.osc.bz2
>
> I applied the last three daily diffs as follows:
> osmosis --read-xml file=planet-071017.osm --read-xml-change
> file=2007-10-16--10-17-00\:01\:01.osc.bz2 compressionMethod=bzip2
> --apply-change --read-xml-change
> file=2007-10-17--10-18-00\:01\:02.osc.bz2 compressionMethod=bzip2
> --apply-change --read-xml-change
> file=2007-10-18--10-19-00\:01\:02.osc.bz2 compressionMethod=bzip2
> --apply-change --write-xml file=snapshot-20071019.osm
>
> I now have a snapshot-20071019.osm file that is 18,406,415,738 bytes.
> The dataset is growing fast!
More information about the dev
mailing list