[OSM-dev] Planet diff's revisited
Sebastian Spaeth
Sebastian at SSpaeth.de
Thu Jul 26 07:03:13 BST 2007
Jon Burgess wrote:
> the planetdiff I wrote works at the OSM object level but it does this
> entirely on a streaming model using the libxml2 SAX style parser. It is
> pretty fast. For every node/segement/way which is modified the XML
> contains the complete set of elements and attributes for the object.
> i.e. if a single <seg id=..\> is added to a way then you get both the
> old and new complete way appearing in the diff.
I manually created the diff from last week to this week using Jon's
planetdiff and put it on planet.osm (if anybody is interested). bz2'ed
it is a 25MB file and took about 60 minutes to be generated (with the
server being busy with other stuff at the same time). It would be easy
to auto-generate such a diff each week in case there is interest. I
would probably drop the 7z format in that case to save disk space rather
than offering both archive types.
The latest complete planet.osm should of course always be available for
download. What would people say if I used that diff tool to drop some of
the older complete planets (say, every other for now) and provide the
diffs for them instead.
You would need the planetpatch tool from Jon to recreate a complete
planet using diffs:
> $ planetpatch planet-<foo>.osm.bz2 diff-<bar> | osm2pgsql -
I won't be able to do daily diff's though as I still would have to work
from the weekly planet dump Steve generates. I wonder if you could have
some MySQL queries (WHERE modified_ago < 24h) which would enable you to
do "diff dumps". But I am not going to look into that myself.
What do people think? Would diffs be useful? SHould we start providing
them? In addition to full planets or rather instead of some of the older
ones?
Thanks to Jon for creating the planetdiff tool though, in any case. It
is a nice piece of work (although the UTF8 sanitizer could probably be
removed now that we have proper UTF8 data).
spaetz
More information about the dev
mailing list