[OSM-dev] Planet diff's revisited

Sebastian Spaeth Sebastian at SSpaeth.de
Thu Jul 26 07:03:13 BST 2007


Jon Burgess wrote:
> the planetdiff I wrote works at the OSM object level but it does this
> entirely on a streaming model using the libxml2 SAX style parser. It is
> pretty fast. For every node/segement/way which is modified the XML
> contains the complete set of elements and attributes for the object.
> i.e. if a single <seg id=..\> is added to a way then you get both the
> old and new complete way appearing in the diff. 

I manually created the diff from last week to this week using Jon's 
planetdiff and put it on planet.osm (if anybody is interested). bz2'ed 
it is a 25MB file and took about 60 minutes to be generated (with the 
server being busy with other stuff at the same time). It would be easy 
to auto-generate such a diff each week in case there is interest. I 
would probably drop the 7z format in that case to save disk space rather 
than offering both archive types.

The latest complete planet.osm should of course always be available for 
download. What would people say if I used that diff tool to drop some of 
the older complete planets (say, every other for now) and provide the 
diffs for them instead.

You would need the planetpatch tool from Jon to recreate a complete 
planet using diffs:
> $ planetpatch planet-<foo>.osm.bz2 diff-<bar> | osm2pgsql -

I won't be able to do daily diff's though as I still would have to work 
from the weekly planet dump Steve generates. I wonder if you could have 
some MySQL queries (WHERE modified_ago < 24h) which would enable you to 
do "diff dumps". But I am not going to look into that myself.


What do people think? Would diffs be useful? SHould we start providing 
them? In addition to full planets or rather instead of some of the older 
ones?

Thanks to Jon for creating the planetdiff tool though, in any case. It 
is a nice piece of work (although the UTF8 sanitizer could probably be 
removed now that we have proper UTF8 data).

spaetz




More information about the dev mailing list