[OSM-dev] Planet diff's revisited

Jon Burgess jburgess777 at googlemail.com
Wed Jul 25 20:22:09 BST 2007


On Wed, 2007-07-25 at 14:18 +0200, Thomas Walraet wrote:
> Brett Henderson a écrit :
> > 
> > Osmosis can also generate diff files (actually object changes, not line 
> > changes) but I suspect Jon's diff tool will be a lot quicker if it works 
> > at a raw text file level rather than an OSM entity level.  Then again, 
> > his use of an xml delta file extension suggests otherwise.
> 
> I think it's better to have a diff at OSM entity level.
> 

The planetdiff I wrote works at the OSM object level but it does this
entirely on a streaming model using the libxml2 SAX style parser. It is
pretty fast. For every node/segement/way which is modified the XML
contains the complete set of elements and attributes for the object.
i.e. if a single <seg id=..\> is added to a way then you get both the
old and new complete way appearing in the diff. 


> Such a diff may be used by a script to modify a base populated with 
> osm2pgsql (instead of importing the whole DB each week).

I've considered this for a while and it isn't possible right now. The
problem is that the osm2pgsql conversion is lossy and the resulting SQL
tables do not contain enough information to be able to apply the diff. 

One fix for this would be to include extra tables so that the conversion
is no longer lossless. This extra data would never be used by the mapnik
renderer though and would slow down the import. The extra tables would
be more or less equivalent to using the DB middle layer in the current
code (try uncommenting "//mid = &mid_pgsql;" in in osm2pgsql.c and
remove the "mid = &mid_ram;" line)

The best compromise I can come up with right now is to be able to import
a planet.osm + diff file simultaneously. This would effectively perform
the planetpatch process on the 2 files while running osm2pgsql. In fact
you can already do this today like ...

$ planetpatch planet-<foo>.osm.bz2 diff-<bar> | osm2pgsql -

There probably wouldn't be too much benefit adding this into osm2pgsql
itself. If we did go down this path then I could consider adding support
for applying multiple daily diffs on top of a weekly planet dump, e.g.

$ planetpatch weekly.osm.bz2 thurs.osm.bz2 fri.osm.bz2 ... 


> It could also be used easily to know which tiles need to be redrawn 
> since last planet.

Yes. This was another of the motivating factors behind writing the diff
tool. You've got to be careful though since the diff alone does not tell
you everything. e.g. If we add a pre-existing segement to a way then
only the way may be in the diff. You've got no information in the diff
about the nodes of the segment, nor of the position of those nodes. This
is very similar to the issue faced by osm2pgsql importing a diff. You
really need the previous planet dump as a reference. 

	Jon








More information about the dev mailing list