[OSM-dev] Representing differences between sets of OSM data

Brett Henderson brett at bretth.com
Mon Aug 20 01:23:59 BST 2007


I'll have to check out the planetdiff format as well (I looked a while 
ago but forget the details).  If it contains a superset of the 
information I need then I may be able to write an input task for osmosis 
that reads a planetdiff file.  Once the data is inside osmosis, the file 
format is unimportant and the data can either be processed directly or 
written out to the osmosis format.

In other words, the file format both tools use may not matter too much 
if end applications have a way of converting.  Of course that makes the 
assumption that destination tools will consume the osmosis format which 
may or may not be appropriate :-)

Do you have a simple example of planetdiff data (ie. a small extract of 
data)?

There's a couple of potential problem cases we'll have to consider:

1. When a record is deleted, Osmosis either gets the deleted timestamp 
from the database or uses the current time if deriving a diff from two 
planet files.  This time is then used when creating delete records in 
the destination database.  If we had version numbers on the node and 
segment tables this wouldn't be such a major issue but currently 
timestamp is the only way of ordering records in these tables.  In other 
words, I can't use the last modification time for the delete because the 
delete must be later than the modification.  I can't use the current 
time when creating delete records in the database because it causes 
problems if subsequent re-creations use an earlier timestamp.
2. The database allows records to be deleted and then re-created with 
the same id.  I need to look into this more but I suspect Osmosis will 
detect the re-creation as a modify.  This will apply to a database with 
no problems but may cause problems when trying to apply changes to a 
planet file.

The above may have no relevance to planetdiff but they're things that 
I'll have to keep in mind.

Jon Burgess wrote:
> Yes the planetdiff includes both old and new objects. This is very
> similar to the traditional unified diff format which list both added and
> removed code blocks. This was a natural fit with the way the diff
> algorithm operates and allows a high degree of confidence that we're
> applying the diff on top of the correct file. 
>
> In theory it would also allow a "-R" mode to remove a diff from a planet
> dump and generate the previous one (I've not implemented this though).
>
> I'll have to look a little more at the osmosis diffs. It would be good
> to converge on a common format.
>
> 	Jon
>   






More information about the dev mailing list