[OSM-dev] Osmosis, Changesets, Diffs (replicate) and general questions

Wed Oct 28 21:42:38 GMT 2009

Lars Francke wrote:As Fred has already pointed out it's created with a 
separate tool.
>>> 3)
>>> * My initial import of OSMdoc data is done using a custom program.
>>> * The following data adjustments are done using the diff-files, the
>>> database (in a state as it was _before_ the diff), osmosis and an
>>> osmosis plugin
>>> * The initial import of OSM data is done using osmosis (--write-pgsql)
>>>
>>> As the minute-replicate diffs overlap (at least they used to do) the
>>> planet.osm dump it is best (..at least it used to be ;-) ) to create a
>>> "consistent" dump using the planet.osm and gradually applying diffs
>>> (using --apply-change) as there is no harm in applying changes that
>>> are already present in the planet.osm. What is the best way to do
>>> this? I hope I made myself clear, I don't know how to explain it
>>> better.
>>>
>>>       
>> I'm not sure I fully understand.  As you've stated, take a planet dump
>> and apply some overlapping diffs in sequence until you reach a point
>> after the planet creation completion time.  Just note that the new
>> minute replication diffs may not work with the --apply-change task just
>> yet because the new replication diffs may contain multiple versions of a
>> single entity.  The replication diffs are still experimental and tested
>> with all osmosis functionality.
>>     
>
> I think you understood correctly. I guess I'll have to check the
> apply-change task then. To see if it works with multiple versions or
> what would have to be changed for it to work.
>   
We have two choices here:
1. Fix all existing tasks working with changes to support full history 
changes.
2. Create a new task that can convert full history changes into simpler 
delta changes.

I tend to lean towards number 2 although this adds more burden to the 
end user in order to simplify dev effort.  For example, 
--simplify-change could be written which would accept a (sorted) full 
history (ie. replication) change stream, and produce a simplified change 
stream with only a single change per entity.

If an entity had multiple changes, those multiple changes would be 
collapsed into a single change.  This is not quite as simple as taking 
the most recent change.  For example, if a change stream contained node 
1 with a create change for version 1, and node 1 with a modify change 
for version 2, it would be necessary to write a single create change for 
node 1 with version 2 in the output.

If this task was created, the existing --apply-change could be used as 
it is which would simplify the code somewhat.

You could then invoke osmosis something like this:
osmosis --read-replication-interval --simplify-change --read-xml 
planet-old.osm --apply-change --write-xml planet-new.osm