[OSM-dev] Osmosis, Changesets, Diffs (replicate) and general questions

Brett Henderson brett at bretth.com
Wed Oct 28 06:44:04 GMT 2009


Lars Francke wrote:
> Hi!
>
> I'd like to set up a new consistent database of the OSM data for
> OSMdoc. I've been out of the loop for a while so I've got a few
> questions.
>
> 1)
> If I understood it correctly there is a new type of diffs - the
> minute-replicate - which are guaranteed to contain every change but
> they are not guaranteed to be generated exactly once a minute so this
> seems to be the way to go if one wants to have the best synchronized
> data there is. I also found out that there is an undocumented feature
> in osmosis to work with these diffs (--rri and --rrii) just like the
> "old" diffs.
> Is this correct so far?
>   
Yes.
> Are there any plans to generate such diffs on a hourly or daily basis?
>   
Yes.  I have a new task for merging multiple replication files into time 
based intervals but it's not tested or deployed yet.  I'm starved for 
time at the moment (just returned from leave, starting new project, 
moving house, etc) so not sure when that will happen.
> 2)
> It seems as if none of the diffs contain the changesets. I may have
> missed something here but the only way to get these seems to be the
> weekly dump of all changesets or by using the API? As I use the
> changeset tags for OSMdoc I'd be very interested why this is the case
> and if there are any plans to change this.
>   
I'd like to include full changeset information in diffs but it's not 
trivial.  I'm not sure if I'll ever get to this personally.  I'd love to 
see somebody take it on though.
> 3)
> * My initial import of OSMdoc data is done using a custom program.
> * The following data adjustments are done using the diff-files, the
> database (in a state as it was _before_ the diff), osmosis and an
> osmosis plugin
> * The initial import of OSM data is done using osmosis (--write-pgsql)
>
> As the minute-replicate diffs overlap (at least they used to do) the
> planet.osm dump it is best (..at least it used to be ;-) ) to create a
> "consistent" dump using the planet.osm and gradually applying diffs
> (using --apply-change) as there is no harm in applying changes that
> are already present in the planet.osm. What is the best way to do
> this? I hope I made myself clear, I don't know how to explain it
> better.
>   
I'm not sure I fully understand.  As you've stated, take a planet dump 
and apply some overlapping diffs in sequence until you reach a point 
after the planet creation completion time.  Just note that the new 
minute replication diffs may not work with the --apply-change task just 
yet because the new replication diffs may contain multiple versions of a 
single entity.  The replication diffs are still experimental and tested 
with all osmosis functionality.
> To put it in other words: When I start using my osmosis plugin to
> alter the OSMdoc data using the diffs I need to be sure that both
> datasets work on the same data.
>
> 4)
> What is the order of execution of osmosis tasks?
> I'd like to do something like: osmosis --rri ... -p "plugin.class"
> --tee-change outputCount=2 --osmdoc-plugin --write-pgsql-change but my
> osmdoc-plugin needs to access the data in the db as it was before the
> diff (i.e. before the --write-pgsql-change). Is this guaranteed or do
> I need to split this in two osmosis runs?
>   
You should be okay.  The --write-pgsql-change task makes all updates 
within a transaction and commits changes upon successful completion.  If 
--osmdoc-plugin is accessing the same database it presumably opens a 
separate transaction in a separate connection which should see the 
existing committed data.  I assume that the --osmdoc-plugin is only 
accessing the database in a read-only fashion though otherwise you'll 
probably encounter deadlocks.
> Thanks a lot for bearing with me. I hope someone can answer some of my
> questions :)
>   
Hope that helps :-)






More information about the dev mailing list