[OSM-dev] Diff Upload - possible task for osmosis, bulk uploader?

Thu Nov 8 21:57:52 GMT 2007

Martijn van Oosterhout wrote:
> On Nov 8, 2007 2:13 PM, Frederik Ramm <frederik at remote.org> wrote:
>   
>> I could have osmosis make a diff between the original downloaded data
>> and my augmented data base, but I cannot upload that to the API. To
>> upload such a diff, I would need functionality currently implemented in
>> bulk uploaders (and JOSM), because the numerical IDs assigned to all new
>> objects my local installation need to be re-mapped according to the IDs
>> that the API gives out during upload. I would somehow have to make sure
>> that only the new IDs are rewritten, and all references to previously
>> existing IDs must of course stay unchanged.
>>     
>
> Off the top of my head it should Just Work (TM). The changefile
> indicates that the node is to be created and it will be remapped. I
> havn't actually tested it though.
>
> Have a nice day,
>   
Whether this works will depend very much on the behaviour of the bulk 
upload script.  Osmosis should generate a suitable diff with the 
relevant entities marked create, modify or delete as appropriate.  The 
potential issue is that all ids will be positive and will be set to the 
values assigned by the offline database.  This isn't necessarily an 
issue, but the bulk upload script must read the id in the change file, 
upload the entity as a create, remember the newly assigned id, and 
replace any subsequent elements that refer to it (ie. way referring to a 
node) with the new id.

I don't know how the current bulk uploader works.  Does it only perform 
substitution on negative ids, or can it substitute positive values as well?

The other thing that may cause complications is nested relations.  It 
may be necessary to scan a change file multiple times, once to determine 
all the substitutions that need to take place, and the second to perform 
the actual upload.  In fact some special ordering of the input file may 
be required to sort it correctly so that ids can be assigned before 
being used elsewhere.  The current osmosis --sort-change task with 
type="seekable" will attempt to sort a file so it can be applied to a 
database but it doesn't take into account these complications arising 
from id substitution because it was only designed to replicate changes 
from master to slave, not merge changes between two masters.  I don't 
know how easy this problem is to solve, presumably using the current api 
it is possible to create two relations, each referencing the other (a 
circular dependency) where both must have id substitution performed, 
there is no way of sorting the input file to make this work correctly.  
It would be nice if the API could prevent this scenario from occurring, 
but I suspect it doesn't do so currently.  I guess in practice this is 
very unlikely to occur so probably not a huge deal.

I like the relation concept, but it has thrown a few tricky problems 
into the mix ;-)

In summary, if the bulk uploader can perform id substitution on positive 
ids then everything should work without modification so long as any 
nested relations have the "inner" relation created first (with a lower id).

Cheers,
Brett