[OSM-dev] 0.6 bulk uploader

Frederik Ramm frederik at remote.org
Thu Jan 22 01:33:23 GMT 2009


Hi,

Shaun McDonald wrote:
> It would be best if the bulk_import.py script was updated for 0.6. As  
> everything needs to be wrapped into a changeset, it makes the bulk  
> upload more complex than before.

Yes and no... if you're talking uploads that are small enough to fit 
into one diff upload (i.e. not something like a TIGER county ;-) then 
bulk uploading should become trivial because you don't even have to keep 
track of the object IDs, you just throw your diff at the server and 
that's it. Such a bulk upload could basically be handled by a shell 
script that has three lwp-request calls.

Hm, I see that each object in the diff must explicitly reference the 
changeset ID... so that would probably add one "sed" call to the shell 
script ;-)

BTW: It seems that we're not currently imposing an upper limit for the 
number of changes in a diff upload, is that true? If so, we should 
perhaps add such a limit because the transacionality of diff uploads 
would otherwise make it too easy for the thoughtless script writer to 
mess up or data base... only thing I'm unsure about is whether we should 
simply abort after "n" cycles in the DiffReader.commit method (easy to 
implement, but by the time we abort the database has already been 
unnecessarily loaded), or whether there is perhaps a way to make this 
depend on the size (in bytes) of the upload and it could easily be 
checked before even starting to process it?

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"




More information about the dev mailing list