[OSM-dev] 0.6 bulk uploader
Frederik Ramm
frederik at remote.org
Thu Jan 22 01:33:23 GMT 2009
Hi,
Shaun McDonald wrote:
> It would be best if the bulk_import.py script was updated for 0.6. As
> everything needs to be wrapped into a changeset, it makes the bulk
> upload more complex than before.
Yes and no... if you're talking uploads that are small enough to fit
into one diff upload (i.e. not something like a TIGER county ;-) then
bulk uploading should become trivial because you don't even have to keep
track of the object IDs, you just throw your diff at the server and
that's it. Such a bulk upload could basically be handled by a shell
script that has three lwp-request calls.
Hm, I see that each object in the diff must explicitly reference the
changeset ID... so that would probably add one "sed" call to the shell
script ;-)
BTW: It seems that we're not currently imposing an upper limit for the
number of changes in a diff upload, is that true? If so, we should
perhaps add such a limit because the transacionality of diff uploads
would otherwise make it too easy for the thoughtless script writer to
mess up or data base... only thing I'm unsure about is whether we should
simply abort after "n" cycles in the DiffReader.commit method (easy to
implement, but by the time we abort the database has already been
unnecessarily loaded), or whether there is perhaps a way to make this
depend on the size (in bytes) of the upload and it could easily be
checked before even starting to process it?
Bye
Frederik
--
Frederik Ramm ## eMail frederik at remote.org ## N49°00'09" E008°23'33"
More information about the dev
mailing list