[OSM-dev] 0.6 bulk uploader

Stefan de Konink stefan at konink.de
Thu Jan 22 03:11:38 GMT 2009


Frederik Ramm wrote:
> BTW: It seems that we're not currently imposing an upper limit for the 
> number of changes in a diff upload, is that true? If so, we should 
> perhaps add such a limit because the transacionality of diff uploads 
> would otherwise make it too easy for the thoughtless script writer to 
> mess up or data base... only thing I'm unsure about is whether we should 
> simply abort after "n" cycles in the DiffReader.commit method (easy to 
> implement, but by the time we abort the database has already been 
> unnecessarily loaded), or whether there is perhaps a way to make this 
> depend on the size (in bytes) of the upload and it could easily be 
> checked before even starting to process it?

So 0.6 brought us atomic transactions and you are proposing to break too 
big transactions into pieces? What is the use case of breakage?

Someone loads 'a city' in South-Africa, this is an upload of 1GB, a 
transaction is started, everything is inserted, a transaction is 
committed, profit?

Where do we need a limit? We will create a transaction log in the *SQL 
server, until the request is actually commited we do not return that 
data anyway in any query. The only significant problem that we get is if 
we want to return after a commit and we have a significant processing 
time that is longer than our http-timeout. Then again, if an user is 
able to query the changeset, he must also be able to query the actual 
processing of it hence the need of an actual return value after the 
request could be just 'queued'.


In a more easy, less code approach:

- Upload a file to OSM (By API/FTP/DAV) to changeset.osm
- This returns an in queue respons on successful upload
- The files are processed in the order of upload
- Start Transaction
  - Create new
  - Delete from diff
  - Update existing
- Commit/Rollback
- Update status of the changeset


An editor that uploads in the diff way would have to poll the OSM server 
for the status of the changeset.


Stefan




More information about the dev mailing list