[OSM-dev] 0.6 bulk uploader
Stefan de Konink
stefan at konink.de
Thu Jan 22 03:11:38 GMT 2009
Frederik Ramm wrote:
> BTW: It seems that we're not currently imposing an upper limit for the
> number of changes in a diff upload, is that true? If so, we should
> perhaps add such a limit because the transacionality of diff uploads
> would otherwise make it too easy for the thoughtless script writer to
> mess up or data base... only thing I'm unsure about is whether we should
> simply abort after "n" cycles in the DiffReader.commit method (easy to
> implement, but by the time we abort the database has already been
> unnecessarily loaded), or whether there is perhaps a way to make this
> depend on the size (in bytes) of the upload and it could easily be
> checked before even starting to process it?
So 0.6 brought us atomic transactions and you are proposing to break too
big transactions into pieces? What is the use case of breakage?
Someone loads 'a city' in South-Africa, this is an upload of 1GB, a
transaction is started, everything is inserted, a transaction is
committed, profit?
Where do we need a limit? We will create a transaction log in the *SQL
server, until the request is actually commited we do not return that
data anyway in any query. The only significant problem that we get is if
we want to return after a commit and we have a significant processing
time that is longer than our http-timeout. Then again, if an user is
able to query the changeset, he must also be able to query the actual
processing of it hence the need of an actual return value after the
request could be just 'queued'.
In a more easy, less code approach:
- Upload a file to OSM (By API/FTP/DAV) to changeset.osm
- This returns an in queue respons on successful upload
- The files are processed in the order of upload
- Start Transaction
- Create new
- Delete from diff
- Update existing
- Commit/Rollback
- Update status of the changeset
An editor that uploads in the diff way would have to poll the OSM server
for the status of the changeset.
Stefan
More information about the dev
mailing list