[OSM-dev] Preferred Method of Bulk Upload?
zerebubuth at gmail.com
Sat May 9 01:24:08 BST 2009
On Sat, May 9, 2009 at 12:25 AM, Ian Dees <ian.dees at gmail.com> wrote:
> What is the preferred method of bulk upload nowadays? By bulk, I mean
> millions thousands of features with millions of nodes.
probably best to use the diff upload feature. i think ivan has a script for it.
more GNIS stuff?
> Assuming the preferred pseudocode for upload looks like this:
> open changeset
> while more uploads:
> upload a diff file
> close changeset
> ... how many changes should we put inside of each diff file? How many
> uploads should we make in one changeset?
the present limit on changeset size is 50,000 edits (i.e: nodes, ways
and relations) - so you'll probably need to use more than one
changeset, complicating the code a little. you can upload a single
diff with all 50,000 changes in it, but that would be huge. its
probably better to split it into a number of smaller diffs.
> I assume that the capabilities API call will tell me some of this, but I'm
> not entirely sure which piece of the capabilities call lines up with which
> piece in my example.
the only relevant bits are:
waynodes maximum is the maximum number of nodes per way - any more and
you'll need to split the way.
changesets maximum is, well, the maximum number of edits in a changeset.
> Is the data written to the database after each diff upload or is it stored
> in memory, then written out at the close of a changeset?
its written atomically at each diff upload, i.e: each diff upload
either succeeds or fails entirely.
More information about the dev