[OSM-dev] Minute Diffs Broken

Karl Newman siliconfiend at gmail.com
Sun May 10 05:29:26 BST 2009


On Tue, May 5, 2009 at 3:43 PM, Brett Henderson <brett at bretth.com> wrote:

> Matt Amos wrote:
> > the XML document is parsed incrementally to save memory, rather than
> > for its behaviour, but it appears that rails, lighty and fastcgi all
> > support streaming input. i am unsure if they all work together, but
> > the rails docs suggest that it does.
> >
> > streaming input isn't nearly as difficult as streaming output. since
> > rails treats (most) uncaught exceptions as 500 errors, but the error
> > status is in the header, rails takes the safe option of having the
> > full response in a buffer before it returns the status header.
> >
> Is this a good idea?  I'm not comfortable with the idea of streaming
> input.  At least with streaming output you are performing a read-only
> operation (I think) and therefore shouldn't be holding any important
> locks.  Unless you update the user table during downloads in which case
> the damage is limited to the user record with the cr at ppy connection.
> Even osmosis which is running physically near the database server always
> dumps each query to disk to minimise read time before assembling results
> and writing "diff" files.
>
> But input is triggering writes and could be obtaining various locks all
> across the database.  If we're waiting on an unknown quality network
> connection to send the data then our transaction durations are totally
> unbounded.  We're exposing our nice database to the whims of the outside
> world.  Not to mention making my life more difficult than it otherwise
> would be ;-)
>
> Brett
>

Okay, so I'm 5 days behind on my email (although scanning through more
recent subjects doesn't show any new activity about this), and maybe this
problem has been solved already. Apologies if so, but I want to capture my
thoughts while they're fresh. Please take this with a huge grain of salt and
don't be harsh with me, because I don't know the details of the API workings
(but I did read this entire thread).
Some ideas to solve the root problem of Osmosis minute diffs missing
elements, which appears to be because the timestamp is dated from when the
transaction is opened instead of when it is committed:
1. If the transaction commit delay is caused by slow network connections,
stream the diff upload to a temp file and wait until it finishes before
passing it off to a rails daemon. The downside is increased disk I/O, which
may or may not be significant.
2. If the delay is caused by precondition verification or other rails
operations, then create the timestamp (in rails) once all data has been
received and preconditions verified, just prior to inserting all the rows.
This may not be possible without a major restructuring of the code,
though... (like if it does precondition checks and inserts in sections for
nodes, then ways, then relations, for example).

Karl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20090509/5b822582/attachment.html>


More information about the dev mailing list