[OSM-dev] Minute Diffs Broken

Greg Troxel gdt at ir.bbn.com
Tue May 5 02:10:35 BST 2009


Frederik Ramm <frederik at remote.org> writes:

> 3. Make a semantic change to the way we handle diffs: Let the diff for 
> interval X not be "all changes with timestamp within X" but instead "all 
> changes that happened in a changeset that was closed within X". 
> Changesets not being atomic should pose no problem for this (because 
> when it's closed, it's closed). This would adversely affect downstream 
> systems in that some changes are held back until the changeset is closed 
> (whereas they are passed on immediately now), but on the other hand you 
> could afford to generate the minutely diff at 5 seconds past the minute 
> because you do not have to wait for transactions to settle (the actual 
> changeset close never happens inside a transaction).

So obviously we aren't running "SET TRANSACTION ISOLATION LEVEL
SERIALIZABLE", since that would kill performance and make things harder,
but it would solve this :-)

It's possible for a transaction with effective time T to have a
commit time of T', and the minute scan for A-B for T < B < T' is not
seeing the changeset, and the B-C minute scan is considering it not in
bounds.

If the real requirement for minute diffs is that the union of them is
right, then having the minute diff generator keep track of all the
changeset IDs it has seen in the last hour, and do a query that is
basically:

  select all changesets from the last 30 minutes
  exclude all changesets in the previous 60 minute diffs

then the missing changeset would show up in the next diff, which would
be the minute it was committed in, not the minute it was started in.  If
it's known there are no holes then changeset > top_changeset could make
this faster.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 193 bytes
Desc: not available
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20090504/4eb528de/attachment.pgp>


More information about the dev mailing list