[OSM-dev] Minute Diffs Broken
gdt at ir.bbn.com
Tue May 5 13:45:35 BST 2009
My aim all along has been to provide people with up to date data. The
nice thing about the minute changesets is that they let you have an
offline database that exactly matches the API as of 6 minutes ago. I'd
completely agree with you if the API only released data once the
changeset was closed but that's not the case.
I am a bit confused by some of the terms being used here. The basic
issue for me is that we have API operations, which correspond to
database transacations. Ignoring SERIALIZABLE vs READ COMMITTED, these
operations are quite safe. These operations are not changesets.
Here are logs from squid, showing me querying data and upoading some changes.
1241488036.615 5688 172.16.32.240 TCP_MISS/200 38363 GET http://www.openstreetmap.org/api/0.6/map? - DIRECT/188.8.131.52 text/xml
1241488089.540 276 172.16.32.240 TCP_MISS/200 661 GET http://www.openstreetmap.org/api/capabilities - DIRECT/184.108.40.206 text/xml
1241488099.552 312 172.16.32.240 TCP_MISS/200 373 PUT http://www.openstreetmap.org/api/0.6/changeset/create - DIRECT/220.127.116.11 text/plain
1241488102.378 2817 172.16.32.240 TCP_MISS/200 656 POST http://www.openstreetmap.org/api/0.6/changeset/1081606/upload - DIRECT/18.104.22.168 text/xml
1241488102.570 163 172.16.32.240 TCP_MISS/200 366 PUT http://www.openstreetmap.org/api/0.6/changeset/1081606/close - DIRECT/22.214.171.124 text/html
So there are one read and then three write database transactions, and
one changeset. The read is not hard and the writes all happens close in
time (JOSM), but it won't necessarily be so.
Given the way the world is, it seems like the minute diffs really should
be looking for new transactions, not new changesets. I can see
Frederik's point of only exporting closed changesets, but for that to
really make sense I think the main database has to isolate changesets
From each other until they are fully committed (meaning either
long-running transactions or an API change to have an API operation be
open/upload/close) -- trying to add transaction properties on a copy
when they aren't there in the original seems like it just won't work.
This is also confusing in wording because in svn changeset is a
transaction, and it's not just SERIALIZABLE but actually SERIALIZED, so
the word changeset can have a wrong connotation.
I think we have
uploads == db transactions (perhaps "microchangesets" of "changeset fragments"??)
changesets == (some group of uploads, with a common id and comment)
minute diffs == (some collection of uploads)
or maybe we will have
minute diffs == (some collection of changesets)
but in that case the db created by the minute diff may refer to objects
which are not present, breaking the integrity guarantees that 0.6 got
I don't have a clue about how the uploads are numbered and how easy it
is to extract all of them, but given that the main DB can have committed
transactions with uploads that are not part of a closed changeset, I
think the minute diff and replicated dbs should have that too.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 193 bytes
Desc: not available
More information about the dev