[OSM-dev] Minute Diffs Broken
Brett Henderson
brett at bretth.com
Wed May 6 00:26:38 BST 2009
Greg Troxel wrote:
> Brett Henderson <brett at bretth.com> writes:
>
>
> Given the use of pgsql transactions, osmosis won't see data from
> uncommitted transactions. So I really meant "changes in the database,
> subject to the notion that uncommitted transactions won't be visible."
>
Hehe, terminology strikes again. I think I know what you mean :-)
>> To complicate things slightly further, the full history files
>> http://planet.openstreetmap.org/history/
>> are similar but complete a full delta from one point in time to
>> another and may contain several versions of a single entity.
>>
>> So perhaps the term "diffs" is the right one for the existing files
>> and "deltas" is the right one for full history files.
>>
>
> I would hope that both have the property that if a copy of the DB that
> was right at the earlier time, then applying delta or diff to that copy
> gets one a copy of the database as of when the osmosis extract
> transaction ran. Perhaps then the delta has the intermediate steps and
> the diff is permitted to collapse them?
>
That's correct and exactly how it works. Both deltas and diffs (if I
can call them that) extract the full history between two points in time,
but the diffs collapse the changes into a minimal set. Diffs are fine
if you just want the latest snapshot, deltas are required to replicate
full history.
>
>> The reason I've tended to avoid the word "diffs" is because the planet
>> directory also contains diffs between planet files. These diffs are
>> yet another way of describing changes/differences and are truly a
>> difference between two planet files.
>>
>
> As in the output of the diff command on two text files which happen to
> contain xml, it sounds like.
>
It's not the same as the diff command either. It produces output like:
<delete>
<node id="80871" lat="58.4206038728957" lon="15.5648050309828"
timestamp="20
07-05-24T23:33:56+01:00"/>
</delete>
<add>
<node id="80871" lat="58.4206038728957" lon="15.5648050309828"
timestamp="20
07-07-20T00:30:23+01:00"/>
</add>
which is very similar to the osmosis diffs but it treats a modify as a
delete and an add. Osmosis just creates a modify element containing the
new version and leaves out the old version.
>> If you're not familiar with it already, please check out the API
>> schema. If information isn't stored there, we can't query it. For
>> example, there is no concept of an upload in the database, the only
>> grouping feature it has is changesets.
>> http://gweb.bretth.com/apidb06-pgsql-latest.sql
>>
>
> Thanks - read and sort of understood - there's a lot in there.
>
From memory the only tables that really matter are the nodes,
node_tags, ways, way_tags, way_nodes, relations, relation_tags,
relation_members, changesets, and users. The current_* tables are
irrelevant. And the rest I don't understand at all ;-)
More information about the dev
mailing list