[OSM-dev] Complete History Changesets

Brett Henderson brett at bretth.com
Tue Oct 14 09:54:14 BST 2008


Hi All,

The topic of getting access to bulk history data has come up a few times 
now so I'm wondering if people see a need for this.

Osmosis already provides the ability to produce a changeset of a 
specific time interval.  This is currently being used to produce minute, 
hourly and daily changesets.  By downloading and keeping all minute 
changesets it is possible to keep *most* history items but not all.  The 
current osmosis extraction code looks at all the changes in an interval 
and collapses them into a single change.  With minute changesets this is 
going to extract most individual history elements but not all.

It would be fairly straightforward to modify osmosis to extract full 
history for a time interval.  This could fit into the existing osmChange 
format without any modifications although clients might potentially have 
to make some changes to cope with multiple changes to a single id.  
Potentially the file extension should be modified from *.osc to 
something else like *.osh (history) to avoid confusion.

With this functionality in place it would be straightforward to produce 
full history dumps to build a sequential set of history files for OSM.  
These files wouldn't be small but as a rough guesstimate the total size 
of all files would be less than double the existing planet dump.  
Similar daily, hourly and minute changesets could be produced to the 
existing changesets potentially replacing them over time.

I didn't include full history in existing changesets because my main 
focus was to introduce the concept of distributed up-to-date OSM 
databases for secondary purposes such as rendering, routing, etc.  This 
is starting to take off with OSMXAPI, ROMA, Mapnik's DB, etc.  The next 
step is full history.  As described above, this should be fairly simple 
to introduce with the 0.5 API.

0.6 introduces some nice things like explicit version numbers which will 
make correct ordered application of multiple changes to a database 
simpler, but replication of changeset information will require some 
changes to the osmChange format which I haven't put much thought into yet.

Reactions?

Brett





More information about the dev mailing list