[OSM-dev] "Deep History" App

Brett Henderson brett at bretth.com
Tue Sep 15 23:50:57 BST 2009

Frederik Ramm wrote:
> Hi,
> Ian Dees wrote:
>> That's disappointing. Am I the only one that cares about this? With 
>> permission I'd be happy to try and write something for the 
>> openstreetmap.org <http://openstreetmap.org> site to make this feasible 
>> and not kill the API frontend servers.
I don't think you're the only one who cares about this, it's come up 
fairly often in the past.
> I am very much in favour of supplying, and periodically updating, a full 
> OSM history dump. I think the main reasons why this is not being done 
> currently are
> (a) nobody wrote a program for it
I have, but not in a single file format.
> (b) any program that does get written would have to be engineered in a 
> clever way in order not to put too much strain on the data base and 
> still produce something consistent
The easiest way to avoid strain is to extract deltas based on timestamp 
and merge offline.
> So my suggestion would be to write such a program an then campaign for 
> it being used on openstreetmap.org. - Third party servers importing that 
> data could then offer all sorts of APIs for history queries.
Something like this does already exist in Osmosis.  The only reason it 
isn't being run regularly is due to lack of disk space on dev.

They are full history diffs, broken into daily chunks.  Grouping them 
into larger chunks is fairly trivial and can be done offline.

Once the new services server comes online I'll run it daily and keep it 
up to date.  A new full history file could be generated every so often 
(eg. weekly) by merging in the most recent daily files into an existing 
rollup file.

If you wish to keep an offline database up to date with full history you 
can use:

Note that the above replication files are very experimental at the 
moment.  They are full history files which means they may contain 
multiple changes for a single entity.  They appear to work but I can't 
be 100% sure they're not missing data.  Any missing data will be due to 
bugs in code, not in the replication technique.  However they are 100% 
up to date (ie. no 5 minute delay).  They're using PostgreSQL 
transaction ids to select data and don't use timestamps at all.  They're 
generated once per minute, but the data inside them is not aligned to 
minute boundaries.  Existing osmosis tasks should be used with care 
because most of them have not been verified with full history files 
(eg.  The --apply-change task will probably fail).


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20090916/f8ff3a05/attachment.html>

More information about the dev mailing list