[OSM-dev] Complete history of OSM data - questions and discussion

Matt Amos zerebubuth at gmail.com
Wed Nov 11 11:07:18 GMT 2009


On Wed, Nov 11, 2009 at 6:41 AM, Lars Francke <lars.francke at gmail.com> wrote:
> There are a few questions that probably need answering first and I
> hope we can start a discussion about this:
> - Am I correct in assuming that there are no general objections from
> the OSM server folks against such a dump? (Which would render the rest
> of this E-Mail useless ;-)

the response has always been "if someone writes it, and it's good,
we'll run it" :-)

> - Is anyone else currently working on this?

for some values of "working", yes. it's on my list of things to do for
the license change plan - clearly we'll need a full data dump before
we can re-license.

> - Which format should the data be dumped in

(3) is the easiest to get done and most easily supported, in my opinion.

> - Distribution of the data and storage space requirements

i have a feeling that the data, while big, won't be so big that the
usual method of planet.osm.org + heanet mirror won't work.

> - Interval of dumps

based on back-of-the envelope calculations, a full dump in planet
format would take something like 7-10 days to do in parallel with
normal server activity. so it couldn't be run every week and would
probably be cumbersome to do every month. in my opinion, we should be
looking at every 3-6 months.

> 3) A dump of all OSM elements in OSM format
> (http://www.openstreetmap.org/api/0.6/node/60078445/history)

this is my favourite method as well. the easiest approach would be to
modify planet.c to dump the full history, instead of just the
current_* tables.

note that brett has been working on option (2) by using osmosis to
dump very historical diffs going back to the inception of the
database. you can see the experimental results in
http://planet.openstreetmap.org/history/

for my money, if we do both (2) and (3), then we cater for all
consumers, and in a standard format. the output of the COPY command,
while good for backups, isn't really suited to dumping the information
that we have in the planet (given there will be edits by users who are
still not public, etc...)

if you want to get started hacking on planet.c then i'm happy to help.
otherwise i'm hoping to get around to it by the end of the month, but
there are never any guarantees ;-)

cheers,

matt




More information about the dev mailing list