[OSM-dev] full-history and 'the rails port' db schema

Andrew Harvey andrew.harvey4 at gmail.com
Sat Jul 23 01:50:28 BST 2011


Hi I'm trying to recreate the complete database that backs the rails
port, i.e. the 2TB 'openstreetmap' postgres database running on smaug
which has schema as defined at
http://wiki.openstreetmap.org/wiki/Database_schema

It seems there are several possible routes, if you could provide any
wisdom here it would be of great help. I believe my options are:

1. Start from an empty database and build it up by applying changes
one at a time. I'm guessing these files
http://planet.openstreetmap.org/history/ are the ones to use. However
it seems these /history/ ones don't have changeset details, and the
/{hour|minute}-replicate/ ones don't go back to the beginning of OSM
time. This is akin to replaying the transactions made to the live OSM
database, for my mirror. Even if I could find the right files to use,
I'm worried that this may take too long to complete (i.e. months),
would this approach be wise?

2. I could use a full planet .osm file
http://planet.openstreetmap.org/full-experimental/full-planet-110619-1430.osm.bz2
as a base, and use osmosis to load this into postgres using
--write-apidb. I don't believe that osmosis currently supports this
from the full-planet file tough. Is this a viable option? I tried it,
but osmosis gave some errors about nodes with the same id breaking
some db constraint.

Ideally though, because I don't have a super powerful machine, I would
rather just load an extract rather than the complete planet, if it
would speed things up.
http://lists.openstreetmap.org/pipermail/dev/2011-May/022624.html
should help here.

3. If an OSM sysadmin is willing to take a raw postgres dump/extract
from smaug. e.g. pg_dump (is
http://trac.openstreetmap.org/browser/applications/utils/osmosis-history
supposed to make a dump file?), would this be a fast way to replicate
the database? I suppose I wouldn't be able to load an extract. On the
upside if it is fast to load, I won't need an extract as disk capacity
isn't the bottleneck.



More information about the dev mailing list