[OSM-dev] Effort in the US wasted until TIGER import is complete?
frederik at remote.org
Wed Mar 21 11:19:21 GMT 2007
> The original approach on one machine at 1 sec insert cycle produces
> a huge
> amount of data quite quickly (it by far swamped the volume of data
> for the
> rest of the world put together in the time it ran) it will still
> take a
> considerable time to import everything, many many months rather
> than weeks.
Would it not be sensible - for this special case, where such a large
amount of data is imported - to import directly into the central
MySQL database? Of course this would require some admin cooperation
and oversight but for an amont of data that will (my guess) instantly
double what we already have, one shouldn't do anything without admin
cooperation or oversight anyway;-)
I guess the table structure used by the central server is in SVN, so
anyone could set up a MySQL database like ours and experiment with
direct importing of TIGER data, and once that works, prepare the "big
import". That would surely be much quicker than anything else. If
need be, index generation can be halted during the import and indexes
could be rebuilt afterwards, which is often considered a good idea
when importing a lot of data (don't know about the super special
tricky spatial indexes used in OSM but I guess it's the same with them).
Importing TIGER bit for bit over a period of weeks will probably
cause more server trouble than just closing shop for a night and
running the full import. A test run where a separate Mysql instance
(not on OSM hardware but somewhere else) is primed with the full
current database and TIGER then inserted on top of it would also be
All this said with the caveat that I had nothing to do with the
original TIGER import besides reading what went over the lists.
Frederik Ramm ## eMail frederik at remote.org ## N49°00.09' E008°23.33'
More information about the dev