[OSM-dev] Effort in the US wasted until TIGER import is complete?

Frederik Ramm frederik at remote.org
Wed Mar 21 11:19:21 GMT 2007


Hi,

> The original approach on one machine at 1 sec insert cycle produces  
> a huge
> amount of data quite quickly (it by far swamped the volume of data  
> for the
> rest of the world put together in the time it ran) it will still  
> take a
> considerable time to import everything, many many months rather  
> than weeks.

Would it not be sensible - for this special case, where such a large  
amount of data is imported - to import directly into the central  
MySQL database? Of course this would require some admin cooperation  
and oversight but for an amont of data that will (my guess) instantly  
double what we already have, one shouldn't do anything without admin  
cooperation or oversight anyway;-)

I guess the table structure used by the central server is in SVN, so  
anyone could set up a MySQL database like ours and experiment with  
direct importing of TIGER data, and once that works, prepare the "big  
import". That would surely be much quicker than anything else. If  
need be, index generation can be halted during the import and indexes  
could be rebuilt afterwards, which is often considered a good idea  
when importing a lot of data (don't know about the super special  
tricky spatial indexes used in OSM but I guess it's the same with them).

Importing TIGER bit for bit over a period of weeks will probably  
cause more server trouble than just closing shop for a night and  
running the full import. A test run where a separate Mysql instance  
(not on OSM hardware but somewhere else) is primed with the full  
current database and TIGER then inserted on top of it would also be  
possible.

All this said with the caveat that I had nothing to do with the  
original TIGER import besides reading what went over the lists.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00.09' E008°23.33'






More information about the dev mailing list