[OSM-dev] Suggestion about design of OSM data structure

Thu Jul 26 21:58:48 BST 2007

Hi,

>> How you trim the ways before trim the segments? Because the segments 
>> appear in the planet.osm file before ways. Remember the planet.osm 
>> file is very huge > 5GB, you can not load it into memory. You need to 
>> read one line at a time and process one by one.
> 
> The way that I was thinking of doing it, would work when you are using 
> some advanced SQL queries in a database. As you correctly point out, 
> when reading from the planet file, this would be impractical. If 
> incomplete ways was an issue, then the planet file could be re-read to 
> include the segments and nodes that are missing.

Yes, I've been doing that on some occasions already. You can even, while 
doing the first pass, create and index that maps node/segment IDs to 
byte positions in the plane file. We have 12 million of each at the 
moment, so an index matching an 8-byte byte offset to each node and 
segment will need less than 300 MB of RAM (and you can even trade more 
efficiency for memory use by just storing every 10th or so, as long as 
they're ordered).

Of course nothing beats a proper database, but for little one-off tasks 
I always like to be able to offer a 50-line Perl script with "nothing 
else required, just run this on the planet file and wait half a day" 
which many people seem to find less of a hurdle than to install a 
database first.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00.09' E008°23.33'