[Geocoding] Local Mirror of OSM Data...

andrzej zaborowski balrogg at gmail.com
Fri Oct 23 18:27:50 BST 2009


2009/10/23 Peter Childs <pchilds at bcs.org>:
> No I'm using...
> bzcat planet-latest.osm.bz2 | osmosis-0.31/bin/osmosis --read-xml
> file=/dev/stdin --write-pgsql database=map
> and apparently CPU usage, (according to TOP, (Give or take))
> osmosis 55%, postgres 15%, bzcat 12%
> Can't believe it takes 4 times the resources to convert XML that it takes to
> decompress a file.....

On linux I usually pin the processes to different processors manually
using "taskset" to make sure they're using  both cores, I'm not sure
why linux doesn't get it right by default (someone suggested on irc
that there's overhead of "copying" the data between CPUs which believe
is a myth -- an unbound process on a dual core cpu is normally
migrated about 200 times a second from one cpu to the other when
they're both idle, there's no overhead from this).

Also if I know I'm going to process a planet snapshot more than once,
I download it and immediately recompress using gzip (grows to about
11GB).  Both compression and decompression are ~10 times faster for
gzip than bzip2 and the bottleneck moves CPU closer to IO (the gzip
ratio seems to be close to the sweetspot where both cpu and the disk
are equally busy), on my hardware which is an average pc.

Cheers




More information about the Geocoding mailing list