[OSM-dev] Memory error while converting osm to gml

Stefan de Konink stefan at konink.de
Thu Apr 2 23:40:33 BST 2009


Iván Sánchez Ortega wrote:
> El Jueves, 2 de Abril de 2009, Stefan de Konink escribió:
>> Does anyone have any timings on importing the current planet.osm in
>> PostgreSQL with any tool available? Using lets say 8GB of ram and
>> 'typical' disk?
> 
> I can only say that I imported a planet (with custom, non-OSM data) worth 170m 
> nodes and 12m ways (about 2GB in .bz2) in less than five hours.

The current  OSM is worth around 330m nodes, of that 360m used in ways, 
having 26m ways.

> Just for the record: I do think the import is roughly as fast as batch 
> processing 2 GB of small files files. And I'm sorry I didn't run any 
> benchmarking tools at the time O:-)

Preperations are made to benchmark, but I want to know some numbers that 
I can expect. My usual import action consist of extracting the planet, 
creating cvs files, loading them and constraint validation.

Matt Amos wrote:
 >>> i second frederik's recommendation - import into postgres using
 >>> osm2pgsql then export the GML from that. its a little convoluted, but
 >>> the toolchain is well tested.
 >> Does anyone have any timings on importing the current planet.osm in
 >> PostgreSQL with any tool available? Using lets say 8GB of ram and 
'typical'
 >> disk?
 >
 > we do these in about 3 hours in-memory from a bz2 file, but it uses
 > about 4.5Gb. machine is a dual quad-core xeon with 16Gb ram, disks are
 > 5x sata raid 5. during import the max disk read was 15k blocks/s.
 > osm2pgsql seems to spend most of its time at >80% cpu, so it looks
 > like most of the effort is going on decompressing and xml parsing.

What I have seen using gzip and bzip2 on the fly decompression on 5x 
raid0, only one CPU was busy decompressing and the actual tool had a cpu 
time of .30. It was clear that first decompressing then running the tool 
on an mmap file was faster.

So typically the first stage ages are both done in less than two hours; 
but today I discussed the possibility to natively write binary files; 
which is trivial for non string containing tables/columns.


For ASB-1 [Api Six Bench times one] I hope to provide some generic numbers.


Stefan




More information about the dev mailing list