[osmosis-dev] planet import
Brian DeRocher
brian at derocher.org
Sun Apr 10 03:27:31 BST 2011
Hey everyone,
I started a full planet import on the 29th, 11 days ago. I'm trying to get an idea how long this will take. I just want to know if this will take about 20 days or more like 40 days.
Here's my setup:
2 dual core Opterons, cpu is not the bottleneck
8 GM ram, htop reports this RES memory usage
postgres 1082M UPDATE
java osmosis 91928 (15 processes/threads?)
Areca RAID 5 1T with 3 disks
/var is 552 GB, 444 GB used (87%) 80GB available
This usage has gone up and down from 84% to 91% a few times per day.
The import added about 300GB.
Debian 6.0
PostgreSQL 8.4 is probably not tuned well for this hardware, and it's not tuned well for large imports.
work_mem 1MB
maintenance_work_mem 16MB
checkpoint_segments 3
fsync on (i have a BBU and may set this to off in the future)
shared_buffers 24MB
The xlog is on the RAID 5 array too.
I've modified osmosis to connect to port 5433. Did i miss something? Can i specify that on the command line?
I ran: $ bzcat planet-110316.osm.bz2 | src/osmosis-0.34+ds1/bin/osmosis --read-xml file=- --write-pgsql host="localhost" user="osm" password="Shut up, Ted."
Here's the log so far.
Mar 29, 2011 11:11:43 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Osmosis Version 0.34
log4j:WARN No appenders could be found for logger (org.java.plugin.ObjectFactory).
log4j:WARN Please initialize the log4j system properly.
Mar 29, 2011 11:11:44 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Preparing pipeline.
Mar 29, 2011 11:11:44 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Launching pipeline execution.
Mar 29, 2011 11:11:44 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline executing, waiting for completion.
Sadly i did not configure logging correctly.
According to pg_stat is currently running this statement, so it looks like it's mostly done.
UPDATE ways SET bbox = (SELECT Envelope(Collect(geom)) FROM nodes JOIN way_nodes ON way_nodes.node_id = nodes.id WHERE way_nodes.way_id = ways.id)
Looks like a correlated subquery to me. Probably performing a nested loop.
I've read in the mailing list that adding the bbox and linestring columns will make the import "much" longer. So does that mean 10 days or 100 days?
I checked \d ways and i see "idx_ways_bbox" gist (bbox) and "idx_ways_linestring" gist (linestring). So either those indexes were created after "UPDATE ways set bbox..." or i see the database before the transaction started.
I don't know if this is in a transaction or not. I can't find the BEGIN in the code. I do see setAutoCommit() and this appears to be called with false.
Any tips?
Thanks,
Brian
--
Brian DeRocher
http://brian.derocher.org
More information about the osmosis-dev
mailing list