[OSM-dev] Osmosis Postgres Vacuum Stall Upload

Humphries, Grant HumphriG at TriMet.org
Tue Apr 7 21:49:35 UTC 2015


Thanks for the response Flo, I was able to do as you suggested. Within this file: https://github.com/openstreetmap/osmosis/blob/fa8ff0e3d3ed39a7951c8daf3169dcf892141b37/osmosis-pgsnapshot/src/main/java/org/openstreetmap/osmosis/pgsnapshot/v0_6/impl/CopyFilesetLoader.java

I simply replaced:
dbCtx.getJdbcTemplate().update("VACUUM ANALYZE");

with:
dbCtx.getJdbcTemplate().update("VACUUM ANALYZE users");
dbCtx.getJdbcTemplate().update("VACUUM ANALYZE nodes");
dbCtx.getJdbcTemplate().update("VACUUM ANALYZE ways");
dbCtx.getJdbcTemplate().update("VACUUM ANALYZE way_nodes");
dbCtx.getJdbcTemplate().update("VACUUM ANALYZE relations");
dbCtx.getJdbcTemplate().update("VACUUM ANALYZE relation_members");

and built the tool as outline on the github page.  Everything seems to running smoothly and it reduced load time of the data from ~30 minutes to about 90 seconds.

-----Original Message-----
From: Florian Lohoff [mailto:f at zz.de] 
Sent: Saturday, April 04, 2015 12:00 PM
To: Humphries, Grant
Cc: 'dev at openstreetmap.org'
Subject: Re: [OSM-dev] Osmosis Postgres Vacuum Stall Upload

On Sat, Apr 04, 2015 at 01:02:38AM +0000, Humphries, Grant wrote:
> Hi all,
> 
> I have a script set up that loads OSM data into a postgres db via Osmosis (into the snapshot schema) each night.  The database that it being loaded has a lot of other data in various schemas and the osm data is loaded into its own schema.  However it seems that as a part of the loading process Osmosis vacuums the entire database and because the db is so large this is taking about 25 minutes, while the loading of the data itself takes less than 5 minutes.
> 
> (At least this is what I believe is happening, when I run osmosis under verbose this is the line it hangs up on for ~25 mins:
> Apr 3, 2015 5:13:02 PM 
> org.openstreetmap.osmosis.pgsnapshot.v0_6.impl.CopyFilesetLoader run
> FINE: Vacuuming database.)
> 
> Does anyone have advice as to how I could address this issue?

I would enable statement + time logging in postgres to see which statement results in the delay.

Then modify osmosis to not call that statement. In the end its open source and if it breaks you may keep both halves.

Flo
-- 
Florian Lohoff                                                 f at zz.de
     We need to self-defense - GnuPG/PGP enable your email today!



More information about the dev mailing list