[OSM-talk] Size of installed Database?

Paul Norman penorman at mac.com
Sat Aug 31 22:34:16 UTC 2013


> From: TorstenMohr at schleim.mac.com [mailto:TorstenMohr at schleim.mac.com]
> Subject: [OSM-talk] Size of installed Database?
> 
> Hello,
> 
> i'm about to install the latest planet.osm and i'd like to install it on
> an SSD.
> 
> The bzipped XML file is approximately 31 GB of size, but how much space
> will i need on an SSD for the PostgreSQL database?

You haven't said you're using the XML file, but I'd suggest using the PBF 
instead.

> I haven't bought an SSD yet and i'd like to make sure that the database
> will fit on the SSD.
> 
> It would be very kind if somebody could tell me how much size an
> installation takes on disk and how old their installation is (if it is
> not actual).

I'm assuming you're asking about an osm2pgsql rendering database being
updated 
with diffs. If not, some of the following concepts still apply, but numbers 
will differ. 

Because it's been awhile since it's come up, I figure I might as well go 
over osm2pgsql size in detail. The disk space required is governed by 
the size of OSM data, import options used, and database bloat. 

1. Size of OSM data 

More data = more space. PBF is faster and smaller, but doesn't change 
database size. 

2. Import options used 

If you're dealing with the entire planet, you should be using 
--flat-nodes as its faster and saves space. Flat nodes uses about the 
same space for an extract as for the full planet, so it's not worth 
using for small extracts. Aside from that, there's not much you can 
tweak to save space, except --drop. If you aren't planning on doing 
updates, use --slim --drop to get rid of the slim tables and save lots 
of space. 

hstore or a bigger .style will use more space 

3. Database bloat 

Over time, data gets updated or deleted in the database. Postgres 
recovers this space with autovacuum, but the default autovacuum settings 
are not agressive enough. You want to adjust 
autovacuum_vacuum_scale_factor so it will autovacuum more frequently. 
autovacuum_analyze_scale_factor may also be adjusted. I use 0.04 and 
0.02 for these. 

A second problem is index bloat. If you can tolerate the database locks, 
just do a REINDEX (or a CLUSTER). Otherwise, create a new index 
concurrently and then use it to replace the old index. You can do 
reindexing on an index-by-index basis 

Now, for numbers. It's important to remember that bloat changes size, so 
if you turn autovacuum off and import a year old planet then update it, 
you get a different size than if you import a fresh planet. 

Yevaud had a fresh planet imported 3 weeks ago. It is 263GB in-DB 
currently, with an additional ~20GB for flat-nodes. Of this, it is 60GB 
for slim data, 105GB for slim indexes, 76GB for rendering data, 19GB for 
rendering indexes and 1.6GB for non-standard rendering indexes. 

This gives you a total of 95GB for rendering, and an additional 190GB 
for slim. 

One additional caution is the import requires more space to do the 
CLUSTER. 

This leads to a few conclusions 

- If you don't need updates, use --slim --drop --flat-nodes and delete 
the flat nodes file to save lots of space, particularly during import 

- You probably want to run reindex every 1-3 months 

- If you want to update your database by reloading into a new DB and 
replacing the old DB with it, the additional space requirements during 
import vs. non-slim are modest (slim size - CLUSTER requirements for 
non-slim tables = ~60GB) 

- I'd recommend a 512GB class SSD with a new server if you want 
everything on a SSD. The OSMF rendering servers use 512GB 840 Pros. If 
you have less space, use tablespace options to split it up, particularly 
if doing rendering only and just need the slim tables for the import. 

- Tune autovacuum aggressively if you're short space




More information about the talk mailing list