[OSM-dev] Postgres 8.4/PostGIS 1.4 performance
Frederik Ramm
frederik at remote.org
Tue Sep 15 10:42:46 BST 2009
Hi,
(taking this back onto dev)
Jon Burgess wrote:
> It sounds odd that they would be quite so wrong. The osm2pgsql code
> performs an ANALYZE as part of the processing.
Something must be wrong there then. The osm2pgsql import with 8.3 has
now completed, again needing 1800 minutes compared to the 2000 needed by
the 8.4 version. The pg_class table, again, listed the nodes as having
relpages=0. Then I did a simple "analyze" which took a minute or two,
and afterwards I got sensible results:
For 8.3 with osm2pgsql 0.67, slim mode and -l:
relname | size | reltuples | relpages
--------------------------+------------+-----------+----------
planet_osm_ways_nodes | 20 GB | 32636734 | 2577003
planet_osm_nodes | 18 GB | 420728864 | 2333748
planet_osm_nodes_pkey | 9007 MB | 420728864 | 1152902
planet_osm_line | 7105 MB | 27390248 | 909401
planet_osm_ways | 6373 MB | 32636734 | 815762
planet_osm_line_index | 1613 MB | 27390248 | 206460
planet_osm_polygon | 1248 MB | 4276819 | 159700
planet_osm_ways_pkey | 1002 MB | 32636734 | 128192
planet_osm_roads | 800 MB | 2496160 | 102457
planet_osm_point | 615 MB | 7954021 | 78728
planet_osm_line_pkey | 587 MB | 27390248 | 75093
planet_osm_point_index | 464 MB | 7954021 | 59423
planet_osm_polygon_index | 260 MB | 4276819 | 33337
planet_osm_point_pkey | 171 MB | 7954021 | 21825
planet_osm_roads_index | 151 MB | 2496160 | 19373
planet_osm_ways_idx | 113 MB | 0 | 14499
planet_osm_polygon_pkey | 92 MB | 4276819 | 11735
planet_osm_rels_parts | 85 MB | 171435 | 10904
planet_osm_rels | 84 MB | 171435 | 10727
planet_osm_roads_pkey | 54 MB | 2496160 | 6929
planet_osm_rels_pkey | 3784 kB | 171435 | 473
planet_osm_rels_idx | 8192 bytes | 0 | 1
(22 rows)
For 8.4 with the same parameters (this time after "analyze"):
relname | size | reltuples | relpages
--------------------------+------------+-----------+----------
planet_osm_ways_nodes | 18 GB | 32352920 | 2406878
planet_osm_nodes | 18 GB | 420052384 | 2333748
planet_osm_nodes_pkey | 9007 MB | 420052384 | 1152902
planet_osm_line | 7104 MB | 27426062 | 909316
planet_osm_ways | 6364 MB | 32352920 | 814552
planet_osm_line_index | 1607 MB | 27426062 | 205720
planet_osm_polygon | 1247 MB | 4286884 | 159647
planet_osm_ways_pkey | 1001 MB | 32352920 | 128190
planet_osm_roads | 800 MB | 2525946 | 102446
planet_osm_point | 615 MB | 7954690 | 78728
planet_osm_line_pkey | 587 MB | 27426062 | 75093
planet_osm_point_index | 464 MB | 7954690 | 59423
planet_osm_polygon_index | 260 MB | 4286884 | 33337
planet_osm_point_pkey | 171 MB | 7954690 | 21825
planet_osm_roads_index | 151 MB | 2525946 | 19373
planet_osm_ways_idx | 113 MB | 0 | 14499
planet_osm_polygon_pkey | 92 MB | 4286884 | 11735
planet_osm_rels | 84 MB | 171433 | 10727
planet_osm_rels_parts | 67 MB | 171433 | 8545
planet_osm_roads_pkey | 54 MB | 2525946 | 6929
planet_osm_rels_pkey | 3784 kB | 171433 | 473
planet_osm_rels_idx | 8192 bytes | 0 | 1
(22 rows)
Now the two tables create more questions than they answer. Why does the
node table under 8.4 seem to have less entries than under 8.3 when both
have been populated with the same planet file?
osm=# select max(id) from planet_osm_nodes;
max
-----------
480684210
same for both databases.
osm=# select count(*) from planet_osm_nodes;
count
-----------
420480680
also the same for both. Which means that 8.3 over- and 8.4
under-estimates the actual number in the pg_class listing.
Maybe the 18 GB versus 20 GB is just an estimate as well because I
cannot seem to find a difference in node storage size:
osm=# select id,pg_column_size(planet_osm_nodes.*) from planet_osm_nodes
where id in (1,2,3,4,5,49786136,49786237,50052568,51308246,52779935);
id | pg_column_size
----------+----------------
2 | 109
3 | 36
4 | 36
5 | 36
49786136 | 236
49786237 | 240
50052568 | 276
51308246 | 208
52779935 | 204
Same for 8.3 and 8.4.
osm=# select sum(pg_column_size(planet_osm_nodes.*)) from
planet_osm_nodes where id > 400000000;
sum
------------
2622633489
Same too.
But then the disk usage for 8.3:
# du 8.3/main
328 8.3/main/global
4 8.3/main/pg_xlog/archive_status
688976 8.3/main/pg_xlog
12 8.3/main/pg_clog
12 8.3/main/pg_subtrans
4 8.3/main/pg_twophase
12 8.3/main/pg_multixact/members
12 8.3/main/pg_multixact/offsets
28 8.3/main/pg_multixact
4352 8.3/main/base/1
4352 8.3/main/base/11510
4448 8.3/main/base/11511
73569488 8.3/main/base/16385
4 8.3/main/base/pgsql_tmp
73582648 8.3/main/base
4 8.3/main/pg_tblspc
74272028 8.3/main
and 8.4:
# du 8.4/main
488 8.4/main/global
4 8.4/main/pg_xlog/archive_status
688976 8.4/main/pg_xlog
12 8.4/main/pg_clog
12 8.4/main/pg_subtrans
4 8.4/main/pg_twophase
12 8.4/main/pg_multixact/members
12 8.4/main/pg_multixact/offsets
28 8.4/main/pg_multixact
5484 8.4/main/base/1
5484 8.4/main/base/11563
5592 8.4/main/base/11564
72235152 8.4/main/base/16384
4 8.4/main/base/pgsql_tmp
72251720 8.4/main/base
4 8.4/main/pg_tblspc
4 8.4/main/pg_stat_tmp
72941260 8.4/main
There seems to be some truth about the 18 GB vs 20 GB figure; 8.4 seems
to save a few GB somewhere, and my suspicion is that for some reason
this saving costs some time during import, making for the 2000 vs 1800
minutes.
Bye
Frederik
More information about the dev
mailing list