[OSM-dev] osm2pgsql and bbox imports

Fri Oct 12 15:27:59 BST 2012

Peter Wendorff writes: 

> osm2pgsql has to store nodes outside the bbox because geometries that 
> overlap the borders etc. should be included in the result, too.
depends on the cutting algorithm used.
I could live with osm2pgsql doing a hard cut as I made my bbox large enough 
to have some buffer. If reference completeness is a requirement it's still 
possible to pass it to a softcut filter before and leave away the bbox at 
all.
Here is a description of two implemenations of a cutting algorithm
https://github.com/MaZderMind/osm-history-splitter 

> Yes, preprocessing might be faster therefore, but that might depend on 
> your system setup and where the bottleneck of your pipeline is, as the 
> cutting process faces the same problem here: it runs several times over 
> the input file to find dependent nodes for ways that are partly in the 
> extracted target area.

My problem is that a database which was always around 2GB grow to 40GB 
during the import process and this is killing my vserver. It simply can't 
cope with the I/O load. I used the asia extract as my input file as I did 
before but now the nodes table contains data 90 degrees away from my 
bounding box. 

My setup documentation does not mention anything that I manually cleaned up 
the nodes table after import. So it could have been a change in osm2pgsql 
as well. 

Stephan