[OSM-dev] Osm2pgsql and failed planet import

Tue Aug 30 17:44:38 BST 2011

On 08/30/2011 11:50 AM, John Smith wrote:

> osm2pgsql doesn't have any code to check for memory allocation
> failures and to deal with it in a sane way, it just assumes all
> allocations are fine until it checks the nodes when going over pending
> ways etc. Anthony posted a patch a couple of months back, I didn't
> hear if the patch was added to the svn version(s).

i'm not aware of any other patch, but i changed the cache allocation
code quite recently to allocate the full configured cache size up
front instead of doing so in blocks of 8KB.

The main reason for this was that although all cache blocks get
freed once the cache is no longer needed all this memory is very
unlikely to get returned to the OS due to heap fragmentation.

Allocating everything up front has the advantage that malloc()
will use mmap() to map the cache memory into the process memory
space instead of putting it on the heap, and so it can later
be free()d without being affected by potential heap fragmentation.

As a side effect this also allows to check up front that there
is sufficient memory for the configured cache size instead of
running into out-of-memory situations only after the import had
already been going for potentially quite a while.

As the memory is really freed before the index building step now
it is possible to configure larger work_mem and maintenance_work_mem
buffers on the postgresql side without having to wait for the
unused-but-still-allocated osm2pgsql process memory gradually being
pushed out to swap over time.

Current code can be found here:

  https://github.com/hholzgra/osm2pgsql/tree/freeable_cache

-- 
hartmut