[OSM-dev] speeding up loading an OSM dump into PostGIS?

Kai Krueger kakrueger at gmail.com
Wed Nov 30 16:01:23 GMT 2011


Jukka Rahkonen-2 wrote
> 
> [...]
> For me it takes many hours with the Finnish dataset and if it fails it
> happens in some "Going over pending ways" phase. I will need to make some
> further tests some day so I can give you better information.
> 
If it is at the very beginning of the Going over pending ways (before it
prints out any other information) then that is probably during the sql
querry loading all of the pending ways into memory. It was mentioned in
another thread that this can be problematic on low memory systems. One could
possibly recode it to use postgres cursors to not have to load the full
result set into memory, but that would break the new parallelisation stage
which relies on having the full result set in memory before forking the
helper processes. I'll have to think about if there are solutions to this.


Jukka Rahkonen-2 wrote
> 
>   I know that
> it is Ubuntu 10.04 and PostgreSQL 9.0 with PostGIS is running on the same
> machine.  I believe it is 32-bit and osm2pgsql is half an year old. I have
> not used it many times because of failures.
> 
Half a year old osm2pgsql is probably before the introductions of my
improvements which were in October. Osm2pgsql used to be incredibly
inefficient with memory for its node cache on extracts, as it was optimised
for full planet imports. So it both used a lot of ram and on memory
constrained systems didn't get a good cache hit ratio which makes it very
very slow to import.

If you can it would therefore be good if you could try a new version of
osm2pgsql. You might need to set the "--cache-strategy" to "optimized" or
"sparse", as I think it defaults to the old inefficient behaviour on 32 bit
compiles. 64 bit compiles should use the optimized strategy as default.


Jukka Rahkonen-2 wrote
> 
> My Windows laptop in not much faster but it does not fail. Osm2pgsql
> version is the newest that exists for Windows, it is rather old (April 9,
> 2010).
> 

Ah Windows... Unfortunately the windows osm2pgsql is indeed ancient. Can
anyone try and compile a recent osm2pgsql for windows? Or does anyone know
how to compile it? I could set up a windows VM to try and build it, but
currently I don't even know where to get a C compiler or the
autoconf/autobuild tools for windows, so I am not sure how successful I
would be to build it. But it would be very helpful to have a modern
osm2pgsql for windows as that would allow more people to play with rendering
their own tiles.


Jukka Rahkonen-2 wrote
> 
>  Your 20 minutes total time feels amazing for me. But my server is
> very basic and I do not wait very much for 27 euros per month. But when
> the data are in it works well as a WFS server.
> 
Yes, I'd like the whole rendering stack to become more lightweight, at least
for small extracts, so that more people can play with rendering their own
tiles, either on their home laptop/desktop or on fairly cheap servers. The
huge resources often required for working with osm data is imho a big
barrier to more people using it and thus needs to be addressed. My recent
improvements to osm2pgsql together with the easy to install ubuntu packages
have hopefully helped somewhat in this respect, but there is still much more
to do...

Kai

--
View this message in context: http://gis.638310.n2.nabble.com/speeding-up-loading-an-OSM-dump-into-PostGIS-tp7045762p7047301.html
Sent from the Developer Discussion mailing list archive at Nabble.com.



More information about the dev mailing list