[OSM-dev] osm2pgsql update
Jukka Rahkonen
jukka.rahkonen at latuviitta.fi
Thu Nov 24 10:17:06 GMT 2011
Hi,
I have a feeling that I can't test this --slim --drop with Windows. It is
a much desired feature for me anyhow.
-Jukka Rahkonen-
Frederik Ramm wrote:
> Hi,
>
> Kai has made a number of interesting improvements to osm2pgsql in
> the last weeks. I believe some bits are still work in progress but on
> the whole osm2pgsql has become a lot more efficient - it makes better
> use of cache memory and can even use multiple processes for some tasks.
> Anyone who regularly spends time waiting for osm2pgsql to complete is
> encouraged to check out a recent version from svn and try if that
> improves things for him.
>
> I think it would be great to share results of osm2pgsql runs among users
> - how long does it take to import X on infrastructure Y?
>
> I've made a start here, please add/modify as you see fit:
>
> http://wiki.openstreetmap.org/wiki/Osm2pgsql/Benchmarks
>
> There's one particular use case that osm2pgsql did not cover so well in
> the past - the "I don't want to apply updates but I need to use slim
> mode nonetheless because I don't have enough memory for non-slim" use
> case.
>
> osm2pgsql is not very well suited for this because it puts all its
> temporary information into the database instead of a more efficient
> random-access structure. This is something I'll leave for someone else
> to fix, but I did one thing to make this use case a bit better; I
> introduced a "--drop" flag that makes osm2pgsql drop the temporary
> tables after import, and also does not create the indexes on way id and
> relation id that a --slim import normally created. So now, after
> importing a data set with --drop and --slim, you should have a database
> that looks almost the same as one imported without --slim. By dropping
> the unnecessary tables and indexes, the database usually is only 25% of
> the size of a complete --slim import (but of course it is unsuitable for
> updates).
>
> There's one strange thing I noticed. When I dropped the creation of
> indexes (more precisely, primary keys) on way id and polygon id,
> suddenly osm2pgsql took ages to run - even though these indexes are
> clearly not created in non-slim mode and therefore should not be required.
>
> I found out that the culprit is in the multipolygon code, where after
> finding out that an one-way outer ring is tagged the same as the
> multipolgon relation itself, a "delete_way_from_output" is issued,
> presumably to remove that already-generated ring. This leads to a
> "DELETE from <table> where osm_id=<id>" which requires a table scan
> because of lack of primary keys.
>
> I have now disabled this for --slim --drop mode (the change will not
> affect normal --slim mode), but have to investigate further - this will
> likely create some extra areas for outer rings, but since it doesn't
> have these indexes, non-slim mode should exhibit the same behaviour.
>
> Is anyone aware of multipolygon handling not working right when not
> using --slim? We might have to (re)introduce the primary key for osm_id
> at least on the polygon table to allow this deletion of duplicate areas.
>
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>
More information about the dev
mailing list