[Geocoding] Nominatim SQL script indexing speed

Brian Quinion openstreetmap at brian.quinion.co.uk
Mon Jan 3 17:42:03 GMT 2011


On 3 January 2011 08:32, Teemu Ikonen <teemu.ikonen at iki.fi> wrote:
> Hi,
>
> I did few experiments with latest Nominatim sql scripts and for some
> reason the indexing is very, very slow. The gazetteer-loaddata.sql
> progresses maybe 0.5 - 2GB per 24h (as reported by  select
> pg_size_pretty(pg_relation_size('placex'));
>
> Few facts
> - Indexing seems to be CPU bound, faster or slower IO setup does not
> make significant difference
> - Same results with Ubuntu 10.40 / Postgres 9.0 / PostGIS 1.5.2  and
> Fedora core 8 / Postgres 8.3 / PostGIS 1.5.1
> - Version: http://svn.openstreetmap.org/applications/utils/export/osm2pgsql/gazetteer,
> Revision: 24892
> - "HW" is Amazon EC2 m1.xlarge instance.
>
> Last time I did this was on osm2pgsql SVN Revision: 22731, the
> loaddata step took ~24 hours on Postgres 8.3. Now, loaddata probably
> takes 10-40 days if it ever finishes.

the most likely suspect is that it now pre-calculates the country code
from most features, but as of postgis 1.5.1 the internal caching
should take care of that (from a performance point of view)

You could try moving the following (first line in add_location) inside the if:

    keywords := make_keywords(name);

you could try commenting out

    IF place_country_code IS NULL THEN
      country_code := get_country_code(geometry);
    END IF;

just below it.

see if either of these speed it up to at least narrow down where to look.

Other than that it is a question of debugging placex_insert to narrow
down the cause of the problem.

I'm not aware of any particular performance problems introduced recently.

--
 Brian



More information about the Geocoding mailing list