[Geocoding] Nominatim SQL script indexing speed
Frans Hals
fhals7 at googlemail.com
Tue Jan 4 08:20:40 GMT 2011
My experiences with Amazon and Nominatim had been horrible, didn't matter which instance I took.
Just a big waste of time/money. Switched to a dedicated server for much less the price...
Frans
Am 04.01.2011 um 08:19 schrieb Teemu Ikonen:
> Thanks for help! I'll try that out as soon as current run ends.
>
> Now gazetteer-loaddata.sql progresses steadily 400MB / hour on
> following setup. Table 'place' is 24GB, so it will take at least 60
> hours to finish.
>
> Postgres 8.3.9
> Postgis 1.5.1
> osm2psql revision: 22731
> Fedora Core 8 2.6.21.7-2.fc8xen x86_64
>
> "Hardware"
> Amazon m1.xlarge instance (4 core, 15GB)
> Raid 0 on 4 x ESB disks, 64kB chunk (used only < 100 tps per second,
> indexing not IO bound)
> Single core utilized 100%, rest 3 are idle.
>
> Delta to the default database configuration file
>
> 107,108c107
> < shared_buffers = 2GB
> ---
>> shared_buffers = 32MB
> 115,116c114,115
> < work_mem = 256MB
> < maintenance_work_mem = 256MB
> ---
>> #work_mem = 1MB
>> #maintenance_work_mem = 16MB
> 121c120
> < max_fsm_pages = 1092000
> ---
>> max_fsm_pages = 204800
> 153c152
> < fsync = off
> ---
>> #fsync = on
> 163c162
> < wal_buffers = 16MB
> ---
>> #wal_buffers = 64kB
> 172c171
> < checkpoint_segments = 20
> ---
>> #checkpoint_segments = 3
> 174c173
> < checkpoint_completion_target = 0.9
> ---
>> #checkpoint_completion_target = 0.5
> 205c204
> < random_page_cost = 1.5
> ---
>> #random_page_cost = 4.0
> 209c208
> < effective_cache_size = 8GB
> ---
>> #effective_cache_size = 128MB
> 383c381
> < autovacuum = off
> ---
>> #autovacuum = on
>
> Br Teemu
>
> On Mon, Jan 3, 2011 at 7:42 PM, Brian Quinion
> <openstreetmap at brian.quinion.co.uk> wrote:
>> On 3 January 2011 08:32, Teemu Ikonen <teemu.ikonen at iki.fi> wrote:
>>> Hi,
>>>
>>> I did few experiments with latest Nominatim sql scripts and for some
>>> reason the indexing is very, very slow. The gazetteer-loaddata.sql
>>> progresses maybe 0.5 - 2GB per 24h (as reported by select
>>> pg_size_pretty(pg_relation_size('placex'));
>>>
>>> Few facts
>>> - Indexing seems to be CPU bound, faster or slower IO setup does not
>>> make significant difference
>>> - Same results with Ubuntu 10.40 / Postgres 9.0 / PostGIS 1.5.2 and
>>> Fedora core 8 / Postgres 8.3 / PostGIS 1.5.1
>>> - Version: http://svn.openstreetmap.org/applications/utils/export/osm2pgsql/gazetteer,
>>> Revision: 24892
>>> - "HW" is Amazon EC2 m1.xlarge instance.
>>>
>>> Last time I did this was on osm2pgsql SVN Revision: 22731, the
>>> loaddata step took ~24 hours on Postgres 8.3. Now, loaddata probably
>>> takes 10-40 days if it ever finishes.
>>
>> the most likely suspect is that it now pre-calculates the country code
>> from most features, but as of postgis 1.5.1 the internal caching
>> should take care of that (from a performance point of view)
>>
>> You could try moving the following (first line in add_location) inside the if:
>>
>> keywords := make_keywords(name);
>>
>> you could try commenting out
>>
>> IF place_country_code IS NULL THEN
>> country_code := get_country_code(geometry);
>> END IF;
>>
>> just below it.
>>
>> see if either of these speed it up to at least narrow down where to look.
>>
>> Other than that it is a question of debugging placex_insert to narrow
>> down the cause of the problem.
>>
>> I'm not aware of any particular performance problems introduced recently.
>>
>> --
>> Brian
>>
>
> _______________________________________________
> Geocoding mailing list
> Geocoding at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/geocoding
More information about the Geocoding
mailing list