[Geocoding] Nominatim SQL script indexing speed
Teemu Ikonen
teemu.ikonen at iki.fi
Tue Jan 4 07:19:24 GMT 2011
Thanks for help! I'll try that out as soon as current run ends.
Now gazetteer-loaddata.sql progresses steadily 400MB / hour on
following setup. Table 'place' is 24GB, so it will take at least 60
hours to finish.
Postgres 8.3.9
Postgis 1.5.1
osm2psql revision: 22731
Fedora Core 8 2.6.21.7-2.fc8xen x86_64
"Hardware"
Amazon m1.xlarge instance (4 core, 15GB)
Raid 0 on 4 x ESB disks, 64kB chunk (used only < 100 tps per second,
indexing not IO bound)
Single core utilized 100%, rest 3 are idle.
Delta to the default database configuration file
107,108c107
< shared_buffers = 2GB
---
> shared_buffers = 32MB
115,116c114,115
< work_mem = 256MB
< maintenance_work_mem = 256MB
---
> #work_mem = 1MB
> #maintenance_work_mem = 16MB
121c120
< max_fsm_pages = 1092000
---
> max_fsm_pages = 204800
153c152
< fsync = off
---
> #fsync = on
163c162
< wal_buffers = 16MB
---
> #wal_buffers = 64kB
172c171
< checkpoint_segments = 20
---
> #checkpoint_segments = 3
174c173
< checkpoint_completion_target = 0.9
---
> #checkpoint_completion_target = 0.5
205c204
< random_page_cost = 1.5
---
> #random_page_cost = 4.0
209c208
< effective_cache_size = 8GB
---
> #effective_cache_size = 128MB
383c381
< autovacuum = off
---
> #autovacuum = on
Br Teemu
On Mon, Jan 3, 2011 at 7:42 PM, Brian Quinion
<openstreetmap at brian.quinion.co.uk> wrote:
> On 3 January 2011 08:32, Teemu Ikonen <teemu.ikonen at iki.fi> wrote:
>> Hi,
>>
>> I did few experiments with latest Nominatim sql scripts and for some
>> reason the indexing is very, very slow. The gazetteer-loaddata.sql
>> progresses maybe 0.5 - 2GB per 24h (as reported by select
>> pg_size_pretty(pg_relation_size('placex'));
>>
>> Few facts
>> - Indexing seems to be CPU bound, faster or slower IO setup does not
>> make significant difference
>> - Same results with Ubuntu 10.40 / Postgres 9.0 / PostGIS 1.5.2 and
>> Fedora core 8 / Postgres 8.3 / PostGIS 1.5.1
>> - Version: http://svn.openstreetmap.org/applications/utils/export/osm2pgsql/gazetteer,
>> Revision: 24892
>> - "HW" is Amazon EC2 m1.xlarge instance.
>>
>> Last time I did this was on osm2pgsql SVN Revision: 22731, the
>> loaddata step took ~24 hours on Postgres 8.3. Now, loaddata probably
>> takes 10-40 days if it ever finishes.
>
> the most likely suspect is that it now pre-calculates the country code
> from most features, but as of postgis 1.5.1 the internal caching
> should take care of that (from a performance point of view)
>
> You could try moving the following (first line in add_location) inside the if:
>
> keywords := make_keywords(name);
>
> you could try commenting out
>
> IF place_country_code IS NULL THEN
> country_code := get_country_code(geometry);
> END IF;
>
> just below it.
>
> see if either of these speed it up to at least narrow down where to look.
>
> Other than that it is a question of debugging placex_insert to narrow
> down the cause of the problem.
>
> I'm not aware of any particular performance problems introduced recently.
>
> --
> Brian
>
More information about the Geocoding
mailing list