[Geocoding] gazetteer-index.sql explained

Brian Quinion openstreetmap at brian.quinion.co.uk
Sat Jan 23 17:01:58 GMT 2010


> For at least some days pgstat shows that the database is working on
> update placex set indexed = true where not indexed and rank_search = 25 and
> name is not null;
> Whenever I ask for
> select count (*) from placex where not indexed and rank_search = 25 and name
> is not null;

This is because the whole update operation is performed as a single
transaction which means that until it completes no other postgresql
instance will show any progress.  You can get a vague feel for the
progress by looking at the table size of search_name however for full
planet indexing I would recommend using util.update.php command line
util instead:

gazetteer/util.update.php --index

you will need to make sure settings.php is configured to connect to
your database.

This utility runs the updates in much smaller transactions so there is
actual visible progress - it also helps to prevent deadlock.  There
are also options for multi-processor systems to try to speed things up
- see --help for more information.

> Is there any documentation of the rank_search indexes and what the stand for
> that I have overlooked?

I'm afraid there is no documentation for the internals yet.  I've
still not had a chance to get round to it and probably won't now as
I'm already working on a version 2 which has signification changes.
There are quite a few comments in gazetteer-functions which may help
or feel free to ask questions as needed.

rank_search is in indicator of how 'big' the feature is.  Based around
admin_level * 2
rank_address is where in the address output the place should be used.
0 = do not use
in both cases smaller number is a larger feature, same ad admin_level

--
 Brian




More information about the Geocoding mailing list