[Taginfo-dev] Finding Tags (and achieving a word translation list) - How?

Jochen Topf jochen at remote.org
Thu Jun 23 09:26:48 BST 2011


On Thu, Jun 23, 2011 at 01:46:34AM +0200, Stefan Keller wrote:
> I'd like to turn taginfo into an even more searchable
> internationalized tool: Use case is that somebody knows a keyword - in
> any language - but has not a clue what the most used (set of)
> tag/value is for that keyword. I already discussed some issues about
> this with Jochen and we made already some experiments based on our
> "TagFinder".
> 
> My current solution path is a two-phase process (after the user typed
> in a keyword):
> 1. The (probably non-english) word is translated by Google Translation,
> 2. current taginfo services are called at least once for searching key
> names and once for lookin for value names.
> 
> Then it turned out, that Google Translation does not do a good job
> translating e.g. german words to english (and it's only offering one
> single result). The problem is, that the translated english words
> often are not the ones which exist in the OSM database.

And Google already announced that it will dicontinue its Translation
API in the autumn, so it doesn't make sense to build on it anyway.

> My idea is now to make this process better with a (manually?)
> maintained translation list of key-values whih indicate which is the
> preferred target tag-value. And I already have a filtered list of
> about 1000 tag-value-pairs from my PostGIS terminal
> (http://152.96.80.16/ ). I know about
> http://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases but
> that's covering only a fraction of the tag-value-pair.
> 
> In addition in the tag-value-pair translation list - e.g. from german
> to english - a distinction should be made between tag-values that
> classify a spatial feature and tag-values which are only "facettes"
> (meaning: additional attributes) of a spatial feature.
> 
> How could we achieve a maintained version such a list?

There is a lot of existing information around. You mentioned the Nominatim
special phrases page, also there are translations of many tags in editor
configurations. There are wiki pages for many keys and tags in several
languages. Then add a general thesaurus for each language you support and some
dictionaries. It should be possible to bring all of that together in some
clever algorithm to find tags that are probably related to the search query.

There is no need for a manually maintained list especially for Taginfo. There
are many lists out there already maintained by people who have an interest in
maintaining them. A list specially for Taginfo will probably be maintained by
nobody, so its not a solution. This is the old "metadata problem": Nobody likes
to maintain metadata.

If you think this is not enough you could add some kind of "Keyword" macro to
the wiki. People can add that to pages about keys and tags to specifically mark
certain words as "belonging to the concept of the given key/tag". Then add this
as one input to the algorithm. Again, this will probably not be maintained
properly, but at least it gives people an easy option of adding information
to the system if they want to.

Jochen
-- 
Jochen Topf  jochen at remote.org  http://www.remote.org/jochen/  +49-721-388298




More information about the Taginfo-dev mailing list