[OSM-talk] capitols
David Earl
david at frankieandshadow.com
Wed Jun 13 17:54:49 BST 2007
Iván Sánchez Ortega wrote:
> El Miércoles, 13 de Junio de 2007, Sebastian Spaeth escribió:
>>> [...] "capital_of=Texas". [...]
>> so what do you do if there happens to be a municipality of texas and a
>> galactical imperium called texas? this is way too ambigious,
>
> You don't have to go outside the Earth to find such ocurrences. For example, I
> know about:
>
> Granada, a province in Spain, Europe
> Granada, a province in Nicagarua, South America
> Granada, a county in Mississippi, USA, North America
> etc etc etc
We had this discussion a while back in the context of is_in:
http://lists.openstreetmap.org/pipermail/talk/2007-May/014125.html (and
the rest of a long thread).
The good reason to use links by id is integrity, but the downside is
complexity - you can't just use the tag value on its own, you have to go
to a database to fetch something useful; more importantly we have to
substantially upgrade all our tools, while the looser textual linkage
means only the tools that want to make use of the linkages need be
changed. The same applies to users: it is easy to type Granada (but
equally easy to get wrong), but a much more fiddly process to find the
relevant node, which may not be in the scope of the data you're working
on at the time.
Simplicity of the data structures was a starting point for osm as Steve
explained in the blog he circulated this week.
I think that in nearly all cases you can disambiguate names by
proximity: it will be clear from the lat/lon which Granada is relevant
inthe above example. A counter case is one that Ben Laenen pointed out
in that previous thread: there are Limburg provinces either side of the
Belgian/Dutch border, hard to determine by proximity. But then I think
it will be hard to discriminate in any context which involves humans
reading it too, so perhaps the names could be disambiguated in these
very few cases ("Limburg (NL)"), so when you read it you can see which
is meant, as well as in data processing.
The language variants problem needn't be too big a deal either: so long
as Brussels has 'name=Bruxelles; name:en=Brussels' then the thing that
is trying to navigate the hierarchy will be able to find it in either
form, so the creator doesn't have to worry about that, merely about
spelling it correctly.
In summary, the text form can be done now, is easy to create, can be
found out of one's head not a database search, so is likely to appeal to
people to participate in the appropriate tagging. If it is a harder
process to create the things, has to wait for lots of software to be
written and so on, people won't participate.
Finally, consider how both our wiki and Wikipedia cross reference their
pages - for essentially the same reasons.
David
More information about the talk
mailing list