[OSM-talk] is_in

David Earl david at frankieandshadow.com
Sun May 27 11:46:57 BST 2007


A rather rambling reply from me I'm afraid, but in essence what I think your
message amounts to is to deprecate is_in altogether (and place=county,
place=country), whereas I was just trying to rationalise this existing
feature. If is_in has no value, fine, but if people think it has value, I
think we can improve on it.


> > is_in has the potential to be a very useful source of relationship
> > data.
>
> Let's put it that way: Relationship data would be very useful ;-)
>
> The "isin" you propose is the best that can be done inside the OSM
> database at the moment, but of course it is just a meek imitation of
> what real object relationships could give us.

Agreed.


> What you really want is to create a relationship between the object
> representing (say) a suburb and the object representing (say) a town.

Indeed, or between a village and its county (or whatever).

> What you do with "isin" is creating relationships between names of
> objects. If someone has a typo somewhere (or uses different names -
> you pointed out the language problem), then your whole hierarchy
> breaks. The name of a town is duplicated in each of the isin tags of
> its suburb, and changing the name of the town breaks the
> relationships. Unlikely, but still ugly.

I hate to admit it, but my own experience says it is likely. Just yesterday
I noticed that the English city of Newcastle was wrongly named. It's correct
name is Newcastle Upon Tyne. So I fixed it.

A couple of weeks ago I also changed Brussels to Bruxelles, having added
name:en=Brussels .

This is a pretty fundamental problem, though one that can be detected. But
it is also a problem for the present is_in and all I'm trying to do here is
to rationalise is_in.


> Also, your relationships are not unique because names are not unique.
> A suburb carrying "isin=city:Perth" can be in Scotland or in New
> South Wales, and nobody will want to name one city "Perth (Scotland)"
> and the other "Perth (NSW)".


True, but that's easily resolved by proximity if there's more than one hit.
Without boundaries we can't decide which city, region or county something is
in, but we can easily distinguish between two or more competing
alternatives.


> What's more, the larger your "parent" objects get the less likely it
> is to see them in OSM because we do not have the option of creating
> location-less objects - and where to put the node for "nation=England"?

No one has added England, but I can eliably inform people from the name
finder that...

country Germany in Europe found about 15km north-east of town Eisenach
(which is about 30km south-east of town Eschwege and about 65km south-west
of city Nordhausen)
http://www.openstreetmap.org/index.html?lat=51.100000&lon=10.400000&zoom=5


:-)


> Honestly I don't really see what is gained by having this data *in*
> OSM at all.


OK, so the alternative is to abolish is_in, and also place=county,
place=country etc. Fine, but that doesn't seem to be what people have wanted
to do.

I was tying to suggest an improvement to what we have that really doesn't
work terribly well before it gets too entrenched, but if people don't want
it at all, let's withdraw it.

But actually I do think it does have some value if georeferenced. For
example, it is hard to get administrative boundaries onto the map for
copyright reasons, but if I can find sufficiently many places in a county I
can do some interesting things to deduce approximate county boundaries at
sufficiently small scale maps (obviously not accurately at large scale).

One could also argue that anything that is an area should not be represented
by a point. So why have place=village? That doesn't have a specific location
any more than a county. Nor does a street. Actually nothing does. Eveything
is actually a set of points defined by a boundary, and we are choosing to
approximate. What I think you are saying is thatthe approximation breaks
down beyond a certain scale. I'm arguing that, though you're right in some
circumstances, there are applications for which it has some value.

In other situations I have said that I want data in OSM
> because it is then editable; but in this case, it is very difficult
> to extract the actual structure tree (you cannot click a town in your
> editor and see the suburb nodes highlighted... well... it *could* be
> done but would only ever work for smaller areas)

The structure I have behind the name finder could easily accommodate this.
The OSM database can't, but the reason the name finder works is because I am
post-processing the planet file to derive structure.


, and modifying the
> whole structure tree is even more complex. You can modify the "isin"
> of a suburb here and there, but to systematically find anything else
> in order to change it would be a challenge suitable only for planet
> file hackers.

... or in an application derived from it. Its hard to do relatinship type
searches within OSM, and we've had this issue crop up in other contexts,
like how to relate the name or number of a junction to its corresponding
highway elements.

> Also, such "structure trees" are availble in good quality from free
> sources already (e.g. geonames - who have been criticised on the
> grounds of using Google to derive co-ordinates but we wouldn't have
> to use co-ordinates for what we want to do).

Maybe I should start to bring in data into the name finder from other
sources too. I'd like to know more about "in" as opposed to "near",
especially for larger irregular areas like counties.

Does geonames have these relationships? I can't see evidence in the web
searches you can do that it has a complete hierarchy (suburb, town,
district, county, nation, country) or similar (it just tells me "Fulbourn,
UK" and that it is a "populated place". But I haven't looked at the files
you can export.

> Is it wise to re-invent the wheel here?


Err ... that rather applies to OSM as a whole doesn't it? The motivation
being financial.


> I think I'd opt for having this kind of data in a text file, or web
> service or so. Can't see how having it in the planet file helps.



> I am no expert on this but does each place have exactly one
> undisputed local language, or are there "OSM edit wars" looming?

Places like Wales (or should I say Cymru) and Catalonia are problems
certainly. I would think some concensus is possible though. Maybe we can
even cope with variation (even withot language variation some people have
put UK and other "United Kingdom" in their is_in's already): if we have
  place=nation; name=Cymru; name:en=Wales
and someone puts
  place=county; name=Gwynedd; isin:nation=Wales
or in current form
  place=county; name=Gwynedd; is_in="Wales, UK"
it's not beyond the bounds of possibility to match on either name. (I
already do that for name finder searches).

David





More information about the talk mailing list