[OSM-talk] is_in

Frederik Ramm frederik at remote.org
Sun May 27 16:37:18 BST 2007


Hi,

> A rather rambling reply from me I'm afraid

Well I was the one who started it ;-)

> but in essence what I think your
> message amounts to is to deprecate is_in altogether (and place=county,
> place=country)

I just wanted to say that before more work is invested here - and  
your suggestion would mean that all existing is_ins would have to be  
changed step by step, so it is probably quite a lot of work, and  
anyone doing it properly is likely to try to add isin tags to  
everything that doesn't have one currently, so it will be a LOT of  
work... - one should think about wheter the concept is really good,  
or if the same work is not perhaps better invested in doing it a  
wholly different way.

>> What you do with "isin" is creating relationships between names of
>> objects. If someone has a typo somewhere (or uses different names -
>> you pointed out the language problem), then your whole hierarchy
>> breaks. The name of a town is duplicated in each of the isin tags of
>> its suburb, and changing the name of the town breaks the
>> relationships. Unlikely, but still ugly.
>
> I hate to admit it, but my own experience says it is likely. Just  
> yesterday
> I noticed that the English city of Newcastle was wrongly named.  
> It's correct
> name is Newcastle Upon Tyne. So I fixed it.

Hm. Do you think that the gazetteer you created from the planet file  
could be used reversely, i.e. write a script against your gazetteer  
database that can be used to change the name of a place, and it will  
generate a .osm file for use in JOSM that contains updates for all  
objects having an "isin" tag with the old name? A tool like that  
would be required...

> This is a pretty fundamental problem, though one that can be  
> detected. But
> it is also a problem for the present is_in and all I'm trying to do  
> here is
> to rationalise is_in.

Sure but I alwas treated the present is_in as something that doesn't  
work anyway so I did not bother; but if you try to turn the present  
is_in into something workable, you will have to deal with it ;-)

> No one has added England, but I can eliably inform people from the  
> name
> finder that...
>
> country Germany in Europe found about 15km north-east of town Eisenach
> (which is about 30km south-east of town Eschwege and about 65km  
> south-west
> of city Nordhausen)
> http://www.openstreetmap.org/index.html? 
> lat=51.100000&lon=10.400000&zoom=5

Congratulations! Knowing where Germany is puts you ahead of the  
average US high school student ;-)

Could you perhaps change your name finder to say "slightly to the  
right of France"?

>> Honestly I don't really see what is gained by having this data *in*
>> OSM at all.
>
> OK, so the alternative is to abolish is_in, and also place=county,
> place=country etc. Fine, but that doesn't seem to be what people  
> have wanted
> to do.

Who am I to know what people want to do... many mappers seem to look  
at the tools they''re given and use them as good as they can...

> I was tying to suggest an improvement to what we have that really  
> doesn't
> work terribly well before it gets too entrenched, but if people  
> don't want
> it at all, let's withdraw it.

I had the impression that is_in was far from being entrenched but  
rather sidelined, and that your work would run the risk of making it  
more important again.

> But actually I do think it does have some value if georeferenced. For
> example, it is hard to get administrative boundaries onto the map for
> copyright reasons, but if I can find sufficiently many places in a  
> county I
> can do some interesting things to deduce approximate county  
> boundaries at
> sufficiently small scale maps (obviously not accurately at large  
> scale).

Good point.

>> Also, such "structure trees" are availble in good quality from free
>> sources already (e.g. geonames - who have been criticised on the
>> grounds of using Google to derive co-ordinates but we wouldn't have
>> to use co-ordinates for what we want to do).
>
> Maybe I should start to bring in data into the name finder from other
> sources too. I'd like to know more about "in" as opposed to "near",
> especially for larger irregular areas like counties.
>
> Does geonames have these relationships? I can't see evidence in the  
> web
> searches you can do that it has a complete hierarchy (suburb, town,
> district, county, nation, country) or similar (it just tells me  
> "Fulbourn,
> UK" and that it is a "populated place". But I haven't looked at the  
> files
> you can export.

I may have been overly optimistic here. They do show Karlsruhe with  
all its suburbs but the connection seems to be by proximity only.


Hm. Maybe a good solution would be to try not to put the entire  
relationship tree in is_in tags in OSM, but still use is_in as input  
for that tree (so that people entering a new suburb would not have to  
go to an extra input for for a separate database and put it in there  
too).

If there was a master geoname relationship tree web service somewhere  
that would automatically parse the planet file to find new  
information it didn't yet know about (but the "master" would still be  
the database in the web service - not everything the web service  
knows would be in OSM, e.g. countries or maybe even imported third- 
party data), then we could create a cool edit plugin for JOSM that  
would use that external service to help you structure your "isin" tags.

(Someone else said that one should use object ids in "isin" which is  
theoretically good, but as long as the API does not support  
relationships and thus we can never be sure whether a referenced  
object still exists, this doesn't work. "isin=Newcastle" may have  
ambiguity problems and break when Newcastle changes its name, but at  
least it retains a little value even if the Newcastle node is  
accidentally deleted, whiile "isin=123456789" is worthless if that  
happens.)

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00.09' E008°23.33'






More information about the talk mailing list