[OSM-talk] administrative boundaries and is_in

Thu Jan 10 08:51:53 GMT 2008

At 09:44 PM 1/9/2008, Robin Paulson wrote:
>can someone explain a few things about the way boundaries work, and
>their relation to the is_in key?
>
>as far as i can tell, when a location (say the suburb of balham, in
>london) is added to the map, the is_in tag needs to be set, multiple
>times. in this case, it would be set as follows:
>
>is_in:Westminster (...i think)
>is_in:greater london
>is_in:england
>is_in:united_kingdom
>is_in:British_Isles
>is_in:Great_Britain
>is_in:Europe
>...etc.
>
>which seems counter-intuitive, not to mention requiring huge amounts
>of work. do we set this for every item - roads, churches,
>supermarkets,....thousands of other items?
>is there anything underway to enable OSM to calculate where an object
>is, based upon knowledge of administrative boundaries - after all,
>they are only a polygon-shaped bounding box?

Yes, sort of.  But the other way around, I am working on deriving 
administrative boundaries from "is_in" and "place" tags. *If* it 
works, the answer to your main question would be to randomly use 
is_in tags on low level items such as roads and churches and let the 
computer work out a boundary around them.  I should be able to report 
back in February.

I've spent a year seeding OSM with "is_in" and "place"  tags as 
described below.  I've also generated some simple bounding boxes for 
countries from the US government GNS place names data and am working 
on the same for their ADM1 level (states and provinces).  What I am 
working on now is matching the two tags together.  In your above 
example, I'd have entered Balham something like this:

name=Balham
place=suburb
is_in=England,Greater London

Then programmatically I'm looking for closest higher level place tags 
with the name "England" and "Greater London".  That should determine 
that what they are.  Hopefully, the England node will also have 
information saying it is inside the United Kingdom and Europe so the 
process can be repeated.  So in the best case I end up with all the 
values in your example.  I also have lat/lon that I know lies inside 
all of them ... if I also have a lat/lon for Moscow and also know 
that it is Europe, I can begin to build up a model describing the 
size and extent of Europe.

That is the theory.  In practice, there are many issues to contend 
with. What if there is a nearby town called England? , spelling 
variancy, how does Greater London relate to Westminister and are they 
place tagged? etc, etc.  I'm reasonably confident though.  Random use 
of namespaced tags like is_in:country=Sweden will also help.

How I do is_in tagging:

countries, states, counties, cities

- always have a place tag and put as much info in the is_in tag as 
possible.  Use is_in:state etc.

towns, villages, hamlets

- always use a place tag and put at least the country and 
state/county into the is_in tag if known for certain.

suburbs

- always use a place tag and put the just the city/town/kommun of 
which it is a suburb

streets and POIs (churches, supermarkets ...)

- RANDOMLY use is_in tags. Add a postal_code tag if possible.  The 
idea being to generate a good spread of points so that the computer 
can draw a polygon around the outermost points and say that is a 
reasonable approximation of the boundary of a town or suburb.

Mike
Stockholm