[OSM-dev] What country is something in?
Roland Olbricht
roland.olbricht at gmx.de
Tue Nov 11 10:57:17 GMT 2008
> Hi recently I've uploaded all the borders for Italy as written here:
> http://wiki.openstreetmap.org/index.php/Italian_Borders
Thank you, that's great. I've just checked the data and there's not even a
single bug in the data.
> now I would like to do the opposite, so being able to obtain a list of
> vertexes that define a border between the various Italian regions
> (20), provinces (104) and municipalities (8100).
>
> I did try to hack your RUNME file a little bit, but I was not
> successful in getting the data I needed.
>
> I would like to get a file, one for each of the above entities, with
> data formatted for osmosis use. So I could make available regional and
> provincial data/planets for later debugging and analysis.
>
> Is there a way with you program to do this?
Not until yesterday. But the presence of the data and the request are a good
motivation. It took a day to tune the given software to something useful for
this task. I've put the newest version on
http://wmaz.math.uni-wuppertal.de/olbricht/osm/osm-boundaries-source.tgz
I have preferred the timelineness of this release over a good documentation.
So I hope, for the first try, the following instructions might help:
For the impatient:
1) edit the file "country-patches.not_osm" and add the pivot nodes
2) run "./RUNME_subnational ../osm/non-us.osm.bz2 4 region 36.6 47.2 6.6 18.6"
resp. "./RUNME_subnational ../osm/non-us.osm.bz2 6 province 36.6 47.2 6.6
18.6"
3) if everything is satisfactory, please upload the pivot nodes
ad 1)
Please add to the file "country-patches.not_osm" pivot nodes for each region
and each province. There is an example for the "Aosta Valley" in the file.
The node's id "addon-0" is just a dummy but necessary numbering and can be
retired after step 3. The values for lat(titude) and lon(gitude) are the
essential data, so they must be specified for each region to lie somewhere in
the respective region. The node must be tagged as "place" with value "region"
resp. "province" or whatever you have submitted as third parameter to
RUNME_subnational because this is used by report-results to recognize the
node as relevant. It should contain a tag with key "name" because its value
is used by report-results as the target filename.
ad 2)
This will work similar to the RUNME script with some small changes
- the second parameter controls which ways to use as boundary: "2" will take
only those ways tagged as "admin-level" "2", the value "4" will take ways
tagged as "admin-level" "2" or "admin-level" "4" and so on for even numbers
up to 10.
- the third parameter controls which kind of pivot nodes are taken into
account: It specifies the value of the tag with key "place". For nations,
this is usually "country".
- the subsequent parameters specify a bounding box with southern and northern
latitude, then western and eastern longitude. The values should be a
reasonable approximation for Italy. You may omit it but you might end up with
way more data than expected and you might get in trouble if two areas in
different countries have the same name.
ad 3)
Please upload the pivot nodes you have written for the regions or provinces to
OSM. It's a good deal of work to write them and they would be useful for
other mappers as well. Also, they get ordinary id-values that way.
Background concerning the data:
The software can determine the components delimited by the boundary ways
automatically, but it can't guess the names of the areas. A first glance at
tagwatch has shown that there are a couple of approaches in the current OSM
database to represent this data:
- A node tagged with "place" something and coordinates somewhere in the
respective area. This data is almost consistently present for nations so it
formed the base for this software. It would be useful to extend this
representation also to regions ans provinces but it had not taken place yet.
- A tag whatever:left|right and the name of whatever. This is handy for a
renderer to write this name immediately on the border, but it's not useful
for most other types of software: this entries are often not present, or
broken by typos, wrong entries or name variants. So a software using this
data would have to do a lot of guesswork which variant is right. Thus, this
software does not support this representation.
- The most recent approach stores the tags not in a pivot node but rather in a
relation that contains as members the ways that form the boundary. From the
point of view that a database is execellent on performing joints, this
solution provides all the data of the two above in almost the same speed. And
it is the only solution that allows to store the information of exclaves and
enclaves in a natural way. So I personally would prefer to do some automatic
conversion in the future to that representation. But this has been a
controversal point of discussion. Anyway, the data in the nodes could be
easily converted, so no information is lost if you store the data now in
pivot nodes.
Writing the pivot nodes by hand for the regions should be quick enough to do
it, for the provinces this work will already become tedious but remain a
matter of a most one evening. For the municipalities, this probably is too
much work. So if you already have the municipalities in another data
representation, feel free to ask me or whomever for an automatic conversion
script.
Tagwatch has also shown that the tag "admin_level" is rarely used with odd
numbers as values. Thus, those ways are ignored here. You can change it in
the script at lines 12-19 if there is an intended usage of these ways.
Cheers,
Roland
More information about the dev
mailing list