[OSM-talk] A forest ... what?

Sandor Seres sandors39 at gmail.com
Mon Apr 10 08:04:22 UTC 2017

Three weeks ago I posted some multipolygon related notes. This mail is, in a
way, an addition to that former mail.

My first note was triggered by some user worries about poorer maps if they
use data from the osm2ogsql preparation. Dropping "broken multipolygons"
will result in many and large empty/white places with long reparation
period. Strengthening the preparation on the subject might be a better
option in my opinion (I know, I was there). However, at the end, how this
subject will be handled is perfectly up to the authors of the osm2pgsql
application.  Users starting from the OSM source data will not be affected
whatever strategy will be selected.

The second note was related to the mass/programmatic correction of the
source data. This could have dangerous/damaging impact on many OSM users.
Fortunately, the replays say that programmatic correction is not a strategy
in the "fixing multypolygons" actions. I have mentioned the "self-crossings"
issue which is not an error for many users (depending on what notion
interpretations and tools one uses). To clean up the confusion, this note
needs some additional words. Assume someone would correct all polygon
self-crossings in the source data. Assume, the selected fixing model is the
popular dividing model (the polygon is divided into new polygons between
self-crossings). The "fix" will be correct but the consequences damaging.
Namely, in scaling and rendering the new small areas quickly  reach
ignorable/collapsing size causing brakes. Here, it is worth noting, that the
self-crossing issue is a topic in the modern vector based digital mapping
even if all self-crossings are somehow resolved in the source data. Namely,
while scaling and doing edge-smoothing in data generalisation,
self-crossings on thin area sections (like fiords, peninsulas, rivers and so
on) are unavoidable and dividing produces many tiny areas. High
fragmentation of the source data and freedom of tag selection (river
sections tagged as lakes) make the issue even worse. Just look  at the
Amazonas river-system rendering from a popular vector map-maker her
<http://goo.gl/bT1Bu9> http://goo.gl/bT1Bu9

(the screen dump is from yesterday, from a demo system, in roughly 1:6.7
mill scale). There are really many and large unacceptable breaks. However,
from the same data source, using topology geometry as suggested in my former
mail, it is possible to create a compact minimal coverage for the same river
system like this   <https://goo.gl/pNQwDm> https://goo.gl/pNQwDm . Note that
the river system her is one simple area (one outer and many inner borders
never touching each other) from Peru to the Atlantic. To be on the fair side
the last image should be rendered from a zoom/scale level that corresponds
to the 1:6.7 mill scale. This is done here  <https://goo.gl/eaAWNy>
https://goo.gl/eaAWNy and the zoom level contains approximately 250 times
less nodes than the level used for the previous image. The area connectivity
is still perfectly preserved and the image is much cleaner in this scale
extract. Finally, if a user is still insist on fixing the polygon
self-crossings, exchanging  and reversing the poly-lines between two
consecutive self-crossings (eventually just reversing the end loop after a
self-crossing) should be a much better strategy. 

However, the third, the last note was my major point. Just to remind. There
is a large set of area related anomalies caused by relations between objects
from different classes (between seas, forests, lakes, rivers.). The extent
and complexity of this set is far beyond the "broken polygons" issue  and
should be more in the development focus. Even if the areas/multipolygons
within a class are in perfect conformity with the strongest OSM and OGC
rules, still these anomalies are there, though sometimes hardly visible in
maps. Therefor many map-makers tolerate them but in GIS systems they appear
as strong limitations and should not be tolerated. In the former mail  I
have presented many examples and some hints how these anomalies could be
resolved. Unfortunately, the discussion went in a wrong direction, about the
Scandinavian forests, while the region selection is irrelevant for the
subject. To avoid much repetition I will present further examples without
details in procedures. The illustrations are from the area of Japan (one of
the best mapped areas) and the source is the standard OSM dump from some
week ago.

Honestly, I am not sure what a forest is. More precisely, if you ask me - I
know, if you ask me to tell what it is - I do not know. However, among the
many interpretations, I am closest to accept the topology interpretation of
the notion. The green area in the front page map (or in other OSM based
maps) usually covering the areas tagged as forest and/or wood. In Japan, as
everywhere, forests are uploaded highly fragmented, they overlap in the most
strange combinations, the same with river and lake area objects. The most
common case is when borders of neighbouring objects run in and out of each
other. The fragmentation itself is causing lots of problems even in
rendering. Just look at these examples (the well-known light/dark stripes)
here  <http://osm.org/go/7WCEND?layers=H> http://osm.org/go/7WCEND?layers=H
or here  <http://osm.org/go/7WCzACu--?layers=C>
http://osm.org/go/7WCzACu--?layers=C or here  <https://goo.gl/JVI1E7>
https://goo.gl/JVI1E7 or here  <https://goo.gl/Xhv1nq> https://goo.gl/Xhv1nq
. Extending the areas within the object classes may help in rendering but
still the fragmentation is there.

Assume, we have managed to remove all redundancy, repair most of the "broken
polygons" and perform full defragmentation within area classes: forests,
lakes, rivers and land masses. Besides, we managed to recognize and replace
missing river sections, missing islands in lakes and rivers. So, within any
of these object classes we have the best data presentation that is
potentially possible from the source data. Yet, we quickly discover that
there are forests overwritten by lakes, rivers running over forests, borders
of lakes running in and out of forests and so on (the inter class
anomalies). While these anomalies are not show stoppers in rendering, they
limit the corresponding GIS's quality, statistics, quantitative analyses and
forecasts (number of trees in forests, CO2 consumption per year, oxygen
production per year and so on).  Let us assume, we have managed to repair
all these anomalies by using the topology geometry/calculus as hinted in my
previous mail. Then some of the results are like these:

The country's land area created from the coastline data is here
<http://goo.gl/O1L60r> http://goo.gl/O1L60r , the border polygons are
disjunctive and there are no holes at all. Subtracting all inland water
areas and adding the islands within these, we get the land-masses
illustrated here  <http://goo.gl/OM2dqn> http://goo.gl/OM2dqn. The yellow
areas represent a minimal simple/compact land-masses coverage. The inland
waters make only about 0.5% of the land area.

The countries forest coverage is pretty high  <https://goo.gl/HU63M7>
https://goo.gl/HU63M7 . The forests cover around 63.6% of the land-masses,
though there are still some forests to be mapped (see the Kyushu island).
The largest compact/simple forest area, here  <https://goo.gl/4yzeyC>
https://goo.gl/4yzeyC, by size equals to 24% of all forests. It consists of
one outer/container and 25831 inner/excluding polygons. All polygons are
disjunctive and from any point A to any point B in this area one can go
walking exclusively through the forest (hm, the shortest way?). However, the
holes of this largest simple area contain additional 2892 new (small)
"forests". An extract from this complete, largest reginal forest is
presented here  <https://goo.gl/mzgDRg> https://goo.gl/mzgDRg . The light
green is the largest simple forest area while the dark green represents the
smaller forests in holes. One can see that there are even holes in these
small forests and new forests in their holes and so on. Similar inclusions
sometimes go up to 6 levels. The ten largest simple areas make 70.2% of all
forests in the country. 

Finally, extending the case to other object types and/or larger areas like
continents or the Planet, one can feel the huge potential of OSM, especially
in the future with growing content. Simply, it is difficult not to be an
enthusiast of it. 

Regards, Sandor















-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170410/d9ddbb6e/attachment.html>

More information about the talk mailing list