SandorS sandors39 at gmail.com
Sat Jul 30 14:02:21 UTC 2016

Probably the most complete river systems that is possible to create from the OSM source data.
The subject has been discussed several times from my side but mostly inside the river area object class. At that time, I was just indicating cross water-area object class problems. Now, here is one option that resolves these problems as well. The complete river systems is generated from the latest OSM dump and can be downloaded from here:
You can use it as you like but please respect the OSM licencing rules. Note that in this presentation there are no missing objects caused by area overlaps and different contents (holes in one are not in the other one consequently not visible in any vector mapping system), no brakes caused by tagging river sections as lakes and so on.
The following notes indicate the major processing skeleton and might be of certain interest for vector map-makers and those interested in heavy topology and polygon algebra issues. 
The extracted and used object classes are rivers (waterway=riverbank and natural=water + water=river), riverlines (waterway=river), lakes (natural=water and natural=water + water=lake) and the planet_land (the natural=coastline based land/water area objects).
All these object classes are passing a robust data-preparation-tool-chain ending with simple objects. During this procedure more than 100K consecutive replicated nodes are removed, more than 10K corridor and exact replicated polylines are removed and so on. In addition, for area object classes, two polygon classes are extracted: P0 not being inside any other polygons and P1 all the other polygons. E.g. rivers_P0 are all the outer disjunctive polygons from the rivers simple area class while the rivers_P1 contains all corresponding holes, holes-in- holes, holes-in-holes-in-…  After this, I have added to the rivers simple area class the following correction areas:
-planet_land_P1 (1119) being inside at least in one of the rivers_P0 (120 cases),
-planet_land_P0 being inside in one of the rivers_P0 (490 cases),
-circular/closed riverlines, longish areas statistically similar to riverbanks (4645 cases),
-longish lakes crossed alongside by a riverline (28692 cases) and 
-longish lakes neighbouring similar riverbanks (5560 cases).
The recognition algorithms and especially the heuristic criteria used are complex and based on many, many experiments and on large samples.
Now, the extended river area set is input to the create-area-coverage algorithm, which is doing defragmentation and creates a perfect coverage. Instead of river-fragments there are now river systems like the Mississippi, Amazonas, Danube, Volga… river systems presented as huge and complex areas/relations yet any in a simple area structure. These objects are, if necessary, input to a data generalisation (scale levels), tiling and so on. Unfortunately, there are several brakes on large river systems where obvious river sections are connected to large lakes incorrectly (yes I know, it is legal, but that is something else). These cases need manual intervention. This is yet to be done.
Finally, just to mention, the lakes, riverlines, the planet_land/planet_sea are corrected in the similar way. E.g. the planet_land now contains only simple areas (with no holes at all) and from there the planet_sea (which is now the Global Ocean) is created just in several seconds.
Regards, Sandor.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20160730/1585253a/attachment.html>

More information about the dev mailing list