[OSM-talk] Fixing broken multipolygons, some notes

john whelan jwhelan0112 at gmail.com
Sat Mar 18 20:54:44 UTC 2017


There has been some discussion on the HOT mailing list that makes things a
bit clearer.

OSM in general has a fair number of things that have been added in a less
than ideal way.  It can be difficult to correct some things as we have
guidelines or recommended practises as opposed to hard and fast rules but
maproulette has managed to identify a number of areas where there is some
agreement about what needs to be corrected.

JOSM validation also tries to identify problem areas so I suspect fixing
the underlying data is the better long term solution rather than ensuring
all the different rendering systems are more robust.  Robustness costs
machine cycles​ as well.

Cheerio John

On 18 Mar 2017 4:43 pm, "Sandor Seres" <sandors39 at gmail.com> wrote:

> I am new to this list and therefore apologize for eventual
> misinterpretations and wrong stile. The motivation for the mail is a
> worrying mail on the local list about the purer osm2pgsql based maps and
> about the “broken polygons” fixing strategies. The mentioned white spots in
> the Scandinavian forests are just an illustration. By simply dropping
> broken polygons, empty spots will be typical for any area types and for any
> corners of the Planet.
>
> As I understand, osm2pgsql is an application doing data preparation from
> the OSM source data up to a DB used by many mapmakers for rendering. We can
> see that almost all OSM based public mapping system use this database and
> consequently repeat the same anomalies. Therefore, maybe, making the
> osm2pgsql more robust could be a better strategy. There is still a large
> potential for such strengthening. Just waiting for “do-ocracy” reparations
> is really a long-term strategy. Anyway, users starting from the source OSM
> data will not be affected by any of these strategies.
>
> The “Fixing broken polygons”, especially programmatic/mass fixing, could
> be more dangerous to all users. Just look at the many possible
> self-crossing fixing options. Loosely defined notions open for different
> interpretations and different sets of error criteria. Consequently, for the
> same object type we may have (and we do) different error classes and
> reparation tools. Besides the typical polygon interpretations as area (ESRI
> polygon redefinition) or as a closed polygonal line, we simply can’t find
> in the documentation what “outer”, “inner”, “hole” … notions actually mean.
> The interpretation (individual perception) of these notions is left to us
> and there we have a source of misunderstandings. For instance, if we assume
> that “outer” border polygons define the interior candidate points (points
> inside and on the border) and “inner” border polygons define (in the same
> way) exterior points of area than self-crossings, touching polygons,
> polygon overlaps, crossings… are not errors at all.
>
> However, my point here is still something else. The “broken multipolygon”
> (whatever that means) issue is just “the tip of the iceberg”. There is
> still remaining huge number of anomalies caused by area object relations
> from different area classes. I intentionally use the anomaly notion, as a
> moderate form for error, because many people/mapmakers may liv with them.
> But a modern GIS system and a vector layers based digital cartography
> cannot tolerate them. Let me present some arguments and illustrations. Let
> us look at a map extract from the mentioned Scandinavian forests here
> http://osm.org/go/0Tt1PZIt- . The example could be taken from any corner
> of the Planet and, as mentioned, there is huge number of similar cases. At
> the first glance, everything looks correct and nice (and it is). However,
> we see immediately that something is still wrong. The forest type symbols
> are placed directly over the water. In another style, typical land related
> names are on the water like here http://osm.org/go/0Tt1PZIt-?layers=T .
> Looking at the source data we can see that the lake in the middle is placed
> over an empty space (intentionally, not a hole) where the border of the
> lake runs slightly in and outside the forests. At the same time, we can see
> many forest areas inside the mentioned empty space overwritten with the
> lake that has no holes. Consequently, there are many missing islands in the
> lake and many missing forest areas in the extract. Note that only on that
> little extract there are more than 40 of the described anomalies. What
> more, there are many lakes with borders running in/out of forest areas
> (corridor border overlaps), having considerable parts over a forest and
> holes in forests, partly overlapping several disjunctive forest areas and
> so on, and the contrary. Extending the case to the Planet and other area
> types combinations we may feel the extent of the issue. There were attempts
> to compensate these problems in renderings like rendering the holes,
> rendering smaller over larger objects and so on. These actions generally do
> not work. Simply, they do good some places and damaging at other places.
> So, the question is whether and what can we do with the problem. Just
> waiting for do-ocracy based reparations is, obviously, irrational.
> Fortunately, the source data has a large potential to remove most of the
> mentioned anomalies. Let me present some hints in bullets for the forests,
> lakes and river combinations.
>
> Assume {F0} is a set of all forest outer border polygons (closed polygonal
> lines) and {F1,L0,R0} is a set of all inner forest, outer lake and outer
> river border polygons (the orientations and the relations are irrelevant).
> Then, you can prove the existence of minimal disjunctive simple area
> coverage of the forests. In other words, you can find a set of isolated
> simple areas (one outer and zero or any number of inner polygons) where any
> area point is on/inside of at least one element in {F0} and never on/
> inside of any element in {F1,L0,R0}. This coverage is the topological area
> difference, or subtraction, {D}=U{F0}-U{F1,L0,R0}, where U stands for
> union. To find this coverage is really a nice challenge for researchers in
> topology, algorithms and, of course, in programing. Some data preparation
> tools already have procedures for making this coverage for some  major area
> type combinations like the planet_sea/global_ocean, forests, lakes, rivers
> and some more. An extract from such coverage for forests, lakes and rivers
> combination is presented in this image https://drive.google.com/file/d/
> 0B6qGm3k2qWHqLWMtcVRIVklXUmc/view?usp=sharing . Note that whatever
> Z/rendering order one takes the image is always the same. The only
> difference may appear in the borderline colours if hard edge rendering is
> used but even this difference disappear with the “smooth edge”
> anti-aliasing technology.
>
> Regards, Sandor.
>
>
>
> _______________________________________________
> talk mailing list
> talk at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170318/f080906d/attachment-0001.html>


More information about the talk mailing list