[Imports] Import of forests, farmland and other types of land cover for Sweden generated from Naturvårdsverkets Nationella Marktäckedata 2018

Grigory Rechistov ggg_mail at inbox.ru
Wed Apr 24 07:49:15 UTC 2019


Hi all,

I agree on some points with Christoph, Peter and Martin, and on some points
with Jarek. So let me comment on the remaining ones, and suggest some
future course of action for myself and others in regard to this import, if I may.

First I want to point out that the vast majority of OSM data imports [2] deal with
single node-objects (e.g. bus stops, addresses etc.), linear objects (roads)
and sometimes man-made "small" closed polygons (buildings). There are however
few imports dealing with naturally-shaped polygons (land cover).
The last category of objects have some unique properties which create fascinating
challenges, in particular:

1. Natural objects have "blurred" and fractal borders, leading to endless disputes
  on whose border drawing is right, when no one is right — it is just impossible
  to represent such a border as a polygon with a pre-defined error margin.

2. Conflation with existing natural borders. In OSM, it is often expected that
  two natural areas contacting each other have a single border. As it comes from
  #1, two different humans/tools would always draw this shared border differently.
  There should be a way to decide on a one border, however, see below.

3. Just sheer amount of data and need to control it. As it follows from #1,
   machines are especially eager to generate polygons with crazy high amount of
   nodes. It is then a human's task to filter them out leaving a reasonable
   subset. Again, some people will argue that a particular forest area does not
   look right when compared against a particular aerial imagery.
   And it will never be, even if we allow it to have a crazy high number of
   nodes. Then other people would argue that their editor freezes when
   rendering such area.

Note how this is different with man-made features: their borders are always sharp.
People would be uncomfortable living in fuzzy buildings, or riding fractal roads.
Even our farmlands have straight borders, unless they meet a forest, where
natural "chaos" comes in conflict with human's desire to simpify things.

My understanding (please correct me if I am wrong!) is that OSM currently
lacks some good tools/checks to assist with areas.
There are tools such as "simplify way", "simplify area", "merge contours",
"merge areas" etc., and I used them countless times to "straighten out"
natural objects along borders.
There also are tools to conflate nodes and linear ways to help importing
such data [5]. However, these tools do not scale well or do not work for closed areas.
No warnings are often given for intersecting natural areas in editors such as
JOSM, possibly because of #1 above, even though I do wish to be reminded when
a forest slips into water.
This partially explains to me community's reluctance of accepting such imports:
we do not have tools to do quality assurance of such data, and doing it manually
is daunting. Can someone create a Summer of Code task to work on such tools,
or something similar?


Now, to some comments and answers.

> it should not be an acceptable excuse for (e.g.) topological errors
Agree, topological errors are not acceptable. By that I mean things such as
unclosed polygons, multipolygons with inner ways outside outer ways,
extremely long ways, self-intersecting ways etc.

> that the average mappers produce or use them as well.
Pray tell, what is an average mapper's work? Shouldn't we just refuse accepting
an average mapper's contributions until one has a diploma of a qualified
OSM-mapper :-) ?

I would very much gratefully accept an *actionable* critique in form:
"In your import, X is bad because Y. To make it better, do Z". There is some
good criticism in this thread, and in the talk-se thread, that we will turn
into action.

> Almost only new ways and it seems everything is tagged with landuse=forest,
> no matter if it's natural scrub or wetland or whatever else

This import does have areas of forest, scrub, wetland, and their combinations.
We also have additional tags for forests to clarify their type (broadleaved etc.),
available, but I omitted them for the time being to cut down on number of polygons.
One is always surprised to see how frequently types of forest change each other
in reality, and how this translates into increase of number of polygons.

>> The first 16 of those changesets
>> uploaded nodes, 10k each of them (i.e. total of 160000 nodes).
> Limitations of API should be dealt with for the convenience of the
> people doing the mapping, not of the API.

I used the JOSM's uploading interface to do this first medium sized import.
Just to learn how good it deals with larger changesets, network issues,
intermittent conflicts etc. Things went much better than I expected. At least it
did not screw the whole database when a minor conflict was discovered somewhere
in the middle of upload, and allowed me to fix it and continue from the
point of interruption.

I am aware about other tools that reorder changesets to look more human-like,
not just "nodes first, ways second, relations last" or whatever. I plan
to use these tools later for really large subareas of this import.

If JOSM's style of uploading of larger changes looks so disturbing, shouldn't it
just be fixed? Is there already a bug on it? Should I create such a bug?

>It might be charitable to note here that there is a large discussion
> about what is supposed to be the meaning of landuse=forest

The discussion of "landuse=forest" vs "natural=wood" vs "..." is endless and I
won't be able to add anything valuable to it. The majority of forests in Sweden
are tagged with "landuse=forest". When I used "natural=wood", I was asked not to.
I am a simple man, and I know it is a forest when I see it [1]. If I am unsure
for a certain area, I leave it untouched, no matter what the import data
layer says.

Now, to future improvements.

> But outlines also don't match. Strange, but ok. A bit more to the east
> I noticed the lakeshore: It intersects the forests multiple times.
> Strange. A bit to the south, the island Lövholmen intersects the forest
> yet again. And so on and so on.

This bothers me as well. I personally spent a lot of time to manually merge
closely placed but not matching and sometimes intersecting forest/water borders
of pre-existing objects, both drawn by humans or created by past land cover
imports such as CORINE [3]. We definitely do not want to create more of such
borders. I do have some ideas on how to detect and/or resolve at least some of
such conflation problems, and I am working on improving current conflation
script [4] to take even more details of pre-existing objects into account.

Currently the script already avoids to import or marks for human resolution
cases when land use of pre-existing data and new data possibly intersect.
So it was not a data/machine error, they did their job.
It was my personal overlook as I mostly focused on residential/forest
intersections.

We won't be trying to import data for further subareas of the country until
the intersection problems and some other annoying issues are reliably detected and
preferably automatically solved to reduce chances of human errors.
If there are other systematic problems present in this first batch that you
would like to be fixed (and in what manner), please inform us.

Thank you every one and sorry for long read.

[1] https://en.wikipedia.org/wiki/I_know_it_when_I_see_it
[2] https://wiki.openstreetmap.org/wiki/Import/Catalogue
[3] https://wiki.openstreetmap.org/wiki/WikiProject_Corine_Land_Cover
[4] https://github.com/grigory-rechistov/nmd-osm-tools/blob/master/conflate.py
[5] https://wiki.openstreetmap.org/wiki/Conflation


>Среда, 24 апреля 2019, 2:39 +03:00 от Jarek Piórkowski <jarek at piorkowski.ca>:
>
>On Tue, 23 Apr 2019 at 16:59, Peter Barth < osm-peda at won2.de > wrote:
>> I didn't read the plan btw, but wanted to read the ML if there really
>> was community acceptance as any note about this was left out in this
>> thread. A huge thread, all swedish, so no idea if swedish community is
>> ok or not. Did they accept or decline or abstain?!
>
>Hello Peter,
>
>This comment attracted my attention. In my experience it is quite
>usual that when local issues are discussed in local forums, it is done
>in the local language, whether English, French or Swedish. I have not
>often seen a summary written in a foreign language towards the end of
>the discussion. Perhaps it is common in the German-speaking forums or
>mailing lists?
>
>> The first 16 of those changesets
>> uploaded nodes, 10k each of them (i.e. total of 160000 nodes). No ways
>> or relations. Ok, so a huge change, split into changesets leaving
>> traceability to the mapper instead of the importer. As always, I want
>> to add. Changeset Nr. 17[2] actually adds something. Of course again
>> a quite large one but at least achavi[3] could load it.
>
>Limitations of API should be dealt with for the convenience of the
>people doing the mapping, not of the API. OSM tooling for viewing
>changesets and area history is known to be poor but this cannot be a
>determining factor for contributing to OSM (otherwise I request that
>rosemary 0.4.4 or wheelmap changesets with bounding boxes spanning the
>globe be blocked immediately). Would you rather have contributors
>figuring out how to deal with API limitations and making pretty
>changesets, or doing mapping or indeed importing?
>
>> Almost only new ways
>
>This cannot be surprising in an area where forests are largely unmapped.
>
>> and it seems everything is tagged with
>> landuse=forest, no matter if it's natural scrub or wetland or whatever
>> else. Seems wrong to me and taginfo[4].
>
>It might be charitable to note here that there is a large discussion
>about what is supposed to be the meaning of landuse=forest, and of
>landuse and landcover tags in general, with no OSM-wide consensus that
>I've seen so far. If there were trees grown in an area, they were cut
>down for wood, and it's regrowing as scrub, is the land still "being
>used as a forest"? Is wetland=swamp automatically not a forest?
>
>> I opened a small test area in an editor, a small island[5]. It has an
>> offset to all imageries out there. Ok, imagery can be wrong for sure.
>> But outlines also don't match. Strange, but ok. A bit more to the east
>> I noticed the lakeshore: It intersects the forests multiple times.
>> Strange. A bit to the south, the island Lövholmen intersects the forest
>> yet again. And so on and so on.
>
>Indeed from imagery it looks like the lake, which was previously
>mapped (seemingly by hand in changeset 29491171), is inaccurate, and
>the bits of forest that are now mapped at 16/59.0365/16.0570 appear
>fairly accurate to me comparing to local orthophoto (Lantmäteriet
>Historic Orthophoto 1975). From the achavi visualizations we can see
>that the lake outline was largely unchanged. Would you have liked to
>see the lakes manually fixed while importing forests?
>
>> From this quite small random sample I'd argue that this is a very low
>> quality import. I'm not really astonished about that, but I'm
>> questioning if it isn't time to increase our quality standards wrt
>> imports and introduce import permissions as opposed to just ignore
>> criticism and wait a week or two to import.
>
>And who might issue such a permission? The local community? The
>already overworked DWG? OSMF?
>
>Best,
>Jarek
>
>_______________________________________________
>Imports mailing list
>Imports at openstreetmap.org
>https://lists.openstreetmap.org/listinfo/imports


С наилучшими пожеланиями,
Григорий Речистов.
Med vänliga hälsningar,
Grigory Rechistov
With best regards,
Grigory Rechistov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20190424/03872dc5/attachment-0001.html>


More information about the Imports mailing list