[Imports] UN Mappers import of UNSOS waterways in Somalia

Rafael Avila Coya ravilacoya at gmail.com
Wed Apr 8 18:05:46 UTC 2020

Hi, Christoph:

Thank you very much for this informative and thorough assessment!

O 08/04/20 ás 17:38, Christoph Hormann escribiu:
> Thanks for the additional info, that makes things a bit clearer.
> Based on this it seems your suggested tagging is not right.  You base
> the classification into waterway types on the HYP attribute - which is
> not what this indicates apparently.

As explained already, I based that on checking a sample. We could leave 
the waterway tag empty, but that would slow the process and more 
tiresome, and if you have that most of HYP=1 and 2 are mostly rivers, 
and HYP = 4 are most of the time streams, we can put that as default, 
and users will correct when needed. I can't see any problem in this.

> More generally speaking i have doubts about the wisdom in importing
> large parts of the data.  Looking over it a large fraction of the
> waterways in the data set are not correct.  This is already hinted in
> the information you provided.  Most of the waterways (more than 40k,
> i.e. more than 90 percent) are indicated to be dry, that is without any
> evidence of present day water flow.  Looking over the data i could find
> a lot of cases where a drainage shaped landform was interpreted to be a
> stream and that stream was then continued downhill without any physical
> indication of even historic water flow - sometimes along tracks misread
> to be streambeds, sometimes also right across villages and other human
> built structures.

There are actually some examples of those waterways in the workflow 
wiki. Examples of ways that should not be imported (deleted). I can add 
more examples if you let me know, maybe in an appendix to avoid making 
the wiki difficult to follow because too many captions.

> In subtropical Africa it is very common that due to climate change (both
> recent human made and natural changes over the last few thousand years)
> as well as immensely intensified groundwater use valleys created by
> water flow - with often indications of that visible in imagery - do not
> carry even sporadic water flow any more at present time.  This is
> called a fossile waterway.  According to OSMs verifiability principle
> mapping such structures as waterway however is clearly wrong.
> The problem i see is that importing such data where a large portion of
> the features are factually incorrect will either result in
> * a lot of incorrect data in serious need of cleanup imposing a serious
> debt on the local community.
> * a lot of work to evaluate every single one of the >40k features to
> assess if it really represents a verifiable waterway.  My estimate
> would be that this work might be more efficiently invested in mapping
> those from scratch.

We commit errors all the time when mapping remotely, not only adding 
wadis that might not carry water all the year round. We, as mappers, put 
all our interest in mapping the best way possible, anyway. It's the 
responsability of the mappers to decide, with a validation too. Nothing 
will be perfect, as usual. Take any two experienced users, put them to 
map waterways in a certain area, and you won't get the same osm file.

I put you one example: We recently asked the community on helping the UN 
mission in Somalia to map some features in some areas of the country. 
One project ( https://tasks.hotosm.org/project/7918 ) asked for waterway 
mapping in an area of the south, an area where there aren't any of these 
candidates-to-be-imported waterways. The project was divided in 100 
tasks (squares), and one user, Arne Kimmig, who is participating in this 
thread, did 92% of them, I think with a very good quality.

As nobody jumped in, I've been validating all tasks of that project, and 
made some modifications, corrections, and additions to most of the 
tasks. I've validated 83% already, so almost finished. But I am sure if 
a 3rd comes in there would be modifications, etc, etc. With this import, 
that is basically a waterways mapping where we integrate the data in a 
controled way, we will follow the same process.

> Now this of course varies a bit across the coverage area - in the
> western part a significant fraction of the HYP=4 waterways show
> indications they could be legitimate intermittent streams based on
> available images (though you often have to spend a lot of time looking
> for hints for that).  In the east this is much less so and i would
> probably consider the majority of features bogus.

We will actually start from the East. For each new project (this import 
is so huge that it will need 15, if no more, TM projects) we can check 
the data and decide. It's not a matter of putting data that is 80% bad, 
but my understanding after sampling in general is that the majority of 
ways are ok for importing, some with modifications. If some areas look 
not to be worth the effort, we will comment them with the interested 
users and the local community, and in case we think they aren't 
interesting we drop those areas and continue with others. We want to 
improve the map, that's the aim.

> At the same time the positional accuracy of many of the features is poor
> with positional errors in the order of often 50-100m.  This is another
> indication that importing this could be quite wasteful in terms of time
> spent.

Not many in my opinion. Participants are being asked to correct 
geometries when needed. But I've seen thousands of waterways everywhere 
that have less accuracy than most of these waterways. So I don't see any 
issue here, really.
> Overall i would probably say that for mappers invested in improving the
> area this data could be useful to help identify where there are
> possibly waterways to map.  But as an actual import where the mapper
> just does a quick verification and tag adjustment before adding the
> feature more or less as is it seems less suitable.  And actual import
> would bear a high risk of well meaning mappers adding a lot of
> incorrect data because of the principal difficulty of proving a
> negative (it is hard to reliably prove a waterway from the data does
> not exist in reality and many likely will be inclined to trust the data
> being correct in such cases).

I get your point. And I will add more emphasis on that in both wikis 
about the fact that waterways don't have to be uploaded when they aren't 
ok. The aim is clear: help mapping waterways taking this dataset as a 
base, but not trusting it blindly. They aren't asked to prove the 
incorrectness, because users are free to decide what they think, and so 
too the validators.

> Existing waterway data in the area is - while being incomplete - fairly
> decent quality it seems and it would be unfortunate if that was diluted
> by low reliability data.

So let's make sure this doesn't happen. With a well documented workflow 
wiki, remembering well the goal of this data integration, a good and 
constant communication among participants and a proper validation, I am 
sure the results will be fairly ok and the map improved and enriched.

Thank you again for placing your valued time in reviewing. I really 
appreciate it.



More information about the Imports mailing list