[Imports] Tanzania roads manual import

Rafael Avila Coya ravilacoya at gmail.com
Fri Aug 21 23:12:23 UTC 2015

Hash: SHA1

Hi, Christoph:

Thanks for your feedback, and sorry for replying late these days. I am
out for leisure for a few days, and my internet connection is very bad.

I answer you comments:

On 20/08/15 14:52, Christoph Hormann wrote:
> These comments are based on the data made available now.  This
> covers only a small part of Tanzania which is mostly already quite
> well mapped and is fairly uncritical w.r.t. imagery (not much dense
> forest vegetation, no persistent cloud cover).  I am going to
> assume this is also what is going to be imported, if the actual
> import is supposed to cover other areas as well things might vary.

This subset is the same one that I used for assessment and then give
my opinion on what would be the best solution for the integration of
the data into OSM. But there is one thing I have to remember that is
key here: Most of data has been collected with GPS trackers, and it
has been completed with some road segments for roads that a) weren't
present in neither OSM nor MSD&JSI data, AND b) where hires Bing
imagery was available. So we can be sure that, for areas where no
hires imagery was available OR areas with dense forest, clouds, etc,
the MSD&JSI data is coming from GPS tracks, so we rely in that data
for that area. I thing that, for those roads we shouldn't add any
surface tag, but apart from that we should consider those roads good
and fine for import. A good reason for this is that we see that GPS
collected roads for areas where we can check against hires imagery are
of very good quality.

But for sure, this has to be added to the wiki and workflow, as well
as other ideas given by you and others in this thread.

> In general since a large portion of the roads in the data set are 
> already in OSM in comparable or better quality it would make sense
> in terms of efficiency, avoidance of errors and ease for the
> mappers to remove those roads in advance (i.e. everything that is
> within some small distance of an existing road).  In some areas
> manually finding the roads not yet mapped in OSM comes close to a
> needle in a haystack problem.

You can't imagine how many times and for how long I have thought about
this idea of removing by software and beforehand the MSD&JSI road
segments that are already present in OSM. The advantage would be very
clear: we would save a lot of time and effort for the import users.
But at the same time, I also spent quite a few hours going around the
data, and my conclusion is that a good number of MSD&JSI are an
improvement of their OSM counterparts. And I think it would be a pity
to miss this oportunity of improving many of the existing OSM roads

There are for sure areas where most roads are already mapped in OSM,
specially around big towns and cities. But this dataset will grow the
total road distance in OSM from the aprox. 55,000 km that we have now
inside Tanzania to around 115,000 km, so this means that around 52% of
the MSD&JSI road segments aren't mapped in OSM yet. That's the average.

Still, deciding about what is the best approach is tricky, as there
are pros and cons. One thing that plays on our side is that the import
will be done in several chuncks (each one will go in one different
Tasking Manager project), so after working with the first one, we
could see/discuss in this same thread things that we think we can
improve for the subsequent TM projects, based on the experience of the
first ones. Makes sense?

> Differences between OSM roads and this data are in most cases of
> the same order of magnitude as the satellite images' typical
> positional accuracy so imagery is not going to help much in
> deciding which data is better in that regard.

Yes. That's why we say in the workflow that MSD&JSI segments that are
of similar quality than the existing OSM ways will be ignored (not
imported). This needs, maybe, a more detailed explanation, but I fear
all the time of ending up writing a book instead of a workflow wiki ;)

>>> - if you advise the mappers to always run line simplifications
>>> on the ways first it would be better to do this on the whole
>>> data set in advance, this would avoid unnecessary work for the
>>> mappers.
>> Running the Simplify Way plugin for each task takes just
>> seconds, [...]
> Since the lines only touch at the ends it is easy to run a 
> simplification before converting to OSM format.

Yes. I already answered that I will check this again. I think it will
be possible to do this simplification beforehand. Just let me come
back to "normal" (next week), and I will keep you informed about this.

>> The workflow is still unfinished, and I will add instructions
>> when no Bing nor Mapbox are present. In those cases, when no OSM
>> ways are present, we can safely import the MSD&JSI segments as
>> they are, bearing in mind that they have been collected with GPS
>> devices and, as you will see, they are most of the time of very
>> good quality.
> That would mean you generally consider cross checking with an 
> independent source not necessary.  The question is how are the
> various decisions you ask the mapper to make (tagging, deciding
> which data is more accurate in case the road is already in OSM) to
> be made then?

To keep the maximum level of consistence across users, I am trying my
best to make sure that all mappers follow the same guidelines, with
the least possible situations where they have to take a decision on
their own. Respect the roads in areas without hires imagery, I
answered already further up. My suggestion for the workflow in these
areas would be something like: "1) For segments still not present in
OSM, we import them, but we will avoid adding the surface tag (we
could instruct the mappers to add a fixme="Please, add surface tag").
2) For segments already mapped in OSM, we won't import them (ignore

>> Sometimes the imagery let's see clearly that surface=asphalt. But
>> I agree that in case of doubt we should stick to that
>> paved/unpaved tagging.
> Reliably distinguishing asphalt from concrete is generally very
> hard. This might be feasible based on quantitative analysis of low
> noise infrared data in case of relatively clean surfaces but not
> from visual color jpegs from a web service with dust during dry
> season.

I think concrete is very rare in Africa. But, as surface=paved/unpaved
is an acceptable solution for all, let's simplify it to paved/unpaved.

>>> - Instead of 'source:date=2014..2015' use
>>> 'source:date=2014;2015'
>> I disagree. "2014..2015" means that the data has been collected
>> any time during 2014 and 2015. [...]
> That is not how source:date is generally used in OSM.  There are
> only eight cases of use of a '..' connector in source:date at the
> moment, all covering longer time spans with many in between values.
>  source:date=2014;2015 means data is from the years 2014 and 2015
> which seems exactly what is intended.

To my understanding, we use a ";" inside the value of a tag when we
need to add a collection of values to the key. For example:
contact:phone=+34 555555555;+34 777777777 means two contact phones for
an object. For source:date I base this tagging on the start_date wiki
[1]. As an example of that wiki, start_date=1914..1918 means the
object was constructed (started to exist) some time during WWI. And
about ";", you can find further down: "Avoid multiple entries, for
example 2010-08-01;2010-08-19 without providing at least an
explanation in a note". So, again, I think "2014;2015" is wrong for
each individual segment. But it might be correct for the changeset
tags, although unnecessary (I prefer "2014..2015" as with the segments).

Thank you very much again for the time and feedback, and have a nice


[1] http://wiki.openstreetmap.org/wiki/Key:start_date

- -- 
Twitter: @ravilacoya <http://twitter.com/ravilacoya>
*Humanitarian OpenStreetMap Team (HOT)* <http://www.hotosm.org>
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/


More information about the Imports mailing list