[Imports] Facebook's AI-Assisted Road Tracing for OSM

Mon Mar 27 19:51:45 UTC 2017

Thanks for the additional details, this clarifies quite a few things.

> How do you deal with clouds in imagery?
>
> Our ML algorithm has been trained on images with no cloud cover only.

That is what i feared.

This means the AI is unaware of the phenomenon of clouds in images.  If 
you run it on images with clouds the result is going to be 
unpredictable since it has no reference from the training data how to 
interpret this.  You'd either have to train specifically for 
recognizing cloudy areas as without roads or you'd have to reliably 
exclude areas with clouds from your processing.

This might appear overly nitpicking considering the overall fraction of 
clouds is very small but since we do not have access to the images used 
we'd have no way to specifically identify and correct cloud based 
problems.

> What is the mapping experience of those people in OpenStreetMap?
>
> Our editors are indeed Facebook employees. While most of the editors
> are new to OSM, and therefore do not have personal accounts they were
> hired because of their strong backgrounds in mapping and are well
> versed with GIS tools. [...]

That is going to be a problem.  The road tagging system in OSM is a 
complex subject and you can not really learn how to properly use it 
just from lessons in a classroom.  Since deciding on tags to use for 
roads apparently at least partially going to be task of your editors 
solid experience in road mapping in OSM would be paramount.

> At this stage we only have machine generated data (before human
> validation) at province scale. As you all know in any project you
> have to balance resources, cost, and more importantly employee time.
> Preparing human validated .osm files at country scale is both costly
> and time consuming.

We don't need that, we just need the machine generated data to verify 
the correct operation of the process and the quality of the results.

This is not a unique situation, we have had many imports in the past 
that relied on human work during the import for example for 
verification or conflation.   None the less we always look at the 
original data first.

> We are also using population distribution
> maps<https://info.internet.org/en/wp-content/uploads/sites/4/2016/07/
>population_density_final_mj2_ym_tt2113.pdf> created by another team
> here at Facebook to help us understand where people are. This
> data<https://code.facebook.com/posts/1676452492623525/connecting-the-
>world-with-better-maps> was released and has been used by humanitarian
> organizations to help with planning and decision making.

The question remains if there is other data than DG imagery going into 
production of this population data.  The main reason for these 
questions is to look for possible legal implications since none of 
these data sets is published under an open data license apparently.

For the rest of the questions discussed i am going to wait for the 
data - there is little use in theoretically discussing how well the 
algorithm works on tag selection or connecting roads.  None the less 
thanks for elaborating on these points.

-- 
Christoph Hormann
http://www.imagico.de/