[Imports] Importing Kerala, India road network from Facebook's ML generated data

Naveen Francis naveenpf at wikimedia.in
Fri Aug 31 18:28:27 UTC 2018

*Hello Christoph, Thank you for your feedback. We've considered your notes
and here are our responses.As Blake mentioned, I on behalf the local
community reached out to Facebook via a public Tweet[1]. Even though India
was not on their mapping plan it seemed the team ran their models to
produce road data over the weekend for Kerala and released it quickly with
the caution to not directly upload until reviewed.Positional accuracyWe're
aware of this issue, but of course, this is always a tradeoff with using
satellite imagery where we lack GPS traces. In the context of the current
disaster response, we believe the value of having these roads on the map
outweighs the concern. I’m sure the editors at Facebook understand the
complexity since they completed mapping an entire country. Geometry
dataWe've discussed this with Facebook and we're comfortable that the
conflation process they will be going through — combined with our
post-import validation — will be able to address these issues adequately if
they are not. However we are confident the editing team at Facebook will
have great quality.As we made clear in the import wiki page, the raw data
posted will go through Facebook's conflation process, using their
specialized tools before contribution to OSM, and will *not* go in in raw
form. We are grateful to them for offering their support in this process.In
the future, we hope to be able to do this step ourselves, once we have
Facebook's open source iD fork running locally, but unfortunately, we're
not able to do this under present conditions. You can see the video posted
on their wiki -
types are always corrected manually during the import and we also discussed
the community would be able to verify this easily once the roads are
in.“Overall my assessment of this is that the work required to bring the
data shown to a level of quality similar to good quality manual mapping is
probably similar - if not larger - than mapping the roads manually.”Thank
you for your assessment, however in light of the support offered by
Facebook we disagree. We believe it would take much more time and effort to
map all these roads manually, and we are grateful for the help to speed up
this work during a disaster.“Making such approval contingent on training
and hiring locals to perform the work could be a useful approach on several
levels”We appreciate your concern and we do hope to be able to perform the
initial import ourselves in the future. Facebook has already offered to
train us on their workflows with their open source iD version (thank you!).
We will also be validating a final validation pass following the Facebook
team's initial import. So we have no concerns about being excluded from
this workflow and appreciate Facebook's support.In Kerala, already National
Highways and State Highways are tagged and wikidatafied. [2][3].We use OSM
directly to display the highway network in english and malayalam Wikipedia.
We plan to tag all the Major District Roads with proper tags
further.[4]Lastly, the local OSM community in India is highly competent and
has been mapping for years. While we appreciate the free advice we believe
we are able to see the needs of our community and make decision
accordingly.We look forward to more collaboration across the local
communities, Tech companies and humanitarian sector since we all have the
same goal of improving the World Map. :) Thanks,*

*naveenpf 1. https://twitter.com/Drish_T/status/1031285152440180736
<https://wiki.openstreetmap.org/wiki/India:Roads/Kerala> 4.
<https://wiki.openstreetmap.org/wiki/Major_District_Road_(Kerala)>  *

Naveen Francis

On Thu, Aug 30, 2018 at 4:36 PM, Blake Girardot <bgirardot at gmail.com> wrote:

> Hi Christoph,
> Thank you very much for the review of the proposed import data, it is
> much appreciated.
> My quick review did not see the same degree of issues, so your review
> is really helpful.
> I think based on your feedback, as you suggest, the local Kerala OSM
> folks can further review the data and workflow and figure out if and
> how the import can move forward.
> You have presented several things to investigate further, so let us
> allow the Kerala OSM folks review them.
> I will just point out again, this process was started by the local
> Kerala OSM mappers and no one else, suggesting it happened in any
> other way is just not correct. I am really impressed that Facebook
> offered to do what they could to support them. That is the kind of
> collaboration I think we need more of, not less.
> I am also very impressed that the local Kerala OSM mappers, regardless
> of the circumstances, are following the import guidelines and are
> dedicated to making sure they do everything in collaboration with the
> larger OSM community. It may or may not work out this time, but I
> think they really show the way forward for this kind of import process
> in the future.
> Respectfully,
> blake
> On Thu, Aug 30, 2018 at 11:51 AM, Christoph Hormann <osm at imagico.de>
> wrote:
> > On Wednesday 29 August 2018, Blake Girardot wrote:
> >> Greetings,
> >>
> >> An import specific wiki page has been created for this import:
> >>
> >> https://wiki.openstreetmap.org/wiki/Kerala_Road_Import
> >>
> >
> > Ok, that looks much better already.  Based on that here my review.
> >
> > The data files are from August 21 and contain the conflation with OSM
> > data which raises the question i already hinted at - you will likely
> > have significant problems with editing conflicts.  This applies to both
> > hard conflicts you need to resolve and semantic incompatibilities - if
> > for example buildings have been added in the meantime which happen to
> > intersect some of the roads to be imported.
> >
> > Looking over the data it seems that the number of roads to be added is
> > relatively small compared to the number of roads that already exists in
> > most parts (which is a good thing obviously).  Where roads are added
> > and where not looks arbitrary in a lot of cases when looking at
> > satellite images - there is no visible reason in many cases why a
> > certain road is in the data and a nearby much more prominent road is
> > not.  This is not bad per se, just curious.
> >
> > The data contains a source=digitalglobe tag which should only be added
> > to the changesets and not the data.  The import=yes tag should
> > also be removed before upload.
> >
> > So far the general remarks, now regarding the data quality.  This is
> > based not on looking at all of the data but at a cross section -
> > covering both samples near the coast and in the mountains.
> >
> > Positional accuracy
> >
> > First the available images show a fairly large variance in alignment, in
> > particular in the mountain areas, and there is no indication that the
> > images the road geometries are generated from have the best positional
> > accuracy than others.  In fact i found several places where the image
> > matching the roads had visibly the largest off-nadir angle and
> > therefore likely the largest positional error.  Offsets between the
> > different images available are frequently about 20m, sometimes more
> > than 50m.  Existing road data has a similar level of accuracy, in a few
> > cases also more than 50m offset to the average alignment of the image
> > layers.
> >
> > Geometry data
> >
> > Regarding the geometries - i would estimate about 10-15 percent of the
> > road geometries are clearly faulty, the most common cases were:
> >
> > * nonsense geometries resulting from conflating roads with existing data
> > with significant relative offset.
> > * intersections between roads without nodes
> > * roads drawn where there is evidently no road
> >
> > I would estimate an additional 10 percent where without additional data
> > (ground level photos or local knowledge) you can't reliably verify if
> > there is a road (i.e. the geometry looks guessed and doubtful but you
> > can neither verify not falsify it reliably).
> >
> > Tagging
> >
> > The tags assigned to the generated roads looks wrong in the majority of
> > cases specifically i would estimate:
> >
> > * unclassified: wrong in about 60 percent of cases (in particular where
> > the road has no connecting function)
> > * residential: evidently wrong in about 70 percent of cases (mostly
> > because no residential buildings near it or because it is clearly just
> > a service road)
> > * service: too few for an accurate estimate but mostly wrong (in
> > particular roads with a connecting function)
> > * path/footway: too few for an accurate estimate but most are likely
> > wide enough for cars, hence wrong
> > * track: too few as well but this might actually be correct in the
> > majority of cases
> >
> > Conclusions and recommendations:
> >
> > * there is no basis for the tags chosen - replacing them all with
> > highway=road would be a big improvement.
> > * running the import as is would create significant technical debt
> > because it would conflate data with different alignments all of unknown
> > accuracy.  Improving the overall accuracy later or just mapping other
> > stuff with better accuracy would require a lot of hand work
> > (essentially checking and correcting every road manually) which would
> > be much more work in total than importing the data.
> >
> > The second point is of course something you also have with manual
> > mapping to some extent but
> >
> > * you don't accumulate that much debt in such a short time.
> > * you have the possibility to significantly improve the accuracy by
> > aligning images locally based on ground reference data or by taking
> > into account other image sources.  With the errors mentioned above the
> > difference this can make is significant.
> >
> > Overall my assessment of this is that the work required to bring the
> > data shown to a level of quality similar to good quality manual mapping
> > is probably similar - if not larger - than mapping the roads manually.
> > For Facebook this is not so relevant because (a) they have made the
> > overall decision they want to take this approach independent of its
> > efficiency in the individual case and (b) they are doing a mixed
> > calculation that a large part of the required work is either done
> > through free labour from the community rather than paid work by their
> > staff or not at all.
> >
> > The Kerala community needs to contemplate and discuss if their goals in
> > the long term in mapping their region (read long term here as 5-10
> > years) are compatible with that approach and actually more work
> > efficient for them than mapping by local mappers (which can still be
> > supported by algorithm help). I don't know the answer to this question
> > but i have not seen a serious discussion of this question by the local
> > community either.
> >
> > -- end of review --
> >
> > As a general remark and recommendation to local communities approached
> > by international corporations for approving import or organized editing
> > plans:  Making such approval contingent on training and hiring locals
> > to perform the work could be a useful approach on several levels - both
> > to support the local economy and to ensure work is performed with
> > proper knowledge of the local geography as well as to support a
> > sustained growth of the local community.
> >
> > --
> > Christoph Hormann
> > http://www.imagico.de/
> >
> > _______________________________________________
> > Imports mailing list
> > Imports at openstreetmap.org
> > https://lists.openstreetmap.org/listinfo/imports
> --
> ----------------------------------------------------
> Blake Girardot
> OSM Wiki - https://wiki.openstreetmap.org/wiki/User:Bgirardot
> HOTOSM Member - https://hotosm.org/users/blake_girardot
> skype: jblakegirardot
> _______________________________________________
> Imports mailing list
> Imports at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/imports
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20180831/8a63939e/attachment-0001.html>

More information about the Imports mailing list