[Imports] Importing Kerala, India road network from Facebook's ML generated data

Blake Girardot bgirardot at gmail.com
Thu Aug 30 11:06:59 UTC 2018


Hi Christoph,

Thank you very much for the review of the proposed import data, it is
much appreciated.

My quick review did not see the same degree of issues, so your review
is really helpful.

I think based on your feedback, as you suggest, the local Kerala OSM
folks can further review the data and workflow and figure out if and
how the import can move forward.

You have presented several things to investigate further, so let us
allow the Kerala OSM folks review them.

I will just point out again, this process was started by the local
Kerala OSM mappers and no one else, suggesting it happened in any
other way is just not correct. I am really impressed that Facebook
offered to do what they could to support them. That is the kind of
collaboration I think we need more of, not less.

I am also very impressed that the local Kerala OSM mappers, regardless
of the circumstances, are following the import guidelines and are
dedicated to making sure they do everything in collaboration with the
larger OSM community. It may or may not work out this time, but I
think they really show the way forward for this kind of import process
in the future.

Respectfully,
blake

On Thu, Aug 30, 2018 at 11:51 AM, Christoph Hormann <osm at imagico.de> wrote:
> On Wednesday 29 August 2018, Blake Girardot wrote:
>> Greetings,
>>
>> An import specific wiki page has been created for this import:
>>
>> https://wiki.openstreetmap.org/wiki/Kerala_Road_Import
>>
>
> Ok, that looks much better already.  Based on that here my review.
>
> The data files are from August 21 and contain the conflation with OSM
> data which raises the question i already hinted at - you will likely
> have significant problems with editing conflicts.  This applies to both
> hard conflicts you need to resolve and semantic incompatibilities - if
> for example buildings have been added in the meantime which happen to
> intersect some of the roads to be imported.
>
> Looking over the data it seems that the number of roads to be added is
> relatively small compared to the number of roads that already exists in
> most parts (which is a good thing obviously).  Where roads are added
> and where not looks arbitrary in a lot of cases when looking at
> satellite images - there is no visible reason in many cases why a
> certain road is in the data and a nearby much more prominent road is
> not.  This is not bad per se, just curious.
>
> The data contains a source=digitalglobe tag which should only be added
> to the changesets and not the data.  The import=yes tag should
> also be removed before upload.
>
> So far the general remarks, now regarding the data quality.  This is
> based not on looking at all of the data but at a cross section -
> covering both samples near the coast and in the mountains.
>
> Positional accuracy
>
> First the available images show a fairly large variance in alignment, in
> particular in the mountain areas, and there is no indication that the
> images the road geometries are generated from have the best positional
> accuracy than others.  In fact i found several places where the image
> matching the roads had visibly the largest off-nadir angle and
> therefore likely the largest positional error.  Offsets between the
> different images available are frequently about 20m, sometimes more
> than 50m.  Existing road data has a similar level of accuracy, in a few
> cases also more than 50m offset to the average alignment of the image
> layers.
>
> Geometry data
>
> Regarding the geometries - i would estimate about 10-15 percent of the
> road geometries are clearly faulty, the most common cases were:
>
> * nonsense geometries resulting from conflating roads with existing data
> with significant relative offset.
> * intersections between roads without nodes
> * roads drawn where there is evidently no road
>
> I would estimate an additional 10 percent where without additional data
> (ground level photos or local knowledge) you can't reliably verify if
> there is a road (i.e. the geometry looks guessed and doubtful but you
> can neither verify not falsify it reliably).
>
> Tagging
>
> The tags assigned to the generated roads looks wrong in the majority of
> cases specifically i would estimate:
>
> * unclassified: wrong in about 60 percent of cases (in particular where
> the road has no connecting function)
> * residential: evidently wrong in about 70 percent of cases (mostly
> because no residential buildings near it or because it is clearly just
> a service road)
> * service: too few for an accurate estimate but mostly wrong (in
> particular roads with a connecting function)
> * path/footway: too few for an accurate estimate but most are likely
> wide enough for cars, hence wrong
> * track: too few as well but this might actually be correct in the
> majority of cases
>
> Conclusions and recommendations:
>
> * there is no basis for the tags chosen - replacing them all with
> highway=road would be a big improvement.
> * running the import as is would create significant technical debt
> because it would conflate data with different alignments all of unknown
> accuracy.  Improving the overall accuracy later or just mapping other
> stuff with better accuracy would require a lot of hand work
> (essentially checking and correcting every road manually) which would
> be much more work in total than importing the data.
>
> The second point is of course something you also have with manual
> mapping to some extent but
>
> * you don't accumulate that much debt in such a short time.
> * you have the possibility to significantly improve the accuracy by
> aligning images locally based on ground reference data or by taking
> into account other image sources.  With the errors mentioned above the
> difference this can make is significant.
>
> Overall my assessment of this is that the work required to bring the
> data shown to a level of quality similar to good quality manual mapping
> is probably similar - if not larger - than mapping the roads manually.
> For Facebook this is not so relevant because (a) they have made the
> overall decision they want to take this approach independent of its
> efficiency in the individual case and (b) they are doing a mixed
> calculation that a large part of the required work is either done
> through free labour from the community rather than paid work by their
> staff or not at all.
>
> The Kerala community needs to contemplate and discuss if their goals in
> the long term in mapping their region (read long term here as 5-10
> years) are compatible with that approach and actually more work
> efficient for them than mapping by local mappers (which can still be
> supported by algorithm help). I don't know the answer to this question
> but i have not seen a serious discussion of this question by the local
> community either.
>
> -- end of review --
>
> As a general remark and recommendation to local communities approached
> by international corporations for approving import or organized editing
> plans:  Making such approval contingent on training and hiring locals
> to perform the work could be a useful approach on several levels - both
> to support the local economy and to ensure work is performed with
> proper knowledge of the local geography as well as to support a
> sustained growth of the local community.
>
> --
> Christoph Hormann
> http://www.imagico.de/
>
> _______________________________________________
> Imports mailing list
> Imports at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/imports



-- 
----------------------------------------------------
Blake Girardot
OSM Wiki - https://wiki.openstreetmap.org/wiki/User:Bgirardot
HOTOSM Member - https://hotosm.org/users/blake_girardot
skype: jblakegirardot



More information about the Imports mailing list