[Imports] Importing Kerala, India road network from Facebook's ML generated data
Blake Girardot
bgirardot at gmail.com
Wed Aug 29 16:19:18 UTC 2018
Hi Frederik,
My comments are in line below.
But many of your comments are off topic to this thread, please stay on
the topic of this import so we can get it squared away.
I would suggest creating a new thread about Machine Learning data to
address your general items, most of which I agree with.
If you have comments or questions specific to this import, please
share them without a lot of off topic points mixed in.
On Wed, Aug 29, 2018 at 5:42 PM, Frederik Ramm <frederik at remote.org> wrote:
> Hi,
>
> I would like to use this an opportunity to offer a few general remarks.
>
> It is clear that OSM is no place to import "machine learning" results
> without thorough checks by people familiar with the situation on the
> ground. I think that, at least for the moment, everyone agrees with that
> statement.
Everyone I know agrees, and that is what is happening here.
>
> What we're seeing, though, is the "white-washing" (for lack of a more
> culturally neutral term... could I say the "community-washing"?) of
> machine learning or other low quality data, in various ways:
No evidence to support this claim, here or anywhere as far as I know.
This is a community led project, some of the main folks in the Kerala,
India OSM community reached out and asked FB for this and FB said
"sure and we will be glad to help you do the conflation since we have
experience and tools for that working with this data."
>
> * people suggest importing low-quality machine-learning data and promise
> a manual review and improvements during the import process, but the
> reality later is that a focus was put on quantity not quality, and
> individuals "review" tens of thousands of objects a day - this is not
> what most people would understand to be a review.
Another unsupported claim. Off topic. This does not apply to this
import, we are not talking about low quality data.
>
> * people create derived works from the machine-learning data, e.g.
> aggregate low quality building traces into "residential areas" and then
> import them, again with quality control on the level of spot checks
Off topic, does not apply here, we are talking roads, nothing else.
>
> This often happens because there is no good collaboration platform for
> fixing errors before the data goes into OSM; the hope is that *after*
> the data has gone into OSM, "the community" - here used as a nebulous
> term that often means "other people not us" - can and will fix things.
>
Not applicable to this import, the folks involved have been mapping in
Kerala for years, it is their data, they care for it.
> This is a trend that we should be wary about. Looking at the GitHub
> issue, I do find a few "hopeful" statements there that would raise an
> alarm for me: "Floating roads - easy to fix" (yes but who does it?), "we
> usually take care of all those mentioned fixes in our editing process"
> (from Drishtie - I am unsure what "usually" means and how much time
> Facebook have agreed to spend on this?), "Yes we can go. May be some
> problems will be fixed later", and so on.
This is addressed in the import plan, Tasking Manager by the Kerala
OSM mappers to review and fix if needed.
> As always, the concrete discussion is made difficult by an existing
> disaster condition, where anyone who says "wait a minute" feels the
> pressure of standing in the way of humanitarians saving lives. My
> respect to Christoph for taking a principled stand here and separating
> the aspects of "due process" and "humanitarian situation".
Unsupported claim
No one is rushing this due to the current situation in Kerala. I see
folks learning how to do an import and following the guidelines, no
one is saying skip anything because of the current situation.
Stop using this as strawman. I know one person who say things like
this, but they are not active in this import or HOT's activities in
general and I 100% disagree with them. Crisis is never an excuse to
ignore OSM guidelines or generally accepted practices.
> I'm loathe to oppose the concrete project, but I think we really have to
> be more strict here if we do not want to become a rubbish dump in the
> long run. We can't call for Facebook's machine-learning output to be
> imported everytime there's a natural disaster somewhere.
No idea what you are saying here.
But no one is "dumping rubbish" into OSM with this import.
Also, if there is a good process in place for generating roads,
importing and conflating them, I am not sure why you say there can not
be more of this in the future.
>
> Every import should have a post-import review, where after 6 months or
> one year, we actually analyze what has happened:
>
> * how much has been imported (compared to what was planned),
> * have the quality checks/controls that were promised/hoped for during
> the planning actually materialised, or has problematic data been waved
> through with just spot checks?
> * has the data been healthily assimilated by a local community working
> on OSM, or does it just sit there and rot away?
>
> Such an "import health check" could then lead to concrete projects to
> improve it if deemed problematic, or in drastic cases, a decision could
> be made to remove an import again if it is found out that it was not
> helpful.
Totally support this, but off topic to this import.
> Actually, this does not only apply to imports but also to concerted
> mapping efforts - quote from a post that Pierre Beland made on osm-talk
> just yesterday: "The number of contributors is limited in Africa and the
> risk is that errors created by mapathons while participating to Crisis
> responses stay as is for years."
Off topic, this is not Africa and the local community is involved and active.
>
> Bye
> Frederik
Bye
--
----------------------------------------------------
Blake Girardot
OSM Wiki - https://wiki.openstreetmap.org/wiki/User:Bgirardot
HOTOSM Member - https://hotosm.org/users/blake_girardot
skype: jblakegirardot
More information about the Imports
mailing list