[OSM-dev] GSoC - Anomaly Detection Engine
Paul Norman
penorman at mac.com
Sat May 12 05:14:02 BST 2012
> From: Frederik Ramm [mailto:frederik at remote.org]
> Subject: Re: [OSM-dev] GSoC - Anomaly Detection Engine
>
> Adam,
>
> On 05/11/2012 04:52 PM, Adam Velkei wrote:
> > Hello, my name is Adam Istvan Velkei, I'm a student from Hungary and I
> > got the chance to work on an engine to detect vandalism and other
> > kinds of unwanted map edits.
>
> Such an engine could really be useful.
>
> You'll surely have opportunities to discuss the details with your mentor
> but one thing that I find most important is: What happens after your
> project is done.
>
> If your work is to be useful in the long-term, then whatever you write
> must be done in a way that enables other project members to continue to
> work on the software and to maintain it; to tune it for new use cases or
> new types of vandalism and so on.
>
> This is a slight conflict of goals with what GSoC usually is about - you
> are expected to make a proper plan and to carry it out and to wrap it up
> and have a "finished" product at the end.
>
> For OSM and for the long-term usability of your code, having something
> "finished" is good, but it is even more important to have something
> "manageable". It must be done in a non-exotic programming language, it
> must be documented well, it must be easily approachable for Tinkerers.
>
> If you do some sort of giant black box with a fantastic and most elegant
> neuronal network implementation that after so-and-so many rounds of
> machine learning has a 87% probability to find out whether something is
> vandalism or not then that might, while academically very interesting,
> be less useful than a solid set of rather dumb scripts that everyone can
> easily submit patches for!
https://github.com/pnorman/osm-weirdness/blob/master/detect_osm_weirdness.py
#L144
Set of dumb scripts with some bugs.
They detect imports and mechanical edits with some success. They don't
detect small-scale vandalism. A more advanced set of detection algorithms
might be able to detect small-scale vandalism, but I do not feel it is
possible with a set of dumb scripts.
More information about the dev
mailing list