[OSM-dev] OSM Project Idea for GSoC 17: Vandalism Detection in Map Edits

Ethan Nelson ethan.nelson.osm at outlook.com
Tue Dec 20 03:02:56 UTC 2016

Hi Animesh,

You can check out the vandalism page on the OSM Wiki that provides a pretty good overview about OSM vandalism and the challenge of detecting it [0].

I think a binary classification won't be as straightforward for creating a wide sweeping 'vandal detection' tool because the problem of vandalism in OSM is multifaceted: you can change many objects in one changeset and each object itself has multiple dimensions  (there's the spatial dimensions--shape, detail, etc.--and then there's the data property dimensions). In addition, sometimes the line between poor quality edits and vandalism is very thin, so vandalism may not be the result of malice but rather just an uninformed editor.

Thus, for a binary classification, it would be useful to focus on one type of vandalism. Perhaps it could be detecting doodles (in which case you could search for data that isn't normal shaped: small angles, very high detail, and so on). Or it could be finding times when people are deleting a lot of data. I started a form that aims to collect "bad" edits in general [1], but I haven't really advertised it and thus don't have data that could help inform which direction would be most commonly found.

You may also check out some of the projects that have implemented parts of the algorithms listed on the wiki page for further inspiration [2,3,4].


Ethan aka FTA

[0]: http://wiki.openstreetmap.org/wiki/Vandalism

[1]: https://docs.google.com/forms/d/e/1FAIpQLSf4bVukO5OUXviSujW1gUtM1NTroTz3lPsXy7EcKxIp8ZzX5g/viewform

[2]: http://www.mdpi.com/2220-9964/1/3/315

[3]: https://github.com/willemarcel/osmcha-django

[4]: https://github.com/ethan-nelson/osm_hall_monitor

