[OSM-talk] OSM SPAM detector

James james2432 at gmail.com
Mon Mar 5 14:55:09 UTC 2018


most but not all cases: undiscussed imports get reverted and when they get
the go ahead they would be marked as spam. Very bad way to train the
dataset vs ground truthed spam identification.

On Mar 5, 2018 9:50 AM, "Michał Brzozowski" <www.haxor at gmail.com> wrote:

Could we use something similar to detect generic vandalism by training on
reverted changesets? Many of them have "this changeset was reverted fully
or in part..." comments. Also, analyzing object history or detecting
created_by=reverter;JOSM * would give you more examples to train on.

* Unfortunately this persists for the whole JOSM session, so there will be
some false positives.

Michał

5 mar 2018 15:09 "Jason Remillard" <remillard.jason at gmail.com> napisał(a):

> Hi,
>
> This weekend I put together a SPAM detector for OSM changesets.
>
> https://github.com/jremillard/osm-changeset-classification
>
> You don't need to be a developer to contribute, send over any SPAM'y
> changesets you come across via a github issue, a pull request, or even an
> email to me. I just need the changeset id.
>
> The code is currently hitting 99+% accuracy detecting the difference
> between 1500 random normal edits and 1500 sketchy changesets that Fredrick
> shared with the talk-us last last week. This is with zero tuning, so it
> looks like it will work well.
>
> Jason
>
> _______________________________________________
> talk mailing list
> talk at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
_______________________________________________
talk mailing list
talk at openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20180305/2742647e/attachment.html>


More information about the talk mailing list