[Talk-us] NYC Name Vandalism

Simon Poole simon at poole.ch
Thu Sep 6 09:22:17 UTC 2018


OSM changesets or even the diffs only contain the changes to objects,
while in the simple case (somebody vandalising the name tag on NYC) that
may be enough to determine that something is "bad" for example messing
up a motorway exit will require you to actually apply (in one way or the
other) the changes to the existing data, or even the changes from
multiple changesets. Then naturally you will have edits that undo
previous bad edits that need to be handle some way too (this creates a
potential for conflicts, as the fix may be a substantial amount of time
later). 

I suspect some of the above is why Mapbox uses a different granularity
grouping of changes for their review process (see the SotM talk by Lukas).

In any case all of the above potentially lead to your private fork of
the planet getting out of sync real fast with the original, implying
that applying diffs will become more problematic over time.  So you
wouldn't be able to take you fixed and known good planet fork, apply
only good diffs, and expect to be able to continue to do that for a year
or so.

IMHO the only thing that could really work in the OSM model is reverting
real fast in the -original- dataset.

Naturally there is the other aspect that we want our contributors to
gain experience and become better mappers over time. You are only going
to get that if leave the opportunity to make mistakes open and don't
robo-fix everything that goes wrong

Simon


Am 06.09.2018 um 02:56 schrieb Alan Brown:
> Hi,
>
> Perhaps I didn't express it clearly, but my interest was in the idea
> that certain. rather limited changelists could be flagged for
> moderation before they are put into main dataset.  There will always
> be things that seem like they should be blocked, but are actually
> appropriate.   In the interest of having the most accurate data, I'm
> not convinced this form of moderation can't have a role.  As I
> understand it, the virtue of OSM is to allow anyone to contribute
> accurate, detailed local knowledge about the places they know about;
> however, there's no value in having junk in the database for even a
> moment, if it can be avoided.  Place names are usually verifiable
> facts, even disputed place names.  So you don't want the open nature
> of OSM to compromise accuracy, or a quest for accuracy to discourage
> people from contributing accurate information.
>
> I said my peace; I suspect the OSM community is not culturally
> disposed to that form of moderation. So I will ask about a different
> approach.
>
> In my case, I've seen editing errors that affected motorway
> connectivity (not vandalism), that were made and corrected within a
> couple hours.  Pretty good - except our planet file was in that two
> hour window.  I want to avoid these errors, without getting caught in
> the errors of the next two hour window.
>
> I'm not sure if Mapbox or others use a process like this, but this is
> what I can imagine:
>
> PLANETcur is the current planet file
> PLANETprev is the last used planet file
> CHANGEcur-prev is a comprehensive list of changelists between the two
> datasets
>
> A particular consumer of OSM data can automatically scan
> CHANGEcur-prev and/or PLANETcur for potentially troubling content,
> according to their own criteria.  In their local copy, if they detect
> something they do not want to accept - offensive place names,
> incomplete topology - they can attempt to revert - in their local copy
> only! - recent changes that violate their criteria.  They accept
> whatever mistakes their "reversion" algorithm makes.  The identified
> "questionable changelists"  can be submitted back to the OSM community
> to review and revert, but always by a human.
>
> My hope is that I am being completely unoriginal, and I can cobble
> together existing tools quickly. How unoriginal am I?
>
> I am looking over the osmcha.mapbox.com page, and saw reference to a
> utility called "osm-compare":  
> https://github.com/mapbox/osm-compare/blob/master/comparators/README.md
> - which has an obscenity filter.  If I understand this correctly,
> osm-compare flags changelists for review, osmcha.mapbox.com allows
> people to review the flagged datasets and reverse bad edits.  Could
> someone define osm-compare filters that produce results that can be
> automatically pulled into a local copy?
>
> (If a changeset has been reviewed by a second person - can that
> information be provided).
>
> All I want is something that allows me to be a little bit more
> conservative in accepting edits, without requiring complex processes
> or large resource.  A little insight would be appreciated.
>
> Thanks,
> Alan
>
>
>
>
>
>
>
> On Wednesday, September 5, 2018, 7:52:46 AM PDT, Simon Poole
> <simon at poole.ch> wrote:
>
>
> osmcha (osmcha.mapbox.com) already does most of this. While detecting
> vandalism in general is difficult, edits like those in question are easy
> to detect and small in number.
>
> IMHO it really isn't an issue with openstreetmap in this case, as even
> with the delay (somebody reported the user in question instead of
> reverting and then reporting) in the specific case the vandalism was
> swiftly removed. The reason that this is being discussed at all is
> because of the edit resurfacing with a third party and having to be a)
> detected, b) reported, and c) fixed again. Yes what we know this was a
> glitch in the third parties workflow, but they are bound to happen and
> we shouldn't pretend given the large number of edits that any procedures
> put in place are going to be 100% effective, be it directly with OSM or
> by third parties. 
>
> Simon
>
>
> Am 05.09.2018 um 16:23 schrieb Greg Troxel:
> > I tend to agree that automated systems are going to be not that useful.
> >
> > I tend to notice some things in my area, but it's hard to keep track.
> >
> > This makes me wonder about a tool that
> >
> >  - lets people sign up to watch edits, in some area, or in general,
> >    sort of like maproulette.  Use some scoring system where new
> >    mappers edits are more likely to be looked at by somebody, and
> >    people who claim an area as theirs are more likely to get shown
> >    edits there, or maybe let people get all edits in some bbox
> >
> >  - lets people give a rating to a changeset, something like:
> >        i) high priority for inspection by others
> >        ii) worthy of being checked by a local
> >        iii) probably ok
> >        iv) definitely ok
> >
> >  - presents things to multiple people
> >
> >  - somehow uses a rater's own edit history to validate this (perhaps be
> >    cautious about people with < 500 changesets, and very cautious < 50)
> >
> >
> > This is a slippery slope to a reputation system, but I think in terms of
> > culture, the fact that anybody can review is there already, and the
> > bright line is needing permission to change things, vs a more efficient
> > way of others looking over changes.
> >
> >
> > Unfortunately my editor crashed and I lost the source code :-)
> >
> >
> > _______________________________________________
> > Talk-us mailing list
> > Talk-us at openstreetmap.org <mailto:Talk-us at openstreetmap.org>
> > https://lists.openstreetmap.org/listinfo/talk-us
>
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org <mailto:Talk-us at openstreetmap.org>
> https://lists.openstreetmap.org/listinfo/talk-us

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20180906/e7ec5dfa/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20180906/e7ec5dfa/attachment.sig>


More information about the Talk-us mailing list