[Talk-us] SEO Damage to OSM

Denis Carriere carriere.denis at gmail.com
Wed Jul 5 21:12:07 UTC 2017


Great research Frederik!

*~~~~~~*
*Denis Carriere*

On Wed, Jul 5, 2017 at 5:05 PM, Frederik Ramm <frederik at remote.org> wrote:

> Hi,
>
> > These spam changes do not need that complexity to detect.
>
> I've done some numbers, maybe it helps.
>
> I counted all users that only ever commited one changeset with one edit
> inside. This number is 140352.
>
> Then I discarded those where the changeset comment was shorter than 50
> characters or where the content had been redacted long time ago, leaving
> me with 12173.
>
> Then I looked at the objects modified/created, and discarded all where
> the object had neither website, nor description, nor note tag. This left
> me with 3323 objects.
>
> Then I looked at the list and found a broad range of edits. Some, while
> having an advertising slant, seem a legit addition of someone's own
> business:
>
> user=Martin Merkur
> changeset=38362589
> comment=Our doors are always open.  Come and visit, taste our coffee,
> see what we do
> object=node 4103514010
> addr:city=Berlin;addr:housenumber=38;addr:postcode=
> 12435;addr:street=Elsenstraße;amenity=cafe;cuisine=coffee_
> shop;internet_access=no;name=passenger
> coffee;note=https://www.facebook.com/PassengerEspresso/;opening_
> hours=7:30-15:00
> Uhr;smoking=outside;website=passenger-coffee.de
>
> or
>
> user=otheryan
> changeset=13150739
> comment=Added in West Town Bikes as it is at the same address and has
> enough of its own activity that it needs to be recognized on the map.
> object=node 1585399965
> addr:housenumber=2459;addr:postcode=60622;addr:street=W
> Division;name=Ciclo Urbano/West Town
> Bikes;shop=bicycle;website=http://ciclourbanochicago.com/
>
> some look more SEO-y
>
> user=northcarolinahealth
> changeset=43324244
> comment=Updated Osborne Insurance Services at Raleigh, NC
> object=node 4474950186
> addr:city=Raleigh;addr:housenumber=5316;addr:postcode=27609;addr:state=NC;
> addr:street=Six
> Forks Road;hours=Mon-Fri
> :8.00AM-6.00PM;name=Osborne Insurance
> Services;phone=919-845-9955;suite=110;website=http://
> northcarolinahealth.org
>
> or
>
> user=blakemanhart
> changeset=43027180
> comment=Updated State Farm - Blake Manhart at Springfield, VA
> object=node 4456153164
> addr:city=Springfield;addr:housenumber=8322;addr:
> postcode=22152;addr:state=VA;addr:street=Traford
> Ln #B;name=State Farm -
> Blake Manhart;Owner=Blake
> Manhart;phone=703-992-9664;website=http://blakemanhart.com
>
> I had a look at trying to automatically match website and user name; 457
> of them actually contain the user name in the web site. but that is a
> too coarse check. I fear that it might be necessary to look through the
> rest manually to detect the dodgy ones.
>
> Of the 3323, 208 have a highway tag. But here it bites me that I took
> everything that had either note or description or website, because some
> of the edits with highway=* are legit and have a description/note where
> the newbie mapper explained what they did. 170 of the 208 do have a
> website tag, and finally, they *all* seem dodgy. (Interestingly it was
> not all ways - some highway=traffic_signals too!)
>
> I've run a revert on these 170 but the majority had already been fixed
> by others!
>
> That leaves us with a good 3115 objects to investigate. Many do clearly
> violate our "no advertising" rules but then again we don't want to bee
> to harsh with the cycle shop owner who maybe oversteps the line.
>
> I've put my interim results here
>
> http://www.remote.org/frederik/tmp/username-in-url.csv
>
> (for those where the username is in the URL) - do you think we should
> revert them all automatically? (Keep in mind many may have been reverted
> already - we'd only work on those where the spam version is still current.)
>
> and
>
> http://www.remote.org/frederik/tmp/other.csv
>
> for those where the username is not (fully) in the URL.
>
> Bye
> Frederik
>
> --
> Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"
>
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk-us
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20170705/dd384d13/attachment.html>


More information about the Talk-us mailing list