[OSM-talk] Redacting 75, 000 street names contributed by user chdr

James james2432 at gmail.com
Sun Aug 27 13:58:30 UTC 2017


If we validate via survey say in Canada, will we be able to remove the id
from the revert list? Canada has Canvec we can reference to as well as
OpenStreetCam and Mapillary

On Aug 27, 2017 9:50 AM, "Frederik Ramm" <frederik at remote.org> wrote:

> Hi,
>
>    in 2010 I was privately contacted by another OSM user with the
> suspicion that user "chdr" might be copying names from Google maps
> (there were few "easter eggs" in Oman that were only on Google and not
> in the real world, and they suddenly popped up on OSM). "chdr" was
> contacted at the time, but continued unfazed. In 2013 another mapper
> lodged a complaint with DWG about edits by chdr, and I emailed chdr
> asking him about his sources. At that point chdr stopped mapping. He
> never replied about his sources though, even when I set an ultimatum (of
> 31st August 2013) threatening to remove all names he contributed if he
> can't tell us his source. We do have to assume that all names
> contributed by chdr are copyright violations.
>
> (chdr has added names all around the world, making a harmless survey
> unlikely.)
>
> For various reasons I neglected to act on this, and was only reminded
> now, 5 years later, when DWG received a complaint from a user in Brazil
> where chdr has even used "source=google" occasionally. (But as I said,
> the suspicion is that Google was used throughout.)
>
> I have now compiled a list of all street names that were contributed by
> chdr and are still visible today; we're talking about almost 75,000
> street names world wide. The most affected countries are:
>
>   18023 "United States of America"
>   16345 "Mexico"
>   15109 "Brazil"
>    6791 "RSA"
>    2802 "Spain"
>    2614 "Australia"
>    1923 "Argentina"
>    1673 "Nigeria"
>    1569 "India"
>    1441 "Canada"
>     954 "Malaysia"
>     744 "Botswana"
>     717 "Philippines"
>     619 "Indonesia"
>     553 "Italy"
>     414 "Turkey"
>     290 "Hungary"
>     284 "Chile"
>     250 "Kenya"
>     127 "Saudi Arabia"
>     107 "Paraguay"
>     106 "Panama"
>     100 "Morocco"
>
> I've left out those countries with less than 100 affected ways.
>
> For the US, I can break it down by state:
>
>    5696 "Arizona"
>    5116 "Texas"
>    2294 "New York"
>    1164 "District of Columbia"
>     740 "Iowa"
>     494 "Colorado"
>     416 "New Jersey"
>     339 "Illinois"
>     268 "Michigan"
>     239 "Pennsylvania"
>     181 "Missouri"
>     147 "Georgia"
>     129 "New Mexico"
>     123 "North Carolina"
>     115 "California"
>     106 "Virginia"
>
> The breakdown for Mexico:
>
>    7749 "Baja California"
>    2084 "Puebla"
>    1964 "Chihuahua"
>    1539 "Coahuila"
>    1161 "Mexico"
>    1040 "Chiapas"
>     342 "Tamaulipas"
>     241 "Sonora"
>     185 "San Luis Potosi"
>     129 "New Mexico"
>
> and Brazil:
>
>   10904 "São Paulo"
>    2605 "Paraná"
>     945 "Rio de Janeiro"
>     270 "Rio Grande do Sul"
>     154 "Goiás"
>
> and South Africa:
>
>    4422 "Gauteng"
>     750 "KwaZulu-Natal"
>     600 "Eastern Cape"
>     439 "Western Cape"
>     400 "Northern Cape"
>     179 "Mpumalanga"
>
> - each time leaving out a couple others under 100.
>
> We believe that only names, not geometries have been taken from other
> maps so we'll remove and redact the names only. In identifying "names
> contributed by chdr" I took care to really only pick up the names that
> were introduced by them, not names that were there before, and also when
> chdr split a way that had a name I will make sure that the newly created
> way doesn't count as "named by chdr". Additionally, I have ignored those
> cases where chdr simply performed a TIGER expansion (St->Street etc) of
> a name that was there before.
>
> My process has two weak points (that I am aware of):
>
> 1. It doesn't properly "follow" a chrdr-contributed name through way
> splits performed by other users; if someone has split a way created by
> chdr, then the name will remain on the bit that was created by this
> user. This is somewhat unsatisfying but after having manually checked a
> random sample I think the problem is small enough to be ignored.
>
> 2. It is possible that, like with a recent case in Switzerland where I
> had to do a similar redaction, some of these chdr-contributed names will
> have been confirmed by others in a survey, i.e. someone else surveyed
> the area and checked the name, but saw no need to change it in any way
> since it was already correct. Sadly my process will now remove the name
> even though, had the name not been there in the first place, that person
> could have added the name. This is not nice but I don't see how it could
> be avoided.
>
> Here's a list of way IDs affected, with country and state:
>
> http://www.remote.org/frederik/tmp/chdr.details
>
> I am trying to keep the damage to OSM to a minimum while at the same
> time respecting copyright. If anyone wants to spot check a few names in
> their area and can suggest a refinement of the process that would leave
> more names in place because there's reason to assume they are legit, I'm
> all ears.
>
> It has been suggested to me that even if names in the US were taken
> from Google, Google would in turn have had them from TIGER and hence we
> could simply leave them be. I am not convinced of this reasoning but
> willing to hear that case argued.
>
> It is sad that chdr isn't available for comment but I must take their
> silence as an admission of wrongdoing. I will fire off another message
> to them pointing to this thread.
>
> Bye
> Frederik
>
> --
> Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"
>
>
> _______________________________________________
> talk mailing list
> talk at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk/attachments/20170827/a8b02fc1/attachment.html>


More information about the talk mailing list