[OSM-talk] Suggested mass edits

Martin Machyna machyna at gmail.com
Mon Apr 19 14:26:53 UTC 2021


Hmm I can see how DWG can find this data interesting. For a quick look 
at the users info you can check overpass output and just scroll trough 
the list. (btw woodpeck_repair  has 165 count.. oh oh :)  Maybe you can 
carefully craft a message to those users reminding them to be more 
vigilant before submitting. I would just keep in mind that it's 
different to have 50 errors from 500,000 changes then from 5,000.

[out:csv(user,emptyRelations)][timeout:250];
relation[type](if: count_members() == 0);
rel(br) -> .parents;
(._; - rel(r.parents););

for (user()){
     make stat
      "user"=_.val,
      emptyRelations = count(relations);
     out;
};

(I archived the outputs here: 
https://storage.googleapis.com/osm_cleanup/emptyRelation_userCount.csv ; 
https://storage.googleapis.com/osm_cleanup/typeMultipolygon_userCount.csv )


The info about all objects can be accessed with e.g.

[out:csv(::user,::uid,::changeset,::timestamp)][timeout:250];
relation[type](if: count_members() == 0);
rel(br) -> .parents;
(._; - rel(r.parents););
out meta;

(Archived at: 
https://storage.googleapis.com/osm_cleanup/emptyRelation.csv ; 
https://storage.googleapis.com/osm_cleanup/typeMultipolygon.csv)

So anybody can run analysis they want on the output. More skilled people 
can dig some useful stats. I just quickly ran this for first 1000 record 
going back to 2014-05-20.

ids=$(curl -s https://storage.googleapis.com/osm_cleanup/emptyRelation.csv \
     | tail -n +2 \
     | cut -f 3 \
     | uniq)

for id in $ids; do
     curl -s https://www.openstreetmap.org/api/0.6/changeset/${id} \
         | grep created_by \
         | cut -d "\"" -f 4
done | cut -d " " -f 1 \
         | awk '{stat[$1] += 1}
                 END {for (key in stat) {
                     print key, stat[key]
                     }}'


Seems biggest perpetrator is terms of changeset counts is JOSM
Go 1
JOSM/1.5 378
iD 76
Potlatch 9
Vespucci 7


I would be interested to see what cool things you find. Please keep me 
updated.



On 18.4.21 6:34 , Frederik Ramm wrote:
> Hi,
>
> On 4/18/21 19:20, stevea wrote:
>> Along with the usual caveats about "mass edits" (these must be done 
>> very carefully and with deliberate, targeted purpose)
>
> Another usual caveat is:
>
> If these buggy objects appear in clusters, they might hint at a deeper 
> problem. Are many of them created by the same user(s) or by the same 
> editor(s)? If that is the case, more research might be appropriate so 
> that editors or workflows can be improved. Sometimes analysis of the 
> broken data can also point to a broken import or mass edit that has so 
> far been undetected and will, upon closer look, have more problems 
> than just these obvious ones. In such a situation, simply deleting the 
> buggy objects will remove the "red flags" that would otherwise have 
> pointed at the broken import or mass edits.
>
> Bye
> Frederik
>



More information about the talk mailing list