[OSM-talk] Suggested mass edits
Martin Machyna
machyna at gmail.com
Mon Apr 19 14:26:53 UTC 2021
Hmm I can see how DWG can find this data interesting. For a quick look
at the users info you can check overpass output and just scroll trough
the list. (btw woodpeck_repair has 165 count.. oh oh :) Maybe you can
carefully craft a message to those users reminding them to be more
vigilant before submitting. I would just keep in mind that it's
different to have 50 errors from 500,000 changes then from 5,000.
[out:csv(user,emptyRelations)][timeout:250];
relation[type](if: count_members() == 0);
rel(br) -> .parents;
(._; - rel(r.parents););
for (user()){
make stat
"user"=_.val,
emptyRelations = count(relations);
out;
};
(I archived the outputs here:
https://storage.googleapis.com/osm_cleanup/emptyRelation_userCount.csv ;
https://storage.googleapis.com/osm_cleanup/typeMultipolygon_userCount.csv )
The info about all objects can be accessed with e.g.
[out:csv(::user,::uid,::changeset,::timestamp)][timeout:250];
relation[type](if: count_members() == 0);
rel(br) -> .parents;
(._; - rel(r.parents););
out meta;
(Archived at:
https://storage.googleapis.com/osm_cleanup/emptyRelation.csv ;
https://storage.googleapis.com/osm_cleanup/typeMultipolygon.csv)
So anybody can run analysis they want on the output. More skilled people
can dig some useful stats. I just quickly ran this for first 1000 record
going back to 2014-05-20.
ids=$(curl -s https://storage.googleapis.com/osm_cleanup/emptyRelation.csv \
| tail -n +2 \
| cut -f 3 \
| uniq)
for id in $ids; do
curl -s https://www.openstreetmap.org/api/0.6/changeset/${id} \
| grep created_by \
| cut -d "\"" -f 4
done | cut -d " " -f 1 \
| awk '{stat[$1] += 1}
END {for (key in stat) {
print key, stat[key]
}}'
Seems biggest perpetrator is terms of changeset counts is JOSM
Go 1
JOSM/1.5 378
iD 76
Potlatch 9
Vespucci 7
I would be interested to see what cool things you find. Please keep me
updated.
On 18.4.21 6:34 , Frederik Ramm wrote:
> Hi,
>
> On 4/18/21 19:20, stevea wrote:
>> Along with the usual caveats about "mass edits" (these must be done
>> very carefully and with deliberate, targeted purpose)
>
> Another usual caveat is:
>
> If these buggy objects appear in clusters, they might hint at a deeper
> problem. Are many of them created by the same user(s) or by the same
> editor(s)? If that is the case, more research might be appropriate so
> that editors or workflows can be improved. Sometimes analysis of the
> broken data can also point to a broken import or mass edit that has so
> far been undetected and will, upon closer look, have more problems
> than just these obvious ones. In such a situation, simply deleting the
> buggy objects will remove the "red flags" that would otherwise have
> pointed at the broken import or mass edits.
>
> Bye
> Frederik
>
More information about the talk
mailing list