[OSM-talk] Code of conduct for automated (mass-) edits
Frederik Ramm
frederik at remote.org
Mon Sep 29 00:08:50 BST 2008
Hi,
as OpenStreetMap draws more and more sophisticated users, we're also
seeing more scripts or, as they would be called in Wikipedia, "bots",
modifying data.
I'm in a bit of a dilemma here. I have been using such scripts for a
while now (see my "Fixbot" page on the Wiki, or some past "undo"
actions). I have always felt that being able to use scripts gives me
considerably more power than the average user, and I have tried to
ensure that I use that power responsibly.
That kind self-restraint is, however, not the norm for everyone who is
capable of doing automated modifications. We are now seeing automated
edits on a large scale, often un-discussed and un-documented. When you
ask the authors they respond with something like "oh, I read on the Wiki
that something should be so-and-so, so I thought I'll just change it".
One example to which I took exception is that someone in Germany has
"corrected" a five-digit number of ways by inserting spaces in "ref"
tags (ref=A18 became ref=A 18) and/or changing "Strasse" in the name to
"Straße", which is the correct spelling (but nonetheless "Strasse" is
often found on signs).
Now the actual changes done are not too bad; they are actually, ex post,
welcomed by the majority of people on talk-de. Had the author of the
script discussed the issue on talk-de before, he'd probably have
received an almost unanimous go-ahead from the community.
Still, this issue makes me feel uneasy. We take pride in not having
fixed rules. If someone, somewhere, decides to tag a road as "Strasse"
not "Straße" because that's what's on the signs, however wrong
orthographically: Should someone else, armed with no local knowledge but
just a set of spelling rules, without prior discussion, run a script
that changes this? Is this not showing disrespect to other people's
contributions?
Another issue is, *if* something is changed, *how* this is done. Lacking
0.6's versioning, if anyone analyzes yesterday's planet file to find
ways he'd like to fix and uploads changed versions of each, chances are
he'll overwrite all those that have been changed between the generation
of the planet file and his script run. Whoever wants to run an automated
update should know exactly what he's doing, and be in a position to
exactly revert his changes should it turn out they were faulty.
And still another thing is documentation; I somewhat expect that any
automated, large-scale change should be documented. When was it done,
what exactly was done, how many objects were affected, what were the
"source" and/or username settings for the job so that it can be
identified later.
When I issued words of caution on the German list, some people came to
me grinning and said "well there you have it, that's what happens when
you have a project without rules, and anyone making automated changes
has the same right to do so as anyone else".
I don't think this is true; scripts or "bots" are a powerful means of
enforcing rules. If they proliferate in an uncontrolled fashion, we'll
soon have a number of mini dictators who have constructed their own set
of rules and will modify anything that dares to be different. The
philosopher Karl Popper has called this the "paradox of tolerance" -
even if you preach tolerance, your tolerance has to stop at intolerance.
So if we preach the freedom to tag whatever you want and how you want
it, that freedom has to stop if people start mass-changing existing data.
I am in favour of setting up a code of conduct for automated edits. The
key elements would be:
1. Make a plan of what you want to change, and discuss in relevant forum
(usu. mailing list). If there are many objections; drop the plan. If
there are few objections, maybe exempt certain areas or objects created
by certain people in order to respect their objections. Remember that
they can easily change things back again if you act against their will,
so don't even try to play the superiority card.
2. Make sure your tools and knowledge are good: You have to be able to
revert your changes if something goes wrong, and you need to keep any
collateral damage to an absolute minimum. If you cannot guarantee that,
ask someone for help who can.
3. Run the job. If it is something big or something you will probably do
more often, consider creating an extra account for it so it is easily
recognizable.
4. Provide documentation that tells people what exactly you have done.
5. Remember: With great power comes great responsibility.
I would also accompany this by the notion that if you see an automated
edit that you believe has problems, and it has not been discussed or
documented, it's ok to revert it.
Bye
Frederik
--
Frederik Ramm ## eMail frederik at remote.org ## N49°00'09" E008°23'33"
More information about the talk
mailing list