[OSM-talk] Automated edits code of conduct

Wed Jul 13 02:08:05 UTC 2016

On 12/07/2016 09:53, Christoph Hormann wrote:
> I would suggest to look at things more in terms of consistency - OSM is 
> all about local knowledge and mappers mapping their day-to-day 
> environment.  It is inconsistent with this aim to allow others to mess 
> around in this local mapping through automated edits without looking at 
> individual features one by one.

Automated edits should also have a place. For thing like pure tagging
errors:
  - URLs lacking "https://" prefix
  - leading and trailing spaces in names
  - common, obvious and non ambiguous typos
The error probability is almost null (error in script or typo?)

And then there are things like:
  - renaming a ministry (that was renamed) which was used in operator tags
  - renaming a electricity network operator
  - adding the leaf type of a precise specie where there is absolutely
no doubt and ambiguity on it's leaf type.
  - a whole bike sharing network closes in a city → deletions of station
nodes
  - a brand changed it's name country wide and after few months when the
physical renaming seem to have been completely deployed → Mass renaming

There is a cursor where more and more semantic and subjectivity is
involved and the risk of error increase. But we need many of these kinds
of clean up so the base doesn't rot. And of course prior discussion and
after work review are necessary to ensure that the edit is not replacing
errors by other ones.

---

On 12/07/2016 14:35, Frederik Ramm wrote:
> For example, a few years ago we had a user called "worst-fixer" or so
> who did a couple of large-scale edits removing the "created-by" tag. Now
> this was a mechanical edit against the rules, and there was a consensus
> in the community to remove those unwanted tags piecemeal instead of
> creating a new version for hundreds of thousands of objects, needlessly.
> 
> Strictly enforcing rules would have meant reverting all these edits but
> that would have been quite silly (causing another extra version to be
> created), so they were allowed to stand.

If we (still) accept such edits to be undiscussed then it's difficult to
draw the line with other edits that are unambiguously backed by the wiki
which would also give the impression of community consensus. And there
were case of revert of such edits.
Would a more just approach (in the case of a first incident) could be to
systematically ask(even in a case such as "created-by" tag) the
contributor to quickly open a discussion to call for review of the edit?
That would limit the frustration of first-time mass edit conflict (by
showing a simple solution to avoid revert(is discussion validates it))
and still ensure that the contributor will discuss it's edits in the future.

> Having a DWG whose legitimacy comes from rules would allow everyone to
> start endless discussions about DWG's interpretation of the rules, or
> finding loopholes in the wording.

Alright, I agree, in fact that already seem to happen to some extend.

Should DWG members intervene as DWG members? At least when blocking
power is not needed. During complicated discussions, it seem to happen
that the DWG members mention their belonging. Which could help as the
moderator status can give legitimacy, but the legitimacy of the status
can be questioned by controversial choices due to lack of time or simply
error. And more importantly, it makes the contributor forget that it
faces another contributor that is also volunteering to do QA. I think
the last point should be emphasized avoid a police to infringer
relationship. Because is these situations, were are all damn janitors
trying to clean or avoid mess. :)

> Perhaps we could make JOSM cleverer in detecting such cases and alerting
> people to the rules. JOSM already pops up tons of warnings - about
> moving lots of nodes, about displaced aerial imagery, etc. - it could
> also say "you're changing a lot of objects over a geographically large
> area at the same time and you haven't zoomed in on any, are you sure you
> have read the rules..."

That's actually a great idea! Even something as simple as a message when
first using the search and replace would be efficient.
*Who have the time to open a ticket?*
https://josm.openstreetmap.de/newticket

Are there any other common places where first time mass edits could be
performed without passing by the wiki?

> tuxayo:
>> The reporting of AECoC violations could be done in a dedicated open
>> mailing list so we could have accountability about how these issues are
>> handled.
>> *Any thoughts about this? This is a concrete proposal.*
> 
> DWG is happy about every case that the community manages to handle
> between themselves, without DWG having to get involved. If such a
> mailing list would help taking some of the load off DWG's shoulders and
> DWG would then only deal with those cases that the community can't
> handle or where things aren't clear enough, sure that would be great.

Alright, the steps that come to my mind to try to setup this are:

- Draft guidelines to handle automated edit issues to encourage
contributors interested in QA to try the "meta QA" level ^^

- Decide the scope: It could include generic bad edits issues because:
    - If I send a changeset comment or a private message to a
      contributor about a bad changeset, I could forget it and it
      could remain unfixed.
      Then having a place when one could send an email saying: "this
      contributor mapped a lot of dubious stuff, I posted a comment to
      ask for clarification/fix" would help keeping track of that.
      While also providing the community an overall state of these
      issues.
    - In the case where one took the time to contact a contributor
      about issues and doesn't get a response and the situation needs a
      revert. But one doesn't have the time/knowledge/confidence to
      perform a revert.
      Having a place to ask someone else (not the user SomeoneElse :-P)
      to perform the revert or to confirm that it's justified would be
      nice.
    - I guess the DWG must also receive many requests like:
        - A contributor doesn't respond to issues and a revert is
          needed. (if the contributor is active then a block could be
          need so yeah calling DWG totally makes sense)
        - Someone fearing to forget about following up the case or who
          don't want to bother contacting the other contributor would
          directly report the case to the DWG.
  And it's very likely that the people being interested in contributing
  to handle these more generic issues overlap a lot with those
  interested in automated edits issues. If the needs above would
  justify creating a similar structure, then it would be simpler to
  have one with a broader scope.
  However, we might want to keep the scope narrow enough if it's too
  complicated to find a consensus to set this up.

- Find how to direct all (or a fraction) of the automated edits issues
  that are currently reported to the DWG. It would be effectively a
  proxy for that kind of work that hopefully will absorb a fraction of
  it.

- Find a relevant list name: depends on the scope and should be future
  proof. Remember that the wiki pages about mechanical edits where
  renamed automated edits. So as the name could change and "automated
  edits" might mislead about the scope (because the intuitive
  definition might suggest that mass search and replace is excluded and
  that it's for bots) It must be chosen with care.

- Submit for comment and approval all the above

- Ask to the appropriate people to create the list

*Is an important step missing? *

---

On 12/07/2016 14:35, Frederik Ramm wrote:
> tuxayo:
>> Considering the standards required for tags and automated edits,
>> not having comparable ones for the content of the AECoC is inconsistent
>> compared to it's importance.

> The rules about automated edits stem from their ability to upset many
> people in the community. Reverting an automated edit will usually only
> upset one person. 

It limits the possibilities of cleaning the database for which there is
often more than one person who agrees with a given edit. (e.g. fixing
spelling mistakes, etc, it's case by case). So yes in the strict sense
in the said situations, there is often on person upset by a revert but
it's tool simplistic to just count like that. That why a balance has to
be found (it's already found to a certain extent but we shouldn't
underweight one of the two sides)

> It is a logical fallacy to believe that just because
> automated edits are a problem that needs to be regulated, the reverting
> of automated edits needs to be regulated as well.

That might indeed be the case because I can't clearly phrase why «*just*
because automated edits are a problem that needs to be regulated, the
reverting of automated edits needs to be regulated»

It's more about the need to follow standards when asking and forcing
other people to follow standards. ← That would be some general principle
but I understand that it's not very clear why it should be universally
respected (or at least respected by default).

---

On 12/07/2016 18:29, Éric Gillet wrote:
>  The DWG currently use those (AE CoC) rules to revert changesets without
> further justification.

It's actually only about the "discuss your edit" point isn't it?

Which is essential easy enough to justify.

> It's a good thing that rules could be bent a little, but that means that they should be modified. Defining rules but overriding them when convenient is not a sane approach in the long term.

Agreed, it should be made more explicit how they are actually bent and why.

Because when reading:
«Automated Edits code of conduct must be followed at all times when
performing Automated edits»
«you should therefor be sensitive to proceeding with major changes even
where the great majority support the change»
«If you find that your plan is widely accepted except for a few
dissenters, then work with those people to understand their reasons for
objecting. If you can not find accommodation then consider making an
exception for their edits or area.»
- Making a wiki page has an equal importance as discussing.
«Your edit may be reverted even if you have followed this policy»

The feeling is:
- I have to do *everything* if I want to stand a chance for my several
hours work to not be reverted. (I'm curious how many edits comply with this)
- One opposition in the discussion (except when there is an overwhelming
consensus) will prevent executing the change without risking to loose
all the effort put in planning and executing the change.

So basically to still want to do such QA work, one must not care about
all (conscious lone wolves) or half of the guidelines(which are written
explicitly as rules). Which could actually filter out mostly the
contributors who would be the more cautious about their work.

This is a bit of an extreme interpretation but when reading the AECoC
without any experience about how it actually works (which edits are fine
and which shouldn't be done) I don't think it's that much exaggerated.

> 2016-07-12 14:35 GMT+02:00 Frederik Ramm <frederik at remote.org
>> DWG is happy about every case that the community manages to handle
>> between themselves, without DWG having to get involved. If such a
>> mailing list would help taking some of the load off DWG's shoulders and
>> DWG would then only deal with those cases that the community can't
>> handle or where things aren't clear enough, sure that would be great.
> 
> 
> So at least one user should reach out to the contributor before involving the DWG ? That would be great but that's not currently the case in my experience.

Which is why having a "proxy"(← any better metaphor?) placed before the
DWG could help having greater discussion attempts before invoking the
DWG. It would be mainly due to having a much lower entry barrier than
entering the DWG which would allow more contributors to take part in
handling these issues. More resources should hopefully mean more time on
each case:
- to limit bad experiences when trying to do QA using mass edits
- to convince why at least always discussing the edit is important.

>> The rules about automated edits stem from their ability to upset many
>> people in the community. Reverting an automated edit will usually only
>> upset one person. 
> 
> At least some reversal were done after only one complaint, so it doesn't currently work like that.

That means that the DWG member also didn't agreed with the edit. But
yeah it's still the problem of unreachable true consensus.

---

On 12/07/2016 19:35, Andy Townsend wrote:
> There will be cases where the data that's in OSM is "a bit woolly", and doesn't quite get the sense of a real-world entity across (but without an on-the-ground survey it's difficult to say what the problem is).  Sometimes the fact that OSM mappers have captured something that "doesn't quite fit" OSM's frequently used categories is really useful, because it identifies something that we should categorise better - so it's important that the _sense_ of what the original mapper reported is kept, rather than their square peg being hammered down into a round hole**.

Does this represents the majority of the automated edits that are wrong
is their content? (content as in the changeset itself, not it's process,
like discussing the edit) I'm curious what are the most important errors
about content.

---

@Christoph Hormann @Frederik Ramm

I think you are right about thinking more about guidelines than rules
and not blindly enforcing many complex rules to avoid "lawyering"
issues. It can't be "that simple" to actually work :-(

-- 
tuxayo