[Rebuild] About keeping and removing tags
Frederik Ramm
frederik at remote.org
Fri Mar 2 23:27:08 GMT 2012
Hi,
On 03/02/2012 11:25 PM, errt at gmx.de wrote:
> We should certainly do that, there's lots of valuable information that
> can and therefore should be kept. The interesting point is: What
> algorithm do we use to decide on what is a clean override and what is a
> derivation? Something like Levenshtein distance might be worth
> considering.
I thought about that, but it's not going to work as
J.-W.-v.Goethe-Strasse -> Johann-Wolfang-von-Goethe-Strasse
will have a greater distance than
Talstrasse -> Bachstrasse
but the former is not significant while the latter is.
> There's also a point in just saying every change can be
> kept, as the changer apperently has some knowledge, otherwise he
> couldn't do the change (ok, that's not true with typo correction, but
> the question is: How many such cases are there?)
xybot alone has made around 5 million edits, none of which is based on
any knowledge of the object in question, and I'm sure there will be many
others. If we tried that approach then we would have to create another
list that contains *agreeing* users/changesets of whom we know that
their work is usually automatic...
I agree that a small amount of situations where we accidentally
white-wash a decliner's contribution because it received a minor
automatic edit or spelling correction would be acceptable, but we would
certainly have to make *some* effort to find out the top "automatic
modifiers".
Bye
Frederik
--
Frederik Ramm ## eMail frederik at remote.org ## N49°00'09" E008°23'33"
More information about the Rebuild
mailing list