[Rebuild] About keeping and removing tags

Frederik Ramm frederik at remote.org
Fri Mar 2 23:27:08 GMT 2012


Hi,

On 03/02/2012 11:25 PM, errt at gmx.de wrote:
> We should certainly do that, there's lots of valuable information that
> can and therefore should be kept. The interesting point is: What
> algorithm do we use to decide on what is a clean override and what is a
> derivation? Something like Levenshtein distance might be worth
> considering.

I thought about that, but it's not going to work as

J.-W.-v.Goethe-Strasse -> Johann-Wolfang-von-Goethe-Strasse

will have a greater distance than

Talstrasse -> Bachstrasse

but the former is not significant while the latter is.

> There's also a point in just saying every change can be
> kept, as the changer apperently has some knowledge, otherwise he
> couldn't do the change (ok, that's not true with typo correction, but
> the question is: How many such cases are there?)

xybot alone has made around 5 million edits, none of which is based on 
any knowledge of the object in question, and I'm sure there will be many 
others. If we tried that approach then we would have to create another 
list that contains *agreeing* users/changesets of whom we know that 
their work is usually automatic...

I agree that a small amount of situations where we accidentally 
white-wash a decliner's contribution because it received a minor 
automatic edit or spelling correction would be acceptable, but we would 
certainly have to make *some* effort to find out the top "automatic 
modifiers".

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"



More information about the Rebuild mailing list