[OSM-legal-talk] decision removing data
Frederik Ramm
frederik at remote.org
Wed Aug 4 14:40:37 BST 2010
Hi,
80n wrote:
> This quickly gets quite complex when factored across multiple
> generations of way splits.
You're right, let's just ignore way splits altogether then ;)
> Changesets are a relatively recent invention. Edits prior to the
> introduction of changesets don't have any formal grouping so this
> approach will not work for old data.
When changesets were introduced, changesets have been synthesized for
all old editing sessions according to a reasonably clever scheme that
should indeed catch any split ways.
> Even older data that was converted from segments will have no history at
> all because it was discarded.
The amount of data that existed back then was relatively small. It is
trivial to find out the list of contributors for each way that still
exists today. (The database still lives on a backup.) Again, there may
be fringe cases that get overlooked which we will then fix if someone
complains, but on the whole I think this is manageable.
(Thing I'm more concerned about is what happened when users changed
their names; back then we only recorded names not IDs so it might be
difficult to trace back a contributor through name changes. But again,
this is something were we should not throw the baby out with the
bathwater - if we, despite best efforts, overlook someone's changes
because they have changed their username, and accidentally keep what we
shouldn't have kept, let them complain and then we'll fix it.)
> Such auto-detection could be limited to areas where we have recorded
> contributions that are not being relicensed; in all other areas we
> would not have to bother.
>
> Prolific editors don't tend to restrict their activity to a single
> location. This might be more widespread than anticipated.
Prolific editors also tend not to leave the project in a huff.
> Any such mechanism, in my eyes, need not be 100% perfect; it is
> sufficient to make a honest attempt at doing the right thing, and if
> a few things slip through, then fix them in case of complaints.
>
> Anyone who cares strongly enough to not want to relicense their work
> will probably make a lot of complaints if their work is not fully
> purged This could generate a very large amount of manual remediation.
I think we're already doing a *lot* in respecting their wish, planning
to use a huge amount of manpower to actually purge their data. You know
that there are voices who say let's just relicense everything and ignore
these people - we won't do that because we think that if they want their
data removed, even if it hurts the project, it is prudent to do it. We
don't even say "remove it yourself" (and you know that there are voices
who recommend that - simply declare that everyone who doesn't want their
data relicensed at date X should please remove it now). They just have
to say it and we will try to remove their contributions as good as we can.
I think it is safe to assume that those who "care strongly" will most
certainly not be silent, no matter how much diligence is invested on our
part. If all else fails, they will claim intellectual property on the
corner pub that has been placed where they drew the roads. For those who
"care strongly", leaving the project is probably painful, or sad, and in
many cases they will spend time or even money to make the process
painful for us as well, if only to prove that they were right in
forecasting trouble.
This will happen no matter how diligent we are in removing their
contribution. We might as well not make an effort at all; the amount of
ire will probably be roughly the same.
> If there is anything under development it would be good if we could see
> it. It is unlikely to be a trivial piece of code and I'd be very
> surprised if it can be developed by September 1st if it hasn't already
> been started.
I am not aware of a deadline like that. If I were the LWG, I would not
want to get into discussions about what exactly has to be removed in
which case before I know how many people agree to relicense their data.
> The whole relicensing effort would be a bit of a non-starter if this
> deletion process cannot be done.
I'm sure it can be done. I'm also pretty sure it can never be discussed
to everybody's satisfaction on this mailing list, so I'm all for
postponing that until we have the acceptance figures.
Bye
Frederik
More information about the legal-talk
mailing list