[Rebuild] Do I win a prize if I am the first to post?
frederik at remote.org
Wed Jan 11 00:05:47 GMT 2012
let me open up discussions on this list with a few thoughts.
I do not have solutions for all this, I just want to make sure we're "on
the same page", see the same problems.
When we rebuild the database, there will be some objects edited only by
people accepting the license and some edited only by people who don't.
These are both easy - they will pass unchanged into the new database,
unless someone suggests that while we're at it we make some other change
like re-numbering everything to make the ID space more dense or so.
Then there are objects edited by both agreers and decliners.
One could think "let's just keep those versions done by agreers, and
drop those by decliners, and let's make a new version of all objects
that contains only the content not added by decliners."
This would lead to a situation where some versions are missing. Parts of
our Rails code might have to be hardened against that - it is possible
that somewhere we have code that just counts versions from 1 to n. Also
it is possible that client software out in the wild has such problems,
and if we decide to go this way it would be good to offer something like
relicensing.dev.openstreetmap.org with such a "database with holes" so
that clients can be tested against that.
Then there is the issue that data by decliners might affect more than
the current version, e.g.
Version 1 of way created by woodpeck
Version 2: John Smith adds "name=Blah Road"
Version 3: woodpeck adds "oneway=yes"
Version 4: our rebuild script removes the name tag
We would now delete version 2 from our database, so only 1,3,4 are kept.
But what happens to the "name=Blah Road" tag that is still present in
We can either remove that tag from version 3, thereby falsifying history
(making it look like the tag was never there) - probably a bad idea.
Or we can remove all versions that contain any information contributed
by non-agreers, which might be a lot, and we would lose a lot of history
along the way.
Another option is dropping the whole history for everything now, and
start with a clean database where version 1 (or version n) is the
current version and no other versions exist. (We could keep a read-only
version of the last CC-BY-SA database with full Rails port functions on
a simple server somehow, doesn't matter if it's slow - just so that
people can still access history if they want, but that would all be
Or we could opt for a limited keeping of history whereby every object
with more than one historic version is reduced to having exactly two
versions - v1 is the very first, and v2 is the current one, and
everything in between is removed.
When I talk of "removing" then this does not necessarily mean "remove",
we could also implement functionality a bit like the "visible" flag that
would allow us to make sure that while all versions are still in the
database, we would only ever return those not flagged. Then we'd flag
everything that contains a scrap of CC-BY-SA-only contribution, and all
normal history calls etc. would refuse to return those objects. (If
built in a more general way, such a flagging mechanism could also help
us to weed out license-violating stuff in the future.)
I assume that, since OSMF is the compiler and publisher of this
database. we would be exempt from the ODbL rule that "if you combine
ODbL and non-OdbL content in a database you have to release the derived
database", so it would be ok for us to keep both.
We could perhaps even offer a special API call, like
that would allow users to access individual old versions under CC-BY-SA
where these are not returned by the normal API because of relicensing
problems. That special API would then issue a huge XML comment saying
the following is CC-BY-SA, and would of course not be available for
anything that is created post-changeover.
My final idea is a slightly outlandish variant of the above but even
easier: Simply make the new API return *no* pre-changeover versions at
all, and keep all the pre-changeover versions in a special CC-BY-SA-only
I guess that's enough for a first post.
Frederik Ramm ## eMail frederik at remote.org ## N49°00'09" E008°23'33"
More information about the Rebuild