[Rebuild] Do I win a prize if I am the first to post?

Frederik Ramm frederik at remote.org
Wed Jan 11 00:05:47 GMT 2012


Hello everyone,

    let me open up discussions on this list with a few thoughts.

I do not have solutions for all this, I just want to make sure we're "on 
the same page", see the same problems.

When we rebuild the database, there will be some objects edited only by 
people accepting the license and some edited only by people who don't. 
These are both easy - they will pass unchanged into the new database, 
unless someone suggests that while we're at it we make some other change 
like re-numbering everything to make the ID space more dense or so.

Then there are objects edited by both agreers and decliners.

One could think "let's just keep those versions done by agreers, and 
drop those by decliners, and let's make a new version of all objects 
that contains only the content not added by decliners."

This would lead to a situation where some versions are missing. Parts of 
our Rails code might have to be hardened against that - it is possible 
that somewhere we have code that just counts versions from 1 to n. Also 
it is possible that client software out in the wild has such problems, 
and if we decide to go this way it would be good to offer something like 
relicensing.dev.openstreetmap.org with such a "database with holes" so 
that clients can be tested against that.

Then there is the issue that data by decliners might affect more than 
the current version, e.g.

Version 1 of way created by woodpeck
Version 2: John Smith adds "name=Blah Road"
Version 3: woodpeck adds "oneway=yes"
Version 4: our rebuild script removes the name tag

We would now delete version 2 from our database, so only 1,3,4 are kept. 
But what happens to the "name=Blah Road" tag that is still present in 
version 3?

We can either remove that tag from version 3, thereby falsifying history 
(making it look like the tag was never there) - probably a bad idea.

Or we can remove all versions that contain any information contributed 
by non-agreers, which might be a lot, and we would lose a lot of history 
along the way.

Another option is dropping the whole history for everything now, and 
start with a clean database where version 1 (or version n) is the 
current version and no other versions exist. (We could keep a read-only 
version of the last CC-BY-SA database with full Rails port functions on 
a simple server somehow, doesn't matter if it's slow - just so that 
people can still access history if they want, but that would all be 
under CC-BY-SA.)

Or we could opt for a limited keeping of history whereby every object 
with more than one historic version is reduced to having exactly two 
versions - v1 is the very first, and v2 is the current one, and 
everything in between is removed.

When I talk of "removing" then this does not necessarily mean "remove", 
we could also implement functionality a bit like the "visible" flag that 
would allow us to make sure that while all versions are still in the 
database, we would only ever return those not flagged. Then we'd flag 
everything that contains a scrap of CC-BY-SA-only contribution, and all 
normal history calls etc. would refuse to return those objects. (If 
built in a more general way, such a flagging mechanism could also help 
us to weed out license-violating stuff in the future.)

I assume that, since OSMF is the compiler and publisher of this 
database. we would be exempt from the ODbL rule that "if you combine 
ODbL and non-OdbL content in a database you have to release the derived 
database", so it would be ok for us to keep both.

We could perhaps even offer a special API call, like

http://ccbysa.openstreetmap.org/api/0.6/node/1234/6

that would allow users to access individual old versions under CC-BY-SA 
where these are not returned by the normal API because of relicensing 
problems. That special API would then issue a huge XML comment saying 
the following is CC-BY-SA, and would of course not be available for 
anything that is created post-changeover.

My final idea is a slightly outlandish variant of the above but even 
easier: Simply make the new API return *no* pre-changeover versions at 
all, and keep all the pre-changeover versions in a special CC-BY-SA-only 
API.

I guess that's enough for a first post.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"



More information about the Rebuild mailing list