[Rebuild] Communication to data consumers wrt the licence change (draft)

Dermot McNally dermot at osmfoundation.org
Fri Mar 23 15:17:52 GMT 2012


On 23 Mar 2012, at 13:50, Frederik Ramm wrote:

> I would still like to find out who exactly the "we" in this is. Because I surely am not, and I am not aware of any discussion within the rebuild group where we said "oh, let's change our minds".

I'll try to give a good answer to this. In an ad hoc rebuild group with volunteers, no official responsibilities and differing opinions it's not so easy.

First let's recap on some the players here, in no particular order (there are loads more, but I'm focussing on those who might be implied under the "we" above):

Matt: Is writing the bulk of the code for this based on the in-place model that has been productively discussed and found enough fans to become The Best Option. Is on the board.

Frederik: Did most of the thinking and coding behind the WTFE that crystalised peoples' expectations of what we mean by clean or dirty. Provided corresponding tools, went to to contribute code to Matt's efforts, plenty of other really good stuff.

Simon: Strongly engaged in entire process towards licence change, participated in LWG over a long period, created tools to help in the cleaning process and took ownership of the communications to data consumers.

RichardF: All round good guy and ODbL advocate, hasn't spoken a lot about the rebuild but is in this list because he contributed code to Matt's.

Me: LWG member, contributed some code to Matt's effort, volunteered to try and improve communications by documenting and communicating the plan to get from here to a completed rebuild.

LWG: Stated position once LWG was happy that all policy (especially legally significant) decisions were clear was to get out of the way and let The Community JFDI the rebuild, remaining on hand for queries on such matters as edge cases for redaction logic.

Board: Wants to see this all concluded, has been strongly criticised for dragging it out so long, to this end set an April 1st date late last year to bring things to a head. Which it has done. Oh, and I'm a member of it too.

Sysadmins: Will be needed to help get this stuff done, it has already been flagged that they may not have had as early sight of stuff that they would have wished, something for which I am personally sorry. In typically incestuous OSM fashion, Matt is, of course, not terribly far away...

Members of rebuild@ mailing list, many of whom also participated in a planning conference call: You know who you are, great that you are taking an interest, don't be strangers!

You address the issue of decision making. So I have been documenting a bunch of stuff, some of which comes as news to many people. It's worth mentioning that I'm not making it up as I go along, nor am I attempting to apply personal preferences. The first draft of the document, before it went live, was reviewed by Frederik and I made many changes to it based on the valuable feedback he provided. Matt subsequently provided a lot of steering and continues to do so, the structure of the document having from the start been based on my improving understanding of his code. As such, in as much as anybody can be seen to be influencing the direction of the document I maintain it is Matt. Likewise, it is surely this document that is setting Simon's interpretation of The Plan.

So the "we" in Simon's words is either me and Matt or just Matt, in as much as defer to his judgement on matters relating to his code and how best to deploy it. This is a crude form of the rules of JFDI, but I'll come back to this...

> We all thought that a soft cut over would make sense when we had our telephone call. Why the sudden change? Everyone I spoke to thinks that the soft cut over is prudent, easier to monitor, better for avoiding mistakes, and it is clear that it is less stressful.

Let's get the easy bits over with first - yes, it's sudden.

Prudent? Yeah, maybe. Let's get "easier to monitor" in there too. I was, of course, an early fan of the soft version (I had a slightly different model in mind, but for our purposes here the difference isn't significant). Prudence for me is a question of avoiding risk. Redact a running database, even in a reversible process like ours, and you run certain risks that will not arise if you do it offline instead. I'm thinking here mostly of cases where an object is redacted, a mapper notices and changes it, then it is determined that the redaction logic was flawed. In the live model, you have a more difficult time reprocessing the object than you would have had offline (this assumes, of course, that you can spot your mistake before re-enbling writes). All this said, this reasoning was _not_ why Matt and I turned towards an offline process.

Rather Matt reached the conclusion that to have a sufficiently performant rebuild process, the offline approach, which is by definition faster than the live one, would be a better bet. I'm paraphrasing, but Matt can correct me if I misrepresent him. He made a technical call based on his own code.

> I am relatively sure that the "hard cut over" is going to be a major cock-up and if I cannot prevent it, then at the very least I want to be able to point the finger at someone afterwards and say: This person/these people have decided that we will not go the soft route.

How noble. It will be somebody who stuck their neck out anyway. Far too few people have done so in this process, and you may have offered an insight into the reason why.

Do please let's confront one fact here: the reason I'm spending my time maintaining the plan document and answering emails like this (so those doing the real work don't have to) is to provide as much transparency as possible in a fast moving process with too few people in it. There hasn't been secrecy in the "decision" to favour an offline rebuild. Just as soon as Matt and I had discussed the issues involved in an offline update and he had adopted it as his preferred approach, I updated the plan accordingly. The plan is a living document that exists for the rebuild team in its entirety. In as much as it has problems it has the problem that too few people are discussing it.

That we are now discussing it is good. I'd prefer that we not do so under a cloud of conspiracy theories and threats of retribution.

I'll reiterate what I tried to say on the rebuild list a day or so ago - yes, Matt and I discussed and documented a process involving a hard cutover. We have not burnt any boats in this regard. We are as a rebuild team free to discuss objective pros and cons of each approach. Matt has indicated that he sees no reason not to develop the software tools necessary for either rebuild model in order to give ourselves options. I won't lecture you (Frederik) on the evils of criticising from the sidelines given that you have been among the most engaged in the entire process including rebuild. But there are others who I feel have been overly critical while insufficiently engaged.

Let us return to an assumption that other mappers are acting in good faith. We have a documented plan. It may not be perfect. It is not immune from either criticism or improvement. So far we have had a lot of the former. Before we hit the point of no return we will have discovered more about our options. (/me notices copious use of the word "we" in this para, I think and hope it works in the most inclusive interpretation). I'm talking in particular about the weekend tests. If a possible course of action can be seen to be foolish I don't think there is any one of us who will insist on following it anyway.

On a side note, I'd like to invite you all to join me in publicly wishing Matt a very happy birthday. He is working his arse off to make the rebuild a success and it's worth remembering that we are all good-hearted human beings who like each other and who strive towards a broadly common goal.

For he's a jolly good fellow!

More information about the Rebuild mailing list