[Rebuild] Rebuild plan (followups to rebuild list, please)
tom at compton.nu
Thu Mar 22 14:43:44 GMT 2012
On 22/03/12 14:07, Dermot McNally wrote:
> It would be nice to be able to claim that it was gratifying to see
> such a sudden surge of interest in a topic for which it has, until
> now, been difficult to drum up much enthusiasm. Those who have
> participated in the process of getting us to the point where we have a
> plan and an emerging toolset - they deserve our thanks and they have
I made a deliberate decision not to get involved with the details of
designed the migration algorithm etc as I felt I already had enough on
I assumed that once you had worked out what needed to be done and had
the necessary code you would contact me and other people involved in the
operational side of things to discuss planning the actual operational
side of making the change, scheduling any downtime etc.
> "The plan should be postponed until after April 1st"
> To this I will simply state that deadlines are a Good Thing when you
> are trying to get something done. Until we have completed this task it
> is good that we should work to some deadlines even if they have to
> evolve in the light of circumstances. If a safe rebuild or a portion
> of it really has to slip beyond 1st April then that will have to
> happen. There is, however, no virtue in ensuring that we slip by a
> token few days just to prove that the world will not end. But be
> assured that the plan is a living document that will not ignore
> emerging realities.
As I understood it 1st April was not a deadline. It was, like the other
working group goals, "a challenging aspirational target" or some such
As such it has, as Frederick observed, obviously helped to get things
moving but I certainly did not, and do not, see it as a hard deadline
that must be met to avoid the world catching fire.
> As this is a fast moving process, the plan does not yet reflect the
> fact that we also hope to commisison the new database server and
> install a full API database. The redaction process will then also be
> commenced on this box (we have a choice whether to test the offline or
> online redaction), something that will give us the fairest benchmark
> (and the most random distribution of test cases) possible. Even during
> the running of this full planet test it will be possible to view and
> validate the decisions being made.
Again, you're talking about doing this in the next few days and you
haven't even spoken to anybody on the ops side about this.
For example, if you want a full database loaded on the new server for
testing this weekend then I probably need to start loading it right now.
> "Downtime should have more notice"
> Yes, it should. Maybe we will manage to shorten the length of it
> and/or move it to a more acceptable time. There are not many of us and
> we are under pressure.
So delay it a bit and give yourselves more time, then the pressure will
be less. It seems that the only driving force behind the short notice is
a desire to be seen to have completed the job by 1st April.
> This mail is long already, so apologies if I have missed something
> important. I look forward to seeing a lot of you on rebuild@ and we
> can identify any gaps together.
Well you haven't responded to my concerns about the fact that the
proposed API changes have not yet been presented for review which makes
me very sceptical about the chances of being able to deploy them next week.
Tom Hughes (tom at compton.nu)
More information about the Rebuild