[OHM] server offline - investigating

Wed Aug 10 17:51:51 UTC 2016

I'm curious why we don't move to a cloud hosting provider? Maybe Heroku,
DigitalOcean, AWS? Forgive me if I dozed off for that part of the
discussion about reviving the server.

Cheers!

On Wed, Aug 10, 2016 at 9:39 AM, Tim Waters <chippy2005 at gmail.com> wrote:

> Hi folks,
>
> bit of an update: good news.
>
> We had got into the server and booted in rescue mode, and this Monday
> Sanjay successfully got the good drive mountable.
> One of the drives in the Raid array had indeed broken, but it broke in
> a non standard way and it looked like the system removed the good
> drive from the system rather than the failed one so that complicated
> matters somewhat!
>
> Rob has backed up the postgres data directory, and hooked it up on his
> server, and is making a dump of the API / Website database.
> The database also has the data for the tiles too.
> Rob will try to produce a recent planet file of this database.
>
> We've got backups of various configuration files, tiles and past planets
> too.
> We've got a backup of overpass also.
>
> Our next step would be to mount the home partition to backup data from
> there.
>
> I'll try to keep you all updated as we go.
>
> Where we go from there we will have to see.
>
> I imagine either fixing the server, getting a new drive and recreating
> the raid array, or resurrecting the services on a healthier server.
> We may try to boot the poorly server up on the good drive, and put the
> API into read only mode if it's not too complicated.
> I believe Topomancy is no longer able to continue to host OHM after
> the Summer so it could be best to move everything initially.
>
> Regards,
>
> Tim
>
> On 5 August 2016 at 14:51, Richard Welty <rwelty at averillpark.net> wrote:
> > On 8/5/16 9:43 AM, Tim Waters wrote:
> >> It's software RAID1 I think, two drives each with their own copy of
> >> the data. It's a common Hetzner setup.
> >>
> >> We did have some email help from a friendly admin who had a similar
> >> issue a while back - I imagine that this kind of server admin
> >> knowledge is hard to find and we've not progressed any more past this
> >> help though. I have some time early next week to follow through the
> >> steps and see what happens in case we don't get any more help.
> >>
> >>
> > normally if a disk fails in a truly redundant RAID setup (hint, RAID 0
> > is not actually redundant), you replace with a similar disk and rebuild,
> > which in RAID 1 is simply copying the data from the good (old) drive
> > to the good (new) drive. i've never done the procedure for Linux
> > software raid, i've only done this stuff with a couple of different
> > commercial hardware RAID controllers, where the rebuild is frequently
> > automagic.
> >
> > richard
> >
> > --
> > rwelty at averillpark.net
> >  Averill Park Networking - GIS & IT Consulting
> >  OpenStreetMap - PostgreSQL - Linux
> >  Java - Web Applications - Search
> >
> >
>
> _______________________________________________
> Historic mailing list
> Historic at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/historic
>

-- 
Tod Robbins
Digital Asset Manager, MLIS
todrobbins.com | @todrobbins <http://www.twitter.com/#!/todrobbins>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/historic/attachments/20160810/6288340a/attachment.html>