[OHM] server offline - investigating

Tim Waters chippy2005 at gmail.com
Wed Aug 10 15:39:44 UTC 2016


Hi folks,

bit of an update: good news.

We had got into the server and booted in rescue mode, and this Monday
Sanjay successfully got the good drive mountable.
One of the drives in the Raid array had indeed broken, but it broke in
a non standard way and it looked like the system removed the good
drive from the system rather than the failed one so that complicated
matters somewhat!

Rob has backed up the postgres data directory, and hooked it up on his
server, and is making a dump of the API / Website database.
The database also has the data for the tiles too.
Rob will try to produce a recent planet file of this database.

We've got backups of various configuration files, tiles and past planets too.
We've got a backup of overpass also.

Our next step would be to mount the home partition to backup data from there.

I'll try to keep you all updated as we go.

Where we go from there we will have to see.

I imagine either fixing the server, getting a new drive and recreating
the raid array, or resurrecting the services on a healthier server.
We may try to boot the poorly server up on the good drive, and put the
API into read only mode if it's not too complicated.
I believe Topomancy is no longer able to continue to host OHM after
the Summer so it could be best to move everything initially.

Regards,

Tim

On 5 August 2016 at 14:51, Richard Welty <rwelty at averillpark.net> wrote:
> On 8/5/16 9:43 AM, Tim Waters wrote:
>> It's software RAID1 I think, two drives each with their own copy of
>> the data. It's a common Hetzner setup.
>>
>> We did have some email help from a friendly admin who had a similar
>> issue a while back - I imagine that this kind of server admin
>> knowledge is hard to find and we've not progressed any more past this
>> help though. I have some time early next week to follow through the
>> steps and see what happens in case we don't get any more help.
>>
>>
> normally if a disk fails in a truly redundant RAID setup (hint, RAID 0
> is not actually redundant), you replace with a similar disk and rebuild,
> which in RAID 1 is simply copying the data from the good (old) drive
> to the good (new) drive. i've never done the procedure for Linux
> software raid, i've only done this stuff with a couple of different
> commercial hardware RAID controllers, where the rebuild is frequently
> automagic.
>
> richard
>
> --
> rwelty at averillpark.net
>  Averill Park Networking - GIS & IT Consulting
>  OpenStreetMap - PostgreSQL - Linux
>  Java - Web Applications - Search
>
>



More information about the Historic mailing list