Things you can do when RAID fails you

A few years back I worked for the subsidiary of a major Japanese electronics conglomerate. One of the subsidiary businesses makes hard disks drives. I recall one of the things I needed to become familiar with was “mean time between failure” – a numerical number which in layman terms implies the reliability of a hard disk.

Often used as a marketing ploy, the everyday person often equates MTBF very closely to reliability. WHile there is a connection, MTBF is just one of several components that make up for true reliability in a disk drive.

Another often misrepresented storage terminology is RAID. I recently touched base with Bill Margeson, CEO, President and Co-Founder of CBL Data Recovery Technologies, on the subject of RAID, reliability and how companies in Asia plan their disaster recovery practices.

What follows is an interesting compilation of responses to often misconstrued ideas.

RAID is a technical term. What should business users understand / know about RAID?

“Despite built-in redundancy and active failover capability RAIDs do fail.” Bill Margeson”CEOPresident and Co-Founder of CBL Data Recovery Technologies

A Redundant Array of Independent Disks or RAID combines physical hard disks into a single logical unit either by using unique hardware or software thereby offering redundancy and increasing data availability. Despite manufacturers’ marketing material claiming 99.999% uptimemultiple-drive failures do happen. 

RAID configurations are complex and when a RAID system does failit can threaten a large amount of critical data and put a business at extreme risk.   While today’s RAIDs are more reliable then their predecessorsRAID failures do happen and can be caused by component quality issuesfirmware changes or by environmental factors such as heatvibrationmoisture and power surges.  Andof course we’d be remiss if we didn’t acknowledge that human error may sometimes be the cause of a RAID failureand the subsequent data loss.