Ok some more techie guidance needed please

Rob Keeble

Member
Messages
12,633
Location
GTA Ontario Canada
I have been reading but cannot find what i want to know.

I am planning to set up a raid drive in my new WHS 2011.

My plan is to use a raid 10 as i understand it Raid 10 mirrors the drives.

So question. Raid 10 requires 4 drives minimum. Does this mean i need 8 to cater for the mirrors or is that gonna be a case of 2 and 2.

The raid mirror is part of my backup plan. WHS also has the facility built in to easily have drives mirror so now i am confused as to whether i should run a raid at all.

WHS old version before 2011 had drive extender which allowed one to create a single contiguous drive. WHS 2011 does not.

I am thinking of using 4 2TB WD RED drives for the WHS. Issue i am having is do i buy 4 or 8. Motherboard i have chosen to use only has 6 sata 3 6 gbyte ports on it for the raid.

Planning to use one for a SSD for the OS. Thought 4 would be enough now the raid mirror thing has me snookered.

This is one of those issues i rather trust one of our family techies than what i am not finding on the web. :) Sorry to put work stuff here but i got no one else to turn to.

Thanks in advance for any help or advice.
 
Thanks Darren appreciate the help. I read up some more in the interim and now thinking i might look to add a sata extender and use 1tb drives but increase the number to 8. May also now look to mirror with no raid stripping.
My reasoning is if i implement raid 10 or as you got me to realise raid 1+ 0.
with 4TB drives, given it would then have a single drive fault tolerance my concern would be the time it would take to mirror while its rebuilding.
I dunno now needs a bit more contemplating.

Sent from my MB860 using Tapatalk 2
 
Shorter version of the wikipedia explanation that I find people grok easier :D:
Raid 10 is stripes over mirrors so its like:
stripe(mirror(A+B), mirror(C+D))
so for 4 disks you get 2 disk space worth.
The reason its not:
mirror(stripe(A+B), stripe(C+D))
is that the second configuration is substantially less reliable because in order to recover the full mirror you have to read the entirety of two disks instead of one thus doubling your odds of total failure during the rebuild when the bad disk is replaced.

Which brings me to my second point..
So... I did a lot of math on this a few of years back and ended up making a large computer vendor buy a few thousand drives in addition to what they thought they could get away with supplying in order to meet the space+reliability spec.

The short version is that Raid 1 (and Raid 5) for that matter are .. questionable .. as a backup solution (well technically _any_ raid is not a "backup" solution its a redundancy solution.. more later). The problem is basically that your odds of the backup drive(s) failing while re-reading them are non-zero (exactly how non-zero is wildly subject to debate, but its there). As disks get larger the problem has actually gotten worse because there are more bits to fail (and reliability/bit hasn't kept up).

My rough math says there is a bit less than a ~0.5% chance of a double failure in your setup, not huge but its there (the example of many drives was all raid0.. and hundreds of computers had to survive multi-hour runs.. muhahaha good luck).

If I had the choice (I know about -zero- about WHS 2011 or WHS pre-2011 for that matter so this may or may not even be feasible; I'm so out of the windows world :bliss: :crash:) and all other things being equal and I was extra paranoid about failure modes.. I would generally choose Raid6 over raid 10, you get the same space (4 x 2T disks == 2T of space) but through the Magic of parity you get the ability to survive if any two disks fail (you can survive 2 disks failing in raid 10 as well but they have to be one of 4 lucky pairs: A+B or C+D or A+D or B+C but not the unlucky A+C or B+D). If you have > 4 disks then raid6 also offers space savings as the "cost" is N-2 (N==number of drives) so for 5x2T drives you'd get 6T of space, 6x2T==8T of space and so on. Now all things are not totally equal.. Raid 6 is ~75 the speed of raid10 for writes and up to 50% slower for reads (this is assuming the raid10 implementation is smart enough to read from either side of the mirrors in a sane way otherwise read performance approaches parity, and again the actual numbers varies wildly depending on the implementation, hardware/software used) and raid6 may well not be available (my 30s google search was inconclusive but not encouraging).

Finally on to backups vs redundancy...
Real time data mirroring and striping solutions provide redundancy, not backup. The difference is that a backup can (hopefully, if done correctly) survive people and software errors. Redundancy however means that when you (or a virus or .. the cat jumping on the keyboard or ..) accidentally deletes files its done in all the places at once. Both are advantageous in that redundancy solutions provide quick recovery (err scratch that rebuilds take forever.. lets go with "resiliency") towards hardware failures.
 
Thank you Ryan. You hit the nail on the head. I read an article recently about this exact problem which certainly left the same conclusion. I do plan to use a further proper backup but as discussed and suggested in a thread some moons ago, I am going to break up my data and categorise it into critical data versus non critical data.
Now that i am comfortable that my understanding of the method is reasonably accurate :) but also understanding especially after your confirmation of what i thought i had understood, (this helps like crazy when someone else offers the same understanding especially if you have respect for their abilities knowledge :thumb:) then one can decide the path to walk.
The large drive scenario is very attractive from a price per gig of storage, but the fundamental reliability of these storage methods have not increased in proportion to the growth in size. I am also finding so many in my "environment" have any grasp of what a gigabyte is and how long it takes to move it in bits and bytes without corruption.
So what do you think of my resulting to strategies i am thinking off to offset some of the risk of second drive failure during recovery.
a) Make the drives smaller and put more in the raid. In my case there is a limit to max drives for Raid 10 or 0+1 but there are other raids such as Raid 30 or 0+3 :)
My thinking is if the data is important and availability or as you say resiliency is an issue to rather use smaller drives and more of them.
b) Since we tend to build these computers and have all the drives come from pretty much very close to each other in terms of manufacturing batch , my thought has been to buy 50% of the required drives initially and then leave a decent (what decent is i have not determined yet) gap and then buy the other half. Thinking here being the mirror element gets added later with a view to gambling that the second batch is unlikely in the distant future to have a failure at the same time as the first. Where as if they all start out together and one reaches the stage where failure takes place, the view i have read is that there is a greater probability of catastrophic failure during recovery due to all drives being of the same "age" "wear" status.

I would embrace the "cloud" aspect and online storage to a far greater extent if i knew that companies would go on forever. But they dont. Things happen and when sold or brankcrupt and that server with your data on gets scrapped or sold piecemeal ones data is either in the wrong hands or gone from the face of the earth.
Our old kodak happy snaps have more reliability to be around when you consider what b&w images we still see today from peoples past. Excuse me if i am skeptical about the whole cyber world. :)
 
I would generally choose Raid6 over raid 10, you get the same space (4 x 2T disks == 2T of space) but through the Magic of parity you get the ability to survive if any two disks fail (you can survive 2 disks failing in raid 10 as well but they have to be one of 4 lucky pairs: A+B or C+D or A+D or B+C but not the unlucky A+C or B+D). If you have > 4 disks then raid6 also offers space savings as the "cost" is N-2 (N==number of drives) so for 5x2T drives you'd get 6T of space, 6x2T==8T of space and so on. Now all things are not totally equal.. Raid 6 is ~75 the speed of raid10 for writes and up to 50% slower for reads

At work we're even more paranoid. We buy TWO external RAID boxes. Those boxes are each set up under RAID 6, plus a couple of hot spares. Then we mirror the two RAID boxes together in software (zfs in our case). Downtime at work is REALLY unpleasant and disks are cheap enough ....

And we still do backups. Though zfs snapshots are VERY cool, and eliminate a lot of the need for backups. It would be nice if zfs would migrate down into Mac and Windows. Just imaging Mac Time Machine being smart enough to use zfs snapshots. Sigh.


All this is at work though. At home I confess I just run time machine and don't worry about the hassle of RAID or NAS boxes. Well, I also run an automatic off-site backup.

...art

ps: Rob, Ryan speaks wisdom. I also know pretty much zilch about WHS. I have read lots of folks using various NAS boxes in their houses, but that's all :dunno:
 
The large drive scenario is very attractive from a price per gig of storage, but the fundamental reliability of these storage methods have not increased in proportion to the growth in size. I am also finding so many in my "environment" have any grasp of what a gigabyte is and how long it takes to move it in bits and bytes without corruption.

I wouldn't worry about the transfer protocol/media reliability that much as generally there are enough checks build into all of the transport protocols to avoid corruption (speed for sure, its kinda painful to take a week to recover from backups when you need to :D). The drives themselves are even better than they were 15-20 years ago (although single bit errors are more common they have pretty good recovery/checks built into the drives for those now), the main problems left (simplified :D) is either catastrophic whole drive failure or failure of whole blocks. So basically structure your raid that you reduce the odds of them sufficiently to not have to deal with the slow solution in any meaningful timeframe.

So what do you think of my resulting to strategies i am thinking off to offset some of the risk of second drive failure during recovery.
a) Make the drives smaller and put more in the raid. In my case there is a limit to max drives for Raid 10 or 0+1 but there are other raids such as Raid 30 or 0+3 :)
My thinking is if the data is important and availability or as you say resiliency is an issue to rather use smaller drives and more of them.

Surprised that R30 is available... At 6 disks it does offer somewhat better space utilization than R10, but no better than R50. The failure probabilities are no better than R50 and I don't think better than R10 (math is hard... :D) and I believe worse than R6. Personally I'd pick R10, R5/R50, or R6 over R3/R30 because the former are more commonly used meaning the code paths are better tested. I like well used code paths it means someone else has found all of the problems.

b) Since we tend to build these computers and have all the drives come from pretty much very close to each other in terms of manufacturing batch , my thought has been to buy 50% of the required drives initially and then leave a decent (what decent is i have not determined yet) gap and then buy the other half. Thinking here being the mirror element gets added later with a view to gambling that the second batch is unlikely in the distant future to have a failure at the same time as the first. Where as if they all start out together and one reaches the stage where failure takes place, the view i have read is that there is a greater probability of catastrophic failure during recovery due to all drives being of the same "age" "wear" status.

You have two likely failure curves. First is the bathtub curve (infant mortality, followed by the "good times" followed by age and decay.. amusingly software has a similar lifecycle..) almost all hardware follows this to some extent (some have higher beginning or end curves of course). The second is just bad parts/manufacturing defects (we had one batch of several hundred drives all have a spring fail within ~2 weeks of eachother at just about 6 months). Mitigating against both is kinda hard. An alternative strategy would be to buy from two manufacturers (making sure they aren't just relabeled :D) and make those the mirror pairs.

I don't think your initial plan is/was bad, you just need to consider the risks and see if the cost/benefit is probably worth it.

Mostly what you're trying to do with the local redundancy is reduce time to recovery when you either:
a) suffer a disk failure
or possibly
b) accidentally oops a file or three
Another option to consider is adding a USB connected external drive that you periodically mirror to. With that you could take incremental backups from the point of the mirror and then you'd usually only have a small amount to recover if the main drive failed.

You have to remember as well that you're dealing with very small probabilities here as well. This gets really important when you're working with hundreds of computers or super high volumes of data, but the actual odds of a home array failure in the lifespan of the computer is really quite low. So its worth spending some effort for peace of mind but eventually you pick one of several decent options, quit worrying and move on :D

My current home setup is a two disk R1 array for kinda critical data backed up to a USB/network drive (these oddly don't cost much more than bare drives and sometimes cost less because of volume contracts, etc..) and a 5 disk R5 array (5 disk because we had 5 left over from another project, not because its better than three larger :D) for less important stuff. I'd have gone R6 but the setup didn't warrant it (and the R1 array is also the boot array because thats easier to recover the OS from as either disk works stand alone).

I would embrace the "cloud" aspect and online storage to a far greater extent if i knew that companies would go on forever. But they dont. Things happen and when sold or brankcrupt and that server with your data on gets scrapped or sold piecemeal ones data is either in the wrong hands or gone from the face of the earth.
Our old kodak happy snaps have more reliability to be around when you consider what b&w images we still see today from peoples past. Excuse me if i am skeptical about the whole cyber world. :)

A smart friend (mentor as well) of mine once explained it to me like this "when evaluating a technology don't consider how hard it is to get your data in, consider how hard it will be to get your data back out." That is so so very true of the cloud services. There is nothing magical about the "cloud" its just a bunch of computers in a bunch of datacenters run by companies and people. Things happen to both :D I wouldn't worry to much about the "wrong hands" part if you are diligent about encryption :D

And we still do backups. Though zfs snapshots are VERY cool, and eliminate a lot of the need for backups. It would be nice if zfs would migrate down into Mac and Windows. Just imaging Mac Time Machine being smart enough to use zfs snapshots. Sigh.

I LOVE snapshots! They have saved me sooo much time. The downside is when you write something you didn't .. want to.. to the disk and its preserved in the snapshot - DOH!

At the current gig we've gone the other way from hyper redundancy per computer and everything hardware wise is replaceable, we just throw more computers at the problem :D
 
One other thing I was thinking. I would add a note of pessimism to all our pessimism :rofl: :huh: When working on lots of computers you see a lot of computers so you see more problems overall. People tend to remember problems and in my current job I'm kind of a professional pessimist so naturally take a very conservative path; weight that against the costs :D
 
Thanks Ryan your most recent post had me lol. Its quiet relevant. I find a similar cool head needs to prevail when shopping for hardware.
If one examines some of the user reviews on a place like New Egg (which in my view is an excellent e-retailer and has excellent user reviews) one has to consider ones own needs and application of the device rather than merely accepting some users condemnation or praise of a product and price is absolutely no safe harbor.

Lol i think when i am done my biggest concern of loss of data is not going to come from hardware but from those who shall remain nameless in our house not having the discipline to even store data off a device. lol.
Training old dogs to do new tricks is a challenge in this sphere of life. :)
Thanks again all for your inputs.

I make a last point which i made to Linda to get this point across.
I have black and white photos which my Dad travelled to relatives in UK to obtain that go back several generations. These paper prints have survived over 100 years. The issue we need to be aware of and consider is that if we as a front edge of the digital end of taking these family records want to see them handed down and continue to be handed down its going to take some deliberate effort on our individual parts to take the steps to iwnsure that happens. Someone in the household/family is going to have to take this role under their wing. The old shoe box approach is not going to hack it.:)

Sent from my MB860 using Tapatalk 2
 
In the event some family members have an interest in setting up a home server and worry that this is an overly complex and expensive venture it dont need to be at all.

Here are some examples of economically it can be done.

For a server complete with one drive and box

http://www.newegg.ca/Product/Product.aspx?Item=N82E16859107052CVF

$377 + $12.40 shipping Note all Canadian dollars and Canadian NEwegg site. US will be better but NEwegg US only ships to US.

Windows Home server Operating system .....
http://www.newegg.ca/Product/Product.aspx?Item=N82E16832416443CVF
$50 + $6.99 shipping


So total of $446.43 plus 13% in Canada and you got a home server with a single drive.

USA Version you lucky people

http://www.newegg.com/Product/Product.aspx?Item=N82E16859107052

$319 US after rebate + $15 shipping

Software same as Canadian.

$394 if you take into account software shipping in US. Then you have to add sales taxes if they apply to you.


For those wanting support there is this site on the WHS topic.

http://homeservershow.com/


This is a popular neat machine .

The latest version is this one
http://www.newegg.com/Product/Product.aspx?Item=N82E16859107921&Tpk=n54l


Its simply amazing what people are doing with these little boxes.
 
Last edited:
I have black and white photos which my Dad travelled to relatives in UK to obtain that go back several generations. These paper prints have survived over 100 years. The issue we need to be aware of and consider is that if we as a front edge of the digital end of taking these family records want to see them handed down and continue to be handed down its going to take some deliberate effort on our individual parts to take the steps to insure that happens. Someone in the household/family is going to have to take this role under their wing. The old shoe box approach is not going to hack it.:)

For irreplaceable stuff like that there is no better solution than lots of redundancy in storage medium. Put them in the cloud, burn them to CD, save them on the redundant computer, put printouts in a series of shoe boxes and distribute to family. Figure long term that maybe one of the shoe boxes will survive if you're lucky.. :rolleyes: :rofl:

Another alternative for folks who are less technically inclined (or just lazy like myself :D) is that some of the low end network attached storage devices are getting pretty reasonable: http://www.newegg.com/Network-Attached-Storage-NAS/Category/ID-241?Tid=18253
Once you count in the price of the drives, etc.. they're a pretty reasonable deal.
http://www.newegg.com/Product/Product.aspx?Item=N82E16822165448 is about the same price as the Micro Tower Server rob linked to.
The upside of these is that they're pretty easy to setup/use. The downside is that they're less general purpose so you can't use them for other things (this is sometimes a good thing as it keeps folks from fiddling with them as much :D)

I'm with you on how amazing the little computers are these days. Even more amazing are things like the roku stick: http://www.roku.com/streamingstick or for the gageteers a $35 RaspberryPi http://www.raspberrypi.org/ is crazy powerful compared to the computers we grew up with.
 
Top