@cp2000 said in Storage solutions:
Question - I thought if one of the two raid drives fail I can still access the data on one of them? Then put in a new fresh replacement for the other and it rebuilds? Correct?
But I can still access the data if one drive failed?
It depends on the RAID level. For a 2-drive enclosure you only have two options:
RAID 0 aka striping
- Not "true" RAID, because RAID stands for Redundant Array of Independent/Inexpensive Disks and this one is not redundant)
- Half the data is written to one disk, half the data to the other
- You get the combined storage space of both disks
- Read speed is 2x that of a single disk (because you read half of the file from one disk and half from the other, in parallel)
- Write speed is 2x that of a single disk (because you write half of the file to one disk and half to the other, in parallel)
- If one of the two disks fails, all data is lost
RAID 1 aka mirroring
- Both disks contain the data full (they are identical)
- You only get half of the total diskspace (because the other half is an identical clone)
- Read speed is 2x that of a single disk (because you can read half of the file from one disk and half from the other, in parallel)
- Write speed is the same as a single disk (because the full file needs to be written to each disk, even though it's in parallel)
- If one of the two disks fails, you don't lose any data (just your 2x read speed advantage because now you are just using a single disk) and you can insert a new disk and have it rebuild from the other copy to regain redundancy
However, one often overlooked issue with RAID is that you will typically use identical disks (needed for optimal performance), often even from the same factory batch if you bought them together. The disks will have experienced exactly the same workload over ther lifespan under the exact same operating conditions (temperature, vibrations, ...). The result: when one disk fails due to old age, the other one will likely be on its last legs as well.
This is, by the way, why RAID 5 & 6 are problematic with large drives. The rebuild process on those is more complex and can easily put all the remaining disks under stress for 48 hours. At that point the chance of 1 or 2 more disks failing during the rebuild becomes a very real possibility.
For my Fun files I just got a 6 stackable enclosure with each drive having a mirror so essentially 3 hard drives with 3 mirrors. I went with this because I thought about the wear of a Raid system so its interesting you pointed that out.
This should be fine in practice but you would be wise to replace failed disks immediately. If you put it off for a few months you are playing with fire. I speak from experience (RIP my Tumblr archive).
having one seeding disk is brilliant.... im looking at the seedboxes as a better option and not seed from any of my local drives.
Even better if it's an SSD. The random IO performed by torrents (as opposed to sequential reads/writes) puts a lot more strain on mechanical hard drives. The arm inside then needs to constantly jump all over the place, as opposed to the more gradual motions of a pick-up needle.
Seedboxes are also a good option. Doesn't hog your home bandwidth, no risk of exposing your home IP if your VPN fails for some reason etc. But they do tend to be more expensive for the same storage space than a DIY solution.
Ill have to look more into this JBOD / mergerfs - is all data backed up on other drives so if one fails you havent lost anything?
There is absolutely zero redundancy with JBOD / mergerfs, it is just a convenience thing that lets you access data spread across multiple disks as a single volume, without having to think about which file is on which disk and which disks still has space available. A typical implementation will simply start filling up the first disk, then once it's full start filling up the second disk and so on.
Advantages are:
- You can simply add more disks as your storage needs grow
- Disks don't have to be identical, you can mix various sizes and even SSD's/HDD's
- Each disk will experience its own unique workload based on how often the files on it are accessed, so you don't get the "everything is near-death at the same time" problem
- If one disk starts showing early signs of failing, you only have to offload the data on that one disk to somewhere else in order to safeguard it (just like using separate individual disks - which this essentially is)
- If one disk does fail you only lose the data that was on that particular disk (again, just like using separate individual disks)
However you don't really get any of the RAID performance benefits (although you can access multiple files in parallel at increased speeds if those files happen to live on different disks). And, like I said, there is no redundancy whatsoever in this, so you need some kind of backup solution in addition to this.
JBOD stands for Just a Bunch Of Disks so that pretty much says it all. 
Im mainly concerned about loss prevention which best practices the deeper i dug the more they are like yeah essentially you need THREE backups with one offsite or in another spot but I dont want todo cloud for any of this.
Multiple / offsite / offshore backups is something for companies who stand to lose millions from data loss. At that point you're in the terrain of "but what if an airplane crashes on top of the data center?". There are many data loss scenarios that are extremely unlikely but still have a non-zero chance. It's always a tradeoff, use common sense to decide what's best in your scenario. If you live in an area that's prone to flooding, maybe don't keep all your data-copies in the basement. If your collection is so precious to you that losing it would be a major heartache if your house were to ever burn down (or burglared or swept up by a tornado), maybe keep an extra copy somewhere else. For most people 1 backup copy will be enough to reduce the risk to negligible levels they can live with, 2 if you're paranoid. Anything more and we're talking about insurance aganst real-world disasters, not disk failures.
The cold storage extra drive has been my normal MO up to this point but the data has just grown exponentially.
I want to keep one cold storage disk with periodic updates.... possibly some sort of interim secondary backup till each is sync'd. Then with my new 2 disk 8tb RAID i love it just for my personal files and how quick it is, but I think im going to have to have a third disk cold storage with periodic updates.
You could have a main JBOD for your everyday use and a backup JBOD in an external enclosure which you connect and sync periodically. That would largely eliminate the hassle of juggling too many drives to manage comfortably.
One final note: these days you can buy very large consumer drives (20TB+, even 30TB with enterprise drives). While there are definitely use cases for them, always bear in mind the old addage about putting all your eggs in one basket.