Lec-114: What is RAID? RAID 0, RAID 1, RAID 4, RAID 5, RAID 6, Nested RAID 10 Explained
Summary
TLDRThis educational video delves into RAID technology, explaining its significance in data storage for performance and security. It covers various RAID levels, including RAID 0 for performance via data stripping, RAID 1 for data security through mirroring, and RAID 5 for a balance of both by distributing parity across disks. The script uses real-life examples like Facebook's outage to emphasize the importance of data availability and introduces RAID 6, which offers double parity for even greater fault tolerance. The video is tailored for students, professionals preparing for exams or interviews, and anyone interested in data storage solutions.
Takeaways
- 💾 RAID stands for Redundant Array of Independent Disks (or Inexpensive Disks), which is a method of storing data across multiple hard drives.
- 🔄 RAID 0, or data stripping, splits data into pieces and stores them across different disks to enhance performance but offers no data redundancy.
- 🔄 RAID 1, or mirroring, creates exact copies of data on separate disks to ensure data availability and security in case of disk failure.
- 🔄 RAID 0+1 (or 1+0) combines both data stripping and mirroring to balance performance and data security.
- 🔄 RAID 3 stores data in blocks with a dedicated parity disk, allowing recovery from a single disk failure but can be a bottleneck due to high utilization of the parity disk.
- 🔄 RAID 4 is similar to RAID 3 but distributes parity information across all disks to prevent bottlenecks.
- 🔄 RAID 5 distributes parity information across all disks, ensuring equal utilization and enhancing performance compared to RAID 3 and 4.
- 🔄 RAID 6 calculates two parities and stores them across all disks, allowing for recovery from up to two disk failures.
- 💡 The choice of RAID level depends on the balance between performance, data security, and cost, with each level offering different advantages and trade-offs.
- 🌐 Real-world examples, such as Facebook, WhatsApp, and Instagram outages, highlight the importance of data availability and the potential financial and credibility impact of downtime.
Q & A
What does RAID stand for?
-RAID stands for Redundant Array of Independent Disks, or sometimes Redundant Array of Inexpensive Disks.
Why is redundancy important in RAID?
-Redundancy in RAID is important for duplicating data across multiple disks to ensure data availability and fault tolerance, which helps in case one or more disks fail.
What are the two main factors companies consider when using RAID?
-Companies consider performance, which includes fast read and write speeds, and security or availability, ensuring data is accessible 24/7.
What is the significance of the Facebook, WhatsApp, and Instagram outage mentioned in the script?
-The outage of these platforms for 6-7 hours resulted in a significant loss of revenue and credibility, highlighting the importance of data availability and the impact of downtime on businesses.
What is RAID 0 and how does it improve performance?
-RAID 0 is a level where data is broken into pieces and distributed across multiple disks. This data striping allows for parallel reading and writing, which increases performance and throughput.
How does RAID 1 provide data security?
-RAID 1, also known as mirroring, involves creating exact copies of data on separate disks. This ensures that if one disk fails, the data is still accessible from the other copies.
What is the difference between RAID 1 and RAID 0+1?
-RAID 1+0 (or 0+1) is a nested RAID level that combines both mirroring and striping. It mirrors data across multiple sets of disks and then stripes the data within each mirror set, offering both performance and data security.
What is RAID 3 and how does it use parity for data recovery?
-RAID 3 is a level where data is divided into blocks and distributed across multiple disks, with a dedicated disk for storing parity information. Parity is used to recover data if one disk fails, but if two disks fail, data recovery is not possible.
How does RAID 4 differ from RAID 3?
-RAID 4 is similar to RAID 3 in that it also uses block-level striping and parity for data recovery. However, the main difference is that RAID 4 stores the parity block across all disks, rather than on a single dedicated parity disk.
What is RAID 5 and how does it distribute parity?
-RAID 5 distributes parity information across all disks in the array, rather than storing it on a single disk. This prevents any single disk from becoming a bottleneck and ensures that all disks are utilized equally.
What advantage does RAID 6 offer over RAID 5?
-RAID 6 offers the advantage of having two parity blocks, which allows for the recovery of data even if two disks fail simultaneously, providing an additional layer of fault tolerance.
Outlines
💾 Introduction to RAID
The video begins with an introduction to RAID, which stands for Redundant Array of Independent Disks or Redundant Array of Inexpensive Disks. The presenter explains that RAID is used to store data across multiple disks, which are independent of each other, meaning the failure of one disk does not affect the others. The term 'inexpensive' refers to the cost-effective nature of disks compared to other forms of memory like registers or RAM. The importance of RAID is highlighted in terms of its relevance for competitive exams, college studies, and interviews. The video emphasizes the need for both performance and data availability, using the example of Facebook, WhatsApp, and Instagram's downtime causing significant financial losses. The presenter also touches on the increasing storage capacities in laptops as an indicator of the cost reduction in disk technology.
🔄 RAID 0: Data Stripping for Performance
The second paragraph delves into RAID 0, also known as data stripping. In RAID 0, data is broken down into pieces and distributed across multiple disks, which can improve read and write performance as these operations can be done in parallel. The video uses a diagram to illustrate how data is split and stored across different disks. The benefit of RAID 0 is its ability to enhance performance and throughput. However, it does not provide data redundancy, meaning if one disk fails, the entire data set is lost. The video also mentions RAID 1, which is mirroring, where data is duplicated across disks to ensure availability and security, contrasting it with RAID 0's focus on performance over redundancy.
🔄 RAID 1 and RAID 1+0: Mirroring and Nested RAID
This paragraph discusses RAID 1, which involves mirroring data across two or more disks, providing data security and availability. If one disk fails, the data can still be accessed from the mirror. The video points out that while this does increase cost due to the need for additional disks, the use of inexpensive disks makes it a viable option. The concept of RAID 1+0, or nested RAID, is introduced as a combination of RAID 0 and RAID 1. It involves both mirroring and striping data, providing both high performance and data security. The video explains that this configuration is commonly used in email and web servers due to its balance of performance and reliability.
🔄 RAID 3, 4, 5, and 6: Advanced RAID Levels
The final paragraph covers more advanced RAID levels, starting with RAID 3, which involves block-level striping and a single parity disk. The parity disk allows for the recovery of data if one disk fails, but if two disks fail, data recovery is not possible. RAID 4 is similar to RAID 3 but addresses the issue of a potential bottleneck with the parity disk by distributing the parity information across all disks. RAID 5 improves upon RAID 4 by distributing the parity information, preventing any single disk from being overused. Lastly, RAID 6 introduces double parity, allowing for the recovery of data even if two disks fail. The video concludes by emphasizing the importance of understanding these RAID levels for various applications and potential interview questions.
Mindmap
Keywords
💡RAID
💡Redundancy
💡Performance
💡Availability
💡Data Striping
💡Mirroring
💡Parity
💡Bottleneck
💡RAID 5
💡RAID 6
Highlights
Introduction to RAID (Redundant Array of Independent Disks) and its importance for data storage.
Explanation of the term 'redundant' in RAID and its significance for data duplication.
Discussion on the cost-effectiveness of disks and their role in RAID configurations.
Importance of RAID for performance and data availability in organizations.
Real-world example of Facebook, WhatsApp, and Instagram outage highlighting the need for RAID.
Description of RAID 0 and its method of data stripping for improved performance.
Advantages of RAID 0 in terms of read and write speed due to parallel operations.
Introduction to RAID 1 and its mirroring technique for data redundancy.
Cost-benefit analysis of using inexpensive disks for RAID 1 configurations.
Explanation of data security and availability in RAID 1 with real-world examples.
Combination of RAID 0 and RAID 1 to form RAID 10, offering both performance and data security.
Description of RAID 3 and its block-level data stripping with parity for fault tolerance.
Challenges with RAID 3, such as the single parity disk becoming a bottleneck.
Introduction to RAID 4, which is similar to RAID 3 but with distributed parity.
Explanation of RAID 5, which distributes parity across all disks to prevent bottlenecks.
Description of RAID 6, which calculates two parities to allow recovery from the failure of two disks.
Practical applications of different RAID levels in email and web servers.
Summary of RAID levels and their respective advantages for data storage solutions.
Transcripts
Dear student, welcome to Gate Smashers
In this video I'm going to explain
RAID
That is redundant array of independent Disk
or redundant array of inexpensive disk
So guys here I'll discuss about different RAID levels
with real life examples
So this video from competitive exams point of view, for college and university and for interviews
is very important
So guys so like the video and subscribe the channel
if you havn't done and if you have done so do subscribe from other other device
Subscriber are very important
lets start first of all
here we are writing Redundant array
redundant means
redundant means copy
it means duplicacy
means multiple copies of 1 thing
so here
reduntdancy
duplicacy
that is of
that is of disk
and why we're using disk
we're using disk for storing the data
so we are storing data in disk
that disk is independent
lets say if I've 5 disk
and array mean
that we don't have 1-2 disk but we've multiple disk
so this disk
first this is independent
it means it is not like that if 1 is closed
then rest will also get automatically closed, there is no dependency between them
and second why it is called inexpensive
because if we talk about memory hierchary
it is said that register are very much expensive
then cash
then RAM
So somewhere this memory are more costly
but we talk about disk
So disk at this time, its cost is less
and even gradually it is getting decrease
when you've purchased your laptop
so you also have seen that first there is 256 and then there 512 GB
then 1 TB and now a days if you buy a normal laptop theres is by default 1TB is coming
it may be like you can take 2 TB in future also
So it mean that we're using indepentent and inexpensive disk we're using
for storing the data
but why we're doing
what is need of this, this is the first question that arise
So guys whenever we think of company or organsition point of view
so we need 2 things
first is performance
performance in case of read and write
means you from data
you want to read data, want to read data from disk
and write
means want to change data in disk
some transaction has changed something
you want Read and Write
to be fast
so you want performance
Second you want security
avialability
that data should be avaialalbe 24*7
and its real life example I'll give to you
you have heard fews days ago
that facebook, whatapp and insta
was down for 6-7 hours
and you might have heard this also
and because of that Mark zuckerberg income
how much loss does he faced
he faced 7 billion loss
7 billion dollar loss
from 3rd place he has shifted to 5th place
only becasue of 6-7 hours
services were down
and what is there in that services
what we do in whatsapp
sharing data only
what is there in facebook
our profile and friends profile
that is also data
and if we talk about instagram
in that also there is only photos and videos, that is also data only
so it is dealing with data only
and only because of this small down
so because of that there is 7 billion of loss
so guys
thats why companies want
there data should be 24*7 available
and if is not available it would be great loss for company
so that loss
can be in term of money also
and it can be in term of cerdibility also
means trust of them
I'll give one more example of this
if you've heard about telegram
So telegram in only those 6-7 hours
60-70 millions accounts has been added
when facebook and all were down
that 6-7 hours
60-70 new accounts has been created
So see one compant faced a good benefit and another one has faced loss
and only because of avaialbity and performance
so lets start, so you understood storty
that why we are using and what is its purpose
So we'll talk anout different levels
So in Raid we've diffrenet levels
that how we are keeping the data
So first level in this is RAID 0
What we do in it
data stripping
means we are breaking data, breaking data in pieces
and keep in different disk
like you can see in diagram
that in RAID 0, original data that I've lets say it is A
B,C,D
means you can put data in this way
what I did, I broke A in A1 and A2
B into B1 and B2 means you can do mutiple pieces also
but if you see in simple diagram
So I break A1 into A1 and A2
then on A3 and A4 and A5 and A6
I mean to say that breaking data in pieces
and keeping that data in different disk
so what is the use of this
its benefit is performance
RAID 0 gives us
performance
because if you want to read, you want to read data of A
so parallelly you can read the data from both as it is independent disk
parallelly you can read data from this
So your performance will be fat
same if you want write(B)
you want change something in A7
and in A8
so both write will be parallelly
so this give high performance and throughput
RAID 0
So remember this point
then
so it can be asked from you that next level is RAID 1
What is purpose of RAID 1, it is called mirroring
as RAID 0 is called data stripping
here it is called mirroring
here we are not breaking data
same data copy
is shifting to another disk
means I've a data A,B,C
I've made a mirror of that, mirror means like there is mirror image
copy of that data
is kept on another disk also
So you will ask sir it will cost a lot
cost is there
but again inexpensive disk
means we're choosing that disk or memory
whose price is not that much high
we are not using ssd, cash, RAM
we 're using that disk
whose performance is also good
and price is also not that much high
So we're keeping multiple copies
so you got its simple advantage
it is not pointing out on performance
here it talks about availability
data security
lets say if one disk failed
still I can access that data because in other disk that work is going
this is 2 simple example
companies keep 3-4 mirrors and distribute in whole globe
if there is problem in one geographical area
So it can fetch data from different geographical area
but idea must have clear from here
So mirroring work is to secure data in a way
in case of
failure we can access data from another
and in case if this is coming in your mind that if both disk fails
So guys there is no solution of this, you can give 10 mirrors
then also you can say and what if 10 mirror fails
so that is a worst case
but still we are reducing the probability
if failure occurred
whether I'll able to access my data or not
so we are increasing its probability
that yes I can still access my data
so we're improving that thing
so this is combination
means here you can see RAID 1+0
at same places 1+0 can also be written
so we don't call it 10
it is not 10 but is 1+0
1+0 means we've mix 1 and 0
So 0+1 can also be written on 1+0 also
but what we did in 1+0
that data is here mirrored also
and along with this it is striped also
see A1 and A2
means we break the data also, original data was A
it is broken in 2 parts A1 and A2
and kept a copy of that also
and its further part also made A3 and A4
and kept a copy of that, means stripping is also done like data is broken
and mirroring also
that’s why we say
nested raid because both combination
and it is advantageous
very useful
and real life application
whether you talk about email server or web server
there it is mostly used
because this raid 0 it give high performance
so it is using that also
and Raid 1 data security
so it is using this also
so it ls the combination of both
then next if we talk about RAID 3
RAID 2 is also there but has become absolute in that we break data in bit level
that is not used here
Now we'll come to RAID 3
in this we break data in block level
I've data A
I've divided that in bloc
blocks means
broken data in pieces
in pieces
and capture data in different disk
A1 here and A2 here and A3 here
A4 here A5 here and A6 here
Breaking original data into 6 pieces and kept it like this
So in last you might be seeing that what it is A
this is parity bit, means taking parity of data
and stored in a different disk
all data
what we need to dot that along with storing original data
its parity is also stored, why we do this
parity storing means
if one of the disk is failed
if one of the disk is failed
Still I can recover the data from
parity
I'll give ls simple example
lets say data is 1
and this is 2
and this is data 3
just for an example, I'll tell you with a simple example
so if anyone ask you can easily answer that
So method of finding parity is adding all of three
So 1+2+3 will be
it will be 6
so parity 6 we've stored here
so like this we have find out its Parity
now lets say in future A1 fails
A1 disk fails
so now how I'll recover the data
with the help of parity
Parity is 6
and I know how parity was found 1+2+3
so what I'll do
I've to find this one
take rest of the parity
its sum is 5
So will subtract from parity so it will be 5
So see its data is 1
this I'm telling in a form of example
that if 1 disk is failed
Still I can recover the data from the parity
only 1 disk but if 2 disk fail
then we can't do anything
then we can't do anything
then we can't find with parity as there will be confusion
then multiple options can come, so here
So here you got the advantage that we broke block level data
in blocks
made blocks of original and made one parity
but what is the problem here
problem is that all parity is there in 1 parity
data is in hard disk
kept parity in different disk
problem from this is that in case
in case if there if problem came in parity
So all the parities will not able to get use
second bottle neck can come
if anyone read
or write, lets say here also write and here also
and here also
whenever you write or change
so you need to change in parity also
Parity will change always
lets say if I've here instead of 2 I've done 20
then new parity I've to calculate
so whenever you'll write
then you've to take parity from here
So this disk will be used a lot
because of this it can came in bottle neck state
it will be very usable that it can create problem in accessibility
So this problem can come and this problem is solved by next one
RAID 4 says
what we're doing in RAID 4
same concept
but in this we've saved parity in such way
in actual there was not much difference in RAID 3 and 4
but this problem is solved by
RAID 5
what RAID 5 did
All these parities
are distributed
Parities are distributed
Parity in not in 1 disk
Now that parities are distributed all disk
now there is data and parity both
so what will happen because of this
because of this no disk will be over utilized
all disk will be equally utilized, right operation will be performed well by this
as compare to RAID 3 and 4
finally there is RAID 6, this is last,in raid 6 there is little difference
in this we are calculating 2 parities
instead of 1 we are calculating 2 parities
A and A
this data A1,A2 and A3
calculated 2 parities of this
calculating 2 parties means as in earlier one if 1 disk fails
I can recover from 1 parity
but if 2 disk fail
then it is not possible
but here advantage is
that if 2 disk also fails
still I can recover the data from the parity
because I've 2 parities
So there are 2 equations
so from 2 equations I can find out the value, like we normally do
that in one equation x, y value is this and in second it is this
so obviously we
find out them
I mean to say
That in RAID six advantage is
that from 2 parities if 2 disk fails
then that you can easily recover
So from all these parameter whatever question comes you can easily explain
Thank you
تصفح المزيد من مقاطع الفيديو ذات الصلة
Netapp Storage Architecture
Getting the Most Performance out of TrueNAS and ZFS
What is a server? Types of Servers? Virtual server vs Physical server 🖥️🌐
L-7.1: File System in Operating System | Windows, Linux, Unix, Android etc.
The Silken Court Boss Guide - Nerub-ar Palace - Heroic/Normal - The War Within Raid 11.0
The CIA Triad - CompTIA Security+ SY0-701 - 1.2
5.0 / 5 (0 votes)