You are on page 1of 47

Computing Infrastructure

HDD performance
HDD
Service time
Exercise 1

• read/write of 1 sector of 1KB


• rotation speed: 15000 RPM (round per minute)
• data transfer rate: 100 MB/s
• mean seek time: 8ms
• overhead controller: 0.1ms

• mean latency: (1/2 round) * (60s/min) * 1/(15000 round/min) =


0.002s = 2ms

• transfer time: (1 KB) / (100*1024 KB/s) = 0.00001s = 0.01ms

• mean I/O service time = 8ms + 2ms + 0.01ms + 0.1ms = 10.11ms


Exercise 2

• mean I/O service time = 8ms + 2ms + 0.01ms + 0.1ms = 10.11ms

• We want the mean I/O service time to be less than 5ms


• What percentage of data locality do I need?
the data locality of a disk is the percentage of blocks that do not
need seek or rotational latency to be found

• (1-Locality)*(8ms + 2ms) + 0.01ms + 0.1ms < 5ms

• Locality > 51.1%


Exercise 3

• How long does it takes to write a file of 10MB quite fragmented on


disk (i.e., Locality = 35%)?
• 10 MB = 10240KB = 10240 blocks

• mean I/O service time = (1-0.35)*(8ms + 2ms) + 0.01ms + 0.1ms


= 6.61 ms

• 10240*6.61ms = 67.6864s

• Other approach:
• 10240 * (1-0.35) * 10.11ms + 10240 * 0.35 * 0.11 ms =
67.6864s
Computing Infrastructure

RAID disks
RAID levels

RAID 0 striping only


RAID 1 mirroring only
§ RAID 0+1 (nested levels)
§ RAID 1+0 (nested levels)
RAID 2 bit interleaving (not used)
RAID 3 byte interleaving - redundancy (parity disk)
RAID 4 block interleaving - redundancy (parity disk)
RAID 5 block interleaving - redundancy (parity block distributed) –
highly utilized
§ RAID 5+0 (nested levels)
RAID 6 greater redundancy (2 failed disks are tolerated)
RAID 7 (proprietary solutions)
RAID levels (combined)

RAID levels can be combined

RAID x + y (or RAID xy) =>


§ n x m disks in total
§ Consider m groups of n disks
§ Apply RAID x to each group of n disks
§ Apply RAID y considering the m groups as single disks
n
m RAID x

RAID y n
RAID x
RAID 0, 1, 0+1 and 1+0 organizations

RAID 0 RAID 1
RAID 0+1 RAID 1

A1 A2 A1 A1 RAID 0 RAID 0
A3 A4 A2 A2
A1 A2 A3 A1 A2 A3
A5 A6 A3 A3
A4 A5 A6 A4 A5 A6
A7 A8 A4 A4
B1 B2 B3 B1 B2 B3

RAID 1+0 RAID 0 B4 B5 B6 B4 B5 B6

RAID 1 RAID 1 RAID 1

A1 A1 A2 A2 A3 A3
A1, B1,... data blocks
A4 A4 A5 A5 A6 A6

B1 B1 B2 B2 B3 B3

B4 B4 B5 B5 B6 B6
RAID 4, 5, 5+0

RAID 4 RAID 5

A1 A2 A3 Ap A1 A2 A3 Ap

B1 B2 B3 Bp B1 B2 Bp B3

C1 C2 C3 Cp C1 Cp C2 C3

D1 D2 D3 Dp Dp D1 D2 D3

A1, B1, ... data blocks


RAID 5+0 RAID 0 Ap ... parity blocks

RAID 5 RAID 5 RAID 5

A1 A2 Ap A3 A4 Ap A5 A6 Ap

B1 Bp B2 B3 Bp B4 B5 Bp B6

Cp C1 C2 Cp C3 C4 Cp C5 C6

D1 D2 Dp D3 D4 Dp D5 D6 Dp
RAID 6 organization

RAID 6

A1 A2 A3 Ap Aq
A1, B1,... are data blocks
Ap, Aq,... are parity blocks B1 B2 Bp Bq B3

C1 Cp Cq C2 C3

Dp Dq D1 D2 D3

not efficient when the number of disks is small !


when the number of disks increases the loss of efficiency decreases
but the probability of 2 concurrent failures increases, so RAID 6
becomes mandatory
selection of RAID level (2)

NO redundancy
RAID 0
required? YES
§ fast read/write
§ low reliability
§ high-perf computing (speed and
capacity are more important than
reliability)
YES duplication
RAID 1
required?
§ small write random
§ high reliability
§ database appl. (high NO § random small
transaction rate) RAID 5 read/write
§ only 50% of the capacity § medium reliability
(high cost)
RAID 6 § Fault tolerance with
two failures
characteristics of RAID levels (3)

RAID Capacity Reliability R/W Rebuild Suggested


level performance performance applications
0 100% N/A Very good Good Non critical data
1 50% Excellent Very good / good good Critical information
3 (n-1)/n Good Good / fair Fair Single user, large
file processing,
image processing
5 (n-1)/n Good Good/ fair Poor Database,
transaction based
applications
6 (n-2)/n Excellent Very good/ poor Poor Critical
information,
w/minimal
1+0 50% Excellent Very good/ good Good Critical
information,
w/better
performance
3+0/5+0 (n-1)/n Excellent Very good/ good Fair Critical information
w/fair performance

• RAID disks do not protect against multiple failures (except lev. 6)


• thus back-ups!!! back-ups!!!
RAID
reliability
primary metrics

§ MTTF: Mean Time To Failure, mean duration of the functioning


period from the start (if the resource is new) or restart (after a
failure has been repaired or the resource has been replaced by a
new one) to the following failure

§ MTTR: mean downtime for the repair of a failure (of one


component)

§ MTTDL : Mean Time To Data Loss, mean time required to have a


number of disk failures that cause a loss of data in the array,
i.e., such that it is impossible to reconstruct the lost data from
the redundant information (e.g., 1 in a RAID0, 2 in a RAID5, 3 in
a RAID6)
MTTF – Minimum failure time of n disks

assuming that the times to failure are exponentially distributed


F(t): distribution function of the time to failure X

t

FX (t) = 1− e MTTF

§ The distribution function of the minimum of n independent


identically distributed distributions X1 … Xn is:

n
F min( X1... Xn ) (t) = 1− (1− FX (t))

(Why exponential distribution, and why the minimum has this formula will
proven in the availability part of the course)
MTTF – Minimum failure time of n disks (2)

Considering the exponential distribution of the MTTF we have:

n
" " −
t %% −
n⋅t
F min( X1... Xn ) (t) = 1− $$1− $1− e MTTF
''' = 1− e MTTF
# # &&

§ Maclaurin series

§ If x << 1, then we can approximate e-x with 1 – x .


MTTF – Minimum failure time of n disks (3)

§ Usually we have:

§ We can thus approximate the probability that a set of


n disks, each with a given MTTF, will fail after t days
as:

n⋅t
F min( X1... Xn ) (t) ≅
MTTF
MTTF – MTTDL of n disks

The failure rate of n disks is still exponential distributed:

n⋅t

F min( X1... Xn ) (t) = 1− e MTTF

§ The mean of this distribution is thus:

MTTF
E !" F min( X1... Xn ) (t)#$ =
n
§ In other words, the MTTDL of n disks, is equal to the MTTF of
one disk, divided by n.
Exercise 2: MTTDL of an array RAID0

§ MTTF(disk) = 1000 days;


§ n = 8 disks
§ MTTR = 10 days
§ MTTDL = MTTF(1 disk) / n = 125 days

§ MTTDL does not depend on MTTR


RAID 0

MTTF
MTTDL =
n MTTR = 1
3000

2500

2000
MTTDL

Exact N = 4
1500
Exact N = 5
Exact N = 6
1000

500

0
0 2000 4000 6000 8000 10000 12000

MTTF
Exercise 3: MTTDL of an array RAID1

The system fails if there is a second failure when the first resource has
failed and is under repair.
We can approximate the failure probability of the first disk as:

st 2
P(1 fail) =
MTTF
And compute the failure probability of the entire RAID 1 as:

st nd 2 MTTR
P(RAID1) = P(1 fail)⋅ P(2 fail < MTTR) = ⋅
MTTF MTTF
Finally we can approximate the MTTDL as:

1 MTTF 2
MTTDL = =
P(RAID1) 2 ⋅ MTTR
Exercise 3: MTTDL of an array RAID1 (2)

§ MTTF(disk) = 1000 days; total number of disks = 2


§ MTTR = 10 days
§ p(1° failure) = n / MTTF(disk) = 2 / 1000
§ p(2° failure: of the mirror disk) = MTTR / MTTF
§ MTTDL = 1/(2/1000 ´ 10/1000) = 50000 days

§ The RAID 1 approach has extended the lifetime of a disk of 50


times.
M. Gribaudo
RAID 1

MTTF n
MTTDL =
n ⋅ MTTR n−1
MTTR = 1
1E+10

1E+09

100000000

10000000

1000000

100000
MTTDL

N=2 Exact
10000 N=2 Approx

1000 N=3 Exact


N=3 Approx
100

10

1
1 10 100 1000 10000
0,1

0,01

0,001

MTTF
Exercise 4: MTTDL of an array RAID 1+0

In RAID1+0, the second fault causes data loss only if it happens for the
mirror disk of the one already broken.

Here we consider 8 groups (RAID 0) of 2 disks each (RAID 1), for a total
of 16 disks.

§ MTTF(disk) = 1000 days; total number of disks = 16 (2 ´ 8)


§ MTTR = 10 days
§ p(1° failure) = n / MTTF(disk) = 16 / 1000
§ p(2° failure: of the mirror disk of the 1° failed) = MTTR / MTTF
§ MTTDL = 1/(16/1000 ´ 10/1000) = 6250 days

§ Note: MTTDL is the same for any couple of disks (data+mirror) of


the array
§ In general: MTTDL = MTTF2 / (n * MTTR)
RAID 1+0

MTTDL = MTTF2 / (n * MTTR)

1000000
MTTR = 1
100000

10000

1000
N=8 Exact
MTTDL

100 N=8 Approx


n=12 Exact
10
N=12 Approx
1 N=16 Exact
1 10 100 1000 10000
N=16 Approx
0,1

0,01

0,001

0,0001
MTTF
Exercise 5: MTTDL of an array RAID 0+1

In RAID1+0, the second fault causes data loss only if it happens for any
of the disk of the other group.

Here we consider 2 groups (RAID 1) of 8 disks each (RAID 0), 16 disks


in total.

§ MTTF(disk) = 1000 days


§ total number of disks = 16 (2 ´ 8)
§ MTTR = 10 days
§ p(1° failure) = one of the 16 = n / MTTF(disk) = 16 / 1000
§ p(2° failure: a disk fails in the other mirror group)
= MTTR / (MTTF ´ 2/n)

§ MTTDL = 1/(16/1000 ´ 10´8/1000) = 781 days


§ In general: MTTDL = 2 * MTTF2 / (n2 * MTTR)
M. Gribaudo
RAID 0+1

MTTDL = 2 * MTTF2 / (n2 * MTTR)

10000000 MTTR = 1
1000000

100000

10000
N=8 Exact
MTTDL

1000 N=8 Approx


n=12 Exact
100
N=12 Approx
10 N=16 Exact
N=16 Approx
1
1 10 100 1000 10000
0,1

0,01

0,001
MTTF
RAID 1+0 and RAID 0+1 comparison

We have seen that:


the RAID 1+0 has an MTTDL of 6250 days
the RAID 0+1 has an MTTDL of 781 days
a RAID 0 with the same capacity (with only 8 disks) has an MTTDL of
125 days

• RAID 1+0 has MTTDL that is 6250/781 = 8 times larger than the
RAID 0+1
• However, as we have seen, RAID 0+1 can be simpler to
implement, and this is why sometimes it is used.
Failures in RAID 5

G: number of disks in RAID 5


RAID 5 fails when two disks are failed concurrently (the second one
fails during the interval required to repair the first one).
If there are G disks in the beginning, when one is failed, G-1 disks
remains.
The probability of the second failure must consider that one of the G-
1 remaining disks fails during the MTTR of the first:

" MTTR G−1


%
nd
− MTTR ⋅ (G −1)
P(2 fail < MTTR) = 1− $ e MTTF
' ≅
# & MTTF
Failures in RAID 5

Thus, as for the RAID1 case, we have:

st nd G (G −1)⋅ MTTR
P(RAID5) = P(1 fail)⋅ P(2 fail < MTTR) = ⋅
MTTF MTTF

1 MTTF 2
MTTDL = =
P(RAID5) G ⋅ (G −1)⋅ MTTR
Failures in RAID 6

Similarly, for RAID 6, where two disks must fail before the repair of
the first distk, we have:

P(RAID6) = P(1st fail)⋅ P(2 nd fail < MTTR)⋅ P(3rd fail < MTTR / 2) =
G (G −1)⋅ MTTR (G − 2)⋅ MTTR
= ⋅ ⋅
MTTF MTTF 2 ⋅ MTTF
§ We have to use MTTR/2 since during the 3rd failure, there are
two disks being repaired, and thus we have to consider the
minimum of the two.

3
1 2 ⋅ MTTF
MTTDL = = 2
P(RAID6) G ⋅ (G −1)⋅ (G − 2)⋅ MTTR
Exercise 6: MTTDL of a RAID5 and RAID6

§ MTTF(disk) = 1000 days;


§ n = 5 disks for RAID5, and n = 6 for RAID6
§ MTTR = 10 days

§ Both systems have exactly the same capacity (equivalent to four


disks).

1000 2
MTTDLRAID5 = = 5000 days
5⋅ 4 ⋅10
2 ⋅1000 3
MTTDLRAID6 = 2
= 166667 days
6 ⋅ 5⋅ 4 ⋅10

§ RAID6 has an MTTDL more than 33.3 times larger than RAID5 !
M. Gribaudo
RAID 5

MTTF 2
MTTDL =
n ⋅ (n −1)⋅ MTTR
1000000 MTTR = 1
100000

10000

1000
N=4 Exact
MTTDL

100 N=4 Approx


n=8 Exact
10
N=8 Approx
1 N=16 Exact
1 10 100 1000 10000
N=16 Approx
0,1

0,01

0,001

0,0001
MTTF
M. Gribaudo
RAID 6

2 ⋅ MTTF 3
MTTDL =
n ⋅ (n −1)⋅ (n − 2)⋅ MTTR 2
MTTR = 1
1E+09
100000000
10000000
1000000
100000
10000 N=5 Exact
MTTDL

1000 N=5 Approx


100 n=8 Exact
10 N=8 Approx
1 N=16 Exact
1 10 100 1000 10000
0,1 N=16 Approx
0,01
0,001
0,0001
0,00001
0,000001
MTTF
Exercise 7: Comparison of RAID 1+0 and RAID 6

§ MTTF(disk) = 1000 days;


§ n = 4 disks, 2 for data and 2 for parity
§ MTTR = 10 days

§ Compare the MTTDL of RAID 6 (4 disks), and RAID 1+0 (2


groups of 2 disks – 4 disks in total).
2 ⋅1000 3
MTTDLRAID6 = 2
= 833333 days
4 ⋅ 3⋅ 2 ⋅10
1000 2
MTTDLRAID1+0 = = 25000 days
4 ⋅10
§ RAID6 has an higher MTTDL than RAID 1+0, even if they both
uses the same number of disks
Exercise 8: RAID 1+0 and RAID 6 in general

§ In general, with 4 disks:

2 ⋅ MTTF 3
MTTDLRAID6 =
4 ⋅ 3⋅ 2 ⋅ MTTR 2

MTTF 2
MTTDLRAID1+0 =
4 ⋅ MTTR

MTTDLRAID6 MTTF
= RAID 6 is better if MTTF > 3 * MTTR
MTTDLRAID1+0 3⋅ MTTR
Recap - RAID 4, 5, 5+0

RAID 4 RAID 5

A1 A2 A3 Ap A1 A2 A3 Ap

B1 B2 B3 Bp B1 B2 Bp B3

C1 C2 C3 Cp C1 Cp C2 C3

D1 D2 D3 Dp Dp D1 D2 D3

A1, B1, ... data blocks


RAID 5+0 RAID 0 Ap ... parity blocks

RAID 5 RAID 5 RAID 5

A1 A2 Ap A3 A4 Ap A5 A6 Ap

B1 Bp B2 B3 Bp B4 B5 Bp B6

Cp C1 C2 Cp C3 C4 Cp C5 C6

D1 D2 Dp D3 D4 Dp D5 D6 Dp
MTTDL of an array RAID 5+0

We have a total of N disks, divided into m groups of G disks each.


The first failure is of any of the N disks (from any of the groups).
The second failure must happen between the G-1 disks that composes the
group where the first failure occurred.
A simple way of deriving the formula is by applying the RAID 0 formula (with
m groups) to the MTTDL of a RAID 5 of G disks.

MTTF 2
MTTDLgroup G ⋅ (G −1)⋅ MTTR MTTF 2
MTTDLarray = = =
m N N ⋅ (G −1)⋅ MTTR
G
MTTDL of an array RAID 6+0

If the number of parity disks per group are two (RAID 6+0, the data are lost
when a third disk fails during the repair interval of the first two failed), also
the second fail must happen in the same group.
We can apply the same trick used for the RAID 5+0 to compute the MTTDL
of a RAID 6+0:

2 ⋅ MTTF 3
MTTDLarray =
N ⋅ (G −1)⋅ (G − 2)⋅ MTTR 2
Exercise 9: MTTDL of an array RAID 5 + 0

RAID 5+0 RAID 0

RAID 5

RAID 5 RAID 5

A1 A2 Ap A1 A2 Ap A1 A2 Ap

B1 Bp B2 B1 Bp B2 B1 Bp B2



Cp C1 C2 Cp C1 C2 Cp C1 C2

D1 D2 Dp D1 D2 Dp D1 D2 Dp

§ N=25 total number of disks organized in 5 groups of G=5 disks each


§ each group has a redundant disk
§ MTTF(disk)=1000 days, MTTR(disk)=10 days
Exercise 9 (cont.)

§ The MTTDL of the RAID5+0 is:

1000 2
MTTDLRAID5 = = 1000 days
25⋅ 4 ⋅10
M. Gribaudo
RAID 5+0

MTTF 2
MTTDL =
n ⋅ (g −1)⋅ MTTR
1000000
MTTR = 1
100000

10000

1000
N=15 Exact
MTTDL

100 N=15 Approx


n=25 Exact
10
N=25 Approx
1 N=50 Exact
1 10 100 1000 10000
N=50 Approx
0,1

0,01

0,001

0,0001
MTTF
Exercise 10: comparison between RAID 5 and RAID 5+0

§ Compare the MTTDL of the 25 disks, divided into 5 groups, of a


RAID 5+0, with the MTTDL of a 21 disks RAID 5.
§ Both systems have exactly the same capacity.

§ With 21 disks we have an array RAID 5 with:


§ MTTDL(RAID 5) = MTTF2 / (21 ´ 20 ´ MTTR)
§ With 25 disks, it is possible to have an array RAID 5 + 0,
§ MTTDL(RAID 5 + 0) = MTTF2 / (25 ´ 4 ´ MTTR)

§ Thus, the extra 4 disks allows a MTTDL that is more than four
times larger:

§ MTTDL(RAID 5 + 0) / MTTDL(RAID 5) = 21/5 = 4.2


Exercise 11: comparison between RAID 6 and RAID 5+0

§ Compare the MTTDL of 6 disks in a RAID 5+0, divided into 2


groups of 3 disks each, with the MTTDL of a 6 disks RAID 6.
§ Both systems have exactly the same capacity and uses the same
number of disks.

§ For RAID 5+0 we have:


§ MTTDL(RAID 5+0) = MTTF2 / (6 ´ 2 ´ MTTR)
§ For RAID 6 we have:
§ MTTDL(RAID 6) = 2 x MTTF3 / (6 ´ 5 ´ 4 ´ MTTR2)

§ MTTDL(RAID 6) / MTTDL(RAID 4) = MTTF / (5 ´ MTTR)

§ If MTTF > 5 ´ MTTR, then RAID 6 is better. Otherwise RAID 5+0


is more reliable.
Exercise 12: RAID 6 and RAID 5+0 in general

§ In general, compare the MTTDL of 2k disks, divided into 2 groups


of k disks each, for a RAID 5+0, with the MTTDL of a 2k disks
RAID 6.

§ For RAID 5+0 we have:


§ MTTDL(RAID 5+0) = MTTF2 / (2k ´ (k-1) ´ MTTR)
§ For RAID 6 we have:
§ MTTDL(RAID 6) = 2 x MTTF3 / (2k ´ (2k-1) ´ (2k-2) ´ MTTR2)

§ MTTDL(RAID 6) / MTTDL(RAID 5) =
2((k-1) x MTTF) / ((2k-1) x (2k-2) ´ MTTR)

§ If 2*(k-1)*MTTF > (2k-1)*(2k-2)*MTTR, then RAID 6 is better.


Otherwise RAID 5+0 is more reliable.