Академический Документы
Профессиональный Документы
Культура Документы
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp.,
registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other
companies. A current list of IBM trademarks is available on the Web at Copyright and trademark information at
www.ibm.com/legal/copytrade.shtml.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other
countries, or both.
Other product and service names might be trademarks of IBM or other companies.
Source If Applicable
2
Agenda
Conclusions
to speed up the data transfer to one or a group of volumes use more than 1 rank, because then
simultaneously
more than 8 physical disks are used
more cache and NVS can be used
more device adapters can be used
speed up techniques
use more than 1 rank: storage pool striping, Linux striped logical volume
use more channels: ECKD logical path groups, SCSI Linux multipath multibus
ECKD use more than 1 subchannel: PAV, HyperPAV
z9 LPAR:
8 CPUs
512MiB
4 FICON Express4
2 ports per feature used for FICON
2 ports per feature used for FCP
Total 8 paths FICON, 8 paths FCP
DS8700:
8 FCION Express4
1 port per feature used for FICON
1 port per feature used for FCP
Linux:
SLES11 SP1 (with HyperPAV, High Performance
FICON)
Kernel: 2.6.32.13-0.5-default (+dm stripe patch)
Device-mapper:
multipath: version 1.2.0
striped: version 1.3.0
Multipath-tools-0.4.8-40.23.1
8 FICON paths defined in a channel path group
5
System z
Switch
DS8K
FICON Port
FCP Port
Not used
Configuration DS8700
DS8700:
In case the adapters and disks are not already accessible (case 8 channel paths and 2 volumes)
#--#--echo
echo
echo
echo
echo
echo
echo
echo
#--echo
echo
echo
echo
echo
echo
echo
echo
>
>
>
>
>
>
>
>
/sys/bus/ccw/drivers/zfcp/0.0.1700/0x500507630410c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.1780/0x500507630408c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.1800/0x500507630400c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5000/0x500507630418c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5100/0x50050763041bc7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5900/0x500507630413c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5a00/0x500507630403c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5b00/0x50050763040bc7ed/unit_add
>
>
>
>
>
>
>
>
/sys/bus/ccw/drivers/zfcp/0.0.5100/0x50050763041bc7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5900/0x500507630413c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5a00/0x500507630403c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5b00/0x50050763040bc7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.1700/0x500507630410c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.1780/0x500507630408c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.1800/0x500507630400c7ed/unit_add
/sys/bus/ccw/drivers/zfcp/0.0.5000/0x500507630418c7ed/unit_add
Prepare multipath.conf for device mapper to use multibus and the appropriate value for switching paths
/etc/init.d/multipathd stop
Shutting down multipathd
cp /etc/multipath.conf /etc/multipath.conf.backup
vim /etc/multipath.conf
set path_grouping_policy multibus
set rr_min_io 1
/etc/init.d/multipathd start
Starting multipathd
dmsetup table
36005076304ffc7ed0000000000006114_part1: 0 20971488 linear 253:2 32
36005076304ffc7ed0000000000006024: 0 20971520 multipath 0 0 1 1 round-robin 0 8 1 8:64 1 8:96 1
8:112 1 8:0 1 8:32 1 8:16 1 8:48 1 8:80 1
36005076304ffc7ed0000000000006024_part1: 0 20971488 linear 253:0 32
36005076304ffc7ed0000000000006114: 0 20971520 multipath 0 0 1 1 round-robin 0 8 1 8:128 1 8:160
1 8:144 1 8:176 1 8:192 1 8:208 1 8:224 1 8:240 1
10
Multipath details
multipath -ll
36005076304ffc7ed0000000000006024 dm-0 IBM,2107900
size=10G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
|- 4:0:12:1076117600 sde
8:64 active ready running
|- 6:0:10:1076117600 sdg
8:96 active ready running
|- 7:0:11:1076117600 sdh
8:112 active ready running
|- 0:0:15:1076117600 sda
8:0
active ready running
|- 2:0:14:1076117600 sdc
8:32 active ready running
|- 1:0:8:1076117600 sdb
8:16 active ready running
|- 3:0:9:1076117600 sdd
8:48 active ready running
`- 5:0:12:1076117600 sdf
8:80 active ready running
36005076304ffc7ed0000000000006114 dm-2 IBM,2107900
size=10G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
|- 4:0:12:1075069025 sdi
8:128 active ready running
|- 6:0:10:1075069025 sdk
8:160 active ready running
|- 5:0:12:1075069025 sdj
8:144 active ready running
|- 7:0:11:1075069025 sdl
8:176 active ready running
|- 0:0:15:1075069025 sdm
8:192 active ready running
|- 1:0:8:1075069025 sdn
8:208 active ready running
|- 2:0:14:1075069025 sdo
8:224 active ready running
`- 3:0:9:1075069025 sdp
8:240 active ready running
11
12
#--- create logical volume --lvcreate --stripes 2 --stripesize 64K --extents 5118 --name vol01 vg01
Logical volume "vol01" created
lvdisplay
--- Logical volume --LV Name
/dev/vg01/vol01
VG Name
vg01
LV UUID
uBqAxh-QvlQ-4Qq8-peI3-UJj1-PRyr-IrhsjV
LV Write Access
read/write
LV Status
available
# open
0
LV Size
19.99 GB
Current LE
5118
Segments
1
Allocation
inherit
Read ahead sectors
auto
- currently set to
1024
Block device
253:4
#--- activate LV --lvchange -a y /dev/vg01/vol01
#--- create file system --mkfs -t ext3 /dev/vg01/vol01
#--- mount LV --mount /dev/vg01/vol01 /mnt/subw0
13
Check data
dmsetup table
36005076304ffc7ed0000000000006114_part1: 0 20971488 linear 253:2 32
36005076304ffc7ed0000000000006024: 0 20971520 multipath 0 0 1 1 round-robin 0 8 1 8:64 1 8:96 1
8:112 1 8:0 1 8:32 1 8:16 1 8:48 1 8:80 1
36005076304ffc7ed0000000000006024_part1: 0 20971488 linear 253:0 32
vg01-vol01: 0 41156608 striped 2 128 253:1 384 253:3 384
36005076304ffc7ed0000000000006114: 0 20971520 multipath 0 0 1 1 round-robin 0 8 1 8:128 1 8:160
1 8:144 1 8:176 1 8:192 1 8:208 1 8:224 1 8:240 1
14
HyperPAV devices
15
Workload
Threaded I/O benchmark (IOzone)
Each process writes or reads a single file
Options to bypass page cache, separate execution of sequential write, sequential read,
random read/write
Setup
Main memory was restricted to 512 MiB
File size: 2 GiB, Record size: 8 KiB or 64 KiB
Run with 1, 8 and 32 processes
Sequential run: write, rewrite, read
Random run: write, read (with previous sequential write)
Runs with direct I/O and Linux page cache
Sync and drop caches prior to every invocation of the workload to reduce noise
16
1 single disk
17
ECKD 1d
ECKD 2d lv
hpav sps
ECKD 1d
hpav
ECKD 8d lv
hpav
ECKD 1d
hpav sps
FCP/SCSI
1 single disk
1 single disk
18
SCSI 1d
SCSI 1d mb
SCSI 2d lv
mb sps
SCSI 8d lv
mb
SCSI 1d mb
sps
ECKD
For 1 process the scenarios show equal throughput
For 8 processes HyperPAV improves the throughput by
up to 5.9x
For 32 processes the combination Linux logical volume
with HyperPAV dasd improves throughput by 13.6x
SCSI
For 1 process the scenarios show equal throughput
For 8 processes multipath multibus improves
throughput by 1.4x
For 32 processes multipath multibus improves
throughput by 3.5x
ECKD versus SCSI
Throughput for corresponding scenario is always higher
with SCSI
Throughput
ECKD 1d
number of processes
ECKD 2d lv
hpav sps
SCSI 1d mb
ECKD 1d hpav
ECKD 8d lv
hpav
SCSI 1d mb
sps
32
ECKD 1d hpav
sps
SCSI 1d
SCSI 2d lv mb
sps
SCSI 8d lv mb
19
ECKD
For 1 process the throughput for all scenarios show
minor deviation
For 8 processes HyperPAV + storage pool striping or
Linux logical volume improve the throughput by 3.5x
For 32 processes the combination Linux logical volume
with HyperPAV or storage pool striped dasd improves
throughput by 10.8x
SCSI
For 1 process the throughput for all scenarios show
minor deviation
For 8 processes the combination storage pool striping
and multipath multibus improves throughput by 5.4x
For 32 processes the combination Linux logical volume
and multipath multibus improves throughput by 13.3x
ECKD versus SCSI
ECKD is better for 1 process
SCSI is better for multiple processes
General
More NVS keeps throughput up with 32 processes
Throughput
ECKD 1d
number of processes
ECKD 2d lv
hpav sps
SCSI 1d mb
ECKD 1d hpav
ECKD 8d lv
hpav
SCSI 1d mb
sps
32
ECKD 1d hpav
sps
SCSI 1d
SCSI 2d lv mb
sps
SCSI 8d lv mb
20
ECKD
For 1 process the scenarios show equal throughput
For 8 processes HyperPAV improves the throughput by
up to 5.9x
For 32 processes the combination Linux logical volume
with HyperPAV dasd improves throughput by 13.7x
SCSI
For 1 process the throughput for all scenarios show
minor deviation
For 8 processes multipath multibus improves
throughput by 1.5x
For 32 processes multipath multibus improves
throughput by 3.5x
ECKD versus SCSI
Throughput for corresponding scenario is always higher
with SCSI
General
Same picture as for random read
Throughput
ECKD 1d
number of processes
ECKD 2d lv
hpav sps
SCSI 1d mb
ECKD 1d hpav
ECKD 8d lv
hpav
SCSI 1d mb
sps
32
ECKD 1d hpav
sps
SCSI 1d
SCSI 2d lv mb
sps
SCSI 8d lv mb
21
ECKD
For 1 process the throughput for all scenarios show
minor deviation
For 8 processes HyperPAV + Linux logical volume
improve the throughput by 5.7x
For 32 processes the combination Linux logical volume
with HyperPAV dasd improves throughput by 8.8x
SCSI
For 1 process multipath multibus improves throughput
by 2.5x
For 8 processes the combination storage pool striping
and multipath multibus improves throughput by 2.1x
For 32 processes the combination Linux logical volume
and multipath multibus improves throughput by 4.3x
ECKD versus SCSI
For 1 process sometimes advantages for ECKD,
sometimes for SCSI
SCSI is better in most cases for multiple processes
Throughput
ECKD 1d
number of processes
ECKD 2d lv
hpav sps
SCSI 1d mb
ECKD 1d hpav
ECKD 8d lv
hpav
SCSI 1d mb
sps
32
ECKD 1d hpav
sps
SCSI 1d
SCSI 2d lv mb
sps
SCSI 8d lv mb
22
ECKD
For 1 process the throughput for all scenarios show
minor deviation
For 8 processes HyperPAV improves the throughput by
up to 6.2x
For 32 processes the combination Linux logical volume
with HyperPAV dasd improves throughput by 13.8x
SCSI
For 1 process multipath multibus improves throughput
by 2.8x
For 8 processes multipath multibus improves
throughput by 4.3x
For 32 processes the combination Linux logical volume
and multipath multibus improves throughput by 6.5x
ECKD versus SCSI
SCSI is better in most cases
Throughput
ECKD 1d
number of processes
ECKD 2d lv
hpav sps
SCSI 1d mb
ECKD 1d hpav
ECKD 8d lv
hpav
SCSI 1d mb
sps
32
ECKD 1d hpav
sps
SCSI 1d
SCSI 2d lv mb
sps
SCSI 8d lv mb
23
General
Compared to direct I/O:
Helps to increase throughput for scenarios with 1 or
a few processes
Limits throughput in the many process case
Advanatage of SCSI scenarios with additional features
no longer visible
ECKD
HyperPAV,storage pool striping and Linux logical
volume still improves throughput up to 4.6x
SCSI
Multipath multibus with storage pool striping and/or
Linux logical volume still improves throughput up to 2.2x
Throughput
ECKD 1d
number of processes
ECKD 2d lv
hpav sps
SCSI 1d mb
ECKD 1d hpav
ECKD 8d lv
hpav
SCSI 1d mb
sps
32
ECKD 1d hpav
sps
SCSI 1d
SCSI 2d lv mb
sps
SCSI 8d lv mb
24
General
The SLES11 read ahead setting of 1024 helps a lot to
improve throughput
Compared to direct I/O:
Big throughput increase with 1 or a few processes
Limits throughput in the many process case for
SCSI
Advanatage of SCSI scenarios with additional features
no longer visible
The number of available pages in the page cache
limit the throughput at a certain rate
ECKD
HyperPAV,storage pool striping and Linux logical
volume still improves throughput up to 9.3x
SCSI
Multipath multibus with storage pool striping and/or
Linux logical volume still improves throughput up to
4.8x
Throughput
ECKD 1d
number of processes
ECKD 2d lv
hpav sps
SCSI 1d mb
ECKD 1d hpav
ECKD 8d lv
hpav
SCSI 1d mb
sps
32
ECKD 1d hpav
sps
SCSI 1d
SCSI 2d lv mb
sps
SCSI 8d lv mb
25
Initial write
Rewrite
Read
Random read
Random write
Initial write
Rewrite
Read
Random read
Random write
ext3
26
xfs
ext3
xfs
General
xfs improves disk I/O, especially write in our case
page cached I/O has lower throughput, due to memory constraint setup
Improvement in our setup
sequential write up to 62% (page cached I/O)
sequential write 20% (direct I/O)
random write up to 41% (direct I/O)
2009 IBM Corporation
Conclusions
Small sets of I/O processes benefit from Linux page cache in case of sequential I/O
Reads benefit most from using HyperPAV (FICON/ECKD) and multipath multibus (FCP/SCSI)
27
CPU consumption
Linux features, like page cache, PAV, striped logical volume or multipath consume additional
processor cycles
The consumption
grows with number of I/O requests and/or number of I/O processes
depends on the Linux distribution and versions of components like device mapper or device
drivers
depends on customizable values as e.g. Linux memory size (and implicit page cache size),
read ahead value, number of alias devices, number of paths, rr_min_io setting, I/O
request size from the applications
is similar for ECKD and SCSI in the 1 disk case with no further options
HyperPAV and static PAV in SLES11 are much more CPU saving than static PAV in older Linux
distributions
28
Summary
Linux options
29
Hardware options
FICON Express4 or 8
number of channel paths to the storage
server
Port selection to exploit link speed
No switch interconnects with less
bandwidth
Storage server configuration
Extent pool definitions
Disk placement
Storage pool striped volumes
Questions
Further information is at
Linux on System z Tuning hints and tips
http://www.ibm.com/developerworks/linux/linux390/perf/index.html
Live Virtual Classes for z/VM and Linux
http://www.vm.ibm.com/education/lvc/
Mustafa Meanovi
Linux on System z
Performance Evaluation
30