Вы находитесь на странице: 1из 166

Exadata and Database Machine

Administration Workshop
Student Guide

D67016GC20
Edition 2.0
January 2011
D71669

Authors

Copyright 2010, Oracle and/or it affiliates. All


rights reserved.

Peter Fusek

Disclaimer

Jean-Francois Verrier
Mark Fuller
Dave Winter

Technical Contributors
and Reviewers

This document contains proprietary information and is


protected by copyright and other intellectual property
laws. You may copy and print this document solely for
your own use in an Oracle training course. The
document may not be modified or altered in any way.
Except where your use constitutes "fair use" under
copyright law, you may not use, share, download,
upload, copy, print, display, perform, reproduce, publish,
license, post, transmit, or distribute this document in
whole or in part without the express authorization of
Oracle.

Andrew Babb

Sue Lee

Bharat Baddepudi

Juan Loaiza

Maria Billings

Barb Lundhild

Robert Carlin

Varun Malhotra

Michael Cebulla

Louis Nagode

Nilesh Choudhury

Dan Norris

Christian Craft

Michael Nowak

The information contained in this document is subject to


change without notice. If you find any problems in the
document, please report them in writing to: Oracle
University, 500 Oracle Parkway, Redwood Shores,
California 94065 USA. This document is not warranted
to be error-free.

Ravindra Dani

Sriram Palapudi

Restricted Rights Notice

Aslam Edah-Tally

Umesh Panchaksharaiah

Boris Erlikhman

Sugam Pandey

Amit Ganesh

Robert Pastijn

Ed Gilowski

Marshall Presser

Joel Goodman

Georg Schmidt

Scott Gossett

Akshay Shah

Jim Hall

Kam Shergill

Roger Hansen

Tim Shelter

James He

Eric Siglin

David Hitchcock

Sundararaman Sridharan

Bill Hodak

Vijay Sridharan

Vimala Jacob

Mahesh Subramaniam

Martin Jensen

Lawrence To

Kevin Jernigan

Alex Tsukerman

Caroline Johnston

Kodi Umamageswaran

Larry Justice

Douglas Utzig

Vikram Kapoor

Harald van Breederode

Bruce Kyro

Mark Van de Wiel

Sumeet Lahorani

Dave Winter

Publishers
Sujatha Nagendra
Giri Venugopal

If this documentation is delivered to the United States


Government or anyone using the documentation on
behalf of the United States Government, the following
notice is applicable:
U.S. GOVERNMENT RIGHTS
The U.S. Governments rights to use, modify, reproduce,
release, perform, display, or disclose these training
materials are restricted by the terms of the applicable
Oracle license agreement and/or the applicable U.S.
Government contract.
Trademark Notice
Oracle and Java are registered trademarks of Oracle
and/or its affiliates. Other names may be trademarks of
their respective owners.

Contents

Introduction
Course Objectives 1-2
Audience and Prerequisites 1-3
Course Scope 1-4
Course Contents 1-5
Terminology 1-6
Additional Resources 1-7
Practice 1 Overview: Introducing the Laboratory Environment 1-8

Exadata Overview
Objectives 2-2
Traditional Enterprise Database Storage Deployment 2-3
Exadata Storage Deployment 2-4
Exadata Implementation Architecture Overview 2-6
Introducing Exadata 2-7
Exadata Hardware Details (Sun Fire X4270 M2) 2-8
Exadata Specifications 2-9
InfiniBand Network 2-10
Classic Database I/O and SQL Processing Model 2-11
Exadata Smart Scan Model 2-12
Exadata Smart Storage Capabilities 2-13
Exadata Smart Scan Scale-Out Example 2-16
Exadata Hybrid Columnar Compression 2-19
Exadata Hybrid Columnar Compression Architecture Overview 2-20
Exadata Smart Flash Cache 2-21
Exadata Storage Index 2-23
Storage Index with Partitions Example 2-25
Database File System 2-26
I/O Resource Management 2-27
Benefits Multiply 2-28
Exadata Key Benefits for Data Warehousing 2-29
Exadata Key Benefits for OLTP 2-31
Quiz 2-32
Summary 2-34

iii

Additional Resources 2-35


Practice 2 Overview: Introducing Exadata Features 2-36
3

Exadata Architecture
Objectives 3-2
Exadata Software Architecture Overview 3-3
Exadata Software Architecture Details 3-5
Exadata Smart Flash Cache Architecture 3-7
Exadata Monitoring Architecture 3-9
Disk Storage Entities and Relationships 3-10
Interleaved Grid Disks 3-12
Flash Storage Entities and Relationships 3-13
Disk Group Configuration 3-14
Quiz 3-15
Summary 3-17
Additional Resources 3-18
Practice 3 Overview: Introducing Exadata Cell Architecture 3-19

Exadata Configuration
Objectives 4-2
Exadata Installation and Configuration Overview 4-3
Initial Network Preparation 4-4
Configuration of New Exadata Servers 4-6
Answering Questions During the Initial Boot Sequence 4-7
Exadata Administrative User Accounts 4-11
Configuring a New Exadata Cell 4-12
Important I/O Metrics for Oracle Databases 4-13
Testing Performance Using CALIBRATE 4-14
Configuring the Exadata Cell Server Software 4-15
Creating Cell Disks 4-16
Creating Grid Disks 4-17
Creating Flash-Based Grid Disks 4-18
Configuring Hosts to Access Exadata Cells 4-19
Configuring ASM and Database Instances for Exadata 4-20
Configuring ASM Disk Groups for Exadata 4-21
Optional Configuration Tasks 4-22
Exadata Storage Security Overview 4-23
Exadata Storage Security Implementation 4-24
Quiz 4-26
Summary 4-29

iv

Additional Resources 4-30


Practice 4 Overview: Configuring Exadata 4-31
5

Exadata Performance Monitoring and Maintenance


Objectives 5-2
Monitoring Overview 5-3
Exadata Metrics and Alerts Architecture 5-4
Monitoring Exadata with Metrics 5-6
Monitoring Exadata with Metrics: Example 5-8
Monitoring Exadata with Alerts 5-9
Displaying Alert Examples 5-11
Monitoring Exadata with Active Requests 5-13
Monitoring SQL Execution Plans 5-14
Smart Scan Execution Plan Example 5-15
Predicate Offloading Considerations 5-16
Monitoring Exadata from Your Database 5-17
Monitoring Exadata with Wait Events 5-18
Monitoring Exadata with Enterprise Manager 5-19
Additional Monitoring Tools and Utilities 5-20
Cell Maintenance Overview 5-21
Automated Cell Maintenance Operations 5-23
Replacing a Damaged Physical Disk 5-24
Replacing a Damaged Flash Card 5-26
Moving All Disks from One Cell to Another 5-27
Using the Exadata Software Rescue Procedure 5-28
Quiz 5-30
Summary 5-32
Additional Resources 5-33
Practice 5 Overview: Monitoring Exadata 5-34

Exadata and I/O Resource Management


Objectives 6-2
I/O Resource Management Overview 6-3
I/O Resource Management Concepts 6-5
I/O Resource Management Plans 6-6
IORM Architecture 6-7
I/O Resource Management Plans Example 6-8
Enabling Intradatabase Resource Management 6-11
Intradatabase Plan Example 6-12
Enabling IORM for Multiple Databases 6-13
Interdatabase Plan Example 6-14
v

Category Plan Example 6-16


Complete Example 6-17
Using Database I/Os Metrics 6-20
Quiz 6-21
Summary 6-25
Additional Resources 6-26
7

Optimizing Database Performance with Exadata


Objectives 7-2
Optimizing Performance 7-3
Flash Memory Usage 7-4
Compression Usage 7-6
Index Usage 7-8
ASM Allocation Unit Size 7-9
Minimum Extent Size 7-10
Quiz 7-11
Summary 7-13
Additional Resources 7-14
Practice 7 Overview: Optimizing Database Performance with Exadata 7-15

Database Machine Overview and Architecture


Objectives 8-2
Introducing Database Machine 8-3
Database Machine X2-2 Full Rack 8-4
X2-2 Database Server Hardware Details (Sun Fire X4170 M2) 8-5
Start Small and Grow 8-6
Database Machine X2-8 Full Rack 8-7
X2-8 Database Server Hardware Details (Sun Fire X4800) 8-8
Database Machine Capacity 8-9
Database Machine Performance 8-10
Database Machine X2-2 Architecture 8-11
InfiniBand Network Architecture 8-13
X2-2 Leaf Switch Topology 8-14
Full Rack Spine and Leaf Topology 8-15
Scale Performance and Capacity 8-16
Scaling Out to Multiple Full Racks 8-17
Quiz 8-18
Summary 8-20

vi

Database Machine Configuration


Objectives 9-2
Database Machine Implementation Overview 9-3
Configuration Worksheet Overview 9-5
Getting Started 9-6
Configuration Worksheet Example 9-7
Configuring ASM Disk Groups with Configuration Worksheet 9-11
Generating the Configuration Files 9-13
Other Pre-Installation Tasks 9-14
The Result After Installation and Configuration 9-15
Supported Additional Configuration Activities 9-17
Unsupported Configuration Activities 9-18
Quiz 9-20
Summary 9-22
Additional Resources 9-23

10 Migrating Databases to Database Machine


Objectives 10-2
Migration Best Practices Overview 10-3
Performing Capacity Planning 10-4
Database Machine Migration Considerations 10-5
Choosing the Right Migration Path 10-6
Logical Migration Approaches 10-7
Physical Migration Approaches 10-9
Other Approaches 10-11
Post-Migration Best Practices 10-12
Quiz 10-13
Summary 10-15
Additional Resources 10-16
Practice 10 Overview: Migrating to Databases Machine using Transportable
Tablespaces 10-18
11 Bulk Data Loading with Database Machine
Objectives 11-2
Bulk Data Loading Overview 11-3
Preparing the Data Files 11-4
Staging the Data Files 11-5
Configuring the Staging Area 11-6
Configuring the Staging Area 11-7
Configuring the Target Database 11-10
Loading the Target Database 11-11
vii

Quiz 11-13
Summary 11-15
Additional Resources 11-16
Practice 11 Overview: Bulk Data Loading with Database Machine 11-17
12 Backup and Recovery with Database Machine
Objectives 12-2
Backup and Recovery Overview 12-3
Using RMAN with Database Machine 12-4
General Recommendations for RMAN 12-5
Disk Based Backup Strategy 12-7
Disk Based Backup Configuration 12-8
Tape Based Backup Strategy 12-10
Tape Based Backup Configuration 12-11
Hybrid Backup Strategy 12-15
Restore and Recovery Recommendations 12-16
Backup and Recovery of Database Machine Software 12-17
Quiz 12-18
Summary 12-20
Additional Resources 12-21
Practice 12 Overview: Using RMAN Optimizations for Database Machine 12-22
13 Monitoring and Maintaining Database Machine
Objectives 13-2
Monitoring Tools Overview 13-3
ILOM Overview 13-4
ILOM Example 13-6
DCLI Overview 13-7
DCLI Examples 13-8
InfiniBand Diagnostic Utilities 13-9
Database Machine Support Overview 13-11
Patching and Updating Overview 13-12
Maintaining Exadata Software 13-13
Maintaining Database Server Software 13-14
Maintaining Other Software 13-15
Quiz 13-16
Summary 13-18
Additional Resources 13-19
Practice 13 Overview: Using the distributed command line utility (dcli) 13-20

viii

A New Features in Update Release 11.2.1.3.1


Objectives A-2
New Features Overview A-3
Auto Service Request (ASR) A-4
The ASR Process A-5
ASR Requirements A-6
Oracle Linux 5.5 A-7
Enhanced Operating System Security A-8
Pro-active Disk Quarantine A-9
Other New Features A-10
Summary A-11

ix

I t d ti
Introduction

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Course Objectives
After completing this seminar, you should be able to:
Describe the key capabilities of Exadata and Database
Machine
Identify the benefits of using Database Machine for
different application classes
Describe the architecture of Database Machine and its
integration with Oracle Database, Clusterware and ASM
Complete the initial configuration of Database Machine
Describe
D
ib various
i
recommended
d d approaches
h ffor migrating
i ti
to Database Machine
Configure Exadata I/O Resource Management
Monitor Database Machine health and optimize
performance
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 2

Audience and Prerequisites

This course is primarily designed for administrators who


will configure and administer Oracle Exadata Database
Machine.
Prior knowledge
g and understanding
g of the following
g is
assumed:
Oracle Database 11g Release 2, including RAC and ASM.
Linux and general network, storage and system
administration concepts.

Recommended prior training:

Oracle Database 11g: Administration Workshop I


Oracle Database 11g: Administration Workshop II
Oracle 11g: RAC and Grid Infrastructure Administration
Oracle Linux: Linux Fundamentals
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Audience and Prerequisites


This seminar is primarily designed for administrators who will configure and administer Oracle
Exadata Database Machine
Machine.
Please be mindful of the prerequisites because this course does not teach all aspects of the
technologies used inside Database Machine. Rather it focuses on topics that are specific to
Exadata and Database Machine.
Prior knowledge and understanding of Oracle Database 11g Release 2, including Automatic
Storage Management (ASM) and Real Application Clusters (RAC), is assumed. In addition, a
g knowledge
g of Linux is assumed along
g with an understand of g
general networking,
g
working
storage and system administration concepts.
For students that do not meet these prerequisites, the recommended prior training includes
the following courses:
Oracle Database 11g: Administration Workshop I
Oracle Database 11g: Administration Workshop II
Oracle 11g:
g RAC and Grid Infrastructure Administration
Oracle Linux: Linux Fundamentals

Exadata and Database Machine Administration Workshop 1 - 3

Course Scope

This course covers two main subject areas:


Exadata Storage Server X2-2

This section focuses on the architecture and key capabilities of


Exadata along with how to configure, monitor and optimize it.

Oracle Exadata Database Machine

This section introduces students to Database Machine.


The installation and configuration process is covered so that
students can make appropriate configuration decisions.
Students also learn how to maintain, monitor and optimize
Database Machine after initial configuration.

Hardware is discussed during the course, however


detailed hardware installation and maintenance is outside
the scope of this course.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Course Scope
This course covers two main subject areas:
The first section introduces students to Exadata Storage Server X2
X2-2
2 (formerly known
as Exadata Storage Server Version 2). Students learn about the architecture and key
capabilities of Exadata along with how to configure, monitor and optimize it.
The second section introduces students to Oracle Exadata Database Machine. Students
learn about the various Database Machine configurations. The installation and
configuration process is covered so that students are equipped to make appropriate upfront configuration decisions. They also learn how to maintain, monitor and optimize
Database Machine after initial configuration. Students are introduced to various options
for migrating to Database Machine and learn how to select the best approach.
Although the hardware components of Database Machine are introduced and described to
varying degrees throughout this course, you should consult the hardware documentation for
specific hardware installation and maintenance details.

Exadata and Database Machine Administration Workshop 1 - 4

Course Contents
1.
2.
3.
4.
5.
6.
7.
8.
9.
10
10.
11.
12.
13.

Introduction
Exadata Overview
Exadata Architecture
Exadata Configuration
Exadata Monitoring and Maintenance
Exadata and I/O Resource Management
Optimizing Database Performance with Exadata
Database Machine Overview and Architecture
Database Machine Configuration
Migrating Databases to Database Machine
Bulk Data Loading with Database Machine
Backup and Recovery with Database Machine
Database Machine Monitoring and Maintenance

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Course Contents
The slide shows the ordering of lessons in this course.

Exadata and Database Machine Administration Workshop 1 - 5

Terminology

Unless otherwise indicated, Exadata refers to Exadata


Storage Server.
Typically a reference to Exadata refers to the combination of
software and hardware used in Exadata Storage Server.
However at times there are specific references to Exadata
However,
hardware or Exadata software.
Unless otherwise indicated, Exadata X2-2 (formerly known
as Exadata Version 2) is implied throughout the course.
Exadata X2-2 is based on Sun hardware and is the only
version of Exadata supported in Oracle Exadata Database
Machine.
Machine

Unless otherwise indicated, Database Machine refers to


Oracle Exadata Database Machine.
Typically, Database Machine refers to the entire system
including both hardware and software.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Terminology
The slide indicates the conventions used throughout this course to abbreviate the formal
product names for Exadata Storage Server and Oracle Exadata Database Machine
Machine.

Exadata and Database Machine Administration Workshop 1 - 6

Additional Resources

Demonstrations (Viewlets)
http://www.oracle.com/technetwork/tutorials/index.html
Enter the Oracle Learning Library and conduct a search for
content in the Database Machine functional category.
g y Look
out for demonstrations with Exadata and Database Machine
Version 2 Series in the title.

Oracle Technology Network (OTN) Exadata and Database


Machine Page
http://www.oracle.com/technetwork/database/exadata/index.
html

OTN Exadata Discussion Forum


http://forums.oracle.com/forums/forum.jspa?forumID=829

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 7

Practice 1 Overview:
Introducing the Laboratory Environment
In this practice you will be introduced to the laboratory
environment used to support all the practices during this
course.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 1 - 8

E d t Overview
Exadata
O
i

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Objectives
After completing this lesson, you should be able to:
Contrast the Exadata storage architecture with traditional
shared storage offerings
Describe the hardware components of Exadata
Outline the capabilities of Exadata
Describe the main advantages of using Exadata compared
to traditional storage servers

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 2 - 2

Traditional Enterprise Database Storage


Deployment
Database Servers

Storage Arrays

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Traditional Enterprise Database Storage Deployment


The graphic in the slide illustrates the traditional deployment approach for multiple databases.
Each database has an isolated allocation of storage resources and its bandwidth is limited by
the hardware allocated to it. The isolation and dedication of hardware resources to individual
databases can simultaneously lead to unused space and unused input/output (I/O) bandwidth
for some databases, and overcommitted bandwidth with insufficient free space in others. The
right balance is almost never achieved because real-world workloads are very dynamic.
Large storage arrays are used today for many enterprise database deployments. These large
storage arrays must be partitioned and have their bandwidth and space allocated across the
d t b
databases
and
d applications
li ti
sharing
h i th
the storage
t
array. B
Because th
these storage
t
arrays h
house
vast quantities of mission-critical data, they must be highly engineered, and consequentially
very expensive, to deliver high levels of reliability and availability. Enterprise-class storage
arrays are not only costly to procure, they also require highly specialized skills to manage and
maintain. The result is a very high total cost of ownership when traditional large storage
arrays are used in real-world enterprise database deployments.

Exadata and Database Machine Administration Workshop 2 - 3

Exadata Storage Deployment


Oracle Database 11g Servers

Smart
storage
operations

I/O Resource Management


High performance
storage network
Storage
consolidation
(Transparent to
databases)

Data compression
p
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Storage Deployment


The graphic in the slide illustrates the general deployment approach with Exadata.
You can use Exadata to consolidate your storage environment
environment. Using Exadata
Exadata, multiple
databases can use storage from a single pool. Exadata uses Oracle Automatic Storage
Management (ASM) to evenly distribute the storage load for every database across
every available disk in the storage pool. Every database can use all the available disks
to maximize performance. Exadata requires the use of Oracle Database 11g Release 2.
Exadata works equally well with single-instance or Oracle Real Application Clusters
(RAC) databases. Users and database administrators use the same tools and
k
knowledge
l d they
th are already
l d familiar
f ili with.
ith B
Being
i b
based
d on iindustry-standard
d t
t d d components
t
and technologies, Exadata is inexpensive to deploy. In addition, tight integration with the
full suite of Oracle Database high-availability features, ensures that the reliability and
integrity needs of mission-critical environments are met.
A key advantage of Exadata is the ability to offload some database processing to
Exadata servers. With Exadata, the database can offload single table scan predicate
filters and projections, join processing based on bloom filters, along with CPU-intensive
decompression and decryption operations. This ability is known as SQL processing
offload or Smart Scan.

Exadata and Database Machine Administration Workshop 2 - 4

Exadata Storage Deployment (continued)

In addition to Smart Scan, Exadata has other smart storage capabilities including the
ability to offload incremental backup optimizations, file creation operations, and more. This
approach yields substantial CPU
CPU, memory
memory, and I/O bandwidth savings in the database
server resulting in potentially massive performance improvements.
Exadata includes Exadata Hybrid Columnar Compression. This feature provides very high
levels of data compression implemented inside Exadata. Exadata Hybrid Columnar
Compression allows the database to reduce the number of I/Os required to scan a table.
For example, for data with a compression ratio of 10 to 1, the I/Os required to scan the
data are reduced from 10 to 1 as well.
Exadata ensures that I/O resources are made available whenever, and to whichever,
database needs them based on priorities and policies that you can define. The Database
Resource Manager (DBRM) and Exadata I/O Resource Management (IORM) work
together to manage intradatabase and interdatabase I/O resource usage to ensure that
your defined service-level agreements (SLAs) are met when multiple applications and
databases share Exadata storage.
Finally, even for queries that do not use Smart Scan, Exadata has many advantages over
conventional storage. Exadata is highly optimized for fast processing of large queries. It
has been carefully architected to ensure no bottlenecks in the controller or in other
components inside the storage server. It makes intelligent use of high-performance flash
memory to boost performance and also uses a state-of-the-art InfiniBand network that has
much higher throughput than conventional storage networks.

Exadata and Database Machine Administration Workshop 2 - 5

Exadata Implementation Architecture Overview


Oracle Database 11g Servers

Exadata Cell
Exadata
software

Disk

Linux OS

Exadata Cell
Exadata
software

Disk

Linux OS

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Implementation Architecture Overview


Exadata is a self-contained storage platform that houses disk storage and runs the Exadata
Storage Server Software provided by Oracle
Oracle. A single Exadata server is also called a cell
cell. A
cell is the building block for a storage grid. More cells provide greater capacity and I/O
bandwidth. Databases are typically deployed across multiple cells, and multiple databases
can share a single cell. The databases and cells communicate with each other via a highperformance InfiniBand network.
Each cell is a purely dedicated storage platform for Oracle Database files although you can
use Database File System (DBFS), a feature of Oracle Database, to store your business files
i id the
inside
th d
database.
t b
Like other storage arrays, each cell is a computer with CPUs, memory, a bus, disks, network
adapters, and the other components normally found in a server. It also runs an operating
system (OS), which in the case of Exadata is Linux. The Oracle-provided software resident in
the Exadata cell runs under this operating system. The OS is accessible in a restricted mode
to administer and manage Exadata.

Exadata and Database Machine Administration Workshop 2 - 6

Introducing Exadata

Exadata Storage
Server

High performance storage for Oracle


Database
Up to 1.8 GB/sec raw data bandwidth
Up to 75,000 I/Os per second using flash

64 bit Intel-based Sun Fire Server


Preinstalled software
Exadata Storage Server Software
Oracle Linux x86_64
Drivers and Utilities

Only available in conjunction with


Database Machine

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Introducing Exadata
Exadata is highly optimized for use with Oracle Database. Exadata delivers outstanding I/O
and SQL processing performance for data warehousing and online transaction processing
(OLTP) applications.
Exadata is based on a 64 bit Intel-based Sun Fire server. Oracle provides the storage server
software to impart database intelligence to the storage, and tight integration with Oracle
Database and its features. Each cell is shipped with all the hardware and software
components preinstalled including the Exadata Storage Server Software, Oracle Linux
x86_64 operating system and InfiniBand protocol drivers.
Since March 2010, Exadata is no longer offered as a standalone storage product. Now
Exadata is only available for use in conjunction with Database Machine. Individual Exadata
servers can still be purchased, however they must be connected to Database Machine.
Custom configurations using Exadata are no longer supported for new installations.

Exadata and Database Machine Administration Workshop 2 - 7

Exadata Hardware Details


(Sun Fire X4270 M2)

Processors

2 Six-Core Intel Xeon L5640 Processors (2.26 GHz)

Memory

24 GB (6 x 4 GB)

Local Disks

12 x 600 GB 15K RPM High Performance SAS


or 12 x 2 TB 7.2K RPM High Capacity SAS

Flash

4 x 96 GB Sun Flash Accelerator F20 PCIe Cards

Disk Controller

Disk controller HBA with 512 MB battery backed cache

N t
Network
k

T InfiniBand
Two
I fi iB d 4X QDR (40Gb/
(40Gb/s)) ports
t
(1 dual-port PCIe 2.0 HCA)
Four embedded Gigabit Ethernet ports

Remote Management

1 Ethernet port (ILOM)

Power Supplies

2 redundant hot-swappable power supplies

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Hardware Details (Sun Fire X4270 M2)


The slide shows a description of the Exadata Storage Server hardware.

Exadata and Database Machine Administration Workshop 2 - 8

Exadata Specifications
HP Disks

HC Disks

Exadata Smart Flash Cache1

384 GB

384 GB

Raw Disk Capacity1

7.2 TB

24 TB

Uncompressed Data Capacity2

2 TB

7 TB

Raw Disk Throughput (MBPS)

1,800

1,000

Effective Throughput with Flash (MBPS)

3,600

3,600

Disk I/Os per Second (IOPS)

3,600

1,440

Flash I/Os p
per Second (IOPS)
(
)

75,000

75,000

1 - Raw capacity calculated using 1 GB = 1000 x 1000 x 1000 bytes and 1 TB = 1000 x 1000 x 1000 x 1000 bytes.
2 - User Data: Actual space for uncompressed end-user data, computed after single mirroring (ASM normal redundancy)
and after allowing space for database structures such as temporary space, logs, undo space, and indexes. Actual user data
capacity varies by application. User Data capacity calculated using 1 TB = 1024 * 1024 * 1024 * 1024 bytes.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Specifications
Exadata is available in two configurations: with high performance (HP) disks or with high
capacity (HC) disks.
disks The table in the slide lists the key capacity and performance
specifications for both configuration options.
Note: MBPS stands for megabytes per second, IOPS stands for I/Os per second.
Note: These metrics do not take into account compression. With compressed data, you can
achieve much higher effective throughput rates. In all cases, actual performance will vary by
application.

Exadata and Database Machine Administration Workshop 2 - 9

InfiniBand Network
InfiniBand:
Is the Exadata storage network:
Provides highest performance available 40 Gb/sec each direction
Is widely used in high-performance computing since 2002

Looks
oo s like
e normal
o a Ethernet
e e to
o host
os so
software:
ae
All IP-based tools work transparently TCP/IP, UDP, HTTP, SSH,
and so on

Has the efficiency of a SAN:


Zero copy and buffer reservation capabilities

Is used for both storage and RAC interconnect:


Less configuration
configuration, lower cost
cost, higher performance

Uses high-performance ZDP InfiniBand protocol (RDS V3):


Zero-copy, zero-loss Datagram protocol
Open Source software developed by Oracle
Very low CPU overhead

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

InfiniBand Network
InfiniBand is the only storage network supported by Exadata because of its performance and
proven track record in high-performance
p
g p
computing.
p
g InfiniBand works like normal Ethernet but
much faster. It has the efficiency of a SAN, using zero copy and buffer reservation. Zero copy
means that data is transferred across the network without intermediate buffer copies in the
various network layers. Buffer reservation is used so that the hardware knows exactly where
to place buffers ahead of time. These are two important characteristics that distinguish
InfiniBand from normal Ethernet.
InfiniBand is also supported as a unified network fabric for Exadata and the Oracle RAC
interconnect. This facilitates easier configuration and fewer cables and switches. You can
also
l use it ffor hi
high-performance
h
f
external
t
l connectivity,
ti it such
h as tto connectt b
backup
k servers or
ETL servers.
On top of InfiniBand, Exadata uses the Zero Data loss UDP (ZDP) protocol. ZDP is open
source software that is developed by Oracle. It is like UDP but more reliable. Its full technical
name is RDS (Reliable Datagram Sockets) V3. The ZDP protocol has a very low CPU
overhead with tests showing only a 2 percent CPU utilization while transferring 1 GB/sec of
data.
E hE
Each
Exadata
d t server iis configured
fi
d with
ith one d
dual-port
l
t InfiniBand
I fi iB d card
dd
designed
i
d tto b
be
connected to two separate InfiniBand switches for high availability. Each InfiniBand link is
able to carry the full data bandwidth of the entire cell, which means you can lose an entire
network without losing any performance.
Exadata and Database Machine Administration Workshop 2 - 10

Classic Database I/O and SQL Processing Model


SELECT customer_id
1
FROM
orders
WHERE order_amount>20000;

Row returned

Extents identified

SQL processing:
2 MB returned

I/O issued

I/O executed:
10 GB returned

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Classic Database I/O and SQL Processing Model


With traditional storage, all the database intelligence resides in the software on the database
server To illustrate how SQL processing is performed in this architecture
server.
architecture, an example of a
table scan is shown in the graphic in the slide.
1. The client issues a SELECT statement with a predicate to filter a table and return only
the rows of interest to the user.
2. The database kernel maps this request to the file and extents containing the table.
3. The database kernel issues the I/Os to read all the table blocks.
4 All the blocks for the table being queried are read into memory
4.
memory.
5. SQL processing is conducted against the data blocks searching for the rows that satisfy
the predicate.
6. The required rows are returned to the client.
As is often the case with the large queries, the predicate filters out most of the rows in the
table. Yet all the blocks from the table need to be read, transferred across the storage
network,, and copied
p
into memory.
y Manyy more rows are read into memoryy than required
q
to
complete the requested SQL operation. This generates a large amount of unproductive I/O,
which wastefully consumes resources and impacts application throughput and response time.
Exadata and Database Machine Administration Workshop 2 - 11

Exadata Smart Scan Model


SELECT customer_id
1
FROM
orders
WHERE order_amount>20000;

iDB command
2
constructed
and sent to Exadata cells

SQL processing
in Exadata

Row returned

Consolidated result
set built from all
Exadata cells

2 MB returned
to server

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Scan Model


Using Exadata, database operations are handled differently. Queries that perform table scans
can be p
processed within Exadata and return only
y the required
q
subset of data to the database
server. Row filtering, column filtering, some join processing, and other functions can be
performed within Exadata. Exadata uses a special direct-read mechanism for Smart Scan
processing. The above graphic illustrates how a table scan operates with Exadata:
1. The client issues a SELECT statement to return some rows of interest.
2. The database kernel determines that Exadata is available and constructs an iDB
command representing the SQL command and sends it to the Exadata cells. iDB is a
unique Oracle data transfer protocol that is used for Exadata storage communications.
3 The Exadata server software scans the data blocks to extract the relevant rows and
3.
columns which satisfy the SQL command.
4. Exadata returns to the database instance an iDB message containing the requested
rows and columns of data. These results are not block images, so they are not stored in
the buffer cache.
5. The database kernel consolidates the result sets from across all the Exadata cells. This
is similar to how the results from a parallel query operation are consolidated.
6 The rows are returned to the client
6.
client.
Moving SQL processing off the database server frees server CPU cycles and eliminates a
massive amount of unproductive I/O transfers. These resources are free to better service
other requests. Queries run faster, and more of them can be processed.
Exadata and Database Machine Administration Workshop 2 - 12

Exadata Smart Storage Capabilities

Predicate filtering:
Only the rows requested are returned to the database server
rather than all the rows in a table.

Column filtering:
g
Only the columns requested are returned to the database
server rather than all the columns in a table.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Storage Capabilities


The following database functions are integrated within Exadata:
Exadata enables predicate filtering for table scans
scans. Rather than returning all the rows for
the database to evaluate, Exadata returns only the rows that match the filter condition.
The conditional operators that are supported include =, !=, <, >, <=, >=, IS [NOT] NULL,
LIKE, [NOT] BETWEEN, [NOT] IN, EXISTS, IS OF type, NOT, AND, OR. In addition, many
common SQL functions are evaluated by Exadata during predicate filtering. For a full list
of functions that can be offloaded to Exadata, use the following query:
SELECT * FROM v$sqlfn_metadata WHERE offloadable = 'YES';
Exadata provides column filtering, also called column projection, for table scans. Only
the requested columns are returned to the database server rather than all columns in a
table. For tables with many columns, or columns containing LOBs, the I/O bandwidth
saved by column filtering can be very large.
When used together, the combination of predicate and column filtering dramatically improves
performance and reduces I/O bandwidth consumption. For example, when processing the
following query, Exadata returns only the employee names that are longer than five
characters:
SELECT name FROM employees WHERE LENGTH(name) > 5;
Without predicate and column filtering, the storage subsystem would need to send all the
rows and columns of the employees table to the database to evaluate.
Exadata and Database Machine Administration Workshop 2 - 13

Exadata Smart Storage Capabilities

Join processing:
Simple star join processing is performed within Exadata.

Scans on encrypted data


Scans on compressed data
Scoring for Data Mining:
All data mining scoring functions are offloaded.
Up to 10x performance gains.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Storage Capabilities (continued)

Exadata performs join processing for star schemas (between large tables and small
lookup tables)
tables). This is implemented using Bloom Filters
Filters, which is a very efficient
probabilistic method to determine whether an element is a member of a set.
Exadata performs Smart Scans on encrypted tablespaces and encrypted columns. For
encrypted tablespaces, Exadata can decrypt blocks and return the decrypted blocks to
Oracle Database, or it can perform row and column filtering on encrypted data.
Significant CPU savings can be made within the database server by offloading the CPUintensive decryption task to Exadata cells.
Smart Scan works in conjunction with Exadata Hybrid Columnar Compression so that
column projection and row filtering can be executed along with decompression at the
storage level to save CPU cycles on the database servers.
Exadata can perform scoring functions for data mining models. All data mining scoring
functions, such as PREDICTION_PROBABILITY, are offloaded to Exadata cells for
processing. This accelerates warehouse analysis while it reduces database server CPU
consumption
p
and the I/O load between the database server and Exadata.

Exadata and Database Machine Administration Workshop 2 - 14

Exadata Smart Storage Capabilities

Backups:
I/O for incremental backups is much more efficient because
only changed blocks are returned to the database server.

Create/extend tablespace:
p
Exadata formats database blocks.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Storage Capabilities (continued)

The speed and efficiency of incremental database backups is enhanced with Exadata.
The granularity of change tracking in the database is much finer with Exadata
Exadata. With
Exadata, changes are tracked at the individual Oracle block level rather than at the level
of a large group of blocks. This results in less I/O bandwidth being consumed for
backups and faster running backups.
With Exadata, the create/extend tablespace operation is also executed much more
efficiently. Instead of formatting blocks in database server memory and writing them to
storage, a single iDB command is sent to Exadata instructing it to format the blocks.
Database server memory usage is reduced and I/O associated with the creation and
formatting of the database blocks is eliminated with Exadata.

Exadata and Database Machine Administration Workshop 2 - 15

Exadata Smart Scan Scale-Out Example

Database
Server

dbs1

InfiniBand Storage Network


40 Gb/s Maximum

Exadata
Cell

edsc1

edsc2

edsc13

edsc14

Each cell can deliver 1.8 GB/s.


Total of 14 cells that can deliver
14 x 1.8 = 25.2 GB/s
Disks
(12/cell)

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Scan Scale-Out Example


The example in the next three slides illustrates the power of Smart Scan in a quantifiable
manner using a typical case in which multiple Exadata cells scale-out
scale out to share a workload
workload.
The database server, depicted in the upper portion of the slide, is connected to the InfiniBand
storage network, which can deliver a maximum of 40 gigabits per second (Gb/s). To keep the
example clear and simple, assume that the InfiniBand storage network can deliver data at 40
Gb/s with no messaging overhead. We will also assume that a single database server has
access to the full I/O bandwidth of all the Exadata cells.
g that each Exadata cell can deliver 1.8
In this scenario, there are 14 Exadata cells. Assuming
gigabytes (GB) of I/O throughput per second, the potential scanning power of all the Exadata
cells is 25.2 GB per second.

Exadata and Database Machine Administration Workshop 2 - 16

Exadata Smart Scan Scale-Out Example


select /*+ full(lineitem) */ count(*)
from lineitem
where l_orderkey < 0;
Database
Server

dbs1

If the table is evenly distributed


across all disks, each cell
cannot send more than 40 / 14 =
2.85 Gb/s = 0.357 GB/s
to the database instance.

If the table is 4800 GB in size, the


complete scan would take approximately
16 minutes.

Exadata
Cell

edsc1

edsc2

Database asks to retrieve all blocks


by doing a full table scan, and then
filters matching rows.

edsc13

edsc14

0 357 GB/s
0.357
Disks are throttled
by the network bandwidth!

Disks
(12/cell)

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Scan Scale-Out Example (continued)


Now assume a 4800 gigabyte table is evenly spread across the 14 Exadata cells and a query
is executed which requires a full table scan.
scan As is commonly the case
case, assume that the query
returns a small set of result records.
Without Smart Scan capabilities, each Exadata server behaves like a traditional storage
server by delivering database blocks to the client database.
Because the storage network is bandwidth-limited to 40 gigabits per second, it is not possible
for the Exadata cells to deliver all their power. In this case, each cell cannot deliver more than
gigabytes
g y
p
per second to the database and it would take approximately
pp
y 16 minutes to
0.357 g
scan the whole table.

Exadata and Database Machine Administration Workshop 2 - 17

Exadata Smart Scan Scale-Out Example


select /*+ full(lineitem) */ count(*)
from lineitem
where l_orderkey < 0;
Database
Server

dbs1

If the table is evenly distributed


across all disks, each cell
cannot send more than 40 / 14 =
2.85 GB/s = 0.357 GB/s
to the database instance.

If the table is 4800 GB in size, the complete


table scan will complete in approximately
three minutes and ten seconds!

Exadata
Cell

1 8 GB/s
1.8

Disks
(12/cell)

edsc1

edsc2

Database asks Exadata cells


to send back all matching rows.

edsc13

edsc14

Each
E
h cellll can scan att a
speed of 1.8 GB/s,
and send its matching
rows to the database
instance. This represents
a total scan at a speed
of 25.2 GB/s!

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Scan Scale-Out Example (continued)


Now consider if Smart Scan is enabled for the same query. The same storage network
bandwidth limit applies
applies. However this time the entire 4800 GB is not transported across the
storage network; only the matching rows are transported back to the database server. So
each Exadata cell can process its part of the table at full speed; that is, 1.8 GB per second. In
this case, the entire table scan would be completed in approximately three minutes and ten
seconds.

Exadata and Database Machine Administration Workshop 2 - 18

Exadata Hybrid Columnar Compression

Warehouse Compression

Archival Compression

Optimized for Speed

Optimized for Space

10
10x average storage
t
savings
i
10x scan I/O reduction
Optimized for query performance

15x average storage


t
savings
i
15
Up to 50x on some data
Some access overhead
For cold or historical data

Reduced Warehouse Size


Better Performance

Reclaim Disks
Keep Data Online

Can mix compression types by partition for ILM

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Hybrid Columnar Compression


In addition to the basic and OLTP compression capabilities of Oracle Database 11g, Exadata
includes Exadata Hybrid Columnar Compression
Compression.
Exadata Hybrid Columnar Compression offers higher compression ratios for direct path
loaded data. This compression capability is recommended for data that is not updated
frequently. You can specify Exadata Hybrid Columnar Compression at the table, partition, and
tablespace level. You can also choose between two types of Exadata Hybrid Columnar
Compression, to achieve the proper trade-off between disk usage and CPU consumption,
depending on your requirements:
Warehouse compression: This type of compression is optimized for query performance,
and is intended for data warehouse applications.
Online archival compression: This type of compression is optimized for maximum
compression ratios, and is intended for data that does not change frequently.
You can use Exadata Hybrid Columnar Compression on complete tables or in combination
with basic and OLTP compression by using partitioning.
Note: A compression advisor, provided by the DBMS_COMPRESSION package, helps you
determine the expected compression ratio for a particular table with a particular compression
method.
Exadata and Database Machine Administration Workshop 2 - 19

Exadata Hybrid Columnar Compression


Architecture Overview
Compression Unit (CU)
Block Header
CU Header
C1
C2

Block Header

Block Header

Block Header

C2

C5

C7

C3

C8

C4
C5

C6

A compression unit is a logical structure spanning multiple


database blocks.
E h row iis self-contained
Each
lf
t i d within
ithi a compression
i unit.
it
Data organized by column during data load.
Each column compressed separately.
Smart Scan is supported.
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Hybrid Columnar Compression Architecture Overview


Exadata Hybrid Columnar Compression is a new method for organizing data in database
blocks Tables are organized into sets of rows called compression units (CU)
blocks.
(CU). Within a
compression unit, data is organized by column and then compressed. The column
organization of data brings similar values close together, enhancing compression ratios. Each
row is self-contained within a compression unit.
In addition to providing excellent compression, Exadata Hybrid Columnar Compression works
in conjunction with Smart Scan so that column projection and row filtering can be executed
along with decompression at the storage level to save CPU cycles on the database servers.
Note: Although the diagram in the slide shows a compression unit containing four data
blocks, it should not be assumed that a compression unit always contains fours blocks. The
size of a compression unit is determined automatically by Oracle Database based on various
factors in order to deliver the most effective compression result while maintaining excellent
query performance.

Exadata and Database Machine Administration Workshop 2 - 20

Exadata Smart Flash Cache

High performance cache for frequently accessed objects


Excellent for absorbing repeated random reads
Allows optimization by application table
Hundreds of
I/Os per Sec

Tens of Thousands
of I/Os per Second

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Flash Cache


For many years, a constraining factor for storage performance has been the number of
random I/Os per second (IOPS) that a disk can deliver.
deliver To compensate for the fact that even
a high performance disk can deliver only a few hundred IOPS, large storage arrays with
hundreds of disks are required to deliver in excess of 60,000 IOPS.
Exadata provides Exadata Smart Flash Cache, a caching mechanism for frequently accessed
data. It is a write-through cache which is useful for absorbing repeated random reads, and
very beneficial to OLTP. Using Exadata Smart Flash Cache, a single Exadata cell can support
up to 75,000 IOPS, two cells can support up to 150,000 IOPS, and so on.
Exadata Smart Flash Cache focuses on caching frequently accessed data and index blocks,
along with performance critical information such as control files and file headers. In addition,
DBAs can influence caching priorities using the CELL_FLASH_CACHE storage attribute for
specific database objects.

Exadata and Database Machine Administration Workshop 2 - 21

Exadata Smart Flash Cache


High performance cache that understands different types of
database I/O:
Frequently accessed data and index blocks are cached.
Control file reads and writes are cached
cached.
File header reads and writes are cached.
DBA can influence caching priorities.

I/Os to mirror copies are not cached.


Backup-related I/O is not cached.
Data Pump I/O is not cached.
Data file formatting is not cached.
Table scans do not monopolize the cache.
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Flash Cache (continued)


In more recent times, vast and expensive storage arrays have introduced equally expensive
nonvolatile memory caches to improve performance
performance. However,
However these caches know nothing
about the applications using them, so their efficiency is limited when compared to their cost.
With Exadata, each database I/O is tagged with metadata indicating the I/O type. Exadata
Smart Flash Cache uses this information to make intelligent decisions about how to use the
cache. This cooperation ensures the efficient use of Exadata Smart Flash Cache.
For example, with ASM mirroring turned on, multiple copies of each data block must be
protection. However, there is usually
y no
written to disk to deliver the desired level of data p
need to cache the secondary copies of a block because ASM will read the primary copy if it is
available. A traditional storage array would not know about this characteristic leading to
caching inefficiencies.
Similarly, with traditional storage arrays, backups and exports will typically cause all the data
to be loaded into the cache even though the operation will not read the data repeatedly.
Exadata knows that there is no need to fill the cache with backup and export data. The same
is true for data file formatting operations.
operations Finally,
Finally Exadata does not flood the cache with data
from full table scans, as is the case with most storage arrays.

Exadata and Database Machine Administration Workshop 2 - 22

Exadata Storage Index


Storage Index in Memory

SELECT * FROM T1 WHERE B<2;

Region Index
B:1/5

Only first block can match

B:3/8

E:a/j
G:4/9

1 ASM AU

1MB

Storage Region

1 ASM Disk

DBA

Min B = 1
Max B = 5

Table T2

Table T1

Table T1

A B C D

A B C D

a 4

Min B = 3
Max B = 8

d 7
j

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Storage Index


A storage index is a memory-based structure that reduces the amount of physical I/O required
byy the cell. The storage
g index keeps
p track of minimum and maximum column values and this
information is used to avoid useless I/Os.
For example, the slide shows table T1 which contains column B. Column B is tracked in the
storage index so it is known that the first half of T1 contains values for column B that range
between 1 and 5. Likewise it is also known that the second half of T1 contains values for
column B that range between 3 and 8. Any query on T1 looking for values of B less than 2 can
quickly proceed without any I/O against the second part of the table.
Given a favorable combination of data distribution and q
query
yp
predicates,, a storage
g index
could be used to drastically speed up a query by quickly skipping much of the I/O. For another
query, the storage index may provide little or no benefit. In any case, the ease of maintaining
and querying the memory-based storage index means that any I/O saved through its use
effectively increases the overall I/O bandwidth of the cell while consuming very few cell
resources.
The storage space inside each cell disk is logically divided into 1 MB chunks called storage
regions. The boundaries of ASM allocation units (AUs) are aligned with the boundaries of
t
regions.
i
F
For each
h off th
these storage
t
regions,
i
d
data
t di
distribution
t ib ti statistics
t ti ti are h
held
ld iin a
storage
memory structure called a region index. Each region index contains distribution information for
up to 8 columns. The storage index is a collection of the region indexes.
Exadata and Database Machine Administration Workshop 2 - 23

Exadata Storage Index (continued)


The storage statistics represent the data distribution (minimum and maximum values) of
columns that are considered well clustered by Exadata. Exadata has heuristics to transparently
determine what
hat col
columns
mns are cl
clustered
stered eno
enough
gh to be incl
included
ded in the storage inde
index.
The storage index works best when the following conditions are true:
The data is roughly ordered so that the same column values are clustered together.
The query has a predicate on a storage index column checking for =, <, > or some
combination of these.
It is important to note that the storage index works transparently with no user input. There is no
need
d tto create,
t d
drop, or tune
t
th
the storage
t
index.
i d
The
Th only
l way tto iinfluence
fl
th
the storage
t
index
i d is
i to
t
load your tables using presorted data.
Also, because the storage index is kept in memory, it disappears when the cell is rebooted. The
first queries that run after a cell is rebooted automatically cause the storage index to be rebuilt.
The storage index works for data types whose binary encoding is such that byte-wise binary
lexical comparison of two values of that data type is sufficient to determine the ordering of those
two values.
values This includes data types like NUMBER,
NUMBER DATE,
DATE and VARCHAR2.
VARCHAR2 However,
However NLS data
types are an example of data types that are not included for storage index filtering.

Exadata and Database Machine Administration Workshop 2 - 24

Storage Index with Partitions Example


ORDER#

ORDER_DATE

SHIP_DATE

ITEM

(Partition Key)

2007

2007

008
2008

2008
008

2009

2009

Queries on SHIP_DATE do not benefit from ORDER_DATE


partitioning:
However SHIP_DATE is highly correlated with ORDER_DATE.

Storage index provides partition pruning like performance for


queries on SHIP_DATE:
Takes advantage of ordering created by partitioning

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Storage Index with Partitions Example


The example in the slide contains correlated columns. ORDER_DATE is highly correlated with
SHIP DATE The dates are generally correlated because usually a ship date is close to an
SHIP_DATE.
order date.
If your table is partitioned by ORDER_DATE, and you execute a query using ORDER_DATE as a
filter, then partition pruning is used to read only the relevant partitions. However, if you do a
query using only SHIP_DATE in the WHERE clause, partition pruning cannot be used to
optimize the query.
However, if SHIP_DATE is part of the storage index, the storage index is used to skip all the
blocks that do not correspond to your query. This filtering takes place at the storage level. The
storage index helps the SHIP_DATE query to take advantage of the natural ordering implied
by the ORDER_DATE partitioning and the natural correlation that exists between the
ORDER_DATE and SHIP_DATE columns.

Exadata and Database Machine Administration Workshop 2 - 25

Database File System

Database File System (DBFS) enables the database to be used


as a file system.
Files are stored as SecureFiles LOBs inside database tables that
are stored in Exadata.
Protected like any Oracle data ASM mirroring, Data Guard,
Flashback, and so on
Shared storage for ETL staging, scripts, reports and other
application files
5 to 7 GB/sec file system I/O throughput capable on a full rack
Database Machine
Copy files to DBFS

Transform and load into


database tables

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Database File System


Oracle Database File System (DBFS) enables an Oracle database to be used as a POSIXcompatible file system on Linux
Linux. DBFS is an Oracle Database capability that provides
Exadata users with a high performance mechanism to load data into an Oracle database.
DBFS can be used to stage your ETL files for example.
Inside DBFS files are stored as SecureFiles LOBs. A set of PL/SQL procedures implement
the file system access primitives, such as open, create, and so on. The dbfs_client utility
enables the mounting of a DBFS file system as a mount point on Linux. It provides the
mapping from file system operations to database operations. The dbfs_client utility runs
completely
l t l iin user space and
d iinteracts
t
t with
ith the
th kernel
k
l through
th
h the
th FUSE lib
library iinfrastructure.
f t t
Note: ASM Cluster File System (ACFS) is not supported over Exadata.

Exadata and Database Machine Administration Workshop 2 - 26

I/O Resource Management

RDBMS

I/O
Requests

Traditional
Storage Server

H L H L L L
Y cannott
You
influence the
I/O scheduler.

High-priority
workload
request

Exadata

RDBMS

FIFO Disk Queue

I/O
Requests

Low-priority
workload
request

I/O scheduler based on


prioritization scheme

H H
L L L L

L H H H

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

I/O Resource Management


With traditional shared storage, balancing the work of multiple databases sharing the storage
subsystem is inherently difficult
difficult. This issue is illustrated by the graphic at the top of the slide
slide,
which shows how traditional storage servers handle I/O requests. In essence, they queue I/O
requests in a first-in, first-out (FIFO) order, which makes no distinction between high-priority
and low-priority requests.
Exadata allows for allocation of I/O resources based on user-specified priorities and policies.
This is illustrated in the graphic at the bottom of the slide where the Exadata I/O scheduler
executes I/O requests based on a prioritization scheme. It does that by internally queuing I/O
requests
t to
t preventt a low-priority
l
i it but
b t intensive
i t
i workload
kl d ffrom flflooding
di th
the disks.
di k
I/O resource management is covered in more detail in the lesson titled Exadata and I/O
Resource Management.

Exadata and Database Machine Administration Workshop 2 - 27

Benefits Multiply

Multiple terabytes of user


data normally requires
multiple terabytes of I/O

Storage index skips


worthless I/O

Less with Exadata


Hybrid Column
Compression

Smart scan means


that only the results
are returned to the
database

Even less with


partition pruning

Results in
real-time on
Database
Machine

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Benefits Multiply
This is an example that shows you how the main Exadata features that were introduced in this
lesson can work together to multiply the benefits of Exadata
Exadata.
Assume you have a multi-terabyte table and somebody runs a query that is interested in a
small subset of the data, but causes a full table scan. Traditionally, the system would have to
scan the terabytes of data.
However, using Exadata Hybrid Columnar Compression could reduce the size of the table.
If the table is partitioned, the optimizer could use partition pruning to eliminate a substantial
proportion of the data
data.
Using storage indexes, Exadata might further reduce the amount of physical I/O that is
executed.
Finally, because of Smart Scan, the only data returned to the database is the data of interest
to the query, some of which may have been cached inside Exadata Smart Flash Cache.
This example shows how the various Exadata and Oracle Database features can work in
harmony to improve the performance of a single operation using Database Machine
Machine.

Exadata and Database Machine Administration Workshop 2 - 28

Exadata Key Benefits for Data Warehousing

Exadata uses more connections:


Modular storage cell building blocks organized into
massively parallel grid

Exadata has bigger network pipes:


InfiniBand network transfers data faster than Fibre Channel.

Exadata transports less data between the storage and the


database:
Query processing is moved into storage to dramatically
reduce data sent to servers while unloading server CPUs.

Exadata Hybrid Columnar Compression reduces the


number of physical I/Os for large table scans.
In-memory parallel query provides a powerful alternative
query strategy that complements Exadata.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Key Benefits for Data Warehousing


One of the key benefits of Exadata is extremely enhanced performance for data warehousing
applications By replacing your existing storage with Exadata,
applications.
Exadata it is possible to get up to 100
times speedup for your data warehousing queries. The larger the data warehouse, the greater
the speedup from using Exadata.
Exadata addresses three key dimensions of database I/O that can hamper data warehouse
performance.
Exadata is based on a massively parallel architecture, which provides more connections
to deliver more data faster between the storage servers and the database servers.
Exadata is built using wide network pipes that provide extremely high bandwidth
between the storage servers and the database servers. Exadata uses InfiniBand as the
storage network ,which provides a throughput of 40 Gb/sec with very low latency. This is
many times the bandwidth provided by traditional SAN storage networks.
Exadata is database-aware and can transport just the data required to satisfy SQL
requests resulting in less data being sent between the storage servers and the database
servers
servers.
Basically, Exadata reduces the volume of data transported and moves data faster compared
with other storage solutions.
Exadata and Database Machine Administration Workshop 2 - 29

Exadata Key Benefits for Data Warehousing (continued)


In addition, Exadata introduces additional capabilities that can further enhance data warehouse
performance.
Exadata includes Exadata Hybrid Columnar Compression. This feature provides very high
levels of data compression implemented inside Exadata. Exadata Hybrid Columnar
Compression benefits large scale scans, commonly used in data warehousing, by efficiently
scanning vast volumes of data using a fraction of I/Os. Compression ratios of 10 to 1 are
common which means that a 10 TB table can be scanned using 1 TB of disk I/O.
Exadatas tight integration with Oracle Database results in an intelligent platform for data
warehousing The complete solution uses a range of technologies to deliver the best result
warehousing.
result, not
just relying on one approach to the problem. An example of this is the new in-memory parallel
query feature of Oracle Database 11g Release 2.
Normally, a Smart Scan would be used to execute portions of a query inside Exadata and return
the minimum amount of data to the database server. In some cases, however, it may be more
efficient to read all the required data into the memory on the database servers and process the
query that way.
In-memory parallel query enhances query performance by minimizing or even completely
eliminating additional physical I/O for a particular query. Oracle automatically decides if an
object being accessed using parallel execution benefits from being cached in the database
buffer cache. The decision to cache an object is based on a well-defined set of heuristics
including size of the object and the frequency that it is accessed.
In-memory parallel query harnesses the aggregated memory across a database cluster for
parallel operations,
operations enabling it to scale-out as the number of nodes in a cluster increases
increases. In an
Oracle RAC environment, Oracle maps fragments of the object into each of the buffer caches on
the active instances. By creating this mapping, Oracle knows which buffer cache to access to
find a specific part or partition of an object. Using this information, Oracle Database will prevent
multiple instances from reading the same information from disk over and over again, thus
maximizing the amount of memory that can be used to cache the objects.
In-memory parallel query nicely complements Exadata. Using this combination, some queries
can be
b efficiently
ffi i tl executed
t d with
ith little
littl or no additional
dditi
l I/O by
b pinning
i i ttables
bl iin th
the d
database
t b
b
buffer
ff
cache whereas others can harness the power of Smart Scan inside Exadata.

Exadata and Database Machine Administration Workshop 2 - 30

Exadata Key Benefits for OLTP

Exadata uses more connections:


Modular storage cell building blocks organized into
massively parallel grid

Exadata has bigger network pipes:


InfiniBand network
net ork transfers data faster than Fibre Channel
Channel.

Exadata Smart Flash Cache:


Provides high-performance cache for frequently accessed
objects
Is excellent for absorbing repeated random reads
Allows
All
optimization
i i i b
by application
li i table
bl
Hundreds of
I/Os per Sec

Tens of Thousands
of I/Os per Second

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Key Benefits for OLTP


Some of the fundamental architectural characteristics of Exadata that are beneficial for data
warehousing are equally relevant and beneficial for online transaction processing (OLTP)
(OLTP).
The high-performance, low-latency, InfiniBand network used in conjunction with the massively
parallel grid architecture of Exadata is ideal for supporting many thousands of simultaneous
users.
In addition, the introduction of Exadata Smart Flash Cache is of particular benefit to OTLP
performance. Exadata Smart Flash Cache allows each Exadata cell to deliver up to 75,000
IOPS. In addition, Oracle Database and Exadata Smart Flash Cache work closely with each
other.
th This
Thi cooperation
ti optimizes
ti i
th
the usage off Exadata
E d t S
Smartt Flash
Fl h C
Cache
h so th
thatt only
l th
the
most frequently accessed and performance-sensitive data is cached. Users have additional
control over which database objects should be cached more aggressively than others, and
which ones should not be cached at all.

Exadata and Database Machine Administration Workshop 2 - 31

Quiz
Exadata and Database Machine are two different names that
designate the same thing.
1. TRUE
2 FALSE
2.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 2

Exadata and Database Machine Administration Workshop 2 - 32

Quiz
What are the three unique benefits of Exadata compared to
traditional storage servers?
1. Larger disk sizes
2 Smart storage capabilities
2.
3. Higher storage network bandwidth
4. Higher RAM capacity
5. Integrated database I/O resource management

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 2, 3, 5

Exadata and Database Machine Administration Workshop 2 - 33

Summary
In this lesson, you should have learned how to:
Contrast the Exadata storage architecture with traditional
shared storage offerings
Describe the hardware components of Exadata
Outline the capabilities of Exadata
Describe the main advantages of using Exadata compared
to traditional storage servers

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 2 - 34

Additional Resources

Lesson Demonstrations (Viewlets)


Introduction to Smart Scan

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/021ExadataSmartScanIntro
/021exadatasmartscanintro_viewlet_swf.html

Introduction to Exadata Hybrid Columnar Compression

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/022ExadataCompressionInt
ro/022exadatacompressionintro_viewlet_swf.html

Introduction to Exadata Smart Flash Cache

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/023ExadataFlashCacheIntr
o/023exadataflashcacheintro_viewlet_swf.html

Smart Scan Scale Out Example

http://sthtt
// t
curriculum.oracle.com/demos/db/11g/r2/exadatav2/smartscanscaleoutexamp
le/smartscanscaleoutexample.swf

Storage Index

http://stcurriculum.oracle.com/demos/db/11g/r2/exadatav2/storageindex/storageinde
x.swf

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 2 - 35

Practice 2 Overview:
Introducing Exadata Features
In these practices, you are introduced to four major capabilities
of Exadata, namely:
Smart Scan
Exadata Hybrid Columnar Compression
Exadata Smart Flash Cache
Storage Index

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 2 - 36

Exadata Architecture

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Objectives
After completing this lesson, you should be able to describe:
The Exadata architecture
The relationship between the various storage abstractions
used in Exadata

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 3 - 2

Exadata Software Architecture Overview


RAC DB

Single-instance DB

DB Server

Single
ASM cluster

DB Server

DB Server

DB Instance

DB Instance

DB Instance

DBRM

DBRM

DBRM

ASM

ASM

ASM

LIBCELL

LIBCELL

LIBCELL

iDB Protocol over


InfiniBand with Path
Failover

Enterprise
Manager

InfiniBand Storage Switch/Network

Oracle Linux
Cell Control
CLI
(cellcli/dcli)

Oracle Linux

Oracle Linux

CELLSRV

MS

CELLSRV

MS

CELLSRV

MS

IORM

RS

IORM

RS

IORM

RS

SSH

Exadata Server

Exadata Server

Exadata Server

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Software Architecture Overview


The architecture of Exadata includes components on the database server and on the Exadata
g components reside on the
server. The overall architecture is shown in the slide. The following
database server:
Oracle Database communicates with Exadata using the Intelligent Database protocol
(iDB). iDB is implemented in the database kernel and LIBCELL. iDB is a unique Oracle
data transfer protocol, built on Reliable Datagram Sockets (RDS), that runs on industry
standard InfiniBand networking hardware. iDB provides data intelligence between the
database and Exadata and enables ASM and database instances to utilize Exadataspecific features,
features such as Smart Scan and I/O Resource Management
Management. iDB transparently
maps database operations to Exadata-enhanced operations. Single-instance or Oracle
RAC databases access Exadata storage cells using iDB.
Automatic Storage Management (ASM) is required and provides a file system and volume
manager optimized for Oracle Database.
Database Resource Manager (DBRM), in combination with Exadata I/O Resource
Management (IORM), ensures that I/O resources are allocated based on defined priorities.
Note: The slide illustrates the recommended configuration where a single ASM cluster is used
to consolidate storage for all of your databases. Alternatively, you can connect multiple separate
ASM environments with separate disk groups to Exadata.
Exadata and Database Machine Administration Workshop 3 - 3

Exadata Software Architecture Overview


RAC DB

Single-instance DB

DB Server

Single
ASM cluster

DB Server

DB Server

DB Instance

DB Instance

DB Instance

DBRM

DBRM

DBRM

ASM

ASM

ASM

LIBCELL

LIBCELL

LIBCELL

iDB Protocol over


InfiniBand with Path
Failover

Enterprise
Manager

InfiniBand Storage Switch/Network

Oracle Linux
Cell Control
CLI
(cellcli/dcli)

Oracle Linux

Oracle Linux

CELLSRV

MS

CELLSRV

MS

CELLSRV

MS

IORM

RS

IORM

RS

IORM

RS

SSH

Exadata Server

Exadata Server

Exadata Server

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Software Architecture Overview (continued)


The software components that reside in Exadata include:
provides the Exadata server operating
p
g system.
y
Oracle Linux p
Cell Server (CELLSRV) is the primary Exadata software component and provides the
majority of Exadata storage services. CELLSRV is a multithreaded server. CELLSRV serves
simple block requests, such as database buffer cache reads, and Smart Scan requests,
such as table scans with projections and filters. CELLSRV also implements I/O Resource
Management (IORM), which works in conjunction with Database Resource Manager
(DBRM), to meter out I/O bandwidth to the various databases and consumer groups
issuing I/Os. Finally, CELLSRV collects numerous statistics relating to its operations.
O l D
Oracle
Database
t b
and
d ASM processes use LIBCELL to
t communicate
i t with
ith CELLSRV, and
d
LIBCELL converts I/O requests into messages that are sent to CELLSRV using the iDB
protocol.
Management Server (MS) provides Exadata cell management and configuration. It works
in cooperation with the Exadata cell command-line interface (CellCLI). Each cell is
individually managed with CellCLI. CellCLI can only be used from within a cell to manage
that cell, however you can run the same CellCLI command remotely on multiple cells with
the dcli utility.
utility In addition,
addition MS is responsible for sending alerts and collects some
statistics in addition to those collected by CELLSRV.
Restart Server (RS) is used to start up/shut down the CELLSRV and MS services and
monitors these services to automatically restart them if required.
Exadata and Database Machine Administration Workshop 3 - 4

Exadata Software Architecture Details


Exadata Cell

Database Server

Data

ASM instance

RDBMS instance

Smart
Flash Cache

SGA

SGA

ASM

ASM

dskm

I/O
Proc

dskm

I/O
Proc

LIBCELL

/opt/oracle/cell/
cellsrv/deploy/
config

LIBCELL

CellCLI
cellsrv

MS
diskmon

adrci

cell_disk_
config.xml

iDB Protocol
RS

cellinit.ora

CELLSRV
ADR

css
iDB Protocol

/etc/oracle/cell/network-config

cellip.ora

cellinit.ora

bond0

MS internal
dictionary
and
CELLSRV internal
parameters and
local interface IP

List accessible
Exadata cells

List local
interface IP

InfiniBand switch

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Software Architecture Details


Database-host side Exadata software:

LIBCELL
C
Library:
b a y Provides
o des U
UNIX-like
e I/O
/O primitives
p
t es a
and
d iss linked
ed with
t ASM,
S , RDBMS,
S, a
and
d
ASM utilities. It uses the iDB Protocol to communicate with Exadata.

DISKMON (Network/Cell Monitor): Checks the network interface state and cell liveness. It
uses a nodewide master process and one slave process (dskm) for each RDBMS or ASM
instance. The master performs monitoring and propagates state information to the slaves.
Slaves use the SGA to communicate with RDBMS or ASM processes. If there is a failure
in the cluster, DISKMON performs I/O fencing to protect data integrity. Cluster
Synchronization Services (CSS) still decides what to fence
fence. Master DISKMON starts with
the clusterware processes. DISKMON also performs DBRM plan propagation.
Cell-side Exadata software:

CELLSRV is a multithreaded server which provides the majority of Exadata storage


services. It provides smart storage capabilities, serves data blocks when offloading is not
possible, and implements I/O Resource Management to meter out I/O bandwidth.

Management Server (MS) is an OC4J application that provides storage cell management
and configuration functions, such as cell administration, and metrics and alerts generation.
It also communicates with CELLSRV and the operating system.
Exadata and Database Machine Administration Workshop 3 - 5

Exadata Software Architecture Details (continued)

Restart Server (RS): Monitors CELLSRV and MS and restarts them, if necessary.

CellCLI: Executes user cell administration commands. The user must connect to the cell
to use CellCLI
CellCLI. CellCLI communicates with MS using Web Services
Services.
ADRCI: CELLSRV uses the Automatic Diagnostic Repository (ADR) to log software errors.
An Exadata administrator may use the ADR viewer (ADRCI) to view and package ADR
incidents.

InfiniBand provides a high-speed, high-bandwidth, and low-latency network fabric to support


Exadata. InfiniBand is the only network fabric supported for communication between Exadata
and database servers. The InfiniBand implementation in Exadata and Database Machine uses
the open source RDS/Open Fabrics Enterprise Distribution (OFED). These packages are
preinstalled in Exadata and Database Machine.
Note: Exadata requires Oracle Database 11g Release 2 or later.

Exadata and Database Machine Administration Workshop 3 - 6

Exadata Smart Flash Cache Architecture

DB

DB

3
cellsrv

cellsrv

on uncached data

cellsrv

Acknowledgement

Read Reque
est

Read Operation

DB

Read Operation
on previously cached data

Read Reque
est

Write Operation

Exadata Smart
Flash Cache

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Smart Flash Cache Architecture


Exadata Smart Flash Cache provides a caching mechanism for frequently accessed data on
each Exadata cell
cell. Exadata Smart Flash Cache works in conjunction with Oracle Database to
intelligently optimize the efficiency of the cache.
Each database I/O is tagged with the following metadata:
The CELL_FLASH_CACHE setting for the object associated with the I/O:
- DEFAULT specifies that Exadata Smart Flash Cache is used normally.
- KEEP specifies that Exadata Smart Flash Cache is used more aggressively.
- NONE specifies that Exadata Smart Flash Cache is not used
used.

A cache hint, which is assigned by the database based on the reason for the I/O:
- CACHE indicates that the I/O should be cached. For example, the I/O is for an
index lookup.
- NOCACHE indicates that the I/O should not be cached. For example, the I/O is for
a mirrored block of data or is a log write.
- EVICT indicates that data should
sho ld be removed
remo ed from the cache.
cache For e
example,
ample when
hen
an ASM rebalance operation moves data between different disks, the cached
copies that correspond to the original location are removed from the cache.
Exadata and Database Machine Administration Workshop 3 - 7

Exadata Smart Flash Cache Architecture (continued)


In addition, Exadata Smart Flash Cache takes the following into consideration when processing
I/O:
I/O size: Large I/Os on objects with CELL_FLASH_CACHE
CELL FLASH CACHE set to DEFAULT are not cached.
cached

Current cache load: Smart table scans are usually directed to disk. However, if the object
has a CELL_FLASH_CACHE setting of KEEP, some reads may be satisfied using Exadata
Smart Flash Cache in order to best utilize the combined throughput of the disks and the
cache.

Exadata Smart Flash Cache uses all of the aforementioned information to make intelligent
decisions about which data is suitable for caching and which is not.
Exadata Smart Flash Cache is a write-through cache. This means that for write operations,
CELLSRV writes data to disk and sends an acknowledgement to the database so it can continue
without any interruption. Then, if the data is suitable for caching, it is written to Exadata Smart
Flash Cache. Write performance is not improved or diminished using this method. However, if a
subsequent read operation needs the same data, it is likely to benefit from the cache.
For read operations, CELLSRV must first determine if the requested data is already in Exadata
S
Smart
t Flash
Fl h C
Cache.
h CELLSRV maintains
i t i an iin-memory hash
h h ttable,
bl which
hi h it uses tto quickly
i kl
determine which data blocks reside in Exadata Smart Flash Cache. If the requested data is
cached, a cache lookup is used to satisfy the I/O request.
For read operations that cannot be satisfied using Exadata Smart Flash Cache, a disk read is
performed and the requested information is sent to the database. Then if the data is suitable for
caching, it is written to Exadata Smart Flash Cache.
When suitable data is inserted into a full cache
cache, a prioritized least recently used (LRU) algorithm
determines which data to replace. Objects with a CELL_FLASH_CACHE setting of KEEP are
subject to a different cache retention policy than objects with a CELL_FLASH_CACHE setting of
DEFAULT. KEEP objects have priority over DEFAULT objects so that new data from a DEFAULT
object will not push out cached data from any KEEP objects. To prevent KEEP objects from
monopolizing the cache, they are allowed to occupy no more than 80% of the total cache size.
Also, to prevent unused KEEP objects from indefinitely occupying the cache, they are subject to
an additional aging policy
policy, which periodically purges unused KEEP object data
data.

Exadata and Database Machine Administration Workshop 3 - 8

Exadata Monitoring Architecture


Exadata Cell
Exadata Cell
Exadata Cell

From the Enterprise


Enterprise Manager

Smart
Flash Cache

Data

OMS

agent

dcli

CellCLI
cellsrv

MS
adrci

SSH / CellCLI
CELLSRV
ADR

eth0

eth0
Network switch

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Monitoring Architecture


For monitoring, there is an Enterprise Manager plug-in that you use in conjunction with Grid
g this plug-in,
g
yyou can monitor all the Exadata cells in yyour enterprise.
Control. Using
The Enterprise Manager plug-in for Exadata does not require an agent on each Exadata cell.
Instead, an existing Enterprise Manager agent uses SSH to connect to each cell and execute
CellCLI commands. Using this architecture, monitoring information from numerous Exadata
cells can be consolidated on to a single Enterprise Manager screen.
The dcli utility facilitates centralized management across a group of cells. It can be used to
execute CellCLI and other cell-level operating system commands across a group of cells and
provide
id a consolidated
lid t d view
i
off the
th output.
t t The
Th dcli
d li utility
tilit runs commands
d on multiple
lti l cells
ll iin
parallel threads. The cells are referenced by their network name or IP address. Files can be
copied to cells and command scripts can be executed on cells by using this utility. Finally, you
can use the dcli utility to set up SSH user-equivalence to a cell or group of cells.
Note: dcli is a Python script that is available on Exadata. You can copy it to your designated
central management console and execute it from there. The dcli utility requires Python version
g and Maintaining
g
2.3 or later. dcli is discussed further in the lesson entitled Monitoring
Database Machine.

Exadata and Database Machine Administration Workshop 3 - 9

Disk Storage Entities and Relationships


Disk

CELLDISK

LUN

Exadata Cell
Data
Storage
Partition

GRIDDISK

ASM disk

CellCLI> CREATE GRIDDISK ...


First two
LUNs only

Grid
Disk

System Area
OR

Cell
Disk

OR

Visible to
ASM

Grid Disk
LUN

(hot part)
Other ten
LUNs

Grid Disk
(cold part)

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Disk Storage Entities and Relationships


Each Exadata cell contains 12 physical disks. On each of the first two disks, Exadata reserves
a system area that spans multiple partitions with a total size of approximately 29 GB.
GB The
system area contains the OS image, swap space, Exadata software binaries, metric and alert
repository, and various other configuration and metadata files. The two system areas are
mirror copies of each other which are maintained via software mirroring.
Exadata automatically senses the physical disks in each cell. As a cell administrator you can
only view a predefined set of physical disk attributes. Each physical disk is mapped to a
logical abstraction called a Logical Unit (LUN). A LUN exposes additional predefined
metadata
t d t attributes
tt ib t to
t a cellll administrator.
d i i t t You
Y cannott create
t or remove a LUN,
LUN they
th are
automatically created.
A cell disk is a higher level abstraction that represents the data storage area on each LUN.
For the two LUNs that contain the system areas, Exadata recognizes the way that the LUN is
partitioned and maps the cell disk to the disk partition reserved for data storage. For the other
10 disks, Exadata maps the cell disk directly to the LUN.
After a cell disk is created
created, it can be subdivided into one or more grid disks
disks, which are directly
exposed to ASM.

Exadata and Database Machine Administration Workshop 3 - 10

Disk Storage Entities and Relationships (continued)


Placing multiple grid disks on a cell disk allows the administrator to segregate the storage into
pools with different performance characteristics. For example, a cell disk could be partitioned so
grid disk resides on the highest
g
p
performing
gp
portion of the disk ((the outermost tracks on
that one g
the physical disk), whereas a second grid disk might be configured on the lower performing
portion of the disk (the inner tracks). The first grid disk might then be used in an ASM disk group
that houses highly active (hot) data, whereas the second grid disk might be used to store less
active (cold) data files.
Placing multiple grid disks on a cell disk also allows the administrator to segregate the storage
into separate pools that can be assigned to different databases.
In cases where the entire cell capacity is required for a single database or where it is difficult to
clearly define hot and cold data sets, an Exadata administrator will usually define a single grid
disk containing all the space on each cell disk.
Note: The diagram in the slide shows the cases where one or two grid disks are created from
the space on a cell disk. However, you can create more than two grid disks on a cell disk.

Exadata and Database Machine Administration Workshop 3 - 11

Interleaved Grid Disks


Fast Tracks

50%

Slower Tracks

50%

Slowest Tracks

Grid Disk 2

Grid Disk 3

Fastest Tracks

Slower Tracks

50%

Fast Tracks

50%

Slowest Tracks

The perform
mance of Grid Diisk 3 and
Grid Disk 4 is more evenly balanced
b

Grid Dis
sk1 benefits from
m the higher
performa
ance outer tracks
s of the disk

Grid Disk 1

Fastest Tracks

Grid Disk 4

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Interleaved Grid Disks


By default, space for grid disks is allocated from the outer tracks to the inner tracks of a physical
grid disks can be allocated in an interleaved manner. Grid disks that
disk. However, space for g
use this type of space allocation are referred to as interleaved grid disks. This method attempts
to equalize the performance of the grid disks residing on the same physical disk.
The slide contrasts default grid disk allocation with interleaved grid disks. On the left, two grid
disks have been created on a physical disk using default space allocation. In this case, Grid
Disk 1 occupies all the fastest (outer) tracks, whereas Grid disk 2 occupies all the slower (inner)
tracks.
On the
O
th right,
i ht you see an example
l off iinterleaved
t l
d grid
id di
disks.
k With iinterleaving
t l
i enabled,
bl d a di
disk
k iis
divided into two equal parts: the outer half (upper portion) and the inner half (lower portion).
When a new grid disk is created, half of the grid disk space is allocated on the upper portion,
and the other half of the grid disk space is allocated on the lower portion.
Interleaved grid disks are best used in situations where you want to create separate ASM disk
groups that share cell disks without a performance bias.
Note that interleaving
g is enabled by
y setting
g the INTERLEAVING attribute for the cell disk. For
example:
CellCLI> CREATE CELLDISK cd_03_cell01_int LUN=03
INTERLEAVING='normal_redundancy'
Exadata and Database Machine Administration Workshop 3 - 12

Flash Storage Entities and Relationships


Flash

LUN

CELLDISK

OR

GRIDDISK

ASM disk

FLASHCACHE
Exadata Cell

CellCLI> CREATE FLASHCACHE ...


CellCLI> CREATE GRIDDISK ... FLASHDISK ...

Flash
Cache
Flash
LUN

Cell
Disk

OR

Flash Cache
Grid Disk
Visible to
ASM

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Flash Storage Entities and Relationships


Each Exadata cell contains 384 GB of high performance flash memory distributed across 4 PCI
flash memory cards. Each card has 4 flash devices for a total of 16 flash devices on each cell.
Each flash device has a capacity of 24 GB.
Essentially, each flash device is much like a physical disk in the Exadata storage hierarchy.
Each flash device is visible to the Cell Server software as a LUN. You can create a cell disk
based on a flash-based LUN. You can then create numerous grid disks on each flash-based cell
disk. In addition, space on a flash-based cell disk can be allocated to a special area that
supports Exadata Smart Flash Cache.
By default
default, the initial cell configuration process creates flash-based cell disks on all the flash
devices, and then allocates all the available flash space to Exadata Smart Flash Cache. To
create space for flash-based grid disks, you need to drop the default flash cache. Then you can
create a flash cache and flash-based grid disks with your chosen sizes.
Unlike physical disk devices, the order in which you allocate your flash space is not important
from a performance perspective. Likewise, interleaving is not applicable for flash-based cell
disks.
Note: The diagram in the slide shows the case where a flash-based cell disk is allocated
entirely to flash cache, and the case where a flash-based cell disk is used for flash cache and
one grid disk. However, you can allocate up to one flash cache area along with zero or more
flash-based grid disks from a flash-based cell disk.
Exadata and Database Machine Administration Workshop 3 - 13

Disk Group Configuration

SQL> CREATE DISKGROUP

Exadata Cell

Exadata Cell

CELL1 Failure Group

DATA
Disk Group

CELL2 Failure Group

CELL1 Failure Group

FRA
Disk Group

CELL2 Failure Group

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Disk Group Configuration


After the grid disks are configured, ASM disk groups can be defined across your Exadata
g
configuration.
The slide illustrates an example where two ASM disk groups are defined. The DATA disk group
is defined across all the red grid disks, and the FRA disk group is defined across the blue grid
disks. When data is loaded into each disk group, ASM will evenly distribute the data and I/O
across the grid disks in each disk group.
To protect against the failure of an entire Exadata cell, ASM failure groups are automatically
defined on a per cell basis. This is to ensure that mirrored ASM extents are placed on different
E d t cells.
Exadata
ll Thi
This iis also
l ill
illustrated
t t d iin th
the slide.
lid B
By d
default,
f lt when
h ffailure
il
groups are
automatically created, their names correspond to the cell name. So, different disk groups can
have the same failure group names.
When using Exadata, it is strongly recommended to use at least NORMAL ASM redundancy for
all of your disk groups in conjunction with ASM failure groups spread across at least two
Exadata cells. Following this recommendation provides good protection against disk and cell
failure.
Using HIGH ASM redundancy in conjunction with ASM failure groups spread across at least
three Exadata cells provides the best available level of data protection. Such a configuration can
tolerate the simultaneous failure of two complete cells without compromising data availability.
Exadata and Database Machine Administration Workshop 3 - 14

Quiz
What are the three main Exadata services?
1. OMS
2. MS
3 GMON
3.
4. CELLSRV
5. RS

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 2, 4, 5

Exadata and Database Machine Administration Workshop 3 - 15

Quiz
If you use NORMAL ASM redundancy for all of your disk groups
in conjunction with ASM failure groups spread across two
Exadata cells, under which of the following scenarios will you
maintain data availability?
1. A single disk failure in a single cell
2. Simultaneous failure of multiple disks in a single cell
3. Simultaneous failure of a single disk in both cells
4. Complete failure of a single cell

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 1 ,2, 4
The prescribed configuration may provide protection against failure scenario 3 if, and only if,
guarantee data availability
y in
there are no data extents mirrored to both of the failed disks. To g
cases where simultaneous failures affect two cells, you must use HIGH ASM redundancy in
conjunction with failure groups spread across at least three Exadata cells.

Exadata and Database Machine Administration Workshop 3 - 16

Summary
In this lesson, you should have learned to describe:
The Exadata architecture
The relationship between the various storage abstractions
used in Exadata

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 3 - 17

Additional Resources

Lesson Demonstrations (Viewlets)

Exadata Process Introduction

Hierarchy of Exadata Storage Objects

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/034ExadataFlashCacheAdmin/034exad
ataflashcacheadmin_viewlet_swf.html

Exadata Smart Flash Cache Architecture

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/033ExadataInterleavedGridDisks/033ex
adatainterleavedgriddisks_viewlet_swf.html

Examining Exadata Smart Flash Cache

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/032ExadataStorageObjects/032exadata
i l
l
/d
/db/11 / 2/db
h/032E d t St
Obj t /032
d t
storageobjects_viewlet_swf.html

Creating Interleaved Grid Disks

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/031ExadataProcessIntro/031exadatapr
ocessintro_viewlet_swf.html

http://stcurriculum.oracle.com/demos/db/11g/r2/exadatav2/smartflashcachearchitecture/smartfla
shcachearchitecture.swf

My Oracle Support Notes

Oracle Reliable Datagram Sockets (RDS) and InfiniBand (IB) Support for RAC
Interconnect and Exadata Storage

https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=745616.1

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 3 - 18

Practice 3 Overview:
Introducing Exadata Cell Architecture
In these practices, you will be familiarized with the Exadata cell
architecture. You will:
Examine the Exadata processes
Examine the hierarchy of cell objects
Create interleaved grid disks
Examine Exadata Smart Flash Cache

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 3 - 19

E d t C
Exadata
Configuration
fi
ti

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Objectives
After completing this lesson, you should be able to:
Perform the initial Exadata boot sequence
Configure Exadata software
Create and configure ASM disk groups using Exadata
Use the CellCLI Exadata administration tool

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 4 - 2

Exadata Installation and Configuration Overview

1
6

Configuring
ASM disk group
for Exadata

Configuring
ASM and Database
instances
for Exadata

Initial network
preparation

Configuration
of new Exadata
servers

Configuring
Exadata software

4
Configuring hosts
to use Exadata

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Installation and Configuration Overview


Exadata ships with all hardware and software preinstalled. However, it is necessary to configure
general overview of the configuration
g
tasks.
Exadata. This slide introduces a g
Note: In most cases the installation and configuration activities described in this lesson occur as
part of the installation and configuration of Database Machine and there is no requirement to
perform cell-by-cell configuration. You may need to conduct some of the activities described in
this lesson during the normal lifecycle of maintaining your Database Machine environment
however the complete Exadata configuration process would only be required in rare
circumstances, such as when upgrading from a Quarter-Rack Database Machine to a Half-Rack
or Full-Rack configuration
configuration, for example.
example The Database Machine configuration process is
described later in this course in the lesson entitled Database Machine Configuration.

Exadata and Database Machine Administration Workshop 4 - 3

Initial Network Preparation

For each storage cell, assign the following IP addresses:


One IP address for the bonded InfiniBand port
One IP address for administration network access
One IP address for lights out management
Note these network configuration recommendations:
Set up a fault-tolerant, private network subnet for the
InfiniBand network.
Use the InfiniBand network for Oracle Clusterware.

Assign a block of IP addresses


for each network type.
Do not allocate IP addresses ending
in .0, .1, or .255.

Repeat
for each
cell

Define your storage cells to DNS.


Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Initial Network Preparation


Each storage cell contains the following network ports:
1. One dual-port
dual port InfiniBand card for high-speed,
high speed, high-volume
high volume data transfer: Each Exadata
cell is designed to be connected to two separate InfiniBand switches for high availability.
The dual port card is only for availability reasons because each port is capable of
transferring the full data bandwidth generated by the storage cell. You will need to assign
one IP address to the bonded InfiniBand interface during the initial configuration of the
storage cell.
2. Gigabit Ethernet ports for general administration network access to the cell operating
system: Each Exadata server comes with four Gigabit Ethernet ports.
ports However,
However only one
is required for administrative access. You will need to assign one IP address to the cell for
network access during the initial configuration process.
3. One gigabit Ethernet port for lights out management: Exadata uses Sun Integrated Lights
Out Manager (ILOM). You should assign one IP address to the cell for ILOM during the
initial configuration of the storage cell.

Exadata and Database Machine Administration Workshop 4 - 4

Initial Network Preparation (continued)


Note the following network configuration and IP address recommendations:
It is recommended that the InfiniBand network should be a dedicated private network
subnet for Exadata cells and database server hosts
hosts. Multiple InfiniBand switches are
recommended to eliminate the switch as a single point of failure.
The InfiniBand network should be used for Oracle Clusterware network and storage
communications. Use the following command on your clusterware hosts to verify that the
private network for Oracle Clusterware communication is using InfiniBand:
oifcfg getif -type cluster_interconnect

The Reliable Datagram Sockets (RDS) protocol should be used over the InfiniBand
network for database server to cell communication and Oracle Real Application Clusters
(RAC) communication. Check the database alert log to verify that the private network for
Oracle RAC is running the RDS protocol over the InfiniBand network. The following
message should be in the log:
cluster interconnect IPC version: Oracle RDS/IP (generic)

Dedicate a block of IP addresses for the InfiniBand network and ensure that you allow for
f t
future
expansion.
i
Dedicate a block of IP addresses for the general administration interfaces and the lights
out management interfaces. The general administration interfaces and the lights out
management interfaces may be on the same subnet and may share that subnet with other
hosts. For example, on the 192.168.200.0/24 subnet, you might assign the block of IP
addresses between 192.168.200.31 and 192.168.200.50 for your Exadata general
administration interfaces and the lights
g
out management
g
interfaces. Other hosts sharing
g
the subnet would be allocated IP addresses outside the dedicated block. If you want, you
can place the general administration interfaces and the lights out management interfaces
on separate subnets; however, this is not required.
Do not allocate addresses that end in .0, .1, or .255, or those that would be used as
broadcast addresses for the specific netmask that you have selected. For example, avoid
addresses such as 192.168.200.0, 192.168.200.1, and 192.168.200.255.
Exadata cells do not require Domain Name System (DNS) however DNS is recommended
for use in conjunction with Database Machine. If DNS is available in your network,
configure your DNS with the IP addresses and host names associated with the general
administrative network on each Exadata cell.

Exadata and Database Machine Administration Workshop 4 - 5

Configuration of New Exadata Servers

1. Check all physical connections.


2. Power on the Exadata server.
3. Answer questions during boot sequence:

Domain Name Service (DNS) server IP addresses


Time preference (time region and location)
Network Time Protocol (NTP) servers
Ethernet and InfiniBand IP addresses,
netmasks, gateway, and hostnames
Remote management configuration
details

4. Change the initial passwords for the root,


celladmin, and cellmonitor users.

Repeat
for each
cell

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Configuration of New Exadata Servers


The slide lists the general steps to configure a new Exadata server:
1. Check all the physical connections to the Exadata server. It is important that all the
physical network connections are correct prior to configuring the cell. Check also that both
power supplies are connected and that you have a keyboard, video display, and mouse.
2. Power on the cell to boot its operating system.
3. Answer the configuration questions when you are prompted. The slide lists the information
that you need to provide.
4. After you successfully perform the previous step, the login screen is displayed. Change
the initial passwords for the root, celladmin, and cellmonitor users to more secure
passwords. The initial password for root is welcome1. The initial password for the
cellmonitor and celladmin users is welcome.

Exadata and Database Machine Administration Workshop 4 - 6

Answering Questions During the Initial Boot


Sequence
...
Network interfaces
Name State
IP address
Netmask
Gateway
Hostname
eth0 Linked
eth1 Unlinked
eth2 Unlinked
eth3 Unlinked
ib0
Linked
ib1
Linked
Warning. Some network interface(s) are disconnected. Check cables and switches and retry
Do you want to retry (y/n) [y]: n
Nameserver: mynameserv.company.com
Add more nameservers (y/n) [n]: n
Setting up local time...
Select country by number, [n]ext, [l]ast: 230
Select zone by number, [n]ext: 17
Selected timezone: America/Denver
Is this correct (y/n) [y]: y
The current ntp server(s):
Do you want to change it (y/n) [n]: y
Fully qualified hostname or ip address for NTP server. Press enter if none: ntp1.company.com
Continue adding more ntp servers (y/n) [n]: n
...

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answering Questions During the Initial Boot Sequence


The next four slides show an example of the initial boot configuration process for Exadata. On
each slide, the text in blue indicates a user input.
The configuration commences during the server boot sequence. The output from the initial part
of the boot sequence is not shown. This slide commences at the beginning of the interview
phase where user input is required.
In this slide, settings are made for the DNS name server, time zone, and NTP server.
Notice that the interview phase commences with a warning indicating that a number of network
interfaces are disconnected. As shown in the slide, it is safe to ignore this warning because
each Exadata server comes equipped with four Ethernet ports however only one (eth0) is
required. So it is normal for eth1, eth2, and eth3 to be disconnected. Always make sure that
the required network interfaces (eth0, ib0 and ib1) are correctly linked.

Exadata and Database Machine Administration Workshop 4 - 7

Answering Questions During the Initial Boot


Sequence
...
Network interfaces
Name State
IP address
Netmask
Gateway
Hostname
eth0 Linked
bond0 ib0,ib1
Select interface name to configure or press Enter to continue: eth0
Selected interface. eth0
IP address or none: 10.XXX.XXX.XXX
Netmask: 255.255.248.0
255 255 248 0
Gateway (IP address or none): 10.XXX.XXX.1
Fully qualified hostname or none: cell01.company.com
Continue configuring or re-configuring interfaces? (y/n) [y]: y
Network interfaces
Name State
IP address
Netmask
Gateway
Hostname
eth0 Linked
10.XXX.XXX.XXX
255.255.248.0
10.XXX.XXX.1
cell01.company.com
bond0 ib0,ib1
Select interface name to configure or press Enter to continue: bond0
Selected interface. bond0
IP address: 192.168.50.76
Netmask:
k 255.255.255.0
2
2
2
0
Fully qualified hostname or none: cell01-priv.company.com
Continue configuring or re-configuring interfaces? (y/n) [y]: y
Network interfaces
Name State
IP address
Netmask
Gateway
Hostname
eth0 Linked
10.XXX.XXX.XXX
255.255.248.0
10.XXX.XXX.1
cell01.company.com
bond0 ib0,ib1
192.168.50.76
255.255.255.0
cell01-priv.company.com
Select interface name to configure or press Enter to continue: <enter>
...

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answering Questions During the Initial Boot Sequence (continued)


In this slide, the configuration phase continues with settings specified for the Ethernet network
(eth0) that supports administrative access to the storage server, along with the InfiniBand
network (bond0) that supports the main storage network.
Notice that the InfiniBand interface is named bond0 and uses bonding between the physical
InfiniBand interfaces ib0 and ib1. Bonding provides the ability to transparently fail over from
ib0 to ib1 or from ib1 to ib0 if connectivity to either interface is lost.
If you choose not to configure each interface in the list, the unconfigured interfaces will not be
started during system startup and the cell will not be fully functional. You can later configure, or
change,
h
cellll network
t
k settings
tti
using
i th
the ipconf
i
f utility.
tilit

Exadata and Database Machine Administration Workshop 4 - 8

Answering Questions During the Initial Boot


Sequence
...
Select canonical hostname from the list below
1: cell01.company.com
2: cell01-pric.company.com
Canonical fully qualified domain name: 1
Select default gateway interface from the list below
1: eth0
Default gateway interface: 1
Canonical hostname: cell01.company.com
Nameservers: mynameserv.company.com
Timezone: America/Denver
NTP servers: ntp1.company.com
Default gateway device: eth0
Network interfaces
Name State
IP address
Netmask
eth0 Linked
10.XXX.XXX.XXX
255.255.248.0
bond0 ib0,ib1
192.168.50.76
255.255.255.0
Is this correct (y/n) [y]: y
...

Gateway
10.XXX.XXX.1

Hostname
cell01.company.com
cell01-priv.company.com

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answering Questions During the Initial Boot Sequence (continued)


In this slide, the network configuration is finalized with the specification of the canonical host
gateway.
y Both of these settings
g map to the ethernet network providing
g
name and default g
administrative access to the cell.

Exadata and Database Machine Administration Workshop 4 - 9

Answering Questions During the Initial Boot


Sequence
...
Do you want to configure basic ILOM settings (y/n) [y]: y
Loading basic configuration settings from ILOM ...
ILOM Fully qualified hostname [cell01-ilom.company.com]: cell01-ilom.company.com
ILOM IP address [10.XXX.XXX.YYY]: 10.XXX.XXX.YYY
ILOM Netmask [255.255.248.0]: 255.255.248.0
ILOM Gateway [10.XXX.XXX.1]: 10.XXX.XXX.1
ILOM Nameserver or none [mynameserv.company.com]: mynameserv.company.com
ILOM Use NTP Servers (enabled/disabled) [enabled]: enabled
ILOM First NTP server. Fully qualified hostname or ip address or none [ntp1.company.com]: ntp1.company.com
ILOM Second NTP server. Fully qualified hostname or ip address or none [none]: none
Basic ILOM configuration
Hostname
:
IP Address
:
Netmask
:
Gateway
:
DNS servers
:
Use NTP servers
:
First NTP server
:
Second NTP server
:
Timezone (read-only)
:

settings:
cell01-ilom.company.com
10.XXX.XXX.YYY
255.255.248.0
10.XXX.XXX.1
mynameserv.company.com
enabled
ntp1.company.com
none
America/Denver

Is the correct (y/n) [y]: y


...

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answering Questions During the Initial Boot Sequence (continued)


Configuration completes with settings for Integrated Lights Out Manager (ILOM).
If you choose not to configure ILOM at this time, you can use the ipconf utility to do so later.
After the user interview phase is completed, the Exadata server finalizes its system startup
process. The output from the remaining system startup activities is not shown in the slide.
Finally, a login prompt is displayed.

Exadata and Database Machine Administration Workshop 4 - 10

Exadata Administrative User Accounts


Three operating system users are configured for each Exadata server:
The root user can:
Edit configuration files such as cellinit.ora and cellip.ora
Change network configuration settings
Run support and diagnostic utilities located under the
/opt/oracle.SupportTools directory
Run the CellCLI CALIBRATE command
Perform all the tasks that the celladmin user can perform

The celladmin user can:


Perform administrative tasks (CREATE
(CREATE, DROP,
DROP ALTER,
ALTER and so on)
using the CellCLI utility
Package incidents for Oracle Support using the adrci utility

The cellmonitor user can only view (LIST) Exadata cell


objects using the CellCLI utility.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Storage Server Administrative User Accounts


Three operating system users are configured for each Exadata server: root, celladmin,
and cellmonitor.
cellmonitor The slide describes the function of each user account.
account
As mentioned before, after you successfully configure the cell, you should log in and change
the initial passwords for the root, celladmin, and cellmonitor users to more secure
passwords. The initial password for root is welcome1. The initial password for the
cellmonitor and celladmin users is welcome.

Exadata and Database Machine Administration Workshop 4 - 11

Configuring a New Exadata Cell

1. Run performance tests on the cell with CALIBRATE.


2. Configure the cell server software.
3. Create cell disks.
4 Create grid disks
4.
disks.

Repeat
for each
cell

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Configuring a New Exadata Cell


As part of the initial boot configuration, the cell server software is started with a basic
g
In addition, the flash modules are configured
g
as cell disks and all the flash-based
configuration.
cell disks are allocated to Exadata Smart Flash Cache.
At this point, you are ready to finalize the configuration of the Exadata cell. Following is a
summary of the recommended procedure. All the steps are executed using CellCLI:
1. As the root user, run performance tests on the cell with the CALIBRATE command.
2. As the celladmin or root user, configure the cell server software with the ALTER CELL
command.
3. As celladmin or root, create the disk-based cell disks by using the CREATE CELLDISK
command.
4. As celladmin or root, create the grid disks on each cell disk of the storage cell by using
the CREATE GRIDDISK command.
Repeat this process on each Exadata cell.

Exadata and Database Machine Administration Workshop 4 - 12

Important I/O Metrics for Oracle Databases


Disk bandwidth

Channel bandwidth

Metric = IOPS

Metric = MBPS
Need large
I/O channel

Need high RPM and


fast seek time

OLTP

DW/OLAP

(Small random I/O)

(Large sequential I/O)

CALIBRATE

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Important I/O Metrics for Oracle Databases


The CALIBRATE command runs raw performance tests on Exadata disks and flash modules.
This enables yyou to measure two important database metrics IOPS and MBPS:
IOPS (I/O per second): This metric represents the number of small random I/O that can
be serviced in a second. The IOPS rate mainly depends on how fast the disk media can
spin and how many disks are present in the storage system.
MBPS (megabytes per second): The rate at which data can be transferred between the
computing server node and the storage array. This mainly depends on the capacity of the
I/O channel that is used to transfer data.
The database I/O workload typically consists of small random I/Os and large sequential I/Os.
Small random I/Os are more prevalent in an OLTP application environment in which each server
process reads a data block into the buffer cache for updates and the changed blocks are written
to storage in batches by the database writer (DBWn) process. Large sequential I/Os are
common in a data warehouse environment.
OLTP application performance mainly depends on how fast small I/Os are serviced, which
depends on how fast the disk can spin and find the data. Large I/O performance depends on the
capacity of the I/O channel that connects the server to the storage array; throughput is better
when the capacity of the channel is larger.
Exadata and Database Machine Administration Workshop 4 - 13

Testing Performance Using CALIBRATE


[root@cell01 ~]# cellcli
CellCLI: Release 11.2.1.2.0 - Production on Mon Nov 02 16:42:06 PST 2009
Copyright (c) 2007, 2009, Oracle.
Cell Efficiency ratio: 1.0

All rights reserved.

CellCLI> CALIBRATE FORCE


Calibration will take a few minutes
minutes...
Aggregate random read throughput across all hard disk luns: 1601 MBPS
Aggregate random read throughput across all flash disk luns: 4194.49 MBPS
Aggregate random read IOs per second (IOPS) across all hard disk luns: 4838
Aggregate random read IOs per second (IOPS) across all flash disk luns: 137588
Controller read throughput: 1615.85 MBPS
Calibrating hard disks (read only) ...
Lun 0_0 on drive [20:0
] random read throughput: 152.81 MBPS, and 417 IOPS
Lun 0_1 on drive [20:1
] random read throughput: 154.72 MBPS, and 406 IOPS
...
Lun 0_10 on drive
d i
[
[20:10
] random
d
read
d throughput:
h
h
156.84 MBPS, and
d 421 IOPS
Lun 0_11 on drive [20:11
] random read throughput: 151.58 MBPS, and 424 IOPS
Calibrating flash disks (read only, note that writes will be significantly slower).
Lun 1_0 on drive [[10:0:0:0]] random read throughput: 269.06 MBPS, and 19680 IOPS
Lun 1_1 on drive [[10:0:1:0]] random read throughput: 269.18 MBPS, and 19667 IOPS
...
Lun 5_2 on drive [[11:0:2:0]] random read throughput: 269.15 MBPS, and 19603 IOPS
Lun 5_3 on drive [[11:0:3:0]] random read throughput: 268.91 MBPS, and 19637 IOPS
CALIBRATE results are within an acceptable range.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Testing Performance Using CALIBRATE


The CALIBRATE command enables you to verify the disk and flash memory performance before
the cell is put online. You must execute this command while being logged in as the root user at
the operating system level.
The CALIBRATE FORCE command allows you to run the tests when Cell Server is running. If
you do not use the FORCE option, Cell Server must be shut down. Running CALIBRATE at the
same time as the Cell Server will impact performance which is why it is not recommended
during normal operations.
Because the Cell Server software is running immediately after the initial boot sequence, you
mustt either
ith shut
h td
down th
the C
Cellll S
Server software
ft
or execute
t th
the CALIBRATE FORCE command.
d
CALIBRATE FORCE is acceptable in this circumstance because the cell is not yet running a
user workload, so there is no work to disrupt. In the above example, which shows a typical
output for high performance disks, the results matched expectations. A message will alert you if
the performance measurements are substandard.

Exadata and Database Machine Administration Workshop 4 - 14

Configuring the Exadata Cell Server Software

[celladmin@cell01 ~]$ cellcli


CellCLI: Release 11.2.1.2.0 - Production on Mon Nov 02 17:46:13 PST 2009
Copyright (c) 2007, 2009, Oracle.
Cell Efficiency ratio: 1.0
1 0

All rights reserved.

CellCLI> ALTER CELL smtpServer='my_mail.example.com',


smtpFromAddr='exadata.cell01@example.com',
smtpPwd=<email_address_password>
smtpToAddr='jane.smith@example.com',
notificationPolicy='critical,warning,clear',
notificationMethod='mail'
Cell cell01 successfully altered

CellCLI>

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Configuring the Exadata Cell Server Software


The settings provided during the initial boot sequence configure the hardware and cell operating
system. In addition, the Cell Server software is automatically configured using the CREATE
CELL command. By default, the cell name is set to the network host name of the Exadata server
and the INTERCONNECT1 attribute is set to bond0, which is the InfiniBand storage network
interface.
You can change the name of the cell or configure the optional Cell Server attributes by using the
ALTER CELL command.
The slide shows an example ALTER CELL command that configures email notification. This
f ilit sends
facility
d emailil messages tto th
the administrator
d i i t t off th
the storage
t
cellll whenever
h
critical,
iti l warning,
i
and clear alerts are detected by the cell. In addition to email notification, it is possible to
configure notification using Simple Network Management Protocol (SNMP).
Note: After the initial boot configuration, Restart Server (RS) and Management Server (MS)
should be running. If not, an error message will display when using the CellCLI utility. In that
case, run the following CellCLI commands to start the RS and MS services:
ALTER CELL STARTUP SERVICES RS
ALTER CELL STARTUP SERVICES MS

Exadata and Database Machine Administration Workshop 4 - 15

Creating Cell Disks


CellCLI>
CellDisk
...
CellDisk
CellDisk

CREATE CELLDISK ALL


CD_00_cell01 successfully created
CD_10_cell01 successfully created
CD_11_cell01 successfully created

CellCLI> LIST CELLDISK


CD_00_cell01
...
CD_10_cell01
CD_11_cell01
FD_00_cell01
...
FD_14_cell01
FD_15_cell01

normal
normal
normal
normal
normal
normal

CellCLI>

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Creating Cell Disks


After the Exadata cell is first configured, there are 16 flash-based cell disks, which are allocated
to Exadata Smart Flash Cache.
Before you can use the disk-based storage, you must create disk-based cell disks using the
CREATE CELLDISK command. The example in the slide shows the use of the CREATE
CELLDISK ALL command to automatically create 12 disk-based cell disks with default names.
In most cases, you can use the default cell disk names.
If desired, you can configure your cell disks to enable the creation of interleaved grid disks. Use
the following command to create cell disks with interleaving enabled:
CREATE CELLDISK ALL HARDDISK INTERLEAVING='normal_redundancy'
The above example also shows the use of the LIST CELLDISK command to display the diskbased and flash-based cell disks. Check whether the command shows a status of normal for
all the cell disks.

Exadata and Database Machine Administration Workshop 4 - 16

Creating Grid Disks


CellCLI> CREATE GRIDDISK ALL PREFIX=data, SIZE=300G
GridDisk data_CD_00_cell01 successfully created
...
GridDisk data_CD_11_cell01 successfully created

Use fastest
disk portion

CellCLI> CREATE GRIDDISK ALL PREFIX


PREFIX=fra
fra
GridDisk fra_CD_00_cell01 successfully created
...
GridDisk fra_CD_11_cell01 successfully created
CellCLI> LIST GRIDDISK
data_CD_00_cell01
...
data_CD_11_cell01
fra_CD_00_cell01
...
fra_CD_11_cell01
CellCLI> exit
[celladmin@cell01 ~]$

Before

After
Grid
disks

Cell
disk

active
active
active

active

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Creating Grid Disks


After cell disks are created, you can create grid disks by using the CREATE GRIDDISK
command. In the example in the slide, the ALL PREFIX option is used to automatically create
one grid disk on each cell disk. When the ALL PREFIX option is used, the generated grid disk
names are composed of the grid disk prefix followed by an underscore (_) and then the cell disk
name.
It is best practice to use the ASM disk group name as the prefix name for the corresponding grid
disks. In the example, prefix values data and fra are the names of the ASM disk groups that
will be created. Grid disk names must be unique across all cells within a single deployment. By
following the recommended naming conventions for naming the grid and cell disks, you will
automatically get unique names.
The optional SIZE attribute specifies the size of each grid disk. If omitted, the grid disk will
automatically consume all the space remaining on the corresponding cell disk.
The LIST GRIDDISK command shows all the grid disks that are created.
Note that for cell disks that are not enabled for interleaving, the first grid disk created on each
cell disk uses the outermost p
portion of the disk. In this area,, each track contains more data than
the inner tracks resulting in higher transfer rates and better performance. Because the best
available offset is chosen automatically in chronological order of grid disk creation, you should
first create those grid disks expected to contain the most frequently accessed data, and then
create the grid disks that will contain the relatively colder data.
Exadata and Database Machine Administration Workshop 4 - 17

Creating Flash-Based Grid Disks


CellCLI> DROP FLASHCACHE
Flash cache cell01_FLASHCACHE successfully dropped
CellCLI> CREATE FLASHCACHE ALL SIZE=100G
Flash cache cell01_FLASHCACHE successfully created
CellCLI>
GridDisk
GridDisk
...
GridDisk

CREATE GRIDDISK ALL FLASHDISK PREFIX=flash


flash_FD_00_cell01 successfully created
flash_FD_01_cell01 successfully created
flash_FD_15_cell01 successfully created

CellCLI> LIST GRIDDISK


...
flash_FD_00_cell01
...
flash_FD_15_cell01
CellCLI> exit
[celladmin@cell01 ~]$

active

Before

After

Flash
Cache

Flash
Cache
Grid disk

active

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Creating Flash-Based Grid Disks


By default, the initial cell configuration process creates flash-based cell disks on all the flash
devices, and then allocates all the available flash space to Exadata Smart Flash Cache. In
certain
t i circumstances,
i
t
you can benefit
b
fit from
f
creating
ti flash-based
fl h b
d grid
id di
disks
k tto actt as a
permanent flash-based data store. To create space for flash-based grid disks, you first need to
drop the default flash cache. Then you can create a flash cache and flash-based grid disks with
your chosen sizes.
In the example in the slide, the default flash cache is dropped. Next, a new Exadata Smart
Flash Cache is created. The new cache is 100 GB in total size with 6.25 GB of space allocated
on each of the 16 flash-based cell disks.
The CREATE GRIDDISK command is used to create flash
flash-based
based grid disks in the same way as
for disk-based grid disks. Note the use of the FLASHDISK option to specify the use of flashbased cell disks as the basis for the grid disks. In the example in the slide, 16 flash-based grid
disks are created and each consumes the remaining 17.75 GB of space available on the flashbased cell disks. The flash-based grid disks follow the same default naming convention as diskbased grid disks.
Although this example does not show it, you can create multiple grid disks on a flash-based cell
disk. Unlike physical disk devices, the order in which you allocate your flash space is not
important from a performance perspective. Likewise, interleaving is not applicable for flashbased disks.
Note: Circumstances that favor the use of flash-based grid disks are discussed in the lesson
titled Optimizing Database Performance with Exadata.
Exadata and Database Machine Administration Workshop 4 - 18

Configuring Hosts to Access Exadata Cells

1. Create the following directory and files:


# mkdir -p /etc/oracle/cell/network-config
# chown oracle:dba /etc/oracle/cell/network-config
# chmod ug+wx /etc/oracle/cell/network-config
$ cd /etc/oracle/cell/network-config
$ cat - > /etc/oracle/cell/network-config/cellinit.ora
ipaddress1=192.168.50.23/24
$ cat - > /etc/oracle/cell/network-config/cellip.ora
cell="192.168.51.27"
ll
cell="192.168.51.28"
cell="192.168.51.29"

Repeat
for each
host.

2. Restart database and ASM instances.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Configuring Hosts to Access Exadata Cells


After your Exadata cells are configured, the database server hosts must be configured to use
the cells:
The cellinit.ora file contains the database server IP address that connects to the
storage network. This file is host specific, so the IP address will be specific to each
database server. The IP address is specified in Classless Inter-Domain Routing (CIDR)
format.
The cellip.ora file contains the IP addresses that are used by storage cells to send
data to the database server host. These IP addresses correspond to the bonded
InfiniBand interface (bond0) on the cells
cells.
Restart the database and the Oracle ASM instances on the database server host after you finish
creating the cellinit.ora and cellip.ora files. After the files have been configured, they
should not be edited while your database or ASM instances are running.

Exadata and Database Machine Administration Workshop 4 - 19

Configuring ASM and Database Instances for


Exadata

Oracle Database and ASM software must be at least


version 11.2.0.1
Use ASM to store OCR and voting disks on Exadata
Set the ASM_DISKSTRING
ASM DISKSTRING ASM initialization parameter:
ASM_DISKSTRING='o/*/*'

Set the COMPATIBLE database initialization parameter:


COMPATIBLE='11.2.0.0.0'

Repeat
for each
host

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Configuring ASM and Database Instances for Exadata


Oracle Database and Oracle Grid Infrastructure 11g Release 2 (11.2.0.1) or later must be
installed on the database server before you can access Exadata from ASM and database
instances.
If you are using Oracle Clusterware, it is recommended that you place the Oracle Cluster
Registry (OCR) and voting disks on ASM.
To ensure that ASM discovers Exadata grid disks, set the ASM_DISKSTRING initialization
parameter. A search string with the following form is used to discover Exadata grid disks:
o/<cell IP address>/<grid disk name>
Wildcards may be used to expand the search string. For example, to explicitly discover all the
available Exadata grid disks set ASM_DISKSTRING='o/*/*'. To discover a subset of
available grid disks having names that begin with data, set ASM_DISKSTRING='o/*/data*'.
Note that if the ASM_DISKSTRING initialization parameter is not set, then the default is to
discover all the available Exadata grid disks.
To configure a database instance to access cell storage, ensure that the COMPATIBLE
t is
i sett to
t 11.2.0.0.0 or later
l t in
i th
the d
database
t b
initialization
i iti li ti fil
file.
parameter
Note that Database Configuration Assistant (DBCA) 11.2.0.1 does not set the COMPATIBLE
initialization parameter to 11.2.0.0.0 by default, and you must set this parameter on the
Initialization Parameters page.
Exadata and Database Machine Administration Workshop 4 - 20

Configuring ASM Disk Groups for Exadata

Disk group DATA


Failure group cell01

Failure group cell02

o/<cell01 IP address>/data_cd_00_cell01
o/<cell01 IP address>/data_cd_01_cell01
...
o/<cell01 IP address>/data_cd_11_cell01

o/<cell02 IP address>/data_cd_00_cell02
o/<cell02 IP address>/data_cd_01_cell02
...
o/<cell02 IP address>/data_cd_11_cell02

o/<cell01 IP address>/fra_cd_00_cell01
o/<cell01 IP address>/fra_cd_01_cell01
...
o/<cell01 IP address>/fra_cd_11_cell01

o/<cell02 IP address>/fra_cd_00_cell02
o/<cell02 IP address>/fra_cd_01_cell02
...
o/<cell02 IP address>/fra_cd_11_cell02

All candidate disks on cell01 and cell02

CREATE DISKGROUP data NORMAL REDUNDANCY


DISK 'o/*/data*'
ATTRIBUTE 'compatible.rdbms' = '11.2.0.0.0',
'compatible.asm' = '11.2.0.0.0',
'cell.smart_scan_capable' = 'TRUE',
'au_size' = '4M';

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Configuring ASM Disk Groups for Exadata


You can now create ASM disk groups from your ASM instance. An ASM disk group can include
Exadata grid disks and conventional disks. However, to enable Smart Scan processing, all the
di k iin an ASM di
disks
disk
k group mustt b
be E
Exadata
d t grid
id di
disks,
k and
d th
the ffollowing
ll i di
disk
k group attribute
tt ib t
settings must be used:
'compatible.rdbms' = '11.2.0.0.0'
'compatible.asm' = '11.2.0.0.0'
'cell.smart_scan_capable' = 'TRUE'
In addition, it is recommended that you set the AU_SIZE disk group attribute value to 4M to
optimize disk scanning.
The example in the slide shows candidate ASM disks from two Exadata cells: cell01 and
cell02. The CREATE DISKGROUP statement references all of the candidate ASM disks having
names that start with data. By default, ASM failure groups corresponding to each cell are
automatically defined. As a result, two failure groups are automatically created using
corresponding grid disks from each cell. By default, the failure group names correspond to the
cell names.
Once created
created, an Exadata
Exadata-based
based disk group can be used to house Oracle data files in the same
way as an ASM disk group based on any other storage. To complement the recommended
AU_SIZE setting of 4 MB, you should set the initial extent size to 8 MB for large segments. This
can be done using segment-level or tablespace-level settings. The recommended approaches
are discussed in the lesson entitled Optimizing Database Performance with Exadata.
Exadata and Database Machine Administration Workshop 4 - 21

Optional Configuration Tasks

Configure Exadata storage security.


Configure I/O Resource Management (IORM).

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Optional Configuration Tasks


After you complete the cell configuration, you can perform the following optional tasks on the
storage cell:
Configure Exadata storage security.
Configure I/O Resource Management (IORM). IORM is covered in detail in the lesson
titled Exadata and I/O Resource Management.
Note: Repeat each configuration task on each relevant storage cell.

Exadata and Database Machine Administration Workshop 4 - 22

Exadata Storage Security Overview


ASM-scoped
security
mode

ASM cluster A
RAC
DB
Instances

Non-RAC
DB
Instance

Grid
disk

Exadata Cell 1

Exadata Cell 2

RAC
DB
Instances

Non-RAC
DB
Instance

Database-scoped
security mode

ASM cluster B

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Storage Security Overview


Exadata storage security is implemented by controlling which ASM clusters and database
grid disks on storage
g cells.
clients can access specific g
To set up security so that all database clients of an ASM cluster have access to specified
grid disks, configure ASM-scoped security.
To set up security so that specific database clients of an ASM cluster have access to
specified grid disks, configure database-scoped security.
Both concepts are illustrated in the slide. ASM cluster A shares two grid disks per cell with all of
its database clients. ASM cluster B shares one grid disk per cell to store the single instance
database, and another two grid disks (one per cell) to store the RAC database.
Note: By default, none of these security modes are implemented. This situation is called open
security where all database clients can access all grid disks. Open security does not require any
configuration, and as long as the network and database hosts are well secured you can use this
mode for your production databases. Open security is also useful for non-production
environments such as those that house test or development databases.

Exadata and Database Machine Administration Workshop 4 - 23

Exadata Storage Security Implementation


Exadata
Cell
ASM cluster
hosts
CREATE KEY

A
S
M

A
S
M

/etc/oracle/cell/network.config

cellkey.ora

Each
cell

ASSIGN KEY
FOR <ASM>
Each
database

Each
disk

D
B

CREATE/ALTER
GRIDDISK
availableTo
<ASM>

$ORACLE_HOME/admin/<db_unique_name>/pfile

cellkey.ora

D
B

CREATE KEY

Each
disk

Each
cell

ASSIGN KEY
FOR <DB>

CREATE/ALTER
GRIDDISK
availableTo <DB>

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Storage Security Implementation


The slide briefly describes the steps to configure ASM-scoped and database-scoped security. It
y first if yyou want to set up
is important to realize that yyou must set up ASM-scoped security
database-scoped security.
To implement ASM-scoped security, perform the following steps:
1. Shut down your ASM and database instances.
2. Generate a security key using the CREATE KEY CellCLI command. Run this command
once only on any cell.
3. Construct a cellkey.ora file using the generated security key. Copy the cellkey.ora
file into the /etc/oracle/cell/network
/etc/oracle/cell/network-config/
config/ directory on every host in your
ASM cluster.
4. Use the ASSIGN KEY command to assign the security key to the Oracle ASM cluster on
all the cells that you want the Oracle ASM cluster to access. The ASM cluster name is
determined by the DB_UNIQUE_NAME initialization parameter setting.
5. Enter the Oracle ASM cluster name in the availableTo attribute with the CREATE
GRIDDISK or ALTER GRIDDISK command to configure security on the grid disks on all
the cells that you want the Oracle ASM cluster to access.
access At the conclusion of this step
step,
each grid disk has an association with the ASM cluster that is allowed to use the disk.
6. Restart your ASM and database instances.
Exadata and Database Machine Administration Workshop 4 - 24

Exadata Storage Security Implementation (continued)


After you have configured and tested ASM-scoped security, you can proceed to set up
database-scoped security. Perform the following steps for each database you want to configure
with database-scoped security:
1. Shut down your ASM and database instances.
2. Generate a security key using the CREATE KEY CellCLI command. Run this command
once only on any cell.
3. Construct a cellkey.ora file using the generated security key.
Copy the cellkey.ora file into the
$ORACLE_HOME/admin/<db_unique_name>/pfile/ directory on every host running
your database
database.
4. Use the ASSIGN KEY command to assign the security key to the database on all the cells
that you want the database to access. The database name is determined by the
DB_UNIQUE_NAME initialization parameter setting.
5. Enter the database name in the availableTo attribute with the CREATE GRIDDISK or
ALTER GRIDDISK command to configure security on the grid disks on all the cells that
you want the database to access. At the conclusion of this step, each grid disk has an
association with the ASM cluster and specific database that is allowed to use the disk
disk.
6. Restart your ASM and database instances.
Note: For more information, including examples and further details, refer to the Oracle Exadata
Storage Server Software User's Guide 11g Release 2 (11.2).

Exadata and Database Machine Administration Workshop 4 - 25

Quiz
Grid disks are seen by ASM by using a discovery string that
starts with:
1. c/
2 o/
2.
3. g/
4. e/

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 2

Exadata and Database Machine Administration Workshop 4 - 26

Quiz
The first grid disk you create uses the slowest tracks of the
corresponding physical disk.
1. TRUE
2 FALSE
2.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 2

Exadata and Database Machine Administration Workshop 4 - 27

Quiz
When you create a disk group for which you want Exadata
smart storage capabilities enabled, what three attributes must
you specify?
1. compatible.rdbms
p
2. compatible.asm
3. au_size
4. disk_repair_time
5. cell.smart_scan_capable

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 1, 2, 5

Exadata and Database Machine Administration Workshop 4 - 28

Summary
In this lesson, you should have learned how to:
Perform the initial Exadata boot sequence
Configure Exadata software
Create and configure ASM disk groups using Exadata
Use the CellCLI Exadata administration tool

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 4 - 29

Additional Resources

Lesson Demonstrations (Viewlets)

Exadata Cell Configuration

Exadata Storage Provisioning

http://stcurriculum oracle com/demos/db/11g/r2/dbmach/044ExadataUserAccounts/044exa


curriculum.oracle.com/demos/db/11g/r2/dbmach/044ExadataUserAccounts/044exa
datauseraccounts_viewlet_swf.html

Exadata Cell First Boot

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/043ExadataConsumingGridDisks/
043exadataconsuminggriddisks_viewlet_swf.html

Exadata Cell User Accounts

http://sthttp://st
curriculum.oracle.com/demos/db/11g/r2/dbmach/042ExadataStorageProvisioning/0
42exadatastorageprovisioning_viewlet_swf.html

Consuming Exadata Grid Disks Using ASM

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/041ExadataCellConfig/041exadat
acellconfig_viewlet_swf.html

http://stcurriculum.oracle.com/demos/db/11g/r2/exadatav2/cellfirstboot/cellfirstboot.swf

Another Example of Exadata Cell Configuration

http://st-curriculum.oracle.com/demos/db/11g/r2/exadatav2/cellcli/cellcli.swf

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 4 - 30

Practice 4 Overview:
Configuring Exadata
In these practices, you will perform a variety of Exadata
configuration tasks, including cell configuration and storage
provisioning. You will also consume Exadata storage using
ASM and exercise the privileges associated with the different
cell user accounts.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 4 - 31

E d t P
Exadata
Performance
f
M
Monitoring
it i
and
d
Maintenance

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Objectives
After completing this lesson, you should be able to:
Describe the various performance monitoring facilities
available for Exadata
Monitor Exadata from directly within a cell,
cell from a
database instance, and through Enterprise Manager
Interpret SQL execution plans that use Smart Scan
Outline probable maintenance scenarios

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 5 - 2

Monitoring Overview

1
Metrics

2
Alerts

3
Active requests

4
Execution
plans

5
V$
views

6
Wait
events

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring Overview
After Exadata is configured and in use, the administrative focus shifts to ongoing monitoring
and maintenance
maintenance. To monitor Exadata
Exadata, you can use the following tools and information:
1. Exadata cell metrics
2. Exadata cell alerts
3. Exadata active requests
4. Database SQL statement execution plans
5. Database V$ views
6. Database wait events
7. Oracle Enterprise Manager Exadata monitoring plug-in

Exadata and Database Machine Administration Workshop 5 - 3

Exadata Metrics and Alerts Architecture


MS keeps a
set of the metric values.
Collected metrics:
Cell, Cell Disks,
Grid Disk, IORM,
Interconnect

Metric
thresholds
exceeded

CELLSRV
internal
errors
CELLSRV
collects
metrics

ADR

CELLSRV

One hour of
in-memory
metric values

Every hour MS
flushes metric
values to disk.

MS

Cell
software
issues

Cell
Cell
hardware
issues

LIST METRICCURRENT

Disk

ALTER CELL
Seven days

metrics

Email
and/or
SNMP

1h
hour
alerts

LIST METRICHISTORY
LIST ALERTHISTORY

Metric and Alert


History
Admin

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata Metrics and Alerts Architecture


You can monitor each cell with Exadata cell metrics. CELLSRV periodically records important
run-time properties, called metrics, for cell components such as CPUs, cell disks, g
grid disks,
flash cache, and IORM statistics. These metrics are recorded in memory.
Based on its own metric collection schedule, MS gets the set of metric data accumulated by
CELLSRV. MS keeps a subset of the metric values in memory, and writes a history to the disk
repository every hour.
The retention period for metric and alert history entries is specified by the
metricHistoryDays cell attribute. You can modify this setting with the CellCLI ALTER CELL
command.
d B
By d
default,
f lt it is
i seven days.
d
Thi
This process iis conceptually
t ll similar
i il tto d
database
t b
AWR
snapshots.
You can get the metric value history by using the CellCLI LIST METRICHISTORY command,
and the current metric values by using the LIST METRICCURRENT command.
At the Exadata cell level, you can define thresholds for metrics. Using the Enterprise Manager
plug-in for Exadata, you can set separate EM thresholds for all the Exadata metrics supported
by the plug-in
plug-in.

Exadata and Database Machine Administration Workshop 5 - 4

Exadata Metrics and Alerts Architecture (continued)


In addition to metrics, Exadata can trigger alerts. Alerts represent events of importance
occurring within the cell, typically indicating that an Exadata cell function is compromised. MS
gg
an alert when it discovers a:
triggers
Cell hardware issue
Cell software or configuration issue

CELLSRV internal error


Metric that has exceeded an alert threshold
You can view triggered alerts using the LIST ALERTHISTORY command. In addition, you can
configure the cell to instruct MS to automatically send an email and/or SNMP messages to a
designated set of storage administrators.

Exadata and Database Machine Administration Workshop 5 - 5

Monitoring Exadata with Metrics

Metrics

alertState metricObjectName unit objectType metricValue metricType

CREATE|ALTER THRES
SHOLDS

normal
warning
critical

number
% (percentage)
F (fahrenheit)
C (celsius)

Th h ld
Thresholds
name
comparision
critical
occurances
observation
warning

name

cumulative
instantaneous
rate
transition

IORM_CONSUMER_GROUP
IORM_DATABASE
IORM_CATEGORY
CELL
CELLDISK
CELL_FILESYSTEM
GRIDDISK
HOST_INTERCONNECT
FLASHCACHE

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring Exadata with Metrics


Metrics are recorded observations of important run-time properties or internal instrumentation
values of the storage cell and its components, such as cell disks or grid disks. Metrics are a
series of measurements that are computed and retained in memory for an interval of time, and
stored on a disk for a more permanent history.
The graphic in the slide describes some of the important metric attributes. Each metric:
Has a name and description
Is associated with a metricObjectName that is the name of the object being measured,
such as a specific cell disk, grid disk, or consumer group
Belongs
B l
to
t a group that
th t iis d
defined
fi d b
by itits objectType attribute.
tt ib t The
Th possible
ibl groups are
shown in the slide.
Has a metricType, which is an indicator of how the statistic was created or defined.
Possible values and their meanings are:
- cumulative: Cumulative statistics since the metric was created
- instantaneous: Value at the time that the metric is collected
- rate:
t Rates
R t computed
t db
by averaging
i statistics
t ti ti over observation
b
ti periods
i d
- transition: Are collected at the time when the value of the metrics has changed,
and typically captures important transitions in hardware status
Has a measurement unit. Possible units are shown in the slide.
Exadata and Database Machine Administration Workshop 5 - 6

Monitoring Exadata with Metrics (continued)


Understanding the composition of the metric name provides a good insight into the meaning of
the metric. The value of the name attribute is a composite of abbreviations. The attribute value
sstarts
a s with a
an abb
abbreviation
e a o o
of the
e objec
object type
ype o
on which
c the
e metric
e c is
s de
defined:
ed
CL_ (cell)
CD_ (cell disk)
GD_ (grid disk)
FC_ (flash cache)
DB_ (database)
CG_
CG (consumer group)
CT_ (category)
N_ (interconnect network)
After the abbreviation of the object type, many metric names conclude with an abbreviation that
relates to the description of the metric. For example, CL_FANS is the instantaneous number of
working fans on the cell.
I/O l t d metric
I/O-related
t i name attributes
tt ib t continue
ti
with
ith one off the
th following
f ll i combinations
bi ti
tto id
identify
tif th
the
operation:
IO_RQ (number of requests)
IO_BY (number of MB)
IO_TM (I/O latency)
IO_WT (I/O wait time)
Next in the name could be _R for read or _W for write. Following that, there might be _SM or _LG
to identify small or large I/Os, respectively. At the end of the name, there could be _SEC to
signify per second or _RQ to signify per request. For example:
CD_IO_RQ_R_SM is the number of requests to read small blocks on a cell disk.
GD_IO_BY_W_LG_SEC is the number of MB of large block I/O per second on a grid disk.
If a metric value crosses a user-defined threshold, an alert will be generated. Metrics can be
associated with warning and critical thresholds. Thresholds relate to extreme values in the
metric, which might indicate a problem or other event of interest to an administrator.
Thresholds are supported on cell disk and grid disk I/O error count metrics (CD_IO_ERRS_MIN
and GD_IO_ERRS_MIN), along with the cell memory utilization (CL_MEMUT ) and cell filesystem
utilization (CL_FSUT) metrics. In addition, you can set thresholds for I/O Resource Management
(IORM) related metrics. The CellCLI LIST ALERTDEFINITION command lists the metrics for
which thresholds can be set.
Users of Enterprise Manager Grid Control with the Exadata Plug-in can configure a separate set
of thresholds and alerts in the Grid Control environment. These can be used in conjunction with
metrics and alerts from across your systems to provide an enterprise-level view of system health
and state.
Note: For a complete reference of metric and threshold attributes, refer to the Oracle Exadata
Storage Server Software User's Guide. For more information about the Exadata Plug-in for
Enterprise Manager Grid Control, refer to the Oracle Exadata Storage Server Documentation
library.

Exadata and Database Machine Administration Workshop 5 - 7

Monitoring Exadata with Metrics: Example


CellCLI> LIST METRICDEFINITION WHERE objectType ='CELL' DETAIL
name: CL_CPUT
description: "Cell CPU Utilization is the percentage of time over
the previous minute that the system CPUs were not
idle (from /proc/stat). "
metricType: Instantaneous objectType: CELL
unit: %
...
CellCLI> LIST METRICHISTORY WHERE name like 'CL_.*'
AND collectionTime > '2009-10-11T15:28:36-07:00'
CL_RUNQ
cell03_2
6.0
2009-10-11T15:28:37-07:00
CL_CPUT
cell03_2
47.6 %
2009-10-11T15:29:36-07:00
CL_FANS
cell03_2
1
2009-10-11T15:29:36-07:00
CL_TEMP
cell03_2
0.0 C
2009-10-11T15:29:36-07:00
CL RUNQ
CL_RUNQ
cell03_2
ll03 2
5
5.2
2
2009
2009-10-11T15:29:37-07:00
10 11T15 29 37 07 00
...
CellCLI> LIST METRICCURRENT WHERE objectType = 'CELLDISK'
CD_IO_TM_W_SM_RQ CD_1_cell03 205.5 us/request
CD_IO_TM_W_SM_RQ CD_2_cell03 93.3 us/request
CD_IO_TM_W_SM_RQ CD_3_cell03
0.0 us/request
...

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring Exadata with Metrics: Example


The slide shows you some basic commands that you could use to display metric information:
Use tthe
e LIST
S METRICDEFINITION
C
O co
command
a d to d
display
sp ay tthe
e metric
et c de
definitions
t o s for
o tthe
e
cell. A metric definition describes the configuration of a metric. The example does not
specify any particular metric, so all metrics corresponding to the WHERE clause are printed.
In addition to the WHERE clause, you can also specify the metric definition attributes you
want to print. If the ATTRIBUTES clause is not used, a default set of attributes is displayed.
To list all the attributes, you can add the DETAIL keyword at the end of the command.
Use the LIST METRICHISTORY command to display the metric history for the cell. A
metric history describes a collection of past metric observations.
observations Similar to the LIST
METRICDEFINITION command, you can specify attribute filters, an attribute list, and the
DETAIL keyword for the LIST METRICHISTORY command. The above example lists
metrics having names that start with CL_ that were collected after the specified time.
Use the LIST METRICCURRENT command to display the current metric values for the
cell. The above example lists all cell disk metrics. The metric values shown in the slide
correspond to the average latency per request of writing small blocks to a cell disk. For
this metric there is a metric observation for every cell disk.

Exadata and Database Machine Administration Workshop 5 - 8

Monitoring Exadata with Alerts

Alerts

alertSource

severity

BMC
ADR
Metric

warning
critical
clear
info

alertType

metricObjectName

examinedBy

metricName
name

stateful
stateless

alertAction
alertMessage
failedMail

ALTER ALERTHISTORY ALL examinedBy="<administrator>"


y

f il dSNMP
failedSNMP
beginTime

0
1
2
3

EndTime
notificationState

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring Exadata with Alerts


Alerts represent events of importance occurring within the storage cell, typically indicating that
g cell functionality
y is either compromised or in danger
g of failure. An administrator should
storage
investigate alerts, because they might require urgent corrective or preventative action. Use the
ALTER CELL command to configure email or SNMP notification for alerts.
Alerts are either stateful or stateless. Stateful alerts represent observable cell states that can be
subsequently retested to detect whether the state has changed, indicating that a previously
observed alert condition is no longer a problem. Stateless alerts represent point-in-time events
that do not represent a persistent condition; they simply show that something has occurred.
Al t can h
Alerts
have one off th
the ffollowing
ll i severities:
iti
warning,
i
critical,
iti l clear,
l
or info.
i f
Examples of possible events that trigger alerts are physical disk failure, disk read/write errors,
cell temperature exceeding recommended value, cell software failure, and excessive I/O
latency.
Metrics can be used to signal stateful alerts using warning or critical threshold values.
When the metric value crosses the threshold value, an alert is signaled. An alert with a clear
severityy indicates that a p
previous critical or warning
g condition has returned to normal. For
threshold-based alerts, a clear alert is generated when the measured value crosses back over
the threshold value.
Exadata and Database Machine Administration Workshop 5 - 9

Monitoring Exadata with Alerts (continued)


Alerts with an info severity are stateless and log conditions that might be informative to an
administrator but for which no administrator action is required. Informational alerts are not
distributed
d
s bu ed by e
email
a o
or S
SNMP notifications.
o ca o s
The slide illustrates some of the important alert attributes. Each alert has the following attributes:
name provides an identifier for the alert.
alertSource provides the source of the alert. Some possible sources are listed in the
slide.
severity determines the importance of the alert. Possible values are warning,
critical,
c
t ca , c
clear,
ea , and
a d info.
o
alertType provides the type of the alert: stateful or stateless. Stateful alerts are
automatically cleared on transition to normal. Stateless alerts are never cleared unless you
change the alert by setting the examinedBy attribute. This attribute identifies the
administrator who reviewed the alert and is the only alert attribute that can be modified by
the administrator using the ALTER ALERTHISTORY command.
metricObjectName is the object for which a metric threshold has caused an alert.
metricName provides the metric name if the alert is based on a metric.
alertAction is the recommended action to perform for this alert.
alertMessage provides a brief explanation of the alert.
failedMail is the intended email recipient when a notification failed.
failedSNMP is the intended SNMP subscriber when a notification failed.
beginTime
g
provides the timestamp
p
p when an alert changes
g its state.
endTime provides the timestamp for the end of the period when an alert changes its state.
notificationState indicates progress in notifying subscribers to alert messages:
- 0: never tried
- 1: sent successfully
- 2: retrying (up to 5 times)
- 3: five failed retries
Note: Some I/O errors may result in an ASM disk going offline without generating an alert in
Exadata. You should continue to perform I/O monitoring from your databases and ASM
environments to identify and remedy these kinds of problems.

Exadata and Database Machine Administration Workshop 5 - 10

Displaying Alert Examples

CellCLI> LIST ALERTDEFINITION ATTRIBUTES name, metricName, description


ADRAlert "CELL Incident Error"
HardwareAlert "Hardware Alert"
StatefulAlert_CG_IO_RQ_LG CG_IO_RQ_LG "Threshold Based Stateful Alert"
StatefulAlert_CG_IO_RQ_LG_SEC CG_IO_RQ_LG_SEC "Threshold Based Alert"
StatefulAlert_CG_IO_RQ_SM CG_IO_RQ_SM "Threshold Based Stateful Alert"
...

CellCLI> LIST ALERTHISTORY WHERE severity = 'critical' AND examinedBy = '' DETAIL

CellCLI> ALTER ALERTHISTORY 1671443814 examinedBy="JFV"

CellCLI> CREATE THRESHOLD ct_io_wt_lg_rq.interactive warning=1000, critical=2000, comparison='>', occurrences=2, observation=5

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Displaying Alert Examples


The slide shows you some examples of commands that display alert information. The
y g alerts are veryy similar to the ones used for displaying
y g metric
commands for displaying
information:
Use the LIST ALERTDEFINITION command to display the definition for every alert that
can be produced on the cell. The example in the slide displays the alert name, metric
name, and description. The metric name identifies the metric on which the alert is based.
ADRAlert and HardwareAlert are not based on any metric and, therefore, do not have
metric names.
Use the LIST ALERTHISTORY command to display the alert history that has occurred on
a cell. The example in the slide lists in detail all critical alerts that have not been reviewed
by an administrator.
Use the ALTER ALERTHISTORY command to update the alert history for the cell. The
above example shows how to set the examinedBy attribute to the user ID of the
administrator that examined the alert. The examinedBy attribute is the only
ALERTHISTORY attribute that can be modified. The example uses the alert sequence ID to
identify the alert. alertSequenceID provides a unique sequence ID number for the alert.
When an alert changes its state, another occurrence of the alert is created with the same
sequence number but with a different timestamp.
Exadata and Database Machine Administration Workshop 5 - 11

Displaying Alert Examples (continued)

The CREATE THRESHOLD command creates a threshold that specifies the conditions for
generation of a metric alert. The example creates a threshold for the CT_IO_WT_LG_RQ
g y This metric specifies
p
the average
g
metric associated with the INTERACTIVE category.
number of milliseconds that large I/O requests issued by the category have waited to be
scheduled by IORM in the past minute. A large value indicates that the I/O workload from
this category is exceeding the allocation specified for it in the category plan. The alert is
triggered by two consecutive measurements (occurrences=2) over the threshold values:
one second for a warning alert (warning=1000) and two seconds for a critical alert
(critical=2000). The observation attribute is the number of measurements over
which measured values are averaged.

Exadata and Database Machine Administration Workshop 5 - 12

Monitoring Exadata with Active Requests

Active Requests

ioGridDisk

ioBytes

ioOffset

ioReason

ioType

objectNumber

id

name

asmDiskGroupNumber

parentID

asmFileIncarnation

requestState
sessionID
sessionSerNumber

file initialization
read
write
predicate pushing
filtered backup read
predicate push read

asmFileNumber
consumerGroupName

t bl
tablespaceNumber
N b

dbN
dbName
instanceNumber

sqlID

LIST ACTIVEREQUEST WHERE IoType = 'predicate pushing' DETAIL

fileType

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring Exadata with Active Requests


An active request provides a client-centric or application-centric view of client I/O requests that
are currently being processed by a cell.
The slide shows the most important attributes of an active request. You can see that an active
request is characterized at all levels: instance, database, ASM, and cell. Most of the attributes
have self-explanatory names. Here is a brief explanation of some of the attributes:
ioReason is the reason for the I/O activity, such as a control-file read.
ioType identifies the type of active request. Possible values are listed in the slide.
requestState identifies the state of the active request. Possible values include:
- Accessing Disk
- Computing Result
- Network Receive
- Network Send
- Queued Extent
- Queued for Disk
- Queued for File Initialization
- Queued for Filtered Backup Read
- Queued for Network Send
- Queued for Predicate Pushing
- Queued for Read
- Queued for Write
- Queued in Resource Manager
Use the LIST ACTIVEREQUEST command to display active request details for the cell. The
syntax is very similar to other LIST commands. You can specify which attributes to display or
you can display them all using the DETAIL clause. You can also filter the output using a WHERE
clause.
Exadata and Database Machine Administration Workshop 5 - 13

Monitoring SQL Execution Plans

Relevant Initialization Parameters:


CELL_OFFLOAD_PROCESSING
TRUE | FALSE
Enables or disables Smart Scan and other smart storage
capabilities
Dynamically modifiable at the session or system level using
ALTER SESSION or ALTER SYSTEM
Specifiable at the statement level using the OPT_PARAM hint

CELL OFFLOAD PLAN DISPLAY


CELL_OFFLOAD_PLAN_DISPLAY
NEVER | AUTO | ALWAYS
Allows execution plan to show offloaded predicates
Dynamically modifiable at the session or system level using
ALTER SESSION or ALTER SYSTEM
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring SQL Execution Plans


The CELL_OFFLOAD_PROCESSING initialization parameter enables SQL processing offload to
Exadata. The default value of the parameter is TRUE which means that predicate evaluation can
be offloaded to Exadata. If set to FALSE, the database performs all the predicate evaluation with
cells serving blocks like traditional storage. To enable offloading for a particular SQL statement,
use the OPT_PARAM hint as shown in the following example:
SELECT /*+ OPT_PARAM('cell_offload_processing' 'true') */ ...
The CELL_OFFLOAD_PLAN_DISPLAY initialization parameter determines whether the SQL
EXPLAIN PLAN statement displays the predicates that can be evaluated by Exadata as
STORAGE predicates for a given SQL statement
statement. The possible values are:
AUTO instructs the SQL EXPLAIN PLAN statement to display the predicates that can be
evaluated as STORAGE only if a cell is present and if a table is on the cell.
ALWAYS produces changes to the SQL EXPLAIN PLAN statement whether or not Exadata
is present or the table is on the cell. You can use this setting to identify statements that are
candidates for offloading before migrating to Exadata.
NEVER produces no changes to the SQL EXPLAIN PLAN statement due to Exadata
Exadata. This
may be desirable, for example, if you wrote tools that process execution plan output and
these tools have not been updated to deal with new syntax or when comparing plans for
two systems: one with Exadata and one without.
Exadata and Database Machine Administration Workshop 5 - 14

Smart Scan Execution Plan Example


SQL> alter session set CELL_OFFLOAD_PROCESSING = TRUE;
Session altered.
SQL> alter session set CELL_OFFLOAD_PLAN_DISPLAY = ALWAYS;
Session altered.
SQL> explain plan for select * from customers where c_customer_sk < 10;
Explained.
p
SQL> select * from table(dbms_xplan.display);
-----------------------------------------------------------------------------| Id | Operation
| Name
| Rows | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------|
0 | SELECT STATEMENT
|
|
1 |
196 |
326
(1)|
|
1 | PX COORDINATOR
|
|
|
|
|
|
2 |
PX SEND QC (RANDOM)
| :TQ10000 |
1 |
196 |
326
(1)|
|
3 |
PX BLOCK ITERATOR
|
|
1 |
196 |
326
(1)|
|* 4 |
TABLE ACCESS STORAGE FULL| CUSTOMER |
1 |
196 |
326
(1)|
-----------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------4 - storage("C_CUSTOMER_SK"<10) filter("C_CUSTOMER_SK"<10)

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Smart Scan Execution Plan Example


Smart Scan is enabled by default for direct read operations. So Exadata uses only direct reads,
not the buffer cache, to process queries that can be offloaded.
Exadata optimization is a run-time decision, and it is not integrated with the Oracle optimizer. So
offloading is possible only for full scans and is available only with segments stored on disk
groups that are completely stored on Exadata.
If Exadata is not sure that a block is current, it transfers the read of that block to the traditional
buffer cache/read consistency path. So if you run updates at the same time as queries, you will
benefit less from Smart Scan than if you were executing a read-only workload. This is also true
f indirect
for
i di t rows.
The slide shows an example of SQL processing offload manifested in a query plan. The first
command enables offloading for the session. The second command enables storage predicates
to be showed in the SQL execution plans of the session, even if Exadata is not present.
At the bottom of the plan output, you can see the STORAGE operation indicating the predicate
being offloaded to Exadata ("C_CUSTOMER_SK"<10).
Note:
N
t Smart
S
tS
Scan iis also
l available
il bl ffor iindex
d ffastt ffullll scans, nott jjustt ttable
bl scans. Al
Also, you
cannot see column projection in a query plan.

Exadata and Database Machine Administration Workshop 5 - 15

Predicate Offloading Considerations


Predicate evaluation is not offloaded when:
CELL_OFFLOAD_PROCESSING is set to FALSE
The table or partition being scanned is small
The optimizer decides not to use direct path reads
performed on a clustered table
A scan is p
The table has row dependencies enabled or the rowscn is being
fetched
The optimizer wants the scan to return rows in ROWID order
The command is CREATE INDEX using NOSORT
A LOB or LONG column is being selected or queried
A scan is performed on a flashback table
The data is encrypted and cell-based decryption is disabled
The tablespace is not completely stored on Exadata
More than 255 columns are used in a query
The predicate evaluation is on a virtual column
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Predicate Offloading Considerations


The slide lists the cases where predicate evaluation is not offloaded to Exadata. The following
provides additional information for some of these cases:
The optimizer decides not to use direct path reads: Direct path reads are mostly used by
parallel operations. Serial operations can do direct reads too, depending on factors such
as the table size and the state of the buffer cache. Direct path reads can also be forced
for serial access by setting _serial_direct_read to TRUE.

The data is encrypted and cell-based decryption is disabled: In order for Exadata to
perform decryption, Oracle Database needs to send the decryption key to Exadata. If
there are security concerns about keys being shipped across the storage network, you
can disable cell-based decryption by setting the CELL_OFFLOAD_DECRYPTION
parameter to FALSE.

Exadata and Database Machine Administration Workshop 5 - 16

Monitoring Exadata from Your Database


V$CELL
- CELL_PATH
- CELL_HASHVAL
cell
flash cache
read hits
cell physical IO
bytes eligible for
predicate offload
cell physical IO
interconnect bytes
returned by smart
scan
physical write
total bytes

V$SYSSTAT
- NAME
- VALUE
...

V$SQL
cellip.ora

cell p
physical
y
IO
bytes saved during
optimized RMAN file
restore
cell physical IO
interconnect bytes

SQL_TEXT
PHYSICAL_READ_BYTES
PHYSICAL_WRITE_BYTES
IO_INTERCONNECT_BYTES
IO CELL OFFLOAD ELIGIBLE BYTES
IO_CELL_OFFLOAD_ELIGIBLE_BYTES
IO_CELL_UNCOMPRESSED_BYTES
IO_CELL_OFFLOAD_RETURNED_BYTES
OPTIMIZED_PHY_READ_REQUESTS

...
cell physical IO
bytes saved by
storage index
physical read
total bytes
cell physical IO
bytes saved during
optimized file
creation
cell IO
uncompressed
bytes

V$BACKUP_DATAFILE
- DATAFILE_BLOCKS
- BLOCKS_READ
- BLOCKS_SKIPPED_IN_CELL
...

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring Exadata from Your Database


You can use the following V$ views and corresponding statistics to monitor Exadata from a
database instance:
V$CELL provides the cell IP address extracted from the cellip.ora file. It also
contains a numeric hash value for the cell which is used as an identifier for the cell in
other views, such as V$SESSION_WAIT and V$ACTIVE_SESSION_HISTORY.
V$BACKUP_DATAFILE contains statistics relevant to Exadata during RMAN incremental
backups. The BLOCKS_SKIPPED_IN_CELL column indicates the number of blocks that
were filtered in Exadata to optimize the RMAN incremental backup.
$S SS
and
a
d V$SESSTAT
$S SS
contain
co
a key
ey sstatistics
a s cs that
a ca
can be used to
o co
compute
pu e
V$SYSSTAT
Exadata effectiveness at both the system and session level. Statistics in these views
can be used to monitor the effectiveness of Exadata Smart Flash Cache, Exadata
Hybrid Columnar Compression, SQL offloading, storage indexes, fast file creation, and
optimized incremental backups. In addition, other statistics provide the total volume of
I/O exchanged over the interconnect and the total volume of physical disk reads and
writes.
V$SQL lists statistics on shared SQL areas. It contains statement-level statistics for the
volume off physical I/O
/O (reads
(
and writes),
) the volume off I/O
/O exchanged over the
interconnect, along with information relating to the effectiveness of Exadata Smart Flash
Cache, Exadata Hybrid Columnar Compression, and SQL offloading.
Note: For more information, refer to the Oracle Exadata Storage Server Software User's
Guide.
Exadata and Database Machine Administration Workshop 5 - 17

Monitoring Exadata with Wait Events

SELECT w.event, c.cell_path, d.name, w.p3


FROM
V$SESSION_WAIT w, V$EVENT_NAME e, V$ASM_DISK d, V$CELL c
WHERE e.name LIKE 'cell%' AND e.wait_class_id = w.wait_class_id
AND w.p1 = c.cell_hashval AND w.p2 = d.hash_value;

Wait Event

Description

cell interconnect retransmit


during physical read

Database wait during retransmission for an I/O of a


single-block or multiblock read

cell single block physical read

Cell equivalent of db file sequential read

cell multiblock physical read

Cell equivalent of db file scattered read

cell smart table scan

Database wait for table scan to complete

cell smart index scan

Database wait for index or IOT fast full scan

cell smart file creation

Database wait for file creation operation

cell smart incremental backup

Database wait for incremental backup operation

cell smart restore from backup

Database wait during file initialization for restore

cell statistics gather

Wait during query of V$CELL views

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring Exadata with Wait Events


Oracle uses a specific set of wait events for disk I/O to Exadata that identifies the corresponding
grid disk being
g accessed. This information is more useful for performance and
cell and g
diagnostics purposes than the database file number and block number information that is
provided by wait events for conventional storage.
Information about wait events is displayed in V$ dynamic performance views, such as
V$SESSION_WAIT, V$SYSTEM_EVENT and V$SESSION_EVENT.
The slide shows an example of a query used to display the cell IP address and disk name
corresponding to cell wait events. A list of cell wait events with a brief description is also shown.
M t off th
Most
the cellll wait
it events
t are self-explanatory.
lf
l
t
The cell statistics gather event is a little different. It appears when a select is done on
the V$CELL_STATE, V$CELL_THREAD_HISTORY, or V$CELL_REQUEST_TOTALS view. During
such a query, data from the cells and any wait events are shown in this wait event. Normally,
these V$CELL views are only used by Oracle Support Services.
Note: For more information about these wait events, refer to the Oracle Exadata Storage Server
Software User
User's
s Guide.
Guide

Exadata and Database Machine Administration Workshop 5 - 18

Monitoring Exadata with Enterprise Manager


7

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Monitoring Exadata with Enterprise Manager


The System Monitoring Plug-In for Exadata extends Grid Control to add support for managing
Exadata targets. By deploying the plug-in to your Grid Control environment, you gain the
following management features:
Monitor Exadata, individually or in groups, using a GUI environment.
Gather storage configuration and performance information of various Exadata related
storage components, such as grid disks and cell disks.
Raise alerts and violations based on thresholds set for monitoring and configuration data.
Provide rich out-of-the-box metrics and reports based on the gathered data.
This plug-in supports the following versions of products:
Exadata Storage Server software release 11.2 and later.
Enterprise Manager Grid Control 10g Release 3 (10.2.0.3) and later (OMS and Agent).
The plug-in requires SSH connectivity between the celladmin user on the Exadata cells and
the Management Agent user on the computer running the Management Agent. Typically, the
agent on the OMS server is used to monitor Exadata, but another agent can be used.
Before you set up alerts in Grid Control, you must configure the Exadata cells to send SNMP
alerts to the Management Agent that is monitoring them.
Note: Refer to the Enterprise Manager System Monitoring Plug-In Installation Guide for Sun
Oracle Exadata Storage Server Release 10 (1.1.4.0.0) for more information about the plug-in.
Exadata and Database Machine Administration Workshop 5 - 19

Additional Monitoring Tools and Utilities


Facility

Application

More Information

Integrated Lights
Out Manager (ILOM)

Primarily
storage server
hardware
Also network
net ork
and operating
system

http://docs.sun.com/app/docs/coll/ilom3.0 for
more information about ILOM

Standard Linux
monitoring tools and
utilities (vmstat,
iostat, top,
syslog, and so on)

Primarily
storage server
operating
system
Also network
and hardware

http://linux.oracle.com for more information


about Oracle Linux
See Linux man pages for specific utilities.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Additional Monitoring Tools and Utilities


In addition to the facilities described so far in this lesson, Exadata administrators have
numerous tools and utilities that are provided to monitor Exadata hardware,
hardware operating system
and network components. While thresholds and alerts provide the primary method for
highlighting issues and failures, there are circumstances where suboptimal performance will
not lead to an alert, but will be clearly manifested using a specific tool or utility. Also, where
administrators are already familiar with some tools, they are able to apply existing knowledge,
skills, procedures, and even code to assist in monitoring and maintaining Exadata.
The table in the slide lists some of the additional tools and utilities that are provided, you may
fi d more information
find
i f
ti using
i the
th resources listed
li t d in
i the
th table.
t bl Integrated
I t
t d Lights
Li ht O
Outt M
Manager
(ILOM) is introduced in more detail in the lesson entitled Monitoring and Maintaining
Database Machine.

Exadata and Database Machine Administration Workshop 5 - 20

Cell Maintenance Overview

Planned maintenance
Examples

Patch or upgrade Exadata software

Procedure overview
1. Take the corresponding ASM failure groups offline.
2. Execute your planned maintenance operation.
3. Bring the ASM failure groups back online.

Unplanned maintenance
Examples
p

Disk failure, Cell hardware failure, or CELLSRV process failure

Procedure overview
1. Remedy the failure.
2. Bring online or re-create the affected ASM failure groups.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Cell Maintenance Overview


Cell maintenance operations can be broadly divided into two categories: planned and
unplanned.
Planned maintenance will most likely involve patching or upgrading the Exadata software but it
may also include replacing or upgrading an item of hardware prior to a failure. For example, you
might replace one of the storage server power supplies as a planned maintenance operation
because it is no longer functioning but its failure has not impacted on the operations of the cell.
If you are following the recommended pattern of maintaining ASM redundancy using failure
groups on multiple Exadata cells, then planned maintenance operations can be undertaken with
minimal
i i l di
disruption
ti tto normall processing.
i
IIn essence, you need
d tto:
1. Take offline the failure groups associated with the cell being maintained. You should
define a maintenance window, which provides ample time for the maintenance operation.
This can be achieved by specifying a timeout clause in your ALTER DISKGROUP ...
OFFLINE statement or by setting the DISK_REPAIR_TIME ASM initialization parameter.
2. Perform the required maintenance operation within the planned maintenance window.
3 Bring the affected ASM failure groups back online
3.
online.

Exadata and Database Machine Administration Workshop 5 - 21

Cell Maintenance Overview (continued)


If the maintenance operation takes longer than planned, you can either extend the maintenance
window or if it has expired, you will have to re-create the ASM disks and failure groups.
Some failure scenarios can be automatically remedied by Exadata
Exadata. For example
example, if the
CELLSRV process is killed, the restart server process (RS) will restart CELLSRV and in most
cases processing will continue uninterrupted.
In situations where failure cannot be automatically remedied, an unplanned maintenance
operation is required. Unplanned maintenance may be required as a result of a disk failure,
some other form of cell hardware failure, or unrecoverable cell software failure.
If you are following the recommended pattern of maintaining ASM redundancy using failure
groups on multiple Exadata cells, then the failure event will automatically cause your affected
ASM disks to be taken offline. If possible, remedy the failure. After the failure is remedied, the
affected ASM disks can either be brought back online or will need to be re-created depending
on whether or not the failure is remedied before the amount of time specified in the
DISK_REPAIR_TIME ASM initialization parameter. If the failure cannot be repaired, the failure
groups associated with the failed cell will be dropped and should be re-created on another cell.
Note: The general procedures described in this section rely on ASM redundancy (mirroring)
across multiple Exadata cells to maintain at least one copy of data online during a planned or
unplanned maintenance operation. If you do not implement ASM redundancy across multiple
Exadata cells or you suffer simultaneous failures that affect all of your data copies, then you will
need to rebuild your database (in whole or in part) or perform a recovery operation.

Exadata and Database Machine Administration Workshop 5 - 22

Automated Cell Maintenance Operations

Automatic addition of a replacement disk to the original


disk group:
Cell disk and grid disks are automatically re-created.
Each grid
g disk is added back to the original
g
disk g
group.
p

Automatic cell restart:


Grid disks are brought online when a cell restarts.

Automatic firmware upgrades:


A golden firmware copy is kept on the cell and flashed to
replacement components:

Immediately for a disk replacement


During reboot for other components, such as the motherboard,
InfiniBand HCA, disk controller, flash card, and so on

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Automated Cell Maintenance Operations


Exadata simplifies various cell maintenance operations by automating tasks that previously
required administrator intervention:
Automatic addition of a replacement disk to the original disk group: When a replacement
disk is inserted after a physical disk failure, the cell disk and grid disks are automatically
re-created, and each grid disk is automatically added back to the original ASM disk group.
The same occurs after the replacement of a flash card containing flash-based grid disks.
Automatic cell restart: Grid disks are automatically changed to online when a cell recovers
from a failure, or after a restart.
Automatic firmware upgrades: A copy of the required firmware
firmware, called the golden version
version,
is kept on the cell as part of the cell software distribution.
- If a disk is replaced, the new disk is automatically flashed with the golden version of
the firmware before being rebuilt as previously mentioned.
- For other components, the cell must be shut down to replace the component. After
the Exadata cell is powered on, it will apply the golden version of the firmware on the
new component and restart. For a replacement motherboard, the storage server will
shut down and the administrator will need to power on once again.
- If a firmware change is made while the cell is running, a periodic check will raise an
alert if a component does not match the golden firmware level. After the cell is
rebooted, it will update itself by reapplying the golden firmware version.
Exadata and Database Machine Administration Workshop 5 - 23

Replacing a Damaged Physical Disk


1

Determine the damaged disk.

CellCLI> LIST ALERTHISTORY WHERE ALERTMESSAGE LIKE "Logical drive lost.*" DETAIL
Logical drive lost. Lun:0_5. Status: normal. Physdisk: 20:5.
Celldisk on it: CD_05_cell01.
CD 05 cell01. Griddisks on it: data_CD_05_cell01.
data CD 05 cell01.
The suggested action is: Refer to section Maintaining Physical Disks in
the User Guide.

Replace
physical disk.
20:5

LIST PHYSICALDISK

normal

Monitor ASM to confirm the readdition of the disk.

SQL> SELECT NAME, STATE FROM V$ASM_DISK


SQL> SELECT * FROM GV$ASM_OPERATION

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Replacing a Damaged Physical Disk


Replacing a physical disk due to problem or failure is probably the most likely hardware
maintenance operation that Exadata might ever require. Assuming you are using ASM
redundancy,
d d
the
th procedure
d
tto replace
l
a problem
bl
di
disk
k iis quite
it simple.
i l
The first step requires that you identify the problem disk. This could occur in a number of ways:
Hardware monitoring using ILOM may report a problem disk.
If a disk fails, an Exadata alert is generated. The alert includes specific instructions for
replacing the disk. If you have configured the system for alert notifications, the alert will be
sent to the designated email address or SNMP target. You can also use the LIST
ALERTHISTORY command shown in the slide to identify the failed disk.
The LIST PHYSICALDISK command may identify a disk reporting a status of warning
or critical. Even if the cell is still functioning, the problem may be a precursor to a disk
failure.
The CALIBRATE command may identify a disk delivering abnormally low throughput or
IOPS. Even if the cell is still functioning, a single bad physical disk can degrade the
performance of other good disks so you may decide to replace the identified disk. Note
that running CALIBRATE at the same time as the cell is active will impact performance.
After you have identified the problem disk, you can replace it. When you remove the disk, you
will get an alert. When you replace a physical disk, the disk must be acknowledged by the RAID
controller before it can be used. This does not take a long time, and you can use the LIST
PHYSICALDISK command to monitor the status until it returns to normal.
Exadata and Database Machine Administration Workshop 5 - 24

Replacing a Damaged Physical Disk (continued)


The grid disks and cell disks that existed on the previous disk in the slot will be automatically
re-created on the new disk. If these grid disks were part of an Oracle ASM disk group with
O
or HIGH
o
G redundancy,
edu da cy, they
ey will be added back
bac to
o the
ed
disk
s g
group
oup a
and
d the
e da
data
a will be
NORMAL
rebalanced based on disk group redundancy and the asm_power_limit parameter.
Re-creating the ASM disk and rebalancing the data may take some time to complete. You can
monitor the progress of these operations within ASM. You can monitor the status of the disk as
reported by V$ASM_DISK.STATE until it returns to NORMAL. You can also monitor the rebalance
progress using GV$ASM_OPERATION.
Review the following considerations when replacing a failed disk:
If the repair timer (specified in the DISK_REPAIR_TIME ASM initialization parameter) has
not expired, the ASM disk could be offline (not dropped) and the disk group is yet to be
rebalanced. In this case, the prompt replacement of the failed disk can avoid a needless
rebalance operation.
The disk could be dropped by Oracle Automatic Storage Management (Oracle ASM), and
the rebalance operation may have been successfully run. Check the Oracle ASM alert logs
to confirm this.
this After the failed disk is replaced,
replaced a second rebalance will be required
required.
The disk could be dropped, and the rebalance operation is currently running. Check the
GV$ASM_OPERATION view to determine if the rebalance operation is still running. In this
case the rebalance operation following the disk replacement will be queued.
The disk could be dropped by ASM, and the rebalance operation failed. Check
GV$ASM_OPERATION.ERROR to determine why the rebalance operation failed. Monitor
the rebalance operation following the disk replacement to ensure it runs.
Rebalance operations from multiple disk groups can be done on different Oracle ASM
instances in the same cluster if the physical disk being replaced contains grid disks from
multiple disk groups. Multiple rebalance operations cannot be run in parallel on just one
Oracle ASM instance. The operations will be queued for the instance.

Exadata and Database Machine Administration Workshop 5 - 25

Replacing a Damaged Flash Card


1

Determine the damaged flash card.

CellCLI> LIST PHYSICALDISK DETAIL


name: [9:0:2:0]
diskType: FlashDisk
...
slotNumber: "PCI Slot: 1; FDOM: 2"
status: critical

Power down
the cell.

Replace the
flash card.

Power up
the cell.

If the card contained a flash-based grid disk,


monitor ASM to confirm the readdition of the disk.

SQL> SELECT NAME, STATE FROM V$ASM_DISK


SQL> SELECT * FROM GV$ASM_OPERATION

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Replacing a Damaged Flash Card


Each Exadata server is equipped with 4 PCI flash memory cards. Each card has 4 flash
modules ((FDOMs)) for a total of 16 flash modules on each cell.
Identifying a damaged flash module is similar to identifying a damaged physical disk. Hardware
monitoring using ILOM or a drop in performance indicated by the CALIBRATE command may
indicate a problem. If a failed FDOM is detected, an alert is generated. The alert message
includes if any flash-based grid disks were on the flash module.
As shown in the slide, a damaged flash module can also be reported using the LIST
PHYSICALDISK DETAIL command. The slotNumber attribute shows the PCI slot and the
FDOM number.
b IIn thi
this example,
l th
the status
t t attribute
tt ib t indicates
i di t a critical
iti l ffault.
lt
If there were no grid disks on the flash module, the flash module was probably being used for
Exadata Smart Flash Cache. In this mode, the bad flash module results in a decreased amount
of flash memory on the cell. The performance of the cell is affected proportional to the size of
flash memory lost, but the database and applications are not at risk of failure.
Although technically the PCI slots in a Exadata server are hot-replaceable, it is recommended to
power down the cell while servicing a damaged flash card
card. After replacing the card and
powering up the cell, no additional steps are required to re-create any flash-based grid disks.
Optionally, you can monitor ASM to confirm the readdition of a flash-based grid disk.
Exadata and Database Machine Administration Workshop 5 - 26

Moving All Disks from One Cell to Another


Original

New

Original

New

1. Make the grid disks inactive:


CellCLI> ALTER GRIDDISK ALL INACTIVE

2. Back up the operating system configuration files that will


change when the new cell is booted.
3. Move the disks from the original cell to the new cell.

Ensure the system disks occupy the first two slots.

4. Boot the new cell.


5. Restart Exadata cell services:
CellCLI> ALTER CELL RESTART SERVICES ALL

6. Import the cell disks:


CellCLI> IMPORT CELLDISK ALL
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Moving All Disks from One Cell to Another


You may need to move all drives from one Exadata server to another Exadata server. This may
be necessary when there is a chassis-level component failure, or when troubleshooting a
h d
hardware
problem.
bl
T
To move th
the d
drives,
i
perform
f
th
the ffollowing
ll i steps:
t
1. If possible, use the ALTER GRIDDISK ALL INACTIVE command to make the grid disks
inactive.
2. If possible, back up /etc/hosts, /etc/modprobe.conf, and the files in
/etc/sysconfig/network. This is a precautionary step if you want to retain the
settings associated with your original Exadata server in case you plan to move the disks
back to the original Exadata server in the future.
3. Move the disks from
f
the original Exadata cell to the new Exadata cell.
Caution: Ensure the first two disks, which are the system disks, are in the same first two
slots. Failure to do so will cause the Exadata cell to not function properly.
4. Start the cell. The cell operating system will be automatically reconfigured to suit the new
server hardware.
5. Restart the cell services using ALTER CELL RESTART SERVICES ALL.
6. Import the cell disks using IMPORT CELLDISK ALL.
If you are using ASM redundancy and the procedure is completed before the amount of time
specified in the DISK_REPAIR_TIME ASM initialization parameter, then the ASM disks will be
automatically brought back online and updated with any changes made during the cell outage.
Exadata and Database Machine Administration Workshop 5 - 27

Using the Exadata Software Rescue Procedure

Every Exadata server is equipped with a CELLBOOT USB


flash drive to facilitate cell rescue
Cell rescue is required in the unlikely event that both system
disks fail simultaneously
Use with extreme caution

To perform cell rescue:


1. Connect to Exadata using the console
2. Boot the cell, and as soon as you see the "Oracle Exadata"
splash
p
screen,, p
press any
y key
y on the keyboard
y
3. In the displayed list of boot options, select the last option,
CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode, and
press Enter
4. Select the rescue option, and proceed with the rescue
5. Reconfigure the cell
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Using the Exadata Software Rescue Procedure


Exadata maintains mirrored system areas on separate physical disks. If one system area
becomes corrupt or unavailable
unavailable, Exadata can use the mirrored copy to recover
recover.
In the rare event that both system disks fail simultaneously, you must use the rescue
functionality provided on the CELLBOOT USB flash drive that is built into every Exadata
server.
It is important to note the following when using the rescue procedure:
Use extreme caution when using this procedure, and pay attention to the prompts. The
rescue procedure can potentially rewrite some or all of the disks in the cell. If this
happens, then you can irrevocably lose the contents of those disks. Ideally, you should
use the rescue procedure only with assistance from Oracle Support Services.
The rescue procedure does not destroy the contents of the data disks or the contents of
the data partitions on the system disks unless you explicitly choose to do so during the
rescue procedure.
The rescue procedure restores the Exadata software to the same release. This includes
any patches that existed on the cell as off the last successful
f boot.

Exadata and Database Machine Administration Workshop 5 - 28

Using the Exadata Software Rescue Procedure (Continued)

The following is not be restored using the rescue procedure:


- The crash kernel support rpms kernel-debuginfo-common, and kerneldebuginfo You will need to reinstall them.
debuginfo.
them These cannot be restored due to
space limitations on the CELLBOOT USB flash drive.
- Some cell configuration details, such as alert configurations, SMTP information,
and administrator e-mail address. Note that the cell network configuration is
restored, along with SSH identities for the cell, and the root, celladmin and
cellmonitor users.
ILOM configurations. Typically, ILOM configurations remain undamaged even in
case of Exadata software failures.
The rescue procedure does not examine or reconstruct data disks or data partitions
on the system disks. If you have data corruption on the grid disks, then do not use the
rescue procedure. Instead use the database backup and recovery procedures.
-

The following rescue options are available for the rescue procedure:
Partial reconstruction recovery: During partial reconstruction recovery, the rescue
process re-creates
t partitions
titi
on the
th system
t
disks
di k and
d checks
h k th
the disks
di k ffor th
the
existence of a file system. If a file system is discovered, then the process attempts to
boot. If the cell boots successfully, then you use the CellCLI commands, such as
LIST CELL DETAIL, to verify the cell is usable. You must also recover any data
disks, as appropriate. If the boot fails, then you must use the full original build
recovery option.
Full original build recovery: This option rewrites the system area of the system disks to
restore the Exadata software. It also allows you to erase any data on the data disks,
and data partitions on the system disks.
Re-creation of the CELLBOOT USB flash drive: This option is used to make a copy of
the CELLBOOT USB flash drive.
To perform a rescue using the CELLBOOT USB flash drive:
1. Connect to Exadata using the console.
2. Boot the cell, and as soon as you see the "Oracle Exadata" splash screen, press any
key on the keyboard. The splash screen remains visible for only 5 seconds.
3. In the displayed list of boot options, scroll down to the last option,
CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode, and press Enter.
4. Select the rescue option, and proceed with the rescue.
5. After a successful rescue, you must reconfigure the cell to return it to the pre-failure
configuration,
fi
ti
and
d reinstall
i t ll th
the kernel-debuginfo and
d kernel-debuginfocommon rpms to use crash kernel support. If you chose to preserve the data when
prompted by the rescue procedure, then import the cell disks. If you chose not to
preserve the data, then you should create new cell disks, and grid disks.

Exadata and Database Machine Administration Workshop 5 - 29

Quiz
You can define thresholds for all Exadata metrics?
1. TRUE
2. FALSE

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 2
Thresholds are supported on cell disk and grid disk I/O error count metrics (CD_IO_ERRS_MIN
and GD_IO_ERRS_MIN), along with the cell memory utilization (CL_MEMUT ) and cell file system
utilization (CL_FSUT) metrics. In addition, you can set thresholds for I/O Resource Management
(IORM) related metrics. The CellCLI LIST ALERTDEFINITION command lists the metrics for
which thresholds can be set.

Exadata and Database Machine Administration Workshop 5 - 30

Quiz
You enable SQL processing offload using the
CELL_OFFLOAD_PLAN_DISPLAY initialization parameter.
1. TRUE
2 FALSE
2.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 2
The CELL_OFFLOAD_PROCESSING parameter is used to enable SQL processing offload.

Exadata and Database Machine Administration Workshop 5 - 31

Summary
In this lesson, you should have learned how to:
Describe the various performance monitoring facilities
available for Exadata
Monitor Exadata from directly within a cell,
cell from a
database instance and through Enterprise Manager
Interpret SQL execution plans that use offloading
Outline probable maintenance scenarios

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 5 - 32

Additional Resources

Lesson Demonstrations (Viewlets)


Monitoring Exadata Using Metrics, Alerts and Active
Requests

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/051ExadataMe
tricsAlerts/051exadatametricsalerts_viewlet_swf.html

Monitoring Exadata From Within Oracle Database

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/052ExadataDB
Monitoring/052exadatadbmonitoring_viewlet_swf.html
g
g_
_

Exadata High Availability

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/053ExadataHig
hAvailability/053exadatahighavailability_viewlet_swf.html

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 5 - 33

Practice 5 Overview:
Monitoring Exadata
In these practices, you will monitor Exadata using metrics,
alerts and active requests. You will also monitor Exadata
statistics using dynamic performance views (V$ views) in your
database. Finally, you will exercise Exadata high availability by
examining the effect of a cell crash.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 5 - 34

E d t and
Exadata
d I/O Resource
R
Management
M
t

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Objectives
After completing this lesson, you should be able to use Exadata
I/O Resource Management to manage workloads within a
database and across multiple databases.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 6 - 2

I/O Resource Management Overview

Traditional benefits of shared storage:


Lower administration costs
More efficient use of storage

Common challenge for shared storage:


Workloads interfere with each other. For example:

Large queries impact on each other


Data loads impact on warehouse queries
Batch workloads interfere with OLTP performance

Exadata I/O Resource Management allows you to govern


I/O resource usage among different:
User types
Workload types

Applications
Databases

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

I/O Resource Management Overview


Storage is often shared by different workloads on multiple databases. Shared storage provides
some important benefits:
When a storage system is dedicated to a single database, the administrator must size the
storage system based on the databases peak anticipated load and size. The correct
balance of storage resources is seldom achieved because real-world workloads are very
dynamic. This leads to unused I/O bandwidth and space on some systems, whereas
others suffer with insufficient bandwidth and space. Sharing facilitates more efficient usage
of storage space and I/O bandwidth.
Sharing lowers administration costs by reducing the number of storage systems.
systems
Shared storage, however, is not a perfect solution. Running multiple types of workloads and
databases on shared storage often leads to performance problems. For example, large parallel
queries on one production data warehouse can impact the performance of critical queries on
another production data warehouse. Also, a data load on a data warehouse can impact the
performance of critical queries also running on it. You can mitigate these problems by over
provisioning the storage system, but this diminishes the cost savings of shared storage. You
can also avoid running noncritical tasks at peak times, but manually achieving this is laborious.
When databases have different administrators who do not coordinate their activities, the task is
even more difficult.
Exadata and Database Machine Administration Workshop 6 - 3

I/O Resource Management Overview (continued)


I/O Resource Management (IORM) allows workloads and databases to share Exadata I/O
resources automatically according to user-defined policies. To manage workloads within a
database you can define intradatabase resource plans using the Database Resource Manager
database,
(DBRM), which has been enhanced to work in conjunction with Exadata. To manage workloads
across multiple databases, you can define IORM plans.
For example, if a production database and a test database are sharing an Exadata cell, you can
configure resource plans that give priority to the production database. In this case, whenever the
test database load would affect the production database performance, IORM will schedule the
I/O requests such that the production database I/O performance is not impacted. This means
th t the
that
th test
t t database
d t b
I/O requests
t are queued
d until
til they
th can be
b issued
i
d without
ith t disturbing
di t bi th
the
production database I/O performance.

Exadata and Database Machine Administration Workshop 6 - 4

I/O Resource Management Concepts


Database A

Database B

Finance

OnlineQuery

consumer group

Interactive

consumer group

category

HR

BatchQuery

consumer group

consumer group

Reporting

Batch

ETL

consumer group

category

consumer group

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

I/O Resource Management Concepts


A database often has many types of workloads. These workloads may differ in their
y issue. Resource consumer g
groups
performance requirements and the amount of I/O that they
provide a way to group sessions that comprise a particular workload. For example, if your
database is running four different applications, you can create four consumer groups, one for
each application. Alternatively, if your data warehouse has three types of workloads, such as
critical queries, normal queries, and ETL (extraction, transformation, and loading), then you can
create a consumer group for each type of workload. After you have created the consumer
groups, you must create rules that specify how sessions are mapped to consumer groups.
The database resource plan
plan, or intradatabase resource plan
plan, specifies how resources are
allocated among consumer groups in a database. A database may have multiple resource
plans, however, only one resource plan can be active at any point in time. This allows database
resource management to cater for different requirements associated with different time periods.
Exadata IORM extends the consumer group concept using categories. While consumer groups
represent collections of users within a database, categories represent collections of consumer
groups across all databases. The diagram in the slide shows an example of two categories
containing
i i consumer groups across two databases.
d b
Y
You can manage I/O resources based
b
d on
categories by creating a category plan. For example, you can specify precedence to consumer
groups in the Interactive category over consumer groups in the Batch category for all the
databases sharing an Exadata cell.
Exadata and Database Machine Administration Workshop 6 - 5

I/O Resource Management Plans

I/O
Resource
Management
Inside
one
database

Intradatabase
Resource
Plan

Across
multiple
databases

Interdatabase
Resource
Plan

Category
g y
Resource
Plan

IORM Plan

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

I/O Resource Management Plans


IORM provides different approaches for managing resource allocations. Each approach can be
j
with other approaches.
used independentlyy or in conjunction
Database resource management enables you to manage workloads within a database.
Database resource management is configured within each database, using Database Resource
Manager to create an intradatabase resource plan. You should use this feature if you have
multiple types of workloads within a database and you need to define a policy for specifying how
these workloads share the database resource allocation. If only one database is using Exadata,
this is the only IORM feature that you need.
I t d t b
Interdatabase
resource managementt is
i managed
d with
ith an interdatabase
i t d t b
plan.
l
An
A interdatabase
i t d t b
plan specifies how resources are allocated among multiple databases for each cell. The
directives in an interdatabase plan specify allocations to databases, rather than consumer
groups.
Category resource management is an advanced feature. It is useful when Exadata is hosting
multiple databases and you want to allocate resources primarily by the category of the work
g done. For example,
p , suppose
pp
all databases have three categories
g
of workloads: OLTP,,
being
reports, and maintenance. To allocate the I/O resources based on these workload categories,
you would use category resource management.
Note: The combination of the interdatabase plan and the category plan is called the IORM plan.
Exadata and Database Machine Administration Workshop 6 - 6

IORM Architecture
Database A
Database
sends
IO requests
to cells
cells.

Database A
CG1Database
queue A
CG1Database
queue A
CG1Database
queue A
CG2 queue
CG1 queue
CG2 queue
CG2 queue
CG2 queue

Exadata Cell

CELLSRV

CG queue
CGn

CGn queue
CGn
queue
CGn
queue

IO request tag:
- DB name
- Type
- Consumer group

BG queues
BG queues
BG queues
Database
BGZqueues
Database
CG1 queue Z
CG1Database
queue Z
CG1Database
queue Z
CG2 queue
CG1 queue
CG2 queue
CG2 queue
CG2 queue

IORM

CGn queue

Disk queue

CGn queue
CGn
queue
CGn
queue

Database Z

BG queues
BG queues
BG queues
BG queues

Resource
plans

Cell disk

Performance
statistics

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

IORM Architecture
IORM manages Exadata I/O resources on a per-cell basis. Whenever the I/O requests start to
saturate the cell, IORM schedules incoming I/O requests according to the configured resource
plans.
l
IORM schedules
h d l I/O b
by selecting
l ti requests
t ffrom diff
differentt CELLSRV queues. The
Th resource
plans are used to determine the order in which the queued I/O requests are issued to disk. The
goal of IORM is to fully utilize the available disk resources. Any allocation that is not fully utilized
is made available to other workloads in proportion to the configured resource plans.
IORM only intervenes when needed. For example, IORM does not intervene if there is only one
active consumer group on one database because there is no possibility of contention with
another consumer group or database.
Background I/Os are scheduled based on their priority relative to the user I/Os.
I/Os For example,
example
redo writes and control file I/Os are critical to performance and are always prioritized above all
user I/Os. Writes by the database writer process (DBWn) are scheduled at the same priority level
as user I/Os.
The diagram in the slide illustrates the high-level implementation of IORM. For each cell disk,
each database accessing the cell has one I/O queue per consumer group and three background
I/O queues. The background I/O queues correspond to high, medium, and low priority requests
with different I/O types mapped to each queue. If you do not set an intradatabase resource plan,
all nonbackground I/O requests are grouped into a single consumer group called
OTHER_GROUPS.
Note: IORM is only used to manage I/O requests to physical disks. IORM does not manage
requests to flash-based grid disks or requests serviced by Exadata Smart Flash Cache.
Exadata and Database Machine Administration Workshop 6 - 7

I/O Resource Management Plans Example


Database A

Database B

(Single Inst)

(RAC)

Intradatabase Plan A

Intradatabase Plan B

(DBMS_RESOURCE_MANAGER)

(DBMS_RESOURCE_MANAGER)

Consumer group 1: 15%


Consumer group 2: 10%

Consumer group 5: 22%


Consumer group 6: 18%

Consumer group 3: 35%


Consumer group 4: 40%

Consumer group 7: 15%


Consumer group 8: 45%
Controlled I/O
distribution

Exadata Storage Server

Disk

DB A Plan
Interdatabase Plan
(CellCLI)

Database A
Database B

Disk

DB B Plan

IORM Plan

: 70%
: 30%

Category Plan
(CellCLI)

INTERACTIVE : 60%
BATCH
: 40%

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

I/O Resource Management Plans Example


For each database, you can use DBRM to create an intradatabase resource plan. When you set
y sent to each cell. In
an intradatabase resource plan, a description of the plan is automatically
the example in the slide, Database A and Database B have separate intradatabase plans. Note
also that each consumer group in each intradatabase plan is associated with either the
INTERACTIVE or BATCH category.
At each cell, an interdatabase plan can be configured and enabled. In the example in the slide,
the interdatabase plan is configured with a larger resource allocation for Database A (70%) than
for Database B (30%).
Also within
Al
ithi each
h cell,
ll you can categorize
t
i consumer groups ffrom diff
differentt databases
d t b
and
d
distribute I/O resources according to the various categories. In the example in the slide, the
INTERACTIVE category (60%) is allocated a greater resource share than the BATCH category
(40%).

Exadata and Database Machine Administration Workshop 6 - 8

I/O Resource Management Plans Example


Database B

Database A

Database A

Database B

IORM
allocation
Intradatabase
45%

15%

40%

35%

18%

22%

10%

15%

Interdatabase

30%

70%

30%

70%

Categories

40%

60%

BATCH

INTERACTIVE

All
User I/Os
(100%)
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

I/O Resource Management Plans Example (continued)


The category, interdatabase, and intradatabase plans are used together by Exadata to allocate
I/O resources.
The category plan is first used to allocate resources among the categories. When a category is
selected, the interdatabase plan is used to select a database; only databases that have
consumer groups with the selected category can be selected. Finally, the selected databases
intradatabase plan is used to select one of its consumer groups. The percentage of resource
allocation represents the probability of making a selection at each level.
Expressing this as a formula:
Pcgn = cgn / sum(catcgs) * db% * cat%
where:
Pcgn is the probability of selecting consumer group n
cgn is the resource allocation for consumer group n
sum(catcgs) is the sum of the resource allocations for all consumer groups in the same
category as consumer group n and on the same database as consumer group n
db% is the database allocation percentage in the interdatabase plan
cat% is the category allocation percentage in the category plan
Exadata and Database Machine Administration Workshop 6 - 9

I/O Resource Management Plans Example (continued)


The hierarchy used to distribute I/Os is illustrated in the slide. The example is continued from
the previous slide but the consumer group names are abbreviated to CG1, CG2, and so on.
Notice that although each consumer group allocation is expressed as a percentage within each
database, IORM is concerned with the ratio of consumer group allocations within each category
and database. For example, CG1 nominally receives 16.8% of I/O resources from IORM
(15/(15+10)*70%*40%); however, this does not change if the intradatabase plan allocations for
CG1 and CG2 are doubled to 30% and 20%, respectively. This is because the allocation to CG1
remains 50% greater than the allocation to CG2. This behavior also explains why CG1 (16.8%)
and CG3 (19.6%) have a similar allocation through IORM even though CG3 belongs to the
hi h priority
higher
i it category
t
(60% versus 40%) and
dh
has a much
h llarger iintradatabase
t d t b
plan
l allocation
ll
ti
(35% versus 15%).
Note: ASM I/Os (for rebalance and so on) and I/Os issued by Oracle background processes are
handled separately and automatically by Exadata. For clarity, background I/Os are not shown in
the example.

Exadata and Database Machine Administration Workshop 6 - 10

Enabling Intradatabase Resource Management

You can enable intradatabase resource management:


Manually:

Set the databases RESOURCE_MANAGER_PLAN parameter.

Automatically:

Create a job scheduler window.


Associate a resource plan with the window.

Exadata is notified when an intradatabase resource plan is


set or modified:
Enabled or modified plan sent to each cell using iDB

You must activate the IORMPLAN on all Exadata cells.

Following are the commonly used intradatabase plans:


mixed_workload_plan
dss_plan
default_maintenance_p
plan
Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Enabling Intradatabase Resource Management


An intradatabase resource plan can be manually enabled with the RESOURCE_MANAGER_PLAN
initialization parameter or automatically
y enabled using
g the jjob scheduler.
When you set an intradatabase resource plan on the database, a description of the plan is
automatically sent to each cell. When a new cell is added or an existing cell is restarted, the
current intradatabase plan is automatically sent to the cell. This resource plan is used to
manage resources on both the database server and cells.
Before IORM can be used, you must activate the IORMPLAN on all corresponding Exadata cells.
Oracle Database provides several predefined intradatabase plans. The most commonly used
are mixed_workload_plan, dss_plan and default_maintenance_plan.
Intradatabase plans do not contain a directive for background I/O activity. Background I/Os are
scheduled based on their priority relative to the user I/Os. For example, redo writes, and control
file reads and writes are critical to performance and are always prioritized above all user I/Os.
Note: When an Oracle RAC database uses Exadata, all instances in the Oracle RAC cluster
must be set to the same resource plan.

Exadata and Database Machine Administration Workshop 6 - 11

Intradatabase Plan Example


BEGIN
DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => 'my_plan',
CONSUMER_GROUP1 => 'high_priority', GROUP1_PERCENT => 80,
CONSUMER_GROUP2 => 'low_priority' , GROUP2_PERCENT => 20);
END;
/

ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = 'my_plan';

Consumer Group

The plan is sent


di tl tto th
directly
the
Exadata cells
via iDB.

SYS_GROUP

Level 1

Level 2

100%

HIGH_PRIORITY

80%

LOW_PRIORITY

20%

OTHER_GROUP

Level 3

100%

Percentages
are used
sed for both
CPU and I/O
resources.

CellCLI> ALTER IORMPLAN ACTIVE

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Intradatabase Plan Example


The intradatabase I/O resource plan specifies how I/O resources are allocated among
groups in a specific database.
consumer g
An intradatabase I/O resource plan is created with the procedures in the
DBMS_RESOURCE_MANAGER PL/SQL package. There are no specific I/O resource parameters
or procedures. You create an intradatabase I/O resource plan exactly the same way as you
would create a CPU resource plan. When you specify an allocation percentage, this percentage
applies to both database server CPU and Exadata I/O resources if you are using Exadata.
There are no specific I/O settings because typically you are constrained by CPU or I/O, but not
both at the same time
time. The intradatabase I/O resource plan is applicable only when the
database uses Exadata.
The example in the slide uses the CREATE_SIMPLE_PLAN procedure to create MY_PLAN. This
resource plan is used to manage CPU resources at the database level, and I/O resources at the
Exadata cell level.
Before I/O resources for an intradatabase plan can be managed by Exadata I/O Resource
Management,
g
yyou need to make sure that the IORMPLAN is active. This can be done by
y
executing the ALTER IORMPLAN ACTIVE command.

Exadata and Database Machine Administration Workshop 6 - 12

Enabling IORM for Multiple Databases

Enable IORM for multiple databases by configuring an


IORMPLAN:
The category plan assigns I/O resources using categories.
The interdatabase plan assigns I/O resources using
database names
names.
All combinations are possible.

Use CellCLI to define and activate the IORMPLAN on each


cell.
Configure the same IORMPLAN on each cell.
O l one IORMPLAN can be
Only
b active
i at a time
i
on a cell.
ll
IORMPLAN settings are persistent across cell reboots.
All databases get equal allocations in the absence of an
IORMPLAN.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Enabling IORM for Multiple Databases


I/O resource management for multiple databases is configured with the IORMPLAN. The
IORMPLAN specifies how I/O resources are allocated for each cell. If you are using multiple
cells, you need to configure them all. In most cases, all of your cells should use the same
IORMPLAN.
The IORMPLAN contains both an interdatabase plan, also called a DB plan, and a category plan.
The directives in the DB plan specify I/O resource allocations to database names, rather than
consumer groups. The directives in the category plan specify I/O resource allocations to
categories, rather than databases or consumer groups. The IORMPLAN is configured and
enabled with CellCLI on each cell
cell. Only one IORMPLAN can be active on a cell at any given
time.
At startup, the IORMPLAN is an empty string, which effectively turns off IORM. In that case all
databases receive an equal allocation.
The IORMPLAN must be activated for I/O resource management to occur. When the IORMPLAN
is deactivated, IORM will not manage I/O resources, even if an intradatabase resource plan is
set or an IORMPLAN is configured.
g

Exadata and Database Machine Administration Workshop 6 - 13

Interdatabase Plan Example


CellCLI> alter iormplan
>
dbplan=((name=sales_prod, level=1, allocation=80),
>
(name=finance_prod, level=1, allocation=20),
>
(name=sales_dev, level=2, allocation=100),
>
(name=sales_test, level=3, allocation=50),
>
(name=other,
(name other level=3,
level 3 allocation=50)),
allocation 50))
>
catplan=''

CellCLI> alter iormplan active

Database

Level 1

sales_prod

80%

finance_prod

20%

sales_dev

Level 2

Level 3

100%

sales_test

50%

other

50%

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Interdatabase Plan Example


On each Exadata cell, an interdatabase plan specifies how resources are divided among
y allocations to databases,
multiple databases. The directives in an interdatabase plan specify
rather than consumer groups. The interdatabase plan is configured and activated with CellCLI,
on each cell.
The above example implements an interdatabase plan following the directives shown in the
table.
The interdatabase plan is created by specifying the DBPLAN part of the IORMPLAN. The
interdatabase plan is similar to an intradatabase plan in that each directive consists of a level
f
from
1 to
t 8 and
d an allocation
ll
ti amountt in
i percentage
t
terms.
t
F
For a given
i
plan,
l
allll th
the
allocations at any level must add up to 100 or less. An interdatabase plan differs from an
intradatabase plan in that it cannot contain subplans and it only contains I/O resource directives.
As a best practice, you should create a directive for each database using the same Exadata
cell. To make sure that any database without an explicit directive can be managed, you need to
create an allocation named OTHER.

Exadata and Database Machine Administration Workshop 6 - 14

Interdatabase Plan Example (continued)


The role attribute indicates that the directive is applied only when the databases are in that
database role. This provides the flexibility to automatically adjust the IORM plan according to
the role of the database in an Oracle Data Guard environment. If the role attribute is not
specified, the directive is applied regardless of the database role. Following is an example of an
interdatabase plan using the role attribute:
ALTER IORMPLAN dbplan=(
(name=sales1, level=1, allocation=30, role=primary), (name=sales2, level=1, allocation=35, role=primary), (name=sales1,
(name
sales1, level
level=2,
2, allocation=20,
allocation 20, role=standby),
role standby), (name=sales2, level=2, allocation=25, role=standby), (name=other, level=3, allocation = 50))
You can remove an interdatabase plan using:
ALTER IORMPLAN dbplan=''

Exadata and Database Machine Administration Workshop 6 - 15

Category Plan Example


DBA_RSRC_CONSUMER_GROUPS
CONSUMER_GROUP
---------------------------SYS_GROUP
BATCH_GROUP
INTERACTIVE_GROUP
ORA$
OTHER_GROUPS
DEFAULT_CONSUMER_GROUP
LOW_GROUP
AUTO_TASK_CONSUMER_GROUP

CATEGORY
--------------ADMINISTRATIVE
BATCH
INTERACTIVE
MAINTENANCE
OTHER
OTHER
OTHER
OTHER

DBMS_RESOURCE_MANAGER.CREATE_CATEGORY

Category

Level 1

Interactive

90%

Batch

Level 2

Level 3

80%

Maintenance
CellCLI> alter iormplan
>
dbplan= ''
Other
>
catplan=(
catplan
(
>
(name=interactive, level=1, allocation=90),
>
(name=batch, level=2, allocation=80),
>
(name=maintenance, level=3, allocation=50),
>
(name=other, level=3, allocation=50)
>
)

50%
50%
-

CellCLI> alter iormplan active

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Category Plan Example


Database Resource Manager enables you to specify a category for every consumer group. The
predefined categories and their associated consumer groups are listed in the slide. This is the
default situation after database creation. If you decide to use these default categories, you
should map all administrative consumer groups in all databases to the ADMINISTRATIVE
category. All high-priority user activity, such as consumer groups for important online
transaction processing (OLTP) transactions and time-critical reports, should be mapped to the
INTERACTIVE category. All low-priority user activity, such as reports, maintenance, and lowpriority OLTP transactions, should be mapped to the BATCH, MAINTENANCE, and OTHER
g
categories.
You can create your own categories using the CREATE_CATEGORY procedure in the
DBMS_RESOURCE_MANAGER package, and then assign your category to a consumer group
using the CREATE_CONSUMER_GROUP or UPDATE_CONSUMER_GROUP procedures.
You can then manage I/O resources based on categories by creating a category plan. The
example shown in the slide implements a category plan based on the allocations described in
the table. With this plan, consumer groups associated with the INTERACTIVE category get up
to 90 percent of I/O resources.
resources 80 percent of the remainder
remainder, including any unutilized allocation
from the INTERACTIVE category, is allocated to the BATCH category. The MAINTENANCE and
OTHER categories share the remainder.
Any consumer group without an explicitly specified category defaults to the OTHER category.
Exadata and Database Machine Administration Workshop 6 - 16

Complete Example
Database A
BEGIN
DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => DB_A_Plan',
CONSUMER_GROUP1 => CG1', GROUP1_PERCENT => 15,
CONSUMER_GROUP2 => CG2', GROUP1_PERCENT => 10,
CONSUMER_GROUP3 => CG3', GROUP1_PERCENT => 35,
CONSUMER_GROUP4 => CG4, GROUP2_PERCENT => 40);
DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => CG1,
NEW_CATEGORY => BATCH);
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => CG2,
NEW_CATEGORY => BATCH);
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => CG3,
NEW_CATEGORY => INTERACTIVE);
DBMS RESOURCE MANAGER UPDATE CONSUMER GROUP(CONSUMER GROUP => CG4,
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP
CG4
NEW_CATEGORY => INTERACTIVE);
DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
END;
/
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = DB_A_Plan';

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Complete Example
This slide is the first in a series of 3 slides which provide a more complete example showing
the use of the different IORM plan types at the same time.
time The example is based on the
scenario introduced on pages 8, 9 and 10 of this lesson.
On this slide, the commands required to configure DBRM on Database A are shown.
Note that the example does not show the creation of any categories using
DBMS_RESOURCE_MANAGER.CREATE_CATEGORY because the categories used in the
scenario (BATCH and INTERACTIVE) are categories that are predefined inside Oracle
Database byy default.

Exadata and Database Machine Administration Workshop 6 - 17

Complete Example
Database B
BEGIN
DBMS_RESOURCE_MANAGER.CREATE_SIMPLE_PLAN(SIMPLE_PLAN => DB_B_Plan',
CONSUMER_GROUP1 => CG5', GROUP1_PERCENT => 22,
CONSUMER_GROUP2 => CG6', GROUP1_PERCENT => 18,
CONSUMER_GROUP3 => CG7', GROUP1_PERCENT => 15,
CONSUMER_GROUP4 => CG8, GROUP2_PERCENT => 45);
DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => CG5,
NEW_CATEGORY => BATCH);
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => CG6,
NEW_CATEGORY => BATCH);
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP => CG7,
NEW_CATEGORY => INTERACTIVE);
DBMS RESOURCE MANAGER UPDATE CONSUMER GROUP(CONSUMER GROUP => CG8,
DBMS_RESOURCE_MANAGER.UPDATE_CONSUMER_GROUP(CONSUMER_GROUP
CG8
NEW_CATEGORY => INTERACTIVE);
DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
END;
/
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = DB_B_Plan';

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Complete Example (continued)


On this slide, the commands required to configure DBRM on Database B are shown. These
commands are essentially the same as for Database A except for the different consumer
group names and resource allocation percentages.

Exadata and Database Machine Administration Workshop 6 - 18

Complete Example
Exadata Cells
CellCLI> alter iormplan
>
dbplan=((name=Database_A, level=1, allocation=70),
>
(
(name=Database_B,
b
level=1,
l
l
allocation=30)),
ll
i
))
>
catplan=((name=INTERACTIVE, level=1, allocation=60),
>
(name=BATCH, level=1, allocation=40))

CellCLI> alter iormplan active

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Complete Example (continued)


This slide shows the commands required to configure IORM on the Exadata cells. Exadata
uses the IORM plan in conjunction with the DBRM plans propagated by the databases to
allocate I/O resources.

Exadata and Database Machine Administration Workshop 6 - 19

Using Database I/Os Metrics

You can monitor IORM to understand resource


consumption and make required adjustments.
There are separate metrics for small ( 128 KB) and large
I/Os.
Which database has the heaviest load?
Look for highest DB_IO_RQ_SM + DB_IO_RQ_LG values.

Which database was throttled the most?


Look for highest DB_IO_WT_SM + DB_IO_WT_LG values.
Name

Description

DB_IO_RQ_SM
DB_IO_RQ_LG

Total number of I/O requests issued by the database since


any resource plan was set

DB_IO_RQ_SM_SEC
DB_IO_RQ_LG_SEC

I/O requests per second issued by the database in past


minute

DB_IO_WT_SM
DB_IO_WT_LG

Total number of seconds that I/O requests issued by the


database waited to be scheduled

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Using Database I/Os Metrics


Exadata provides three groups of I/O metrics that correspond to the three types of IORM plans:
category metrics, database metrics, and consumer group metrics. I/O metrics allow you to
understand
d t d your I/O consumption
ti and
d make
k adjustments
dj t
t to
t optimize
ti i performance
f
and
d resource
utilization.
For each I/O metric, a distinction is made between small I/Os, typically associated with OLTP
applications, and large I/Os, which are usually indicative of DSS workloads. I/O metric names
include _SM or _LG to identify small or large I/Os, respectively.
For database metrics the objectType attribute is set to IORM_DATABASE. The table in the
slide gives you a quick description of some important database I/O metrics. A separate set of
metric observations is available for each database specified in the IORM plan
plan. Metric
observations for different databases are differentiated by the name of the database, which is set
in the metricObjectName attribute. You can compare metrics between databases to
determine which one has the heaviest load or which one was throttled most as illustrated in the
slide. A special metricObjectName value of _OTHER_DATABASE_ is used for database I/O
metrics associated with ASM and for databases that are not explicitly mentioned in the
interdatabase IORM plan.
While this slide focuses on database metrics, the same principles apply for category metrics and
consumer group metrics. For example, the CG_IO_RQ_SM_SEC metric specifies the rate of
small I/O requests issued by a consumer group per second over the past minute. A large value
indicates a heavy I/O workload from this consumer group in the past minute.
Exadata and Database Machine Administration Workshop 5 - 20

Quiz
If a consumer group does not require its full resource allocation,
what happens to the leftover allocation?
1. It remains unused.
2 It is divided equally among other consumer groups
2.
groups.
3. It is allocated to other active consumer groups, according
to the resource plan.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 3

Exadata and Database Machine Administration Workshop 6 - 21

Quiz
Which of the following conditions are required for IORM to
intervene and control the allocation of I/O resources?
1. The IORM plan must be active.
2 More than one consumer group must be active.
2.
active
3. The disks must be heavily utilized.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 1, 2, 3
All of the conditions listed in this question must be present for IORM to intervene.

Exadata and Database Machine Administration Workshop 6 - 22

Quiz
In which order are the different I/O resource plans applied to
allocate I/O resources?
1. Category, intradatabase, interdatabase
2 Interdatabase,
2.
Interdatabase category
category, intradatabase
3. Category, interdatabase, intradatabase
4. Interdatabase, intradatabase, category
5. Intradatabase, interdatabase, category

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 3

Exadata and Database Machine Administration Workshop 6 - 23

Quiz
You can create categories using the CellCLI utility.
1. TRUE
2. FALSE

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Answer: 2
You can create your own categories using the CREATE_CATEGORY procedure in the
DBMS_RESOURCE_MANAGER package, and then assign your category to a consumer group
using the CREATE_CONSUMER_GROUP or UPDATE_CONSUMER_GROUP procedures.
You can then manage I/O resources based on categories by creating a category plan. The
category plan can be created using the CellCLI utility.

Exadata and Database Machine Administration Workshop 6 - 24

Summary
In this lesson, you should have learned how to use Exadata I/O
Resource Management to manage workloads within a
database and across multiple databases.

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 6 - 25

Additional Resources

Lesson Demonstrations (Viewlets)


Intradatabase I/O Resource Management

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/061ExadataIntr
aDBIORM/061exadataintradbiorm_viewlet_swf.html

Interdatabase I/O Resource Management

http://stcurriculum.oracle.com/demos/db/11g/r2/dbmach/062ExadataInt
erDBIORM/062exadatainterdbiorm_viewlet_swf.html

Copyright 2010, Oracle and/or its affiliates. All rights reserved.

Exadata and Database Machine Administration Workshop 6 - 26