LSS Black Belt Ebook PDF

Certified Lean
Six Sigma Black
l k
Belt Book
elt ook
LEAN SIX SIGMA BELT SERIES
OpenSourceSixSigma.com
Legal Notice
INDIVIDUAL COPY
This Book is an Open Source Six Sigma™ copyrighted
publication and is for individual use only. This publication
y p , y p y
may not be republished, electronically or physically y
reproduced, distributed, changed, posted to a website an
intranet or a file sharing system or otherwise distributed in
any form or manner without advanced written permission
from Open Source Six Sigma. Minitab is a Registered
Trademark of Minitab Inc.
FBI Anti Piracy Warning: The unauthorized reproduction or
distribution of this copyrighted work is illegal. Criminal
copyright infringement, including infringement without
monetary gain, is investigated by the FBI and is punishable by
up to 5 years in federal prison and a fine of $250,000.
FFor reprint permission, to request additional copies, or to
i t i i t t dditi l i t
request customized versions of this publication contact Open
Source Six Sigma.
Open Source Six Sigma
Open Source Six Sigma
6200 East Thomas Road Suite 203
Scottsdale, Arizona, United States of America 85251
Toll Free: 1 800 504 4511
International: +1 480 361 9983
International: +1 480 361 9983
Email: OSSS@OpenSourceSixSigma.com
Website: www.OpenSourceSixSigma.com
Table of Contents
Page
Define Phase
Understanding Six Sigma…………………………………………..………………..….…….… 1
Six Sigma Fundamentals………………………………..…………..………………..……..…. 22
Selecting Projects……………………………………….………………………..……..……… 42
Elements of Waste……………………………………..…………...……………………………64
Wrap Up and Action Items……………………...………………………………………….……77
Define Phase Quiz……………………………..…………………………………………………83
Measure Phase
Welcome to Measure……………………………………………………………….……..….....86
Measure 86
Process Discovery………………………………………..………………………………………89
Six Sigma Statistics…………………………………..….………………………………….….138
Measurement System Analysis……………………….……………………………………....171
Process Capability ……………………………………...…………………………… ……….203
Wrap Up and Action Items …………………………………………………………………….224
Measure Phase Quiz………………………………………………………….………………..230
Analyze Phase
Welcome to Analyze……………………………………………………………………… .…..233
“X” Sifting………………………………….………………...……………………….……….….236
Inferential Statistics……………………………………………..……………..………….…….262
Introduction to Hypothesis Testing……………………………..……….…………………….277
Hypothesis Testing Normal Data Part 1………………………………..…….………………291
Hypothesis Testing Normal Data Part 2 …………………….………………………….……334
Hypothesis Testing Non-Normal Data Part 1………………….….…………………….……364
1 364
Hypothesis Testing Non-Normal Data Part 2……………….…………….………………….390
Wrap Up and Action Items …………………………………………..………………....……..409
Analyze Phase Quiz…………………………………………….………………………………415
Improve Phase
Welcome to Improve……………………………………..………………………………...…..418
Process Modeling Regression……………………………………………………………….421
Advanced Process Modeling………………………….……………………………………….440
Designing Experiments…………………………………………………………………………467
Experimental Methods………………………………………….………………………………482
Full Factorial Experiments…………………………………………………………………..…497
Fractional Factorial Experiments……………………………………………………….……..526
Wrap Up and Action Items…………………………..…………………………………………546
Improve Phase Quiz……………………………………………………………………………552
Control Phase
Welcome to Control………………………………………..……………………………………556
Lean Controls……………………………………………………………………………………559
Defect Controls……………………………………………………………………….…………574
Statistical Process Control…………………………….……………………………………….586
Six Sigma Control Plans………………………………..………………………………………626
Wrap Up and Action Items…………………….…………………………………………….…649
Control Phase Quiz…………………………...………………………………………..……….659
Appendix – Quiz Answers
Glossary
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

1
Lean Six Sigma

Black Belt Training
Define Phase
Understanding Six Sigma
Welcome to Open Source Six Sigma’s Black Belt Training Course.
This course has been designed to build your knowledge and capability to improve the
performance of processes and subsequently the performance of the business of which you are a
part. The focus of the course is process centric. Your role in process performance improvement
is to be through the use of the methodologies of Six Sigma, Lean and Process Management.
By taking this course you will have a well rounded and firm grasp of many of the tools of these
methodologies. We firmly believe this is one of the most effective classes you will ever take and it
is our commitment to assure that this is the case.
We begin in the Define Phase with “Understanding Six Sigma”.

2

Overview
The core fundamentals

of this phase are
Definitions, History, Understanding Six Sigma
Strategy, Problem
Solving and Roles and Definitions
Responsibilities.
History
We will examine the
meaning of each of Strategy
these and show you
how to apply them. Problem Solving
Roles & Responsibilities
Six Sigma Fundamentals
Selecting Projects
Elements of Waste
Wrap Up & Action Items
What is Six Sigma…as a Symbol?
Variation is our enemy. Our

σ sigma is a letter of the Greek alphabet. customers, both internal and
external, have expectations
– Mathematicians use this symbol to signify standard relative to the deliverables from
deviation, an important measure of variation. our pprocesses. Variation from
– Variation designates the distribution or spread those expectations are likely
about the average of any process. dissatisfiers to them. Much of
this course is devoted to
identifying, analyzing and
eliminating variation. So let’s
begin to understand it.
The Blue Line designates

narrow variation while the
Orange Line designated wide
The variation in a process refers to how tightly all the variation.
various outcomes are clustered around the average. No
Obviously the less variation
process will produce the EXACT same output each time.
within a process the more
predictable the p
p process is,,
assuming the mean is not moving all over the place. If you took the height of everyone in the class would
you expect a large variation or narrow variation?
What if you had a few professional basketball player in the room, would that widen or narrow the variation?

3
What is Six Sigma…as a Value?

Sigma is a measure of
deviation. The
mathematical calculation
for the Standard Deviation
of a population is as
shown.
Sigma can be used
interchangeably with the
statistical term Standard By definition, the Standard Deviation is the distance
Deviation.
between the mean and the point of inflection on the
Standard Deviation is the
average distance of data normal curve.
points away from the
Mean in a distribution. Point of Inflection
When measuring the

sigma value of a process
we want to obtain the
distance from the Mean to
the closest specification
limit in order to determine
how many Standard
Deviations we are from
th mean….our Sigma
the Si
Level!
The Mean being our optimal or desired level of performance.
What is Six Sigma…as a Measure?
The probability of creating a defect can be estimated and translated into a

“Sigma” level.
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
The higher the sigma level, the better the performance. Six Sigma refers to a process having six
Standard Deviations between the average of the process center and the closest specification limit or
service level
level.
This pictorial depicts the percentage of data which falls between Standard Deviations within a Normal
Distribution. Those data points at the outer edge of the bell curve represent the greatest variation in our
process. They are the ones causing customer dissatisfaction and we want to eliminate them.

4
Measure
“Sigma Level” is:

– A statistic used to describe the performance of a process relative to the
specification limits
– The number of Standard Deviations from the mean to the closest
specification limit of the process
USL
6 Sigma
5 Sigma
4 Sigma
3 Sigma
2 Sigma
1 Sigma
The likelihood of failure decreases as the number of Standard Deviations

that can be fit between the Mean and the nearest spec limit increases.
Each gray dot represents one Standard Deviation

Deviation. As you can see the Normal Distribution is
tight.
Said differently, if all the outputs of our process fall within six Standard Deviations from the
Mean, we will have satisfied our customers nearly all the time. In fact, out of one million
customer experiences, only 3.4 will have experienced a defect.
What is Six Sigma…as a Metric?

Each of these metrics serves a different purpose and may be used at different levels in the
organization to express the performance of a process in meeting the organization’s (or customer’s)
requirements. We will discuss each in detail as we go through the course.
Defects 20
Defects per unit (DPU) 18
Parts per million (PPM) 16
Defects per million opportunities (DPMO) 14
Rolled Throughput yield (RTY) 12
First Time Yield (FTY) 10
Sigma (s) 8
0 20 40 60 80 100
Above are some key metrics used in Six Sigma. We will discuss each in detail as we go through the
course.

5
What is Six Sigma…as a Benchmark?

This data represents the sigma level of companies
companies. As you can see less than 10% of companies
are at a 6 sigma level!
Yield PPMO COPQ Sigma
99.9997% 3.4 <10% 6 World Class Benchmarks
99.976% 233 10-15% 5 10% GAP
99.4% 6,210 15-20% 4 Industry Average
93% 66,807 20-30% 3 10% GAP
65% 308 537

308,537 30 40%
30- 2 Non Competitive
50% 500,000 >40% 1
Source: Journal for Quality and Participation, Strategy and Planning Analysis
W ha t does 2 0 - 4 0 % of Sa les represent to your O rga niza tion?
What is Six Sigma…as a Method?
The Six Sigma Methodology is made up of five stages: Define, Measure, Analyze, Improve and
Control.
Each has highly defined steps to assure a level of discipline in seeking a solution to any variation or
defect present in a process.
DM AIC provides the m ethod for a pplying the Six

Sigma philosophy in order to im prove processes.
– Define - the business opportunity
– Measure - the process current state
– Analyze - determine root cause or Y= f (x)
– Improve - eliminate waste and variation
– Control - evidence of sustained results

6
What is Six Sigma…as a Tool?
Six Sigma conta ins a broa d set of tools, interw oven

in a business problem-solving methodology. Six
Sigma tools a re used to scope a nd choose projects,
design new products a nd processes im prove current
processes, decrea se dow ntime a nd im prove
customer response time.
- Six Sigma has not created new tools, it has simply

organized a variety of existing tools to create flow.
Customer Value
Management Product Process Process System Functional
Responsiveness,
Cost, Quality,
= EBIT, (Enabler) , Design , Yield , Speed , Uptime , Support
Delivery
Six Sigma has not created new tools. It is the use and flow of the tools that is important. How they
are applied makes all the difference.
Six Sigma is also a business strategy that provides new knowledge and capability to employees so
they can better organize the process activity of the business, solve business problems and make
better decisions.
decisions Using Six Sigma is now a common way to solve business problems and remove
waste resulting in significant profitability improvements. In addition to improving profitability,
customer and employee satisfaction are also improved.
Six Sigma is a process measurement and management system that enables employees and
companies to take a process oriented view of the entire business. Using the various concepts
embedded in Six Sigma, key processes are identified, the outputs of these processes are
prioritized, the capability is determined, improvements are made, if necessary, and a management
structure is put in place to assure the ongoing success of the business
business.
People interested in truly learning Six Sigma should be mentored and supported by seasoned
Belts who truly understand how Six Sigma works.

7
What is Six Sigma…as a Goal?

To give you a better example the concept of the sigma level can be related to hanging fruit. The higher
the fruit, the more challenging it is to obtain. And, the more sophisticated the tools necessary to obtain
them.
Sw eet Fruit
Design for Six Sigma
5+ Sigma
Bulk of Fruit
Process
3 - 5 Sigma Cha ra cteriza tion
a nd O ptimiza tion
Low Ha nging Fruit

3 Sigma Ba sic Tools of
Problem Solving
Ground Fruit
1 - 2 Sigma Simplify a nd
St nda
Sta d rdize
di
What is Six Sigma…as a Philosophy?
General Electric: First, what it is not. It is not a secret society, a slogan or a cliché. Six Sigma is
a highly di i li d process that
hi hl disciplined th t helps
h l us ffocus on d developing
l i and dd
delivering
li i near-perfectf t products
d t
and services. The central idea behind Six Sigma is that if you can measure how many "defects" you
have in a process, you can systematically figure out how to eliminate them and get as close to "zero
defects" as possible. Six Sigma has changed the DNA of GE — it is now the way we work — in
everything we do and in every product we design.
Honeywell: Six Sigma refers to our overall strategy to improve growth and productivity as well as
a measurementt off quality.
lit As
A a strategy,
t t Six
Si Sigma
Si is
i a way for
f us to
t achieve
hi performance
f
breakthroughs. It applies to every function in our company, not just those on the factory floor. That
means Marketing, Finance, Product Development, Business Services, Engineering and all the other
functions in our businesses are included.
Lockheed Martin: We’ve just begun to scratch the surface with the cost-saving initiative called Six
Sigma and already we’ve generated $64 million in savings with just the first 40 projects. Six Sigma
uses data gathering and statistical analysis to pinpoint sources of error in the organization or products
and determines precise ways to reduce the error.

8
History of Six Sigma
Simplistically, Six
Simplistically
• 1 9 8 4 Bob Ga lvin of M otorola edicted the first objectives of
Sigma was a
Six Sigma
program that was
– 1 0 x levels of improvem ent in service a nd qua lity by 1 9 8 9
generated around
targeting a process – 1 0 0 x improvement by 1 9 9 1
Mean (average) six – Six Sigma ca pa bility by 1 9 9 2
Standard Deviations – Bill Sm ith, a n engineer from M otorola , is the person credited a s
away from the the fa ther of Six Sigm a
closest specification • 1984 T Tex a s IInstruments
t t a ndd ABB W orkk closely
l l w ith
limit. M otorola to further develop Six Sigma
• 1 9 9 4 Applica tion ex perts lea ve M otorola
By using the process
• 1 9 9 5 AlliedSigna l begins Six Sigma initia tive a s directed by
Standard Deviation
La rry Bossidy
to determine the
location of the Mean – Ca ptured the interest of W a ll Street
the results could be • 1 9 9 5 Genera l Electric, led by Ja ck W elsh, bega n the most
predicted at 3.4 w idesprea d underta k ing of Six Sigma even a ttem pted.
defects per million by • 1 9 9 7 To present Six Sigma spa ns industries w orldw ide
the use of statistics.
There is an allowance for the process Mean to shift 1.5 Standard Deviations. This number is another
academic and esoteric controversial issue not worth debating. We will get into a discussion of this
number later in the course.
The Phase Approach of Six Sigma

Six Sigma created a realistic and quantifiable goal in terms of its target of 3.4 defects per million
operations. It was also accompanied by a methodology to attain that goal.
That methodology was a problem solving strategy made up of four steps: measure, analyze,
improve and control.
When GE launched Six Sigma they improved the methodology to include the Define Phase Phase.
Control Im prove Ana ly ze M ea sure Define
M O TO RO LA GEN ERAL ELECTRIC
Today the Define Phase is an important aspect to the methodology. Motorola was a mature culture
from a process perspective and didn’t necessarily have a need for the Define Phase.
M t organizations
Most i ti today
t d DEFINITELY need
d it tto properly
l approach
h iimprovementt projects.
j t
As you will learn, properly defining a problem or an opportunity is key to putting you on the right
track to solve it or take advantage of it.

9
DMAIC Phases Roadmap

Cha mpion
Process
O w ner
Identify Problem Area

/
Determine Appropriate Project Focus

Define
Estimate COPQ
Charter Project
M ea sure
Assess Stability, Capability, and Measurement Systems
Identify and Prioritize All X’s

Ana lyze
Prove/ Disprove Impact X’s Have On Problem

Improve
e
Id if Prioritize,
Identify, Pi i i S
Select
l S
Solutions
l i C
Controll or Eli
Eliminate
i X’
X’s Causing
C i P Problems
bl
Implement Solutions to Control or Eliminate X’s Causing Problems

Control
Implement Control Plan to Ensure Problem Does N ot Return
Verify Financial Impact
This roadmap provides an overview of the DMAIC approach.
Define Phase Deployment
Here is a more granular Business Case

Selected
look of the Define
Phase. N otify Belts and Stakeholders
This is what you will

Create High-Level Process Map
later learn to be a Level
2 Process Map. Determine Appropriate Project Focus
(Pareto, Project Desirability)
Define & Charter Project

(Problem Statement
Statement, Objective
Objective, Primary Metric
Metric, Secondary Metric)
N Estimate COPQ
Approved
Project Recommend Project Focus
Focus
Y
Create Team
Charter Team
Ready for Measure

10
Define Phase Deliverables
Listed below are the type of Define Phase deliverables that will be reviewed by this course.
By the end of this course, you should understand what would be necessary to provide these
deliverables in a presentation.
Charter Benefits Analysis

Team Members (Team Meeting Attendance)
Process Map – high level
Primary Metric
Secondary Metric(s)
Lean Opportunities
Stakeholder Analysis
Project Plan
Issues and Barriers
Six Sigma Strategy

Six Sigma places the emphasis on the Process
– Using a structured, data driven approach centered on the customer Six Sigma can resolve
business problems where they are rooted, for example:
Month end reports
Capital
C it l expenditure
dit approvall
New hire recruiting
Six Sigma is a Breakthrough Strategy
– Widened the scope of the definition of quality

includes the value and the utility of
th product/service
the d t/ i to
t both
b th th
the
company and the customer.
Success of Six Sigma depends on the extent of

transformation achieved in each of these levels.
Six Sigma as a breakthrough strategy to process improvement. Many people mistakenly

assume that Six Sigma only works in manufacturing type operations. That is categorically
untrue. It applies to all aspects of either a product or service based business.
Wherever there are processes, Six Sigma can improve their performance.

11

Conventional Strategy
Conventional definitions of quality focused on conformance to standards.
Requirement Requirement
or or
LSL
Target USL
Bad Good Bad
Conventional strategy was to create a product or service that met certain specifications.
Assumed that if products and services were of good quality then their
performance standards were correct.
Rework was required to ensure final quality.
Efforts were overlooked and unquantified (time, money, equipment
usage, etc).
The conventional strategy was to create a product or service that met certain specifications. It was
assumed that if products and services were of good quality, then their performance standards were
correct irrespective of how they were met.
Using this strategy often required rework to ensure final quality or the rejection and trashing of some
products and the efforts to accomplish this “inspect in quality” were largely overlooked and un-
quantified.
You will see more about this issues when we investigate the Hidden Factory.
Problem Solving Strategy
The Problem Solving M ethodology focuses on:

• Understanding the relationship between independent variables
and the dependant variable.
• Identifying the vital few independent variables that effect the
dependant variable.
• Optimizing the independent variables so as to control our
dependant variable(s).
• Monitoring the optimized independent variable(s).
There a re m a ny ex a m ples to describe dependa nt a nd
independent rela tionships.
• W e describe this concept in terms of the equation:
• This equation is also commonly referred to as a transfer function
Y=f
Y f ((Xi)
This sim ply sta tes tha t Y is a function of the
X ’ s. In other w ords Y is dicta ted by the X ’ s.

12
Problem Solving Strategy (contd)
Y = f(x) is a key concept that you must fully understand and remember
remember. It is a fundamental principle
to the Six Sigma methodology. In its simplest form it is called “cause and effect”. In its more robust
mathematical form it is called “Y is equal to a function of X”. In the mathematical sense it is data
driven and precise, as you would expect in a Six Sigma approach. Six Sigma will always refer to an
output or the result as a Y and will always refer to an input that is associated with or creates the
output as an X.
Another way of saying this is that the output is dependent on the inputs that create it through the
blending that occurs from the activities in the process. Since the output is dependent on the inputs
we cannot directly control it, we can only monitor it.
Example
Y f (Xi)
Y=f
W hich process va ria bles (ca uses) ha ve critica l impa ct on
the output (effect)?
Crusher Yield = f ( Feed, Speed,Material

Type , Wear , Lubricant )
Tool
Time to Close = f (Balance

Trial
B l ,A
Correct
Accounts
t ,A
Sub
Accounts
t ,
Credit
M
Memos ,
Entry
Mistakes,X ) n
Applied
If we are so good at the X’s why are we constantly

testing and inspecting the Y?
Y=f(x) is a transfer function tool to determine what input variables (X’s) affect the output responses
(Y’s). The observed output is a function of the inputs. The difficulty lies in determining which X’s
are critical to describe the behavior of the Y’s.
The X’s determine how the Y performs.
In the Measure Phase we will introduce a tool to manage the long list of input variable and their
relationship to the output responses. It is the X-Y Matrix or Input-Output Matrix.

13
Y=f(X) Exercise
Exercise:
Consider establishing a Y = f(x) equation for a

simple everyday activity such as producing a
cup of espresso. In this case our output or Y is
espresso.
Espresso =f ( X1 , X , X , X , X )
2 3 4 n
Notes

14
Six Sigma Strategy
W e use a va riety of Six Sigma

tools to help sepa ra te the “ vita l (X1)
few ” va ria bles effecting our Y (X10) (X4)
from the “ trivia l ma ny.”
Some processes conta in ma ny, (X7) (X8)
ma ny va ria bles. How ever, our
Y is not effected equa lly by a ll
of them. (X3)
(X5)
By focusing on the vita l few w e
insta ntly ga in levera ge. (X9)
Archimedes
A hi d sa
Archimedes said:
id ““ Give
id: Gi me
Give me aa lever
lever
l big
bi enough
big h aand
enough nd
d
fulcrum
fulcrum on
on w
w hich
hich to
to pla
place
ce it,
it, aand
nd II sha
shallll move
move the
the
w
w orld.”
orld.”
(X6)
(X2)
As you go through the application of DMAIC you will have a goal to find the root causes to the
problem you are solving. Remember that a vital component of problem solving is cause and effect
thinking or Y=f(X). To aid you in doing so, you should create a visual model of this goal as a funnel -
a funnel that takes in a large number of the “trivial many contributors,” and narrows them to the “vital
few contributors
contributors” by the time they leave the bottom
bottom.
At the top of the funnel you are faced with all possible causes - the “vital few” mixed in with the
“trivial many.” When you work an improvement effort or project, you must start with this type of
thinking. You will use various tools and techniques to brainstorm possible causes of performance
problems and operational issues based on data from the process. In summary, you will be applying
an appropriate set of “analytical methods” and the “Y is a function of X” thinking, to transform data
into the useful knowledge needed to find the solution to the problem. It is a mathematical fact that 80
percent of a problem is related to six or fewer causes
causes, the X’s.
X’s In most cases it is between one and
three.
The goal is to find the one to three Critical X’s from the many potential causes when we start an
improvement project. In a nutshell, this is how the Six Sigma methodology works.

15
Breakthrough Strategy
Ba d 66-Sigm
-Sigmaa
Brea k through UCL
UCL
Brea k through
Perforrma nce
O ld Sta nda rd
LCL
LCL
UCL
UCL
N ew Sta nda rd
LCL
LCL
Good
Time Juran’s Quality Handbook by Joseph Juran
By utilizing the DMAIC problem solving methodology to identify and optimize the vital few variables we
will realize sustainable breakthrough performance as opposed to incremental improvements or, even
worse, temporary and non-sustainable improvement..
The image above shows how after applying the Six Sigma tools, variation stays within the specification
limits.
VOC, VOB, VOE
The
foundation of
Six Sigma
VO C is Customer Driven
requires
F
Focus on the
th
voices of the VO B is Profit Driven
Customer, the
Business, and
the Employee
which
VO E is Process Driven
provides:
Awareness of the needs that are critical to the quality (CTQ) of our products and
services
Identification of the gaps between “what is” and “what should be”
Identification of the process defects that contribute to the “gap”
Knowledge of which processes are “most broken”
Enlightenment as to the unacceptable costs of poor quality (COPQ)
Six Sigma puts a strong emphasis on the customer because they are the ones assessing our performance
and they
y respond
p byy either continuing
g to p
purchase our p
products and services or….by y NOT!
So, while the customer is the primary concern we must keep in mind the Voice of the Business – how do we
meet the business’s needs so we stay in business? And we must keep in mind the Voice of the Employee -
how do we meet employees needs such that they remain employed by our firm and remain inspired and
productive?

16
Six Sigma Roles and Responsibilities

There are many roles and responsibilities for successful implementation of Six Sigma
Sigma.
MBB
Executive Leadership
Champion/Process Owner
Black Belts Master Black Belt
Black Belt
Green Belt
Green Belts Yellow Belt
Yellow Belts
Just like a winning sports team, various people who have specific positions or roles have defined
responsibilities. Six Sigma is similar - each person is trained to be able to understand and perform the
responsibilities of their role. The end result is a knowledgeable and well coordinated winning business
team.
The division of training and skill will be delivered across the organization in such a way as to provide a
specialist: it is based on an assistant structure much as you would find in the medical field between a
Doctor, 1st year Intern, Nurse, etc. The following slides discuss these roles in more detail.
In addition to the roles described herein, all other employees are expected to have essential Six Sigma
skills for process improvement and to provide assistance and support for the goals of Six Sigma and the
company.
Six Sigma has been designed to provide a structure with various skill levels and knowledge for all
members of the organization. Each group has well defined roles and responsibilities and communication
links. When all individuals are actively applying Six Sigma principles, the company operates and performs
at a higher level
level. This leads to increased profitability
profitability, and greater employee and customer satisfaction
satisfaction.
Executive Leadership
Not all Six Sigma deployments are driven from the top by executive leadership. The data is clear,
however, that those deployments that are driven by executive management are much more successful
than those that are not.
Makes decision to implement the Six Sigma initiative and develop accountability
method
Sets meaningful goals and objectives for the corporation
Sets performance expectations for the corporation
Ensures continuous improvement in the process
Eliminates barriers
The executive leadership owns the vision for the business, they provide sponsorship and set
expectations
t ti for
f the
th results
lt ffrom Si
Six Si
Sigma. Th
They enable
bl th
the organization
i ti tto apply
l Si
Six Si
Sigma and
d th
then
monitor the progress against expectations.

17
Champion/Process Owner
Champions identify and select the most meaningful projects to work on

on, they provide guidance to the
Six Sigma Belt and open the doors for the belts to apply the process improvement technologies.
Own project selection, execution control, implementation and realization of

gains
Own Project selection
Obtain needed project resources and eliminates roadblocks
Participate
P ti i t in i allll project
j t reviews
i
Ask good questions…
One to three hours per week commitment
Champions are responsible for functional business activities and to provide business deliverables to
either internal or external customers. They are in a position to be able to recognize problem areas of
the business, define improvement projects, assign projects to appropriate individuals, review projects
and support their completion
completion. They are also responsible for a business roadmap and employee
training plan to achieve the goals and objectives of Six Sigma within their area of accountability.
Master Black Belt
MBB should be well versed with all aspects of Six Sigma, from technical applications to Project
Management. MBBs need to have the ability to influence change and motivate others.
Provide advice and counsel to Executive Staff

Provide training and support
- In class training
MBB - On site mentoring
Develop
D l sustainability
t i bilit ffor th
the b
business
i
Facilitate cultural change
A Master Black Belt is a technical expert, a “go to” person for the Six Sigma methodology. Master
Black Belts mentor Black Belts and Green Belts through their projects and support Champions. In
addition to applying Six Sigma, Master Black Belts are capable of teaching others in the practices
and tools.
Being a Master Black Belt is a full time position.

18
Black Belt
Bl k Belts
Black B lt are application
li ti experts
t andd workk projects
j t within
ithi th
the b
business.
i Th
They should
h ld bbe wellll
versed with The Six Sigma Technologies and have the ability to drive results.
Project team leader

Facilitates DMAIC teams in applying Six Sigma
methods to solve problems
Black Belts Works cross-functionally
Contributes to the accomplishment of organizational
goals
Provides technical support to improvement efforts
A Black Belt is a project team leader, working full time to solve problems under the direction of a
Champion, and with technical support from the Master Black Belt. Black Belts work on projects j
that are relatively complex and require significant focus to resolve. Most Black Belts conduct an
average of 4 to 6 projects a year -- projects that usually have a high financial return for the
company.
G
Green Belt
B lt
Green Belts are practitioners of Six Sigma Methodology and typically work within their
functional areas or support larger Black Belt Projects.
• Well versed in the definition & measurement of critical processes

- Creating Process Control Systems
Typically works project in existing functional area
Green Belts Involved in identifying improvement opportunities
Involved in continuous improvement efforts
- Applying basic tools and PDCA
Team members on DMAIC teams
- Supporting projects with process knowledge & data
collection
Green Belts are capable of solving problems within their local span of control. Green Belts remain in
their current positions, but apply the concepts and principles of Six Sigma to their job environment.
Green Belts usually address less complex problems than Black Belts and perform at least two projects
per year. They may also be a part of a Black Belt’s team, helping to complete the Black Belt project.

19
Yellow Belt
Provide support to Black Belts and Green Belts as

needed
Yellow Belts May be team members on DMAIC teams
- Supporting projects with process
knowledge and data collection
Yellow Belts participate in process management activities. They fully understand the principles of Six
Si
Sigma and
d are capablebl off characterizing
h t i i processes, solving
l i problems
bl associated
i t d with
ith th
their
i workk
responsibilities and implementing and maintaining the gains from improvements. They apply Six Sigma
concepts to their work assignments. They may also participate on Green and Black Belt projects.
The Life of a Six Sigma Belt
Training as a Six Sigma Belt can be one of the most rewarding undertakings of your career and
one of the most difficult.
You can expect to experience:
Hard work (becoming a Six Sigma Belt is not

easy)
Long hours of training
Be a change agent for your organization
Work effectively as a team leader
Prepare and present reports on progress
Receive mentoring from your Master Black Belt
Perform mentoring for your team members
ACHIEVE RESULTS!
You’re going places!

20
Black & Green Belt Certification

To achieve certification,, Belts must::
Complete all course work:

- Be familiar with tools and their application
- Practice using tools in theoretical situations
- Discuss how tools will apply to actual projects
Demonstrate application of learning to training project:
- Use the tools to effect a financially measurable
and significant business impact through their
projects
- Show ability to use tools beyond the training We’ll be
environment
Must complete two projects within one year from beginning of training watching!
Achieve results and make a difference
Submit a final report which documents tool understanding and

application as well as process changes and financial impact for each
project
Organizational Behaviors
All players in the Six Sigma process must be willing to step up and act according to the Six Sigma
set of behaviors.
Leadership by example: “walk the talk”
Encourage and reward individual initiative
Align incentive systems to support desired behaviors
Eliminate functional barriers
Embrace “systems” thinking
Balance standardization with flexibility
Six Sigma is a system of improvement. It develops people skills and capability for the participants. It
consists of proven set of analytical tools, project-management techniques, reporting methods and
managementt methods
th d combined
bi d tto fform a powerful
f l problem-solving
bl l i and
dbbusiness-improvement
i i t
methodology. It solves problems, resulting in increased revenue and profit, and business growth.
The strategy of Six Sigma is a data-driven, structured approach to managing processes, quantifying
problems, and removing waste by reducing variation and eliminating defects.
The tactics of Six Sigma are the use of process exploration and analysis tools to solve the equation
of Y = f(X) and to translate this into a controllable practical solution.
As a performance goal, a Six Sigma process produces less than 3.4 defects per million
opportunities. As a business goal, Six Sigma can achieve 40% or more improvement in the
profitability of a company. It is a philosophy that every process can be improved, at breakthrough
levels.

21
At this point, you should be able to:
Describe the objectives of Six Sigma
Describe the relationship between variation and sigma
Recognize some Six Sigma concepts
Recognize the Six Sigma implementation model
Describe the general roles and responsibilities in Six

Sigma
You have now completed Define Phase – Understanding Six Sigma.
Notes

22
Lean Six Sigma

Black Belt Training
Define Phase
Now we will continue in the Define Phase with the “Six

Fundamentals”.
The output of the Define Phase is a well developed and articulated project. It has been correctly
stated that 50% of the success of a project is dependent on how well the effort has been defined.
There’s that Y=f(X) thinking again.

23

Overview
of this phase are Understa nding Six Sigm a
Process Maps, Voice of
the Customer, Cost of
Poor Quality and Six Sigm a Funda m enta ls
Process Metrics.
Process
Process M
Maaps
ps
We will examine the
meaning of each of V
Voice
i of
Voice off the
th
the Custom
C t
Custom er
er
these and show you
how to apply them. Cost
Cost of
of Poor
Poor Q
Qua
uality
lity
Process
Process M
Metrics
etrics
Selecting Projects
Elem ents of W a ste
W ra p Up & Action Item s
What is a Process?
W hy ha ve a process focus?
– So we can understand how and why work gets done
– To characterize customer & supplier relationships
– To manage for maximum customer satisfaction while utilizing
minimum resources
– To see the process from start to finish as it is currently being
performed
– Blame the process, not the people
proc• ess (pros′es)

(pros′es) n. – A repetitive
repetitive a nd systema
system atic
tic series
of steps or
or a ctivities where
where inputs are modified to achieve
inputs are modified to
a value-added
value-added output
output
What is a Process? Many people do or conduct a process everyday but do you really think of it as a
process? Our definition of a process is a repetitive and systematic series of steps or activities where inputs
are modified to achieve a value-added output.
Usually a successful process needs to be well defined and developed.

24
Examples of Processes
We go thru processes everyday. Below are some examples of processes. Can you think
of other processes within your daily environment?
Injection molding Recruiting staff
Decanting solutions Processing invoices
Filling vial/bottles Conducting research
Crushing ore Opening accounts
Refining oil Reconciling accounts
Turning screws Filling out a timesheet
Building custom homes Distributing mail
Paving roads Backing up files
Changing a tire Issuing purchase orders
Process Maps
Process Mapping, also called
flowcharting, is a technique to • The purpose of Process Maps is to:
visualize the tasks, activities and – Identify the complexity of the process
– Communicate the focus of problem solving
steps necessary to produce a product
or a service. The preferred method for • Process Maps are living documents and must be changed as the
describing a process is to identify it process is changed
with a generic name, show the – They represent what is currently happening, not what you think is
workflow with a Process Map and happening.
– They should be created by the people who are closest to the process
d
describe
ib itits purpose with
ith an
operational description.
Process Map
Remember that a process is a
blending of inputs to produce some
desired output. The intent of each
task, activity and step is to add value,
ct
Sta rt Step A Step B Step C Step D Finish

e
sp
as perceived by the customer, to the

In
product or service we are producing.

You cannot discover if this is the case
until you have adequately mapped the process.
There are many reasons for creating a Process Map:

- It helps all process members understand their part in the process and how their process fits into the
bigger picture.
- It describes how activities are performed and how the work effort flows, it is a visual way of standing
above the process and watching how work is done. In fact, Process Maps can be easily uploaded into
model and simulation software allowing you to simulate the process and visually see how it works.
- It can be used as an aid in training new people.
- It will show you where you can take measurements that will help you to run the process better.
- It will help you understand where problems occur and what some of the causes may be.
- It leverages other analytical tools by providing a source of data and inputs into these tools.
- It identifies many important characteristics you will need as you strive to make improvements.
The individual processes are linked together to see the total effort and flow for meeting business and
customer needs. In order to improve or to correctly manage a process, you must be able to describe it
in a way that can be easily understood. Process Mapping is the most important and powerful tool you
will use to improve the effectiveness and efficiency of a process.

25
Process Map Symbols
St d d symbols
Standard b l ffor process mapping
i (available in Microsoft
Office™, Visio™, iGrafx™ , SigmaFlow™ and other products):
A RECTAN GLE indicates an A PARALLELAGRAM shows

activity. Statements within the that there are data
rectangle should begin with a
verb
A DIAM O N D signifies a decision An ELLIPSE shows the start

point. Only two paths emerge from and end of the process
a decision point: N o and Yes
An ARRO W shows the A CIRCLE W ITH A LETTER O R

1 N UM BER IN SIDE symbolizes
b li
connection and direction
the continuation of a
of flow
flowchart to another page
There may be several interpretations of some of the process mapping symbols; however, just
about everyone uses these primary symbols to document processes. As you become more
practiced you will find additional symbols useful,
useful i.e.
i e reports,
reports data storage etc
etc. For now we will start
with just these symbols.
High Level Process Map
At a minimum a high
One of the deliverables from the Define Phase is a high level process
level Process Map
map, at a minimum it must include:
must include; start
– Start and stop points
and stop points, all
process steps, all – All process steps
decision points and – All decision points
directional flow. – Directional flow
– Value categories as defined below
Also be sure to • Value Added:
include Value – Physically transforms the “ thing” going through the process
C t
Categories
i such h as – Must be done right the first time
Value Added – Meaningful from the customer’s perspective (is the customer willing to
(Customer Focus) and pay for it?)
Value Enabling • Value Enabling:
(External Stakeholder – Satisfies requirements of non-paying external stakeholders (government
regulations)
focus).
• N on-Value Added
– Everything else

26
Process Map Example
START B Z Call Center

LOGON TO PC &
REVIEW CASE LOGOFF PHONE, CHECK Process
TOOL HISTORY & MAIL,E-MAIL,VOICE MAIL
APPLICATIONS
TAKE NOTES
E
Map
C Y
SCHEDULED
N PHONE TIME?
A
SCHEDULED
PHONE TIME? Z TRANSFER Y
TRANSFER
APPROPRIATE?
CALL
D N
Y
A EXAMINE NEXT NOTE
N OR RESEARCH ITEM
LOGON
TO PHONE IMMEDIATE PROVIDE
Y RESPONSE Y RESPONSE
ACCESS CASE TOOL F
D PHONE
TIME AVAILABLE? PHONE&
N WALK-IN NOTE
CALL or DATA ENDS ENTER APPROPRIATE
WALK-IN? N SSAN (#,9s,0s)
Z CALL PUT ON HOLD,
REFER TO IF EMP DATA NOT
PHONE DATA REFERENCES POPULATED, ENTER
CAPTURE BEGINS
CREATE A CASE
Y INCL CASE TYPE
ANSWER? OLD N
DETERMINE WHO DATE/TIME, &
CASE
IS INQUIRING N NEEDED BY
Y
QUERY INTERNAL UPDATE ENTRIES
ACCESS CASE TOOL HRSC SME(S) INCL OPEN DATE/TIME AUTO Y
ROUTE
ROUTE
DETERMINE NATURE N
OF CALL & CONFIRM Y
ANSWER?
UNDERSTANDING
CASE Y CLOSE CASE
N CLOSED W/ E
DATE/TIME
CASE TOOL N OFF HOLD AND ADD TO N
RECORD? C ARRANGE CALL RESEARCH
BACK PHONE DATA LIST GO TO E
TAKE ACTION
Y ENDS F or E NEXT
or
DEPENDING ON
DO RESEARCH F
B CASE
Cross Functional Process Map

When multiple departments or functional groups are involved in a complex process it is often useful
to use cross functional Process Maps.
– Draw in either vertical or horizontal swim lanes and label the
functional groups and draw the Process Map
These are best

Sending Fund Tra nsfers
used in
transactional
Department
Attach ACH ACH – Automated

Request
Start form to Clearing House.
processes or transfer
Invoice
where the
process involves Fill out ACH Receive
Vendor
Produce an No
enrollment payment End
Invoice
form
several
departments
departments. Match against
Accounting
Maintain database
Financial
Vendor Yes Input info into bank batch to balance ACH

The lines drawn info in
FRS?
web interface and daily cash transfers
batch
horizontally
across the map Accepts transactions,
Bank
transfer money, and

represent provide batch total
different
Accounting
Review and
General
21.0
departments in Process
transfer in
3.0
Journey Entry
Bank
Reconciliation
the company
p y FRS
and are usually

referred to as Swim Lanes. By mapping in this manner one can see how the various
departments are interdependent in this process.

27
Process Map Exercise
Ex ercise objective: Using your favorite process

mapping tool create a process map of your project
or functional area.
1 Create
1. C t a high
hi h llevell process map, use enough
hddetail
t il
to make it useful.
• It is helpful to use rectangular post-it’s for process
steps and square ones turned to a diamond for
decision points.
2. Color code the value added (green) and non-value
added (red) steps.
3. Be prepared to discuss this with your mentor

28
Do you know your Customer?
Know ing your customer is more tha n just a ha ndsha k e. It

is necessa ry to clea rly understa nd their needs. In Six Sigma
w e ca ll this “ understa nding the CTQ ’s” or critica l to
customer cha ra cteristics.
Voice Of the Customer Critical to Customer

Characteristics
An important element of Six Sigma is understanding your customer. This is called VOC or Voice of the
Customer. By doing this allows you to find all of the necessary information that is relevant between your
product/process and customer, better known as CTQ’s (Critical to Quality). The CTQ’s are the customer
requirements for satisfaction with your product or service.
Voice of the Customer

Do you feel confident
that you know what Voice of the Customer or VOC seems obvious; after all, we all
your customer wants? know what the customer wants. Or do we??
There of four steps The customer’s perspective has to be foremost in the mind of the Six
that can help you in Sigma Belt throughout the project cycle.
understanding your 1. Features
customer. These • Does the process provide what the customers expect and need?
• How do you know?
steps
t focus
f on the
th
2. Integrity
customer’s • Is the relationship with the customer centered on trust?
perspective of • How do you know?
features, your 3. Delivery
company’s integrity, • Does the process meet the customer’s time frame?
delivery mechanisms • How do you know?
4. Expense
and perceived value • Does the customer perceive value for cost?
versus cost. • How do you know?

29
What is a Customer?
Every process has a

There are different types of customers which dictates how we interact
deliverable. The person
or entity who receives with them in the process, in order to identify customer and supplier
this deliverable is a requirements we must first define who the customers are:
customer.
Ex terna l
There are two different – Direct: those who receive the output of your services, they generally are
types of customers; the source of your revenue
External and Internal
Internal. – Indirect: those who do not receive or pay for the output of your services
People generally forget but have a vested interest in what you do (government agencies)
about the Internal
customer and they are
just as important as the Interna l
customers who are - those within your organization
buying your product. who receive the output of your
work
Value Chain
The relationship from one process to the next in an organization creates a “ value
chain” of suppliers and receivers of process outputs.
Each process has a contribution and accountability to the next to satisfy the
external customer.
External customers needs and requirements are best met when all process
owners work cooperatively in the value chain.
Careful –
each move
has many
impacts!
The disconnect from Design and Production in some organizations is a good example. If Production
is not fed the proper information from Design how can Production properly build a product?
Every activity (process) must be linked to move from raw materials to a finished product on a store
shelf.

30
What is a CTQ?
• Critical to Quality (CTQ ’s) are measures that we use to capture VOC
properly. (also referred to in some literature as CTC’s – critical to Example: Making an
customer) Online Purchase
• CTQ ’s can be vague and difficult to define.
Reliability – Correct
– The customer may identify a requirement that is difficult to measure
amount of money is
directly so it will be necessary to break down what is meant by the
taken from account
customer into identifiable and measurable terms
Product: Service: Responsiveness –

• Performance • Competence How long to you wait
• Features • Reliability for product after the
• Conformance • Accuracy Merchant receives
• Timeliness • Timeliness there money
• Reliability • Responsiveness
• Serviceability • Access Security – is your
• Durability • Courtesy
sensitive
iti bbanking
ki
• Aesthetics • Communication
information stored in
• Reputation • Credibility
• Completeness • Security secure place
• Understanding
Developing CTQ’s
The steps in developing
CTQ’s are identifying
the customer, capturing • Identify Customers
the Voice of the Step 1 • Listing
Customer and finally • Segmentation
validating the CTQ’s. • Prioritization
• Va lida te CTQ s
Step 2 • Translate VOC to CTQ s
• Prioritize the CTQ s
• Set Specified Requirements
• C fi CTQ s with
Confirm ith customer
t
• Ca pture V O C
Step 3 • Review existing performance
• Determine gaps in what you need to know
• Select tools that provide data on gaps
• Collect data on the gaps

31
Cost of Poor Quality (COPQ)
Another important tool from • COPQ stands for Cost of Poor Quality
this phase is COPQ, Cost of
Poor Quality. COPQ • As a Six Sigma Belt, one of your tasks will be to estimate COPQ for
represents the financial your process
opportunity of your team’s
improvement efforts. Those • Through your process exploration and project definition work you will
opportunities are tied to develop a refined estimate of the COPQ in your project
either hard or soft savings
savings.
• This project COPQ represents the financial opportunity of your team’s
COPQ, is a symptom improvement effort (VOB)
measured in loss of profit
(financial quantification) that • Calculating COPQ is iterative and will change as you learn more
results from errors (defects) about the process
and other inefficiencies in our No, not that
processes. This is what we kind of cop
are seeking to eliminate! queue!
You will use the concept of COPQ to quantify the benefits of an improvement effort and also to
determine where you might want to investigate improvement opportunities.
The Essence of COPQ

There are four
• COPQ helps us understand the financial impact of problems created elements that make up
by defects. COPQ; External Costs,
Internal Costs,
Prevention Costs and
• COPQ is a sym ptom, not a defect
Appraisal Costs.
– Projects fix defects with the intent of improving symptoms. Internal Costs are
opportunities of error
• Th
The concepts off traditional
di i lQQuality
li C Cost are the
h ffoundation
d i ffor f
foundd in
i a process th
thatt
COPQ. is within your
– External, Internal, Prevention, Appraisal organization. Whereas,
External Costs are
costs associated to the
• A significant portion of COPQ from any defect comes from effects finish product
that are difficult to quantify and must be estimated. associated with the
internal and external
customer.
Prevention Costs are typically cost associated to product quality, this is viewed as an investment that
companies make to ensure product quality. The final element is Appraisal costs, these are tied to
product inspection and auditing.
This idea was of COPQ was defined by Joseph Juran and is a great point of reference to gain a
further understanding
understanding.
Over time and with Six Sigma, COPQ has migrated towards the reduction of waste. Waste is a better
term, because it includes poor quality and all other costs that are not integral to the product or service
your company provides. Waste does not add value in the eyes of customers, employees or investors.

32
COPQ - Categories
Interna l CO PQ Prevention
• Quality Control • Error Proofing Devices
Department • Supplier Certification
• Inspection • Design for Six Sigma
• Quarantined Inventory • Etc…
• Etc…
Detection
• W arranty • Supplier Audits
• Customer Complaint Related • Sorting Incoming Parts
Travel • p
Repaired Material
• Customer Charge Back Costs • Etc…
• Etc…
COPQ - Iceberg
Generally speaking W a rra nty

Inspection
Recode
COPQ can be Rew ork
classified as tangible Rejects
(easy to see) and

Visible Costs
intangible (hard to
see). Visually you can
think of COPQ as an Engineering cha nge orders Lost sa les
iceberg Most of the
iceberg.
iceberg is below the Tim e va lue of money (less obvious) La te delivery
Ex pediting costs
water where you
cannot see it. M ore Set-ups
Ex cess inventory
Similarly the tangible W ork ing Ca pita l

a lloca tions
Long cy cle tim es
quality costs are costs Ex cessive M a teria l
the organization is O rders/ Pla nning
rather conscious of,f Hidden Costs Lost Custom er Loya lty
may be measuring
already or could easily be measured. The COPQ metric is reported as a percent of sales revenue. For
example tangible costs like inspection, rework, warranty, etc can cost an organization in the range of 4
percent to 10 percent of every sales dollar it receives. If a company makes a billion dollars in revenue,
this means there are tangible wastes between 40 and 100 million dollars.
g
Even worse are the intangible Costs of Poor Quality.
y These are typically
yp y 20 to 35% of sales. If yyou
average the intangible and tangible costs together, it is not uncommon for a company to be spending
25% of their revenue on COPQ or waste.

33
COPQ and Lean
W a ste does not a dd, subtra ct or otherw ise modify the

throughput in a w a y tha t is perceived by the customer to
a dd va lue.
• In some cases, waste may be
necessary,
y, but should be
Lea n Enterprise
recognized and explored:
Seven Elements of W a ste *
– Inspection, Correction, W aiting
u Correction
in suspense
u Processing
– Decision diamonds, by
definition, are non-value added u Conveyance
• Often,, waste can provide
p u Motion
opportunities for additional defects u W aiting
to occur. u Overproduction
• W e will discuss Lean in more u Inventory
detail later this week.
Implementing Lean fundamentals can also help identify areas of COPQ. Lean will be discussed later.
COPQ and Lean
W hil
hile ha
h rdd sa vings
i a re a lw
l a y s more desira
d i ble bl
beca use they a re ea sier to qua ntify, it is a lso
necessa ry to think a bout soft sa vings.
CO PQ – Ha rd Sa vings CO PQ – Soft Sa vings
• Labor Savings • Gaining Lost Sales

• Cycle Time Improvements • Missed Opportunities
• Scrap Reductions • Customer Loyalty
• Hidden Factory Costs • Strategic Savings
• Inventory
y Carrying
y g Cost • Preventing Regulatory Fines
Here are examples are COPQ’s Hard and Soft Savings.

34
COPQ Exercise
Ex ercise objective: Identify current COPQ

opportunities in your direct area.
1. Brainstorm a list of COPQ opportunities.
2. Categorize the top 3 sources of COPQ for the

four classifications:
• Internal
• Et
External l
• Prevention
• Detection
Notes

35
The Basic Six Sigma Metrics
In a ny process improvement endea vor, the ultima te

objective is to ma k e the process:
• Better: DPU, DPMO, RTY (there are others, but they derive from these
basic three)
• F ster:
Fa t C l Ti
Cycle Time
• Chea per: COPQ
IfIfyou
youmake
makethetheprocess
processbetter
betterby
byeliminating
eliminatingdefects
defectsyou
youwill
willmake
makeititfaster
faster
IfIfyou
you choose to make the process faster, you will have to eliminatedefects
choose to make the process faster, you will have to eliminate defectstoto
bebeasasfast
fastas
asyou
youcan
canbe
be
IfIfyou
you make the processbetter
make the process betteror
orfaster,
faster,you
youwill
willnecessarily
necessarilymake
makeititcheaper
cheaper
The
The metrics
metricsfor
for aallllSix
Six Sigma
Sigma projects
projectsfa
fallllinto
into one
one of
of these
these three
three
ca tegories
ca tegories
Th previous
The i slides
lid have
h been
b discussing
di i process managementt and d th
the concepts
t bbehind
hi d a process
perspective. Now we begin to discuss process improvement and the metrics used.
Some of these metrics are:

DPU: defects per unit produced.
DPMO: defects per million opportunities, assuming there is more than one
opportunity to fail in a given unit of output.
RTY: rolled throughput
g p yyield, the p probability
y that any
y unit will g
go through
g ap
process
defect-free.
Cycle Time Defined
Think of Cycle Time in terms of your product or tra nsa ction

in the eyes of the customer of the process:
– It is the time required for the product or transaction to go through the

entire process, from beginning to end
– It is not simply the “ touch time” of the value-added portion of the process
W ha t is the cycle
y time of the pprocess you
y ma pp
pped?
Is there a ny va ria tion in the cycle time? W hy?
Cycle time includes any wait or queue time for either people or products.

36
Defects Per Unit (DPU)
DPU or D Defects
f t per U Unitit
quantifies individual defects Six Sigma methods quantify individual defects and not just defectives
on a unit and not just – Defects account for all errors on a unit
defective units. A returned • A unit may have multiple defects
unit or transaction can be • An incorrect invoice may have the wrong amount due and the wrong
due date
defective and have more
– Defectives simply classifies the unit bad
than one defect.
• Doesn’t matter how many defects there are
Defect: A physical count of • The
Th invoice
i i iis wrong, causes are unknown
k
all errors on a unit, – A unit:
regardless of the disposition • Is the measure of volume of output from your area.
of the unit. • Is observable and countable. It has a discrete start and stop point.
• It is an individual measurement and not an average of
EXAMPLES: An error in a measurements.
Online transaction has
(typed wrong card number, Tw o Defects O ne Defective
internet failed). In this case
one online transaction had 2
defects (DPU=2).
A Mobile Computer that has 1 broken video screen, 2 broken keyboard keys and 1 dead battery,
has a total of 4 defects. (DPU=4)
Is a p
process that pproduces 1 DPU better or worse than a p
process that g
generates 4 DPU? If yyou
assume equal weight on the defects, obviously a process that generates 1 DPU is better; however,
cost and severity should be considered. However, the only way you can model or predict a process
is to count all the defects.
First Time Yield
Traditional metrics
when chosen
poorly can lead the
team in a direction
that is not
consistent with the
focus of the
business. Some
of the metrics we
must be
concerned about
would be FTY -
FIRST TIME
YIELD. It is very
possible to have
100% FTY and
spend tremendous
amounts in excess
repairs and
rework.

37
Rolled Throughput Yield
Instead of relying on FTY - First Time Yield, a more efficient metric to use is RTY-Rolled Throughput
Yield. RTY has a direct correlation (relationship) to Cost of Poor Quality.
In the few organizations where data is readily available, the RTY can be calculated using actual defect
data. The data provided by this calculation would be a binomial distribution since the lowest yield
possible would be zero.
As depicted here, RTY is the multiplied yield of each subsequent operation throughout a process (X1 *
X2 * X3…)
RTY Estimate
Sadly, in most companies there is • In many organizations the long term data required to
not enough data to calculate RTY calculate RTY is not available, we can however estimate
in the long term. Installing data RTY using a known DPU as long as certain conditions are
collection practices required to met.
provide such data would not be • The Poisson distribution generally holds true for the
cost effective. In those instances, random distribution of defects in a unit of product and is
it is necessary to utilize a the basis for the estimation.
prediction off RTY in the form
f off e- – The best estimate of the proportion of units containing
dpu (e to the negative dpu). no defects, or RTY is:
When using the e-dpu equation to RTY = e-dpu -dpu
calculate the probability of a
product or service moving through The mathematical constant e is the base of the natural logarithm.
e ≈ 2.71828 18284 59045 23536 02874 7135
the entire process without
a defect,, there are several things
g that must be held for consideration. While this would seem to be a
constraint, it is appropriate to note that if a process has in excess of 10% defects, there is little need to
concern yourself with the RTY.
In such extreme cases, it would be much more prudent to correct the problem at hand before worrying
about how to calculate yield.
38
Deriving RTY from DPU
The Binomial distribution is the true model for defect data

data, but the Poisson is the
convenient model for defect data. The Poisson does a good job of predicting
when the defect rates are low.
120%
Poisson
Poisson VS
VS Binomial
Binomial (r=0,n=1)
(r=0,n=1) Probability
Probability Yield
Yield Yield
Yield %
%Over
Over
120% of
ofaadefect
defect (Binomial)
(Binomial) (Poisson)
(Poisson) Estimated
Estimated
0.0
0.0 100%
100% 100%
100% 0%
0%
100%
100% Yield
Yield (Binomial)
(Binomial) 0.1
0.1 90%
90% 90%
90% 0%
0%
Yield
Yield (Poisson)
(Poisson) 0.2
0.2 80%
80% 82%
82% 2%
2%
(RTY)
80%
d (RTY)
80%
0.3
0.3 70%
70% 74%
74% 4%
4%
60% 0.4
0.4 60%
60% 67%
67% 7%
7%
60%
Yield
0.5
05
0.5 50%
50% 61%
61% 11%
11%
Yiel
40% 0.6
0.6 40%
40% 55%
55% 15%
15%
40%
0.7
0.7 30%
30% 50%
50% 20%
20%
20%
20%
0.8
0.8 20%
20% 45%
45% 25%
25%
0.9
0.9 10%
10% 41%
41% 31%
31%
0%
0% 1.0
1.0 0%
0% 37%
37% 37%
37%
0.0
0.0 0.1
0.1 0.2
0.2 0.3
0.3 0.4
0.4 0.5
0.5 0.6
0.6 0.7
0.7 0.8
0.8 0.9
0.9 1.0
1.0
Probability
Probabilityof
ofaadefect
defect
Binom ia l
n = number of units
r = number of ppredicted defects
p = probability of a defect occurrence P i
Poisson
q = 1 -p
For low defect rates (p < 0.1), the Poisson approximates the Binomial fairly well.
Our goal is to predict yield. For process improvement, the “yield” of interest is the ability of a process
to produce zero defects (r=0). Question: What happens to the Poisson equation when r=0?
D i i RTY from
Deriving f DPU - Modeling
M d li
Given a Unit
probability that Ba sic Q uestion: W hat is the likelihood of
O pportunity producing a unit with zero defects?
any opportunity is
a defect = # • For the unit shown above the following
data was gathered:
defects / (# units
– 60 defects observed
x # opps
pp p per unit):
) – 60 units processed
RTY
RTY for
o DPU
for U == 11
DPU
0.368
0.368
• W hat is the DPU? 0.364
To what value is 0.364
0.36
0.36
Yield
the P(0)
Yield
0.356
0.356
converging? 0.352
0.352
• W hat is probability that any given 0.348
0.348
Note: Ultimately, opportunity will be a defect? 10
10 100
100 1000
1000 10000
10000 100000
100000 1000000
1000000
Chances
Chances Per
Per Unit
Unit
this means that
you need the
y • W hat is the probability that any given Opportunities P(defect) P(no defect) RTY (Prob defect free unit)
ability to track all opportunity will N OT be a defect is: 10 0.1 0.9 0.34867844
100 0.01 0.99 0.366032341
the individual 1000 0.001 0.999 0.367695425
10000 0.0001 0.9999 0.367861046
defects which • The probability that all 10 opportunities 100000 0.00001 0.99999 0.367877602
on single unit will be defect-free is: 1000000 0.000001 0.999999 0.367879257
occur per unit via
If we extend the concept to an infinite number
your data of opportunities, all at a DPU of 1.0, we will
collection system. approach the value of 0.368.
Probability that an opportunity is a defect = 0.1

Probability that an opportunity is not a defect = 1 - 0.1 = 0.9
Probability that all 10 opportunities are defect-free = 0.910 = 0.34867844

39
RTY Prediction — Poisson Model
• Use the binomial to estimate the probability of a discrete event

(good/ bad) when sampling from a relatively large population,
n > 16, & p < 0.1.
• W hen r=0, we compute the probability of finding zero defects per
When r =
unit (called “ rolled throughput yield” ).
1, this
• The table to the right shows the proportion of product which will
equation have (dpu) r e – dpu
simplifies Y=
to:
– 0 defects (r=0)
(r 0) r r! p[r]
– 1 defect (r=1) W hen DPU=1
(dpu)*e-
dpu – 2 defects (r=2), etc… 0 0.3679
• W hen, on average, we have a process, with 1 defect per unit, 1 0.3679
then we say there is a 36.79% chance of finding a unit with zero 2 0.1839
defects. There is only a 1.53% chance of finding a unit with 4
defects. 3 0.0613
• W hen r=1,
r 1 this equation simplifies to: 4 0.0153
• To predict the % of units with zero defect (i.e., RTY): 5 0.0031
– count the number of defects found 6 0.0005
– count the number of units produced 7 0.0001
– compute the dpu and enter it in the dpu equation: 8 0.0000
The p
point of this slide is to demonstrate the mathematical model used to p predict the p
probability
y of an
outcome of interest. It has little practical purpose other than to acquaint the Six Sigma Belt with the math
behind the tool they are learning and let them understand that there is a logical basis for the equation.
Six Sigma Metrics – Calculating DPU
The DPU for a given operation can be calculated by dividing the number of
defects found in the operation by the number of units entering the operational
step
step.
1 0 0 pa rts built
2 defects identified a nd corrected
dpu = 0 .0 2
So RTY for this step w ould be e-.0 2 (.9 8 0 1 9 9 ) or 9 8 .0 2 %.
RTY
RTYTO =0 .9 0
TOTT= 0 .9 0
RTY 1 =0 .9 8 RTY 2 = 0 .9 8 RTY 3 =0 .9 8 RTY 4 = 0 .9 8 RTY 5 = 0 .9 8
dpu = .0 2 dpu = .0 2 dpu = .0 2 dpu = .0 2 dpu = .0 2
44
dpu
dpuTO = .1
TOTT = .1
If the process had only 5 process steps with the same yield the process
RTY would be: 0.98 * 0.98 * 0.98 * 0.98 * 0.98 = 0.903921 or 90.39%. Since our
metric of primary concern is the COPQ of this process, we can say that in less than 9% of
the time we will be spending dollars in excess of the pre-determined standard or value
added amount to which this process is entitled.
N ote: RTY’s must be multiplied a cross a process,

DPU’s a re a dded a cross a process.
When the number of steps in a process continually increase, we then continue to multiply the yield
from each step to find the overall process yield. For the sake of simplicity let’s say we are calculating
the RTY for a process with 8 steps. Each step in our process has a yield of .98. Again, there will be a
direct correlation between the RTY and the dollars spent to correct errors in our process.

40
Focusing our Effort – FTY vs. RTY
Assume we are creating two products in our organization

that use similar processes.
Product A
FTY = 80%
Product B
FTY = 80%
How do you k now w ha t to w ork on?

*None of the data used herein is associated with the products shown herein. Pictures are no more than illustration to make a point to teach the concept.
If we chose onlyy to examine the FTY in our decision making

gpprocess,, it would be difficult to determine
the process and product on which our resources should be focused.
As you have seen, there are many factors behind the final number for FTY. That’s where we need to
look for process improvements.
Focusing our Effort – FTY vs. RTY
Let’s look at the DPU of each product assuming equal opportunities and
Answer Slide margin…
questions.
Product A
Now we have a better Product B
idea of:
“What
What does a defect dpu 200 / 100 = 2 dpu
dpu 100 / 100 = 1 dpu
cost?”
“What product should N ow, can you tell which to work on?
get the focus?”
“ the product with the highest DPU?” …think again!
How much more time and/ or raw material are required?

How much extra floor space do we need?
How much extra staff or hours required to perform the rework?
How many extra shipments are we paying for from our suppliers?
How much testing have we built in to capture our defects?
*None of the data used herein is associated with the products shown herein. Pictures are no more than illustration to make a point to teach the concept.

41
Describe what is meant by “Process Focus”
Generate a Process Map
Describe the importance of VOC, VOB and VOE, and CTQ’s
Explain COPQ
Describe the Basic Six Sigma metrics
Explain the difference between FTY and RTY
Explain how to calculate “Defects per Unit” (DPU)
You have now completed Define Phase – Six Sigma Fundamentals.
Notes

42
Lean Six Sigma

Black Belt Training
Define Phase
Selecting Projects
Now we will continue in the Define Phase with the “Selecting

Selecting Projects
Projects”.

43
Selecting Projects
Overview
The core fundamentals of Understa nding
g Six Sigma
g
this phase are Selecting
Projects, Refining and
Defining and Financial Six Sigma Funda m enta ls
Evaluation.
Selecting Projects
The output of the Define
Phase is a well developed
and
a da articulated
t cu ated p
project.
oject Itt has
as Selecting
g Projects
Selecting j
Projects
been correctly stated that
50% of the success of a Refining
Refining &
& Defining
Defining
project is dependent on how
well the effort has been
Financial
Financial Evaluation
Evaluation
defined.
W ra p Up & Action Items
Approaches to Project Selection

Here are three approaches
There a re three ba sic a pproa ches to Project Selection…
for identifying projects. Do
you know what the best
approach is?
The most popular process

“ Bla ta ntly
O bvious”
for generating and selecting
Identifies projects based on individual’s projects is by holding
“ experience” and “ tribal knowledge” “brainstorming” sessions. In
of areas that m a y be creating
problems in delivering our service(s) / brainstorming sessions a
product(s)
d t( ) and d hopefully
h f ll tie
bottom-line business impact.
ti tto group off peoplel gett together,
t th
sometimes after polling
process owners for what
“ Structured Approa ch” “blatantly obvious” problems
Identifies projects based on organizational data, provides a direct plan to effect core business are occurring, and as a team
metrics that have bottom-line impact.
try to identify and refine a list
All three w a y s w ork …the Structured Approa ch is the most desira ble. of problems that MAY be
causing issues in the
organization. Furthermore in an organization that does not have an intelligent problem-solving
methodology in-place, such as Six Sigma, Lean or even TQM, what follows the project selection process
brainstorm is ANOTHER brainstorming session focused on coming up with ideas on how to SOLVE these
problems.
Although brainstorming itself can be very structured it falls far short of being a systematic means of
identifying projects that will reduce cost of poor quality throughout the organization. Why…for several
reasons One
reasons. One, it does not ensure that we are dealing with the most important high-impact
high impact problems,
problems but
rather what happens to be the recent fire fight initiatives. Two, usually brainstorming does not utilize a data
based approach, it relies on tribal knowledge, experience and what people THINK is happening. As we
know what people THINK is happening and what is ACTUALLY happening can be two very different things.
In this module we are going to learn about establishing a structured approach for Project Selection.
44
Selecting Projects
Project Selection – Core Components
Business Ca se – The business case is a high level articulation of the

area of concern. This case answers two primary questions; one,
what is the business motivation for considering the project and two,
what is our general area of focus for the improvement effort.
Project
j Cha rter – The p
project
j charter is a more detailed version of
the business case. This document further focuses the improvement
effort. It can be characterized by two primary sections, one, basic
project information and simple project performance metrics.
Benefits Ana lysis – The benefits analysis is a comprehensive

financial evaluation of the pproject.
j This analysis
y is concerned with
the detail of the benefits in regard to cost & revenue impact that we
are expecting to realize as a result of the project.
With every project there must be a minimum of 3 deliverables:

Business Case
Project Charter
Benefits Analysis
Project Selection - Governance
Responsible Frequency
Pa rty Resources of Upda te
Business Champion Business Unit

N/ A
Ca se (Process Owner) Members
Champion (Process
Project
Six Sigma Belt Owner) & Ongoing
Cha rter
Master Black Belt
Benefits Capture Champion (Process

Benefits Ongoing /
Managerg or Owner)) &
Ana ly sis DM AIC
D,M,A,I,C
Unit Financial Rep Six Sigma Belt

45
Selecting Projects
A Structured Approach – A Starting Point
These are some The Starting Point is defined by the Champion or Process Owner and the
examples of Business Case is the output.
Business Metrics or – These are some examples of business metrics or Key Performance Indicators
Key Performance commonly referred to as KPI’s.
Indicators. – The tree diagram is used to facilitate the process of breaking down the metric of
interest.
What metric should
you focus on…it EBIT
depends? What is Level 2
Cy cle time
the project focus?
What are your Defects Level 2
organizations Level 1
strategic goals? Cost
Level 2
Are Cost of Sales Revenue
preventing growth? Compla ints Level 2
Are ccustomer
stomer
complaints Complia nce
resulting in lost
Sa fety
earnings? Are
excess cycle times
and yield issues eroding market share? Is the fastest growing division of the business the
refurbishing department?
It depends because the motivation for organizations vary so much and all projects should be directly
aligned with the organizations objectives. Answer the question: What metrics are my department not
meeting? What is causing us pain?
A Structured Approach - Snapshot
Once a metric point

has been determined The KPI’s need to brok en dow n into a ctiona ble levels.
another important
question needs to be Business M ea sures
asked – then the next Actiona ble Level
Key Performa nce Indica tors (KPIs)
question should be
what is my metric a
function of? In other
words what are all of
the things that affect
this metric. Level 2 Level 3 Activities Processes
Level 1
We utilize the Tree Level 2 Level 4 Activities Processes
Diagram to facilitate
the process of
breaking down the metric of interest. When creating the tree diagram you will eventually run into
activities
ti iti which
hi h are made
d up off processes. This
Thi iis where
h projects
j t willill b
be ffocused db because thi
this iis
where defects, errors and waste occur.

46
Selecting Projects
Business Case Components – Level 1
Prima ry Business M ea sure or Key Perform a nce

Indica tor (KPI)
Level 2 Level 3 Activities Processes

Level 1
Level 2 Level 4 Activities Processes
– Focus on one primary business measure or KPI.

– Primary business measure should bear a direct line of site with the
organizations strategic objective.
– As the Champion narrows in on the greatest opportunity for
improvement, this provides a clear focus for how the success will be
measured.
Be sure to start with higher level metrics, whether they are measured at the Corporate Level,
Division Level or Department Level, projects should track to the Metrics of interest within a given
area. Primary Business Measures or Key Performance Indicators (KPI’s) serve as indicators of the
success of a critical objective.
Business Case Components – Business Measures
Post business measures (product/service) of the primary

business measure are lower level metrics and must focus on
the end product to avoid internal optimization at expense of
total optimization.
Business Business
Activities Processes
Prima ry Business M ea sure M ea sure
M ea sure Business Business

M ea sure M ea sure
Post business measures (product/service) are lower level metrics and must focus on the end
product.

47
Selecting Projects
Business Case Components - Activities
Business Business
Prim a ry Business M ea sure M ea sure
M ea sure Business
Business
M ea sure M ea sure
Y = f (x 1 , x 2 , x 3 …x n )
1 st Call Resolution = f (Calls, Operators, Resolutions…xn )
Black Box Testing = f (Specifications, Simulation, Engineering…x n)
Business measures are a function of activities. These activities are usually created or enforced by
direct supervision of functional managers. Activities are usually made up of a series of processes or
specific processes.
B i
Business C
Case C
Components
t - Processes
P
Business Business
Prim a ry Business M ea sure M ea sure
M ea sure Business
Business
Activities
ct t es Processes
ocesses
M ea sure M ea sure
Y = f (x 1 , x 2 , x 3 …x n )
Resolutions = f (N ew Customers, Existing Customers, Defective Products…xn )
Simulation = f (Design, Data, modeling…xn )
The processes represent the final stage of the matrix where multiple steps result in the delivery
of some output for the customer. These deliverables are set by the business and customer and
are captured within the Voice of the Customer, Voice of the Business or Voice of the Employee.
What makes up these process are the X’s that determine the performance of the Y which is
where the actual breakthrough projects should be focused.

48
Selecting Projects
What is a Business Case?
The Business Ca se communica tes the need for the project

in terms of meeting business objectives.
The Business Case
is created to ensure
the strategic need The components are:
for your project. It – Output unit (product/ service) for external customer
is the first step in – Primary business measure of output unit for project
project
j t description
d i ti – Baseline performance of primary business measure
development.
– Gap in baseline performance of primary business measure from
business objective
Let’s get
down to
business!
Business Case Example

p
During FY 2005, the 1st Time Call Resolution

Efficiency for New Customer Hardware Setup Here is an example of an
was 89% . Business Case. This defines
the problem and provides
evidence of the problem.
This represents a gap of 8% from the industry
standard of 93% that amounts to US
$2,000,000 of annualized cost impact.
As you review this statement remember the following format of what needs to be in a Business Case:
WHAT is wrong, WHERE and WHEN is it occurring, what is the BASELINE magnitude at which it is
occurring
i and d what
h t iis it COSTING me?
?
You must take caution to avoid under-writing a Business Case. Your natural tendency is to write too
simplistically because you are already familiar with the problem. You must remember that if you are to
enlist support and resources to solve your problem, others will have to understand the context and the
significance in order to support you.
The Business Case cannot include any speculation about the cause of the problem or what actions will
be taken to solve the problem. It’s important that you don’t attempt to solve the problem or bias the
solution at this stage. The data and the Six Sigma methodology will find the true causes and solutions
to the problem.
The next step is getting project approval.

49
Selecting Projects
The Business Case Template
Fill in the Bla nk s for Your Project:
During ___________________________________ , the ____________________ for

(Period of tim e for ba seline perform a nce) (Prim a ry business m ea sure)
________________________ was _________________ .

(A k ey business process) (Ba seline perform a nce)
This gap of ____________________________

(Business objective ta rget vs. ba seline)
from ___________________ represents ____________________ of cost impact.

(Business objective) (Cost im pa ct of ga p)
You need to make sure that your own Business Case captures the units of pain, the business measures,
the performance and the gaps. If this template does not seem to be clicking use your own or just free
form your Business Case ensuring that its well articulated and quantified.
Business Case Exercise
Ex ercise objective: To understand how to create a “ strong

strong”
business case.
1. Complete the business case template below to the best of your

ability.
During ________________________ , the ____________________ for

(Period of time for ba seline perform a nce) (Prima ry business mea sure)
_______________________ was ___________________ .

(A k ey business process) (Ba seline performa nce)
This gap of __________________________

(Business objective ta rget vs. ba seline)
from __________________ represents ____________ of cost impact

impact.
(Business objective) (Cost impa ct of ga p)
Using the Excel file ‘Define Templates.xls’, Business Case, perform this exercise.

50
Selecting Projects
What is a Project Charter?

The Charter expands
p on the Business Case,, it clarifies the projects
p j focus and measures of
project performance and is completed by the Six Sigma Belt.
Components:
• The Problem
• Project Scope
• Project Metrics
y & Secondary
• Primary y
• Graphical Display of Project Metrics
• Primary & Secondary
• Standard project information
• Project, Belt & Process Owner
names
• Start date & desired End date
• Division or Business Unit
• Supporting Master Black Belt
(Mentor)
• Team Members
The Project Charter is an important document – it is the initial communication of the project. The first
phases of the Six Sigma methodology are Define and Measure. These are known as
“Characterization” phases that focus primarily on understanding and measuring the problem at hand.
Th f
Therefore some off th
the information
i f ti ini the
th Project
P j t Charter,
Ch t such h as primary
i and
d secondary
d metrics,
ti can
change several times. By the time the Measure Phase is wrapping up the Project Charter should be in
its final form meaning defects and the metrics for measuring them are clear and agreed upon.
As you can see some of the information in the Project Charter is self explanatory, especially the first
section. We are going to focus on establishing the Problem Statement and determining Objective
Statement, scope and the primary and secondary metrics.
P j t Charter
Project Ch t - Definitions
D fi iti
• Problem Sta tement - Articulates the pain of the defect or error in the
process.
• O bjective Sta tement – States how much of an improvement is desired

from the project.
• Scope – Articulates the boundaries of the project.
• Prima ry M etric – The actual measure of the defect or error in the process.
• Seconda ry M etric(s) – Measures of potential consequences (+ / -) as a

result of changes in the process.
• Cha rts – Graphical displays of the Primary and Secondary Metrics over a
period of time.

51
Selecting Projects
Project Charter - Problem Statement
Migrate the Business Case into a Problem Statement

Statement…
First the Business

Case will serve as the
Problem Statement, as
the Belt learns more
about the process and
the defects that are
occurring.
Project Charter – Objective & Scope
Consider the following

for constructing your
Objective & Scope:
What represents a significant

improvement?
X amount of an increase in
yield
X amount of defect reduction
Use Framing Tools to establish
the initial scope
p
A project’s main objective is to solve a

problem! The area highlighted is for
articulating how much of a reduction or
improvement will yield a significant
impact to the process and business.
This is the starting point creating your

project’s Objective Statement.

52
Selecting Projects
Pareto Analysis
Assisting you in
determining what Pa reto Ana lysis:
inputs are having
the greatest • A bar graph used to arrange information in such a way that priorities for
process improvement can be established.
impact on your
process is the
Pareto Analysis
approach
approach.
• The 80-20 theory was first developed in 1906, by Italian economist,

Vilfredo Pareto, who observed an unequal distribution of wealth and power
in a relatively small proportion of the total population. Joseph M. Juran is
credited with adapting Pareto' s economic observations to business
applications.
The 80:20 Rule Examples
• 20% of the time expended produced 80% of the results
• 80% of your phone calls go to 20% of the names on your list
• 20% of the streets handle 80% of the traffic
• 80% of the meals in a restaurant come from 20% of the menu
• 20% of the paper has 80% of the news
• 80% of the news is in the first 20% of the article

Here are some
• 20% of the p
people
p cause 80% of the p
problems examples of the 80:20
Rule. Can you think of
• 20% of the features of an application are used 80% of the time any other examples?

53
Selecting Projects
Pareto Chart - Tool

Multi level Pareto Charts are used in a drill down fashion to get to root cause of the tallest bar
bar.
Level 1 Scrap
200000 100
80
150000
Level 2 60 Department
Percent
Cost
100000 180000
40
160000 100
50000 140000
20
80
120000
Percent
0 100000 0
60
Cost
Scrap A B C
Count
Percent
150000
73.2
30000
14.6
25000
80000
12.2 Level 3
Cum % 73.2 87.8 100.0 40 Part
60000
100000
100
40000
20
20000 80000
80
0 0
Department J M F W Other
60000
Percent
60
Count 95000 23000 19000 17500 5000
Cost
Percent 59.6 14.4 11.9 11.0 3.1
Cum % 59.6 74.0 85.9 96.9 40000
100.0 40
20000 20
0 0
Part Z101 Z876 X492
Count 75000 15000 5000
Percent 78.9 15.8 5.3
Cum % 78.9 94.7 100.0
The Pareto Charts are often referred to as levels. For instance the first graph is called the first level,
the next the second level and so on.
Start high and drill down. Let’s look at how we interpret this and what it means.
Let’s look at the following example.

Level 2
Department
By drilling down from the first level we see 180000
that Department J makes up approximately 160000 100
140000
60% of the scrapp and p
part Z101 makes up p 120000
80
80% of Dept J’s scrap.
Percent
100000 60
Cost
80000
See how we are creating focus and 60000 40
establishing a line of sight? 40000

20
20000
0 0
You many be eager to jump into trying to fix Department
Count
J
95000
M
23000
F
19000
W
17500
Other
5000
the problem once you have identified it, BE Percent

Cum %
59.6
59.6
14.4
74.0
11.9
85.9
11.0
96.9
3.1
100.0
CAREFUL. This is what causes rework and

defects in the first place. Level 3
Part
100000
100
Follow the methodology, be patient and you 80000
will eventually be led to a solution. 80
60000
Percent
60
Cost
40000 40
20000 20
0 0
Part Z101 Z876 X492
Count 75000 15000 5000
Percent 78.9 15.8 5.3
Cum % 78.9 94.7 100.0

54
Selecting Projects
Pareto Chart - Example
Open MINITABTM and select • Use the “Call

Call Center
Center.mtw
mtw”
Pareto Analysis as shown above worksheet to create a Pareto
What would you do with this Pareto?
When your Pareto shows up

like this your focus is probably Pareto Chart of FAILURE MODE
too broad. 3000 100

2500
80
A good indication of having too 2000
Percent
Count
60
broad of a focus is when your 1500
40
1000
Pareto looks flat. It’s telling
500 20
you that there is no one or two
0 0
inputs that are impacting your FAILURE MODE
ED S L S
LL VE CT NS FO
process. Multiple inputs are O UT CA LE LI O W
T
IN
L F
R D IL N KD S
having similar effects. LY PE SK CO EA CU
CT R OP R CY BR ER
E D O LI S P
RR T
PO M O
CO RA M PR
You need to reduce the scope IN PE C O IM
O T
IN
of the project to get to a more
Count 495 489 478 472 468 455
granular level. Percent 17.3 17.1 16.7 16.5 16.4 15.9
Cum % 17.3 34.4 51.2 67.7 84.1 100.0

55
Selecting Projects
Pareto Chart – Example (Cont.)

Let’s look at the problem a little differently…
- Using a higher level scope for the first Pareto may help in providing focus.
- Create another Pareto as shown below.
This g
gives a better p
picture of which p
product category
g yp produces the highest
g defect count.
Pareto Chart of PRODUCT CATAGORIES

2500
100
2000
80
1500
Percent
60
Count
1000 40
500 20
0 0
PRODUCT CATAGORIES r
US ND US ND he
-B -I -B -I Ot
N
UM UM EE
N
EE
IN IN GR
A T A T GR
PL PL
Count 1238 450 362 201 106
Percent 52.5 19.1 15.4 8.5 4.5
Cum % 52.5 71.6 87.0 95.5 100.0
Now we’ve got something to work with. Notice the 80% area…. draw a line from the 80%
mark
k across to
t the
th cumulative
l ti percentt line
li (R
(Red
d Li
Line)) iin th
the graph
h as shown
h h
here.
Which cards create the highest Defect Rates?
Now you are beginning to see what needs work to improve the performance of your project.

56
Selecting Projects
Pareto Chart – Example (cont.)
N ow tha t w e ha ve more of a focus a rea , drill dow n

one more level.
– This cha rt w ill only use the cla ssifica tions w ithin the first
ba r on the previous cha rt.
– Crea te a nother Pa reto w hich w ill drill dow n to the

ca tegories w ithin the Ca rd type from the previous
Pa reto.
Remember to keep focused on finding the biggest bang for the buck
buck.
N ow w ha t, w e’ve got ourselves a nother

“ Fla teto” …
Pareto
ParetoChart
Chartof
ofTRAVEL
TRAVEL
1400
1400
100
100
1200
1200
1000
1000 80
80
Percent
800
Percent
800
Count
Count
60
60
600
600
40
40
400
400
20
20
200
200
00 00
TRAVEL
TRAVEL CAR
CAR HOTEL
HOTEL AIR
AIR
Count
Count 428
428 420
420 390
390
Percent
Percent 34.6
34.6 33.9
33.9 31.5
31.5
Cum
Cum% % 34.6
34.6 68.5
68.5 100.0
100.0
Essentia lly this tells us tha t there in no clea r

direction w ithin the Pla tinum Business Accounts.
This does not mean there is NO opportunity for improvements to be had, simply means nothing
obvious is sticking out at this level.
So keep looking.

57
Selecting Projects
Project Charter – Primary Metric
Moving
M i on tto ththe nextt E bli
Esta blishing
hi the
h Prim
P i a ry M etric:
i
element of the Project The primary metric is a
Charter…, Using the very important measure in
Excel file ‘Define
the Six Sigma project, this
Templates.xls’,
Project Charter,
m etric is a qua ntified
perform the following m ea sure of the defect
exercise: or prima ry issue of
the project.
Since we will be
narrowing in on the W e can only have One
defect thru the Primary metric, recall the
Measure Phase it is equation y equals f of x,
common for the well, once your defect is
primary metric to
– Quantified measure of the defect located then Y will be your
change several times
– Serves as the indicator of project success defect…your primary
while we struggle to
understand what is – Links to the KPI or Primary Business measure metric will measure it.
happening in our – Only one primary metric per project
process of interest.
The primary metric also serves as the gauge for when we can claim victory with the project.
Project Charter – Secondary Metrics

Consider a project
focused on improving Esta blishing Seconda ry M etric(s):
duration of call times
(cycle time) in a call
Secondary metrics are put in
center. If we realize a
reduction in call time place to measure potential
you would want to changes that may occur as a
know if anything else result of making changes to
was effected. our Primary Metric.
Think about it…did They will Measure ancillary

overtime increase / g in the process, both
changes
reduce, did labor positive and negative.
increase / reduce, what
happened to customer
satisfaction ratings?
These are all things
that should be – Measures positive & negative consequences as a result of changes in the
measured in order to process
accurately capture the – Can have multiple secondary metrics
true effect of the
improvement.

58
Selecting Projects
Project Charter – Metric Charts

The Project Charter
template includes the Genera ting Cha rts:
graphing capabilities
shown here. It is OK Primary and Secondary
to not use this Metrics should be
template but in any continually measured and
case ensure you are frequently updated during
regularly measuring the p
projects
j lifecycle.
y
the critical metrics.
Use them as your gauge
of Project Success and
Status. This is where your
Project’s progress will be
apparent.
– Displays Primary and Secondary metrics over time

– Should be updated regularly throughout the life of the project
– One for primary metric and one for each of the secondary metrics
– Typically time series plots
Project Charter Exercise
Using the Excel

file ‘Define Ex ercise objective: To begin planning the Project Charter
Templates.xls’, deliverable.
Project
j Charter, 1 Complete the Project Charter template to the best of your
1.
perform this ability.
exercise.
2. Be prepared to present your Stakeholder Analysis to your
mentor.
Project Charter Template.xls

59
Selecting Projects
What is the Financial Evaluation?
The fina ncia l eva lua tion esta blishes the va lue of the project.
The components are:

– Impact OK, let’s add it up!
• Sustainable
• One-off
– Allocations
All ti
• Cost Codes / Accounting System
– Forecast
• Cash flow
• Realization schedule
Typically a financial representative is responsible for evaluating the financial

impact of the project. The Belt works in coordination to facilitate the proper
information.
Standard financial principals should be followed at the beginning and end of the project to provide a
true measure of the improvement’s effect on the organization.
A financial representative of the firm should establish guidelines on how savings will be calculated
throughout the Six Sigma deployment.
Benefits Capture - Calculation “Template”
Whatever your
organization’s W ha tever your orga niza tion’s protocol ma y be these a spects
protocol may be should be a ccounted for w ithin a ny improvement project.
these aspects
should be There are two types of
accounted for I
Impact, One Off &
within any
M
P
A
Sustainable Impact “One-Off” Impact Sustainable
C
i
improvement t T
Cost Codes allocate the

project.
C
impact to the
O
S appropriate area in the
T Reduced Increased Implemen-
C
O
Costs Revenue
Costs
tation
Capital
“ Books”
D
E
S
Forecasts allow for

F
proper management of
O
R
E
Realization Schedule
(C h Fl
(Cash Flow)) projects and resources
C
A
S
T By Period
(i.e. Q1,Q2,Q3,Q4)

60
Selecting Projects
Benefits Capture - Basic Guidelines
• Benefits should be ca lcula ted on the ba seline of k ey

business process performa nce tha t rela te to a business
mea sure or KPI(s).
• The Project M ea sure (Prima ry M etric) ha s to ha ve a

direct link betw een the process a nd it KPI(s).
• Goa ls ha ve to be defined rea listica lly to a void under

or over setting.
• Benefits should be a nnua lized.
• B
Benefits
fit should
h ld be
b m ea sured
d in
i a ccorda
d nce w ith
Genera lly Accepted Accounting Principles (GAAP).
When calculating project benefits you should follow these steps.
Benefits Capture - Categorization

Here is an example of how to categorize your project’s impact.
A
• Projects directly impact the Income Statement or Cash Flow
Statement.
B
• Projects impact the Balance Sheet (working capital).
C• Projects avoid expense or investment due to known or expected

events in the future (Cost avoidance).
D• Projects are risk management, insurance, Safety, Health,

Environment and Community related projects which prevent or
reduce severity of unpredictable events.
You don’t want to take this one home!

61
Selecting Projects
Benefits Calculation Involvement & Responsibility
Project Selection D-M -A-I-C Implementa tion 6 M onth Audit
Financial Financial Financial Financial

Representative
p Representative
p Representative
p Representative
Champion Black Belt Champion Process Owner

& &
Process Owner Process Owner
It is highly recommended that you follow the involvement governance shown here.
B
Benefits
fit CCapture
t - Summary
S
• Performa nce tra ck ing for Six Sigma Projects should

use the sa me discipline tha t w ould be used for
tra ck ing a ny other high-profile projects.
• The A-B-C-D ca tegories ca n be used to illustra te the

im pa ct of your project or a “ portfolio” of projects.
• Esta blish The Governess Grid for Responsibility &

Involvement.
It’s a wrap!
Just some recommendations to consider when running your projects or program.

62
Selecting Projects
Benefits Calculation Template
The Benefits Calculation Template facilitates and aligns with the aspects discussed for Project
Accounting.
The Excel file ‘Define Templates.xls’, BENEFITS CALCULATION TEMPLATE.

63
Selecting Projects
Understand the various approaches to project selection
Articulate the benefits of a “Structured Approach”
Refine and Define the business problem into a Project

Charter to display critical aspects of an improvement
project
Make initial financial impact estimate
You have now completed Define Phase – Selecting Projects.
Notes

64
Lean Six Sigma

Black Belt Training
Define Phase
Elements of Waste
Now we will continue in the Define Phase with “Elements of Waste”.

65
Elements of Waste
Overview

of this phase are the 7 Understa nding Six Sigm a
components of waste
and 5S.
Six Sigm a Funda m enta ls
We will examine the
meaning of each of
these and show you
how to apply them
them. Selecting Projects
77 Com
Components
ponents of
of W
W aaste
ste
55S
S
Definition of Lean
“ Lean Enterprise is based on the premise that anywhere

work is being done, waste is being generated.
The Lean Enterprise seeks to organize its processes to the

optimum level, through the continual focus on the
identification and elimination of waste.”
-- Ba rba ra W hea t

66
Elements of Waste
Lean – History
1885 1913 1955 - 1990 1993 -

Craft Production Mass Production Toyota Production Lean Enterprise
- Machine then harden - Part inter-changeability System - "Lean" applied to all
- Fit on assembly - Moving production line - Worker as problem functions in enterprise
- Customization - Production engineering solver value stream
- Highly skilled workforce - "Workers don't like to - Worker as process - Optimization of value
- Low production rates think" owner enabled by: delivered to all
- High Cost - Unskilled labor -- Training stakeholders and
- High production rates -- Upstream quality enterprises in value chain
- Low cost -- Minimal inventory - Low cost
- Persistent quality J t i ti
-- Just-in-time - Improving productivity
problems - Eliminate waste - High quality product
- Inflexible models - Responsive to change - Greater value for
- Low cost stakeholders
- Improving productivity
- High quality product
Lean Manufacturing has been going on for a very long time, however the phrase is credited to
James Womac in 1990. A small list of accomplishments are noted in the slide above primarily
focused on higher volume manufacturing.
Lean Six Sigma

The essence of Lean is to Lea n/ Six Sigma combines the strengths of ea ch system:
concentrate effort on removing • Lea n • Six Sigma
waste while improving process – Guiding principles based – Focus on voice of the customer
flow to achieve speed and agility operating system – Data and fact based decision
at lower cost. The focus of Lean – Relentless elimination of all making
is to increase the percentage of waste – Variation
V i ti reduction
d ti tto near
value-added work performed by – Creation of process flow and perfection levels
a company. Lean recognizes demand pull – Analytical and statistical rigor
– Resource optimization
that most businesses spend a
– Simple and visual
relatively small portion of their
energies on the true delivery of
value to a customer. While all Strength: Efficiency Strength: Effectiveness
companies are busy, it is
estimated for some companies An Ex trem ely Pow erful Combina tion!
that as little as 10% of their
time is spent on value-added work, meaning as much as 90% of time is allocated to non value-added
activities, or waste.
Forms of waste include: Wasted capital (inventory), wasted material (scrap), wasted time (cycle time),
wasted human effort (inefficiency, rework) and wasted energy (energy inefficiency). Lean is a
prescriptive methodology for relatively fast improvements across a variety of processes, from
administrative to manufacturing applications. Lean enables your company to identify waste where it
exists. It also provides the tools to make improvements on the spot.

67
Elements of Waste
Lean Six Sigma (cont.)

Lean focuses on what calls the Value Stream, the sequence of activities and work required to
produce a product or to provide a service. It is similar to a Linear Process Flow Map, but it
contains its own unique symbols and data. The Lean method is based on understanding how the
Value Stream is organized, how work is performed, which work is value added vs. non-value
added and what happens to products and services and information as they flow through the Value
Stream. Lean identifies and eliminates the barriers to efficient flow through simple, effective tools.
Lean removes many forms of waste so that Six Sigma can focus on eliminating variability.
Variation leads to defects
defects, which is a major source of waste
waste. Six Sigma is a method to make
processes more capable through the reduction of variation. Thus the symbiotic relationship
between the two methodologies.
Project Requirements for Lean
• Perhaps one of the most criminal employee performance issues in

today’s organizations is generated not by a desire to cheat one’s
employer but rather by a lack of regard to waste.
• In every work environment there are multiple opportunities for
reducing the non-value added activities that have (over time) become
an ingrained part of the standard operating procedure.
• These non-value added activities have become so ingrained in our
process that they are no longer recognized for what they are,
W ASTE.
• w a ste (v.) Anything other than the minimum amount of time,
material, people, space, energy, etc needed to add value to the
product or service you are providing.
• The Japanese word for waste is muda .
Get that stuff

outta here!
Employees at some level have been de-sensitized

de sensitized to waste: “That’s
That s what we
we’ve
ve always done.”
done.
Lean brings these opportunities for savings back into focus with specific approaches to finding
and eliminating waste.

68
Elements of Waste
Seven Components of Waste
M uda is cla ssified into seven components:

– Overproduction
– Correction (defects)
– Inventory
– Motion
– Overprocessing
– Conveyance
– W aiting
Sometimes a dditiona l forms of muda a re a dded:
– Under use of talent
– Lack
L k off safety
f t
Being Lea n mea ns elimina ting w a ste.
Overproduction
Overproduction is producing more than the next step needs or more

than the customer buys.
– It may be the worst form of waste because it contributes to all
the others.
Examples are:
9Preparing extra reports
9Reports not acted upon or

even read
9Multiple copies in data storage
ordering materials
9Over-ordering
9Over
9Duplication of effort/reports
Waste of Overproduction relates to the excessive

accumulation of work-in-process (WIP) or finished
goods inventory.
P d i more parts
Producing t th
than necessary to
t satisfy
ti f the
th customer’s
t ’ quantity
tit demand
d d thus
th leading
l di tto
idle capital invested in inventory.
Producing parts at a rate faster than required such that a work-in-process queue is created –
again, idle capital.

69
Elements of Waste
Correction
Correction or defects are as obvious as they sound

sound.
Examples are:
9Incorrect data entry
9Paying the wrong vendor
9Misspelled words in
communications
9Making bad product
9Materials or labor discarded

Eliminate erors!! during production
Waste of Correction includes the waste of handling

and fixing mistakes. This is common in both
manufacturing and transactional settings.
Correcting or repairing a defect in materials or parts adds unnecessary costs because of

additional equipment and labor expenses. An example is the labor cost of scheduling employees
to work overtime to rework defects.
Inventory
Inventory is the liability of materials that are bought, invested in and

not immediately sold or used.
Examples are:
9Transactions not processed
9Bigger “in box” than “out

box”
9Over-ordering materials
consumed in-house
9Over-ordering raw materials

– just in case
Waste of Inventory is identical to overproduction except

that it refers to the waste of acquiring raw material before
the exact moment that it is needed.
Inventory is a drain on an organization’s overhead. The greater the inventory, the higher the
overhead costs become. If quality issues arise and inventory is not minimized, defective material
i hidd
is hidden iin fifinished
i h d goods.
d
To remain flexible to customer requirements and to control product variation, we must minimize
inventory. Excess inventory masks unacceptable change-over times, excessive downtime,
operator inefficiency and a lack of organizational sense of urgency to produce product.

70
Elements of Waste
Motion
M ti is
Motion i the
th unnecessary movementt off people
l and
d equipment.
i t
– This includes looking for things like documents or parts as well as
movement that is straining.
Examples are:
9Extra steps
9Extra data entry
9Having to look
for something
Waste of Motion examines how people

move to ensure that value is added.
Any movement of people or machinery that does not contribute added value to the product, i.e.
programming delay times and excessive walking distance between operations.
Overprocessing
Overprocessing is tasks, activities and materials that don’t add value.

– Can be caused by poor product or tool design as well as from not
understanding what the customer wants.
Examples are:
9Sign-offs
9Reports that contain more

information than the
customer wants or needs
9Communications, reports
9Communications reports,
emails, contracts, etc that
contain more than the
necessary points (briefer is
Waste of Over-processing relates to better)
over-processing anything that may not
be adding value in the eyes of the 9Voice mails that are too
customer. long
Processing work that has no connection to advancing the line or improving the quality of the
product. Examples include typing memos that could be had written or painting components or
fixtures internal to the equipment.

71
Elements of Waste
Conveyance
Conveyance is the unnecessary movement of material and Goods.

– Steps in a process should be located close to each other so
movement is minimized.
Examples are:
9Extra steps in the

process
9Distance traveled
9Moving paper from

place to place
Waste of Conveyance is the movement of material.
Conveyance is incidental, required action that does not directly contribute value to the product.
Perhaps it must be moved however, the time and expense incurred does not produce product or
service characteristics that customers see.
It’s vital to avoid conveyance unless it is supplying items when and where they are needed (i.e.
just-in-time delivery).
Waiting
Waiting is nonproductive time due to lack of material, people, or

equipment.
– C b
Can be d
due tto slow
l or b
broken
k machines,
hi material
t i l nott arriving
i i on ti
time,
etc.
Examples are:
9Processing once each month

instead of as the work comes in
9Showing up on time for a

meeting that starts late
9Delayed work due to lack of

communication from another
internal group
Waste of Waiting is the cost of an idle resource.
Idle time between operations or events, i.e. an employee waiting for machine cycle to finish or a
machine waiting for the operator to load new parts.

72
Elements of Waste
Waste Identification Exercise
Ex ercise objective: To identify waste that occurs in

your processes.
W rite an example of each type of muda below:
– Overproduction
p ___________________
– Correction ___________________
– Inventory ___________________
– Motion ___________________
– Overprocessing ___________________
– Conveyance
y ___________________
– W aiting ___________________
Notes

73
Elements of Waste
5S – The Basics
5S is a process designed to organize the workplace, keep it neat and

clean, maintain standardized conditions, and instill the discipline
required to enable each individual to achieve and maintain a world
class work environment.
Seiri - Put things in order
Seiton - Proper Arrangement
Seiso – Clean
Seiketsu – Purity
Shitsuke - Commitment
The term “5S” derives from the Japanese words for five practices leading to a clean and manageable
work area. The five “S” are:
‘Seiri' means to separate needed tools, parts and instructions from unneeded materials and to
remove the latter.
'Seiton' means to neatly arrange and identify parts and tools for ease of use.
'Seiso' means to conduct a cleanup campaign.
'Seiketsu'
Seiketsu means to conduct seiri, seiton and seiso at frequent, indeed daily, intervals to maintain a
workplace in perfect condition.
'Shitsuke' means to form the habit of always following the first four S’s.
Simply put, 5S means the workplace is clean, there is a place for everything and everything is in its
place. The 5S will create a work place that is suitable for and will stimulate high quality and high
productivity work. Additionally it will make the workplace more comfortable and a place of which you
can be proud.
Developed in Japan, this method assume no effective and quality job can be done without clean and
safe environment and without behavioral rules.
The 5S approach allows you to set up a well adapted and functional work environment, ruled by
simple yet effective rules. 5S deployment is done in a logical and progressive way. The first three S’s
are workplace actions, while the last two are sustaining and progress actions.
It is recommended to start implementing 5S in a well chosen pilot workspace or pilot process and
spread to the others step by step.

74
Elements of Waste
English Translation
There have been many attempts to force 5 English “S”S words to maintain the original intent of 5S
from Japanese. Listed below are typical English words used to translate:
1. Sort (Seiri)
2. Straighten or Systematically Arrange (Seiton)
3. Shine or Spic and Span (Seiso)
4. Standardize (Seiketsu)
5. Sustain or Self-Discipline (Shitsuke)
Place things in such a

way that they can be
easily reached
whenever they are
needed
Straighten
Sort Shine
5S
Identify necessary items and Visual sweep of areas,
remove unnecessary ones, use eliminate dirt, dust and
time management scrap. Make workplace
shine.
Self-Discipline
Standardize
Make 5S strong in
Work to standards,
habit. Make
maintain standards,
problems appear and
wear safety equipment.
solve them.
Regardless of which “S” words you use, the intent is clear: Organize the workplace, keep it neat and
clean, maintain standardized conditions and instill the discipline required to enable each individual to
achieve and maintain a world class work environment.

75
Elements of Waste
5S Exercise
Ex ercise objective: : To identify elements of 5S in

your workplace.
W rite an example for each of the 5S’s below:
• Sortt
S ____________________
• Straighten ____________________
• Shine ____________________
• Standardize ____________________
• Self-Discipline ____________________
Notes

76
Elements of Waste
Describe 5S
Identify and describe the 7 Elements of Waste
Provide examples of how Lean Principles can affect your area
You have now completed Define Phase – Elements of Waste.
Notes

77
Lean Six Sigma

Black Belt Training
Define Phase
Wrap Up and Action Items
Now we will conclude the Define Phase with “Wrap Up and Action Items”.

78
Define Phase Overview—The Goal
The goa l of the Define Pha se is to:
• Identify a process to improve and develop a specific Six Sigma

project.
– Six Sigma Belts define critical processes
processes, sub-processes and
elaborate the decision points in those processes.
• Define is the “ contract” phase of the project. W e are determining

exactly what we intend to work on and estimating the impact to
the business.
• At the completion of define you should have a description of the

process defect that is creating waste for the business.
Define Action Items
At this point you should a ll understa nd w ha t is

necessa ry to complete these a ction items a ssocia ted
w ith Define.
Define
– Charter Benefits Analysis

– Team Members
– Process Map – high level
– Primary Metric
– Secondary Metric(s)
– Lean Opportunities
– Stakeholder Analysis
Deliver
– Project Plan
– Issues and Barriers
the
Goods!
79
Six Sigma Behaviors
• Being tenacious, courageous
• Being rigorous, disciplined
• Making data-based
data based decisions
• Embracing change & continuous learning Walk

• Sharing best practices
the
Walk!
Ea
Each
ch ““pla
player ” in
y er” in the
the Six
Six Sigm
Sigmaa process
process must
m ust be
be
AA RO LE M O DEL
RO LE M O DEL
for
for the
the Six
Six Sigm
Sigmaa culture.
culture.
Define Phase — The Roadblocks
Look for the potential roadblocks and plan to address them before
they become problems:
– N o historical data exists to support the project.
– Team members do not have the time to collect data.
– Data presented is the best guess by functional managers.
– Data is communicated from poor systems.
– The project is scoped too broadly.
– The team creates the “ ideal
ideal” Process Map rather than the “ as
is” Process Map.
Clear the road –

I’m comin’
through!

80
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

x’s
Analyze
Prove/Disprove Impact X’
x’s
s Have On Problem
Identify, Prioritize, Select Solutions Control or Eliminate X’s

x’s Causing Problems
Improve
Implement Solutions to Control or Eliminate X’s

x’s Causing Problems
Control
Implement Control Plan to Ensure Problem Doesn

Does ’Not
t Return
Return
Define Phase Deployment

The importance of the Define
Phase is to begin to understand Business Case
Selected
the problem and formulate it into a
project Notice that if the
project.
N otify Belts and Stakeholders
Recommended Project Focus is
approved the next step would be
Create High-Level Process Map
team selection.
(Pareto, Project Desirability)
Define & Charter Project

(Problem Statement, Objective, Primary Metric, Secondary Metric)
N Estimate COPQ
Approved
Project Recommend Project Focus
Focus
Y
Create Team
Charter Team
Ready for Measure

81
Action Items Support List
Define Questions
Step One: Project Selection, Project Definition And Stakeholder Identification
Project Charter
• What is the problem statement? Objective?
• Is the business case developed?
• What is the primary metric?
• What are the secondary metrics?
• Why did you choose these?
• What are the benefits?
• Have the benefits been quantified? It not, when will this be done?
Date:____________________________
• Who is the customer (internal/external)?
• Has the COPQ been identified?
• Has the controller’s office been involved in these calculations?
• Who are the members on your team?
• Does anyone require additional training to be fully effective on the team?
Voice of the Customer (VOC) and SIPOC defined
• Voice of the customer identified?
• Key issues with stakeholders identified?
• VOC requirements identified?
• Business Case data gathered, verified and displayed?
Step Two: Process Exploration
Processes Defined and High Level Process Map
• Are the critical processes defined and decision points identified?
• Are all the key attributes of the process defined?
• Do you have a high level process map?
• Who was as in
involved
ol ed in its de
development?
elopment?
General Questions
• Are there any issues/barriers that prevent you from completing this phase?
• Do you have adequate resources to complete the project?
• Have you completed your initial Define report out presentation?
These are some additional questions to ensure all the deliverables are achieved.

82
At this point, you should:
Have a clear understanding of the specific action items
Have started to develop a project plan to complete the action items
Have identified ways to deal with potential roadblocks
Be ready to apply the Six Sigma method within your business
You have now completed Define Phase.
Notes

86
Lean Six Sigma

Black Belt Training
Measure Phase
Welcome to Measure
Now that we have completed the Define Phase we are going to jump into the Measure Phase.
Here you enter

H t ththe world
ld off measurement,
t where
h you can didiscover th
the ultimate
lti t source off
problem-solving power: data. Process improvement is all about narrowing down to the vital few
factors that influence the behavior of a system or a process. The only way to do this is to
measure and observe your process characteristics and your critical-to-quality characteristics.
Measurement is generally the most difficult and time-consuming phase in the DMAIC
methodology. But if you do it well, and right the first time, you will save your self a lot of trouble
later and maximize your chance of improvement.
Welcome to the Measure Phase - will give you a brief look at the topics we are going to cover.

87
Welcome to Measure
Overview
These are the modules

we will cover in the Welcome
Welcome to
to Measure
Measure
Measure Phase.
Process
Process Discovery
Discovery
Six
Six Sigma
Sigma Statistics
Statistics
Measurement
Measurement System
System Analysis
Analysis
Process
Process Capability
Capability
Wrap
Wrap Up
Up &
& Action
Action Items
Items
DMAIC Roadmap
Process Owner
Champion/
D t
Determine
i A Appropriate
i t PProject
j tF Focus
Define
Estimate COPQ
Establish Team
Measure

alyze
Ana
Prove/ Disprove Impact X’s

X s Have On Problem
Improve
Identify, Prioritize, Select Solutions Control or Eliminate X’s Causing Problems

Control
Implement Control Plan to Ensure Problem Doesn’t Return
Verify
y Financial Impact
p
Here is the overview of the DMAIC process. Within Measure we are going to start getting into details about
process performance, measurement systems and variable prioritization.

88
Welcome to Measure
Measure Phase Deployment
Detailed Problem Statement Determined
Detailed Process Mapping
Identify All Process X’s Causing Problems (Fishbone, Process Map)
Select the Vital Few X’s Causing Problems (X-Y Matrix, FMEA)
Assess Measurement System
Y
Repeatable &
Reproducible?
N
Implement Changes to Make System Acceptable
Assess Stability (Statistical Control)
Assess Capability (Problem with Centering/Spread)
Estimate Process Sigma Level
Review Progress with Champion
Ready for Analyze
This provides a process look at putting “Measure” to work. By the time we complete this phase you
will have a thorough understanding of the various Measure Phase concepts
concepts.

89
Lean Six Sigma

Black Belt Training
Measure Phase
Process Discovery
Now we will continue in the Measure Phase with “Process Discovery”.

90
Process Discovery
Overview
Welcome
Welcome to
to Measure
Measure
Process
Process Discovery
Discovery
Cause
Cause &
& Effect
Effect Diagram
Diagram
Detailed
Detailed Process
Process Mapping
Mapping
Cause
Cause and
and Effect
Effect Diagrams
Diagrams
FMEA
FMEA
Six
Six Sigma
Sigma Statistics
Statistics
Measurement
Measurement System
System Analysis
Analysis
Process
Process Capability
Capability
Wrap
Wrap Up
Up &
& Action
Action Items
Items
The purpose of this module is highlighted above. We will review tools to help facilitate Process
Discovery.
This will be a lengthy step as it requires a full characterization of your selected process
process.
There are four key deliverables from the Measure Phase:

(1) A robust description of the process and its workflow
(2) A quantitative assessment of how well the process is actually working
(3) An assessment of any measurement systems used to gather data for making decisions or to
describe the performance of the process
(4) A “short” list of the potential causes of our problem, these are the X’s that are most likely
related to the problem
problem.
On the next lesson page we will help you develop a visual and mental model that will give you
leverage in finding the causes to any problem..

91
Process Discovery
Overview of Brainstorming Techniques
Cause and Effect Diagram

People Machine Method
The Y
The or
Problem
The X’s Problem
Condition
(Causes)
l
Material Measurement Environment Categories
You will need to use brainstorming techniques to identify all possible problems and their causes.
Brainstorming techniques work because the knowledge and ideas of two or more persons is
always greater than that of any one individual.
Brainstorming will generate a large number of ideas or possibilities in a relatively short time.
Brainstorming tools are meant for teams
teams, but can be used at the individual level also
also.
Brainstorming will be a primary input for other improvement and analytical tools that you will use.
You will learn two excellent brainstorming techniques, cause and effect diagrams and affinity
diagrams. Cause and effect diagrams are also called Fishbone Diagrams because of their
appearance and sometimes called Ishikawa diagrams after their inventor.
In a brainstorming session, ideas are expressed by the individuals in the session and written down
without debate or challenge
challenge. The general steps of a brainstorming sessions are:
1. Agree on the category or condition to be considered.

2. Encourage each team member to contribute.
3. Discourage debates or criticism, the intent is to generate ideas and
not to qualify them, that will come later.
4. Contribute in rotation (take turns), or free flow, ensure every member
has an equal opportunity.
5. Listen to and respect the ideas of others.
6. Record all ideas generated about the subject.
7. Continue until no more ideas are offered.
8. Edit the list for clarity and duplicates.

92
Process Discovery
Cause and Effect Diagram A commonly used tool

People Machine Method
to solicit ideas by
using categories to
The Y stimulate cause and
The X’s
The or
Problem
Problem
Condition
effect relationship with
(Causes) a problem. It uses
verbal inputs in a team
l
Material Measurement Environment Categories environment.
Products Categories for the legs of the Transactional

– Measurement diagram can use templates – People
– People for products or transactional – Policy
– Method symptoms Or you can select
symptoms. – Procedure
– Materials the categories by process – Place
– Equipment step or what you deem – Measurement
– Environment appropriate for the situation. – Environment
A cause and effect diagram is a composition of lines and words representing a meaningful
relationship between an effect,
effect or condition
condition, and its causes
causes. To focus the effort and facilitate thought
thought,
the legs of the diagram are given categorical headings. Two common templates for the headings are
for product related and transactional related efforts. Transactional is meant for processes where
there is no traditional or physical product; rather it is more like an administrative process.
Transactional processes are characterized as processes dealing with forms, ideas, people,
decisions and services. You would most likely use the product template for determining the cause of
burnt pizza and use the transactional template if you were trying to reduce order defects from the
order taking process
process. A third approach is to identify all categories as you best perceive them
them.
When performing a cause and effect diagram, keep drilling down, always asking why, until you find
the root causes of the problem. Start with one category and stay with it until you have exhausted all
possible inputs and then move to the next category. The next step is to rank each potential cause by
its likelihood of being the root cause. Rank it by the most likely as a 1, second most likely as a 2 and
so on. This make take some time, you may even have to create sub-sections like 2a, 2b, 2c, etc.
Then come back to reorder the sub-section in to the larger ranking. This is your first attempt at really
finding the Y=f(X); remember the funnel? The top X’s have the potential to be the Critical X’s, those
X’s which exert the most influence on the output Y.
Finally you will need to determine if each cause is a control or a Noise factor. This as you know is a
requirement for the characterization of the process. Next we will explain the meaning and methods
of using some of the common categories.
There may be several interpretations of some of the Process Mapping symbols; however, just about
everyone uses these primary symbols to document processes. As you become more practiced you
will find additional symbols useful, i.e. reports, data storage etc. For now we will start with just these
symbols.

93
Process Discovery
The Measurement category groups causes related to the measurement and

measuring of a process activity or output:
Examples of questions to ask:
• Is there a metric issue? Measurement
• Is there a valid measurement
system? Is the data good
enough?h?
Y
• Is data readily available?
The People category groups root causes related to people, staffing, and
organizations:
Examples
p of q
questions to ask: People
p
• Are people trained, do they
have the right skills?
• Is there person to person
Y
variation?
• Are people over - worked?
The Method category groups root causes related to how the work is done, the
way the process is actually conducted:
Examples
p of q
questions to ask: Method
• How is this performed?
• Are procedures correct?
• What might unusual? Y
The Materials category groups root causes related to parts, supplies, forms or
information needed to execute a process:

• Are bills of material current? Y
• Are parts or supplies obsolete?
• Are there defects in the materials
Materials

94
Process Discovery
The Equipment category groups root causes related to tools used in the process:
• Have machines been serviced recently,
what is the uptime?
• Have tools been properly maintained? Y
• Is there variation?
Equipment
The Environment (a.k.a. Mother Nature) category groups root causes related to
our work environment, market conditions, and regulatory issues.
• Is the workplace safe and
comfortable? Y
• Are outside regulations impacting the
business?
• Does the company culture aid the
process? Environment
Classifying the X’s
The Cause & Effect Diagram is simply a tool to generate opinions

about possible causes for defects.
For each of the X’s identified in the Fishbone diagram classify them
as follows:
– Controllable – C (Knowledge)
– Procedural – P (People, Systems)
– Noise – N (External or Uncontrollable)
Think of procedural as a subset of controllable. Unfortunately, many

procedures within a company are not well controlled and can cause
the defect level to go up. The classification methodology is used to
separate the X’s so they can be used in the X-Y Diagram and the
FMEA taught later in this module.
WHICH X’s
X s CAUSE DEFECTS?
The Cause and Effect Diagram is an organized way to approach brainstorming. This approach allows
us to further organize ourselves by classifying the X’s into controllable, procedural or noise types.

95
Process Discovery
Chemical Purity Example
Measurement Manpower Materials
Incoming QC (P) Training on method (P) Raw Materials (C)
Measurement Insufficient staff (C)

Method (P) Skill Level (P) Multiple Vendors (C)
Measurement
Capability (C) Adherence to procedure (P) S
Specifications
ifi ti (C)
Work order variability (N)
Chemical
Startup inspection (P) Room Humidity (N) Column Capability (C) Purity
Handling (P) RM Supply in Market (N) Nozzle type (C)
Purification Method (P) Shipping Methods (C) Temp controller (C)
Data collection/feedback
(P)
Methods Mother Nature Equipment
This example of the Cause and Effect Diagram is of chemical purity. Notice how the input variables for
each branch are classified as Controllable, Procedural and Noise.
Cause & Effect Diagram - MINITAB™
Below is a Cause & Effect Diagram for surface flaws. The next few
slides will demonstrate how to create it in MINITAB™.
The Fishbone Diagram shown here for surface flaws was generated in MINITAB™. We will now
review the various steps for creating a Cause and Effect Diagram using the MINITAB™
statistical software package.

96
Process Discovery
Cause & Effect Diagram - MINITAB™
Open the MINITAB™ Project “Measure Data Sets.mpj” and select the worksheet
Surfaceflaws.mtw.
Open the MINITAB™ worksheet “Surfaceflaws.mtw”.
Take a few moments to study the worksheet. Notice the first 6 columns are the classic bones for a
Fishbone. Each subsequent column is labeled for one of the X’s listed in one of the first six columns
and are the secondary bones
bones.
After you have entered the Labels, click on the first field under the “Causes” column to bring up the
list of branches on the left hand side. Next double-click the first branch name on the left hand side to
move “C1 Man” underneath “Causes”.

97
Process Discovery
Cause & Effect Diagram - MINITAB™ (cont.)
To continue identifying
the secondary
branches, select the
button, “Sub…” to the
right of the “Label”
column.
Click on the third field

under “Causes” to
bring up the list of
branches on the left
hand side.
Next double-click the

seventh branch name
on the left hand side to
move “C7 Training”
underneath “Causes”
then select “OK” and
repeat for each
remaining sub branch.
In order to adjust the Fishbone Diagram so the main causes titles are
not rolled grab the line with your mouse and move the entire bone.

98
Process Discovery
Cause & Effect Diagram Exercise
Exercise objective: Create a Fishbone Diagram.
1. Retrieve the high level Process Map for your project

and use it to complete a Fishbone, if possible include
your project team.
Don ’t let the

big one get
away!

99
Process Discovery
Overview of Process Mapping
In order to correctly m a na ge a process

process, you
m ust be a ble to describe it in a w a y tha t ca n be
ea sily understood.
– The preferred method for describing a process is
to identify it with a generic name, show the
workflow with a Process Map and describe its
purpose with an operational description.
– The
Th fifirstt activity
ti it off the
th Measure
M Phase
Ph is
i to
t
adequately describe the process under
investigation.
ct
Sta rt Step A Step B Step C St
Step D Fi i h
Finish
e
sp
In
Process Mapping, also called flowcharting, is a technique to visualize the tasks, activities and steps
necessary to produce a product or a service. The preferred method for describing a process is to
identify it with a generic name, show the workflow with a Process Map and describe its purpose with
an operational description
description.
Remember that a process is a blending of inputs to produce some desired output. The intent of each
task, activity and step is to add value, as perceived by the customer, to the product or service we are
producing. You cannot discover if this is the case until you have adequately mapped the process.
There are many reasons for creating a Process Map:

- It helps all process members understand their part in the process and how their process fits into the
bigger picture
picture.
- It describes how activities are performed and how the work effort flows, it is a visual way of standing
above the process and watching how work is done. In fact, process maps can be easily uploaded into
model and simulation software where computers allow you to simulate the process and visually see
how it works.
- It can be used as an aid in training new people.
- It will show you where you can take measurements that will help you to run the process better.
- It will help
p yyou understand where pproblems occur and what some of the causes may y be.
- It leverages other analytical tools by providing a source of data and inputs into these tools.
- It identifies and leads you to many important characteristics you will need as you strive to make
improvements.
Individual maps developed by Process Members form the basis of Process Management. The
individual processes are linked together to see the total effort and flow for meeting business and
customer needs.
In order to improve or to correctly manage a process, you must be able to describe it in a way that
can be easily understood, that is why the first activity of the Measure Phase is to adequately describe
the process under investigation. Process Mapping is the most important and powerful tool you will
use to improve the effectiveness and efficiency of a process.

100
Process Discovery
Information from Process Mapping

These are more reasons
why Process Mapping is By mapping processes we can identify many important
the most important and characteristics and develop information for other analytical tools:
powerful tool you will
need to solve a problem. 1. Process inputs (X’s)
It has been said that Six 2. Supplier requirements
Sigma is the most 3. Process outputs (Y’s)
efficient problem solving 4. Actual customer needs
methodology
h d l available.
il bl 5
5. All value-added
l dd d andd non-value
l added
dd d process ttasks
k and
d steps
t
This is because work 6. Data collection points
done with one tool sets •Cycle times
up another tool, very little •Defects
information and work is •Inventory levels
wasted. Later you will •Cost of poor quality, etc.
learn to how to further 7. Decision points
use the information and 8. Problems that have immediate fixes
knowledge you gather 9. Process control needs
from Process Mapping.
Process Mapping
There are usually three views

Th
There are usually
ll th
three views
i off a process:
of a process: The first view is
“what you think the process
is” in terms of its size, how
1 2 3 work flows and how well the
process works. In virtually all
What you THINK it is.. What it ACTUALLY is.. What it SHOULD be..
cases the extent and difficulty
of performing the process is
understated.
d t t d
It is not until someone

Process Maps the process
that the full extent and
difficulty is known, and it
virtually is always larger than
what we thought, is more
difficult and it cost more to operate than we realize. It is here that we discover the hidden operations
also. This is the second view: “what the process actually is”.
Then there is the third view: “what it should be”. This is the result of process improvement activities. It
is precisely what you will be doing to the key process you have selected during the weeks between
classes. As a result of your project you will either have created the “what it should be” or will be well
on your way to getting there. In order to find the “what it should be” process, you have to learn
process mapping and literally “walk”
walk the process via a team method to document how it works. This is
a much easier task then you might suspect, as you will learn over the next several lessons.
We will start by reviewing the standard Process Mapping symbols.

101
Process Discovery
Standard Process Mapping Symbols
Standard symbols for process mapping (available in Microsoft

Office™, Visio™, iGrafx™ , SigmaFlow™ and other products):
A RECTANGLE indicates an A PARALLELAGRAM shows

activity. Statements within that there are data
the rectangle should begin
with a verb
A DIAMOND signifies a decision An ELLIPSE shows the start

point.
i t OOnly
l ttwo paths
th emerge from
f and end of the process
a decision point: No and Yes
An ARROW shows the A CIRCLE WITH A LETTER OR

1 NUMBER INSIDE symbolizes
connection and direction
th continuation
the ti ti off a
of flow
flowchart to another page
There may be several interpretations of some of the Process Mapping symbols; however, just
about everyone uses these primary symbols to document processes. As you become more
practiced you will find additional symbols useful, i.e. reports, data storage etc. For now we will
start with just these symbols.

102
Process Discovery
Process Mapping Levels
Levell 1 – The
L Th Macro
M Process
P Map,
M sometimes
ti called
ll d a
Management level or viewpoint.
Calls
Customer Take Make Cook Pizza Box Deliver Customer
for
Hungry Order Pizza Pizza Correct Pizza Pizza Eats
Order
Level 2 – The Process Map, sometimes called the Worker level or

viewpoint This example is from the perspective of the pizza chef
viewpoint.
Pizza
Dough
No
Take Order Add Place in Observe Check Yes Remove
from Cashier Ingredients Oven Frequently if Done from Oven 1
Start New
Pizza
Scrap
No
Tape
Pizza Place in Put on
1 Correct Box
Order on Delivery Rack
Yes Box
Level 3 – The Micro Process Map, sometimes called the Improvement

level or viewpoint. Similar to a level 2, it will show more steps and tasks
and on it will be various performance data; yields, cycle time, value and
non value added time, defects, etc.
Before Process Mapping starts, you have to learn about the different level of detail on a Process
Map and the different types of Process Maps. Fortunately these have been well categorized and
are easy to understand.
There are three different levels of Process Maps. You will need to use all three levels and you most
likely will use them in order from the macro map to the micro map. The macro map contains the
least level of detail, with increasing detail as you get to the micro map. You should think of and use
the level of Process Maps in a way similar to the way you would use road maps. For example, if
you want to find a country, you look at the world map. If you want to find a city in that country, you
look at the country map. If you want to find a street address in the city, you use a city map. This is
the general rule or approach for using Process Maps.
Thee Macro
ac o Process
ocess Map,ap, what
at iss called
ca ed tthe
e Level
e e 1 Map,
ap, sshows
o s ttheebbig
gppicture,
ctu e, you will use tthis
s to
orient yourself to the way a product or service is created. It will also help you to better see which
major step of the process is most likely related to the problem you have and it will put the various
processes that you are associated with in the context of the larger whole. A Level 1 PFM,
sometimes called the “management” level, is a high-level process map having the following
characteristics:
Combines related activities into one major processing step

Illustrates where/how the process fits into the big picture
Has minimal detail
Illustrates only major process steps
Can be completed with an understanding of general process steps and the
purpose/objective of the process

103
Process Discovery
Process Mapping Levels (cont.)
The next level is generically called the Process Map

Map. You will refer to it as a Level 2 Map and it
identifies the major process steps from the workers point of view. In the pizza example above,
these are the steps the pizza chef takes to make, cook and box the pizza for delivery. It gives you
a good idea of what is going on in this process, but could can you fully understand why the
process performs the way it does in terms of efficiency and effectiveness, could you improve the
process with the level of knowledge from this map?
Probably not, you are going to need a Level 3 Map called the Micro Process Map. It is also known
as the improvement view off a process. There is however a lot off value in the Level 2 Map,
because it is helping you to “see” and understand how work gets done, who does it, etc. It is a
necessary stepping stone to arriving at improved performance.
Next we will introduce the four different types of Process Maps. You will want to use different
types of Process Maps, to better help see, understand and communicate the way processes
behave.
Types of Process Maps
The Linear-Flow Process Map There are four

Calls
Customer
Hungry
for
Order
Take
Order
Make
Pizza
Cook
Pizza
Pizza
Correct
Box
Pizza
Deliver
Pizza
Customer
Eats types of Process
M
Maps that
th t you will
ill
As the name states, this diagram shows the process steps in a sequential flow, generally ordered
from an upper left corner of the map towards the right side. use. They are the
The Deployment-Flow or Swim Lane Process Map Linear Flow Map,
the deployment or
Customer
Customer Calls for Customer

Hungry Order Eats
Swim Lane Flow
Map, the S-I-P-0-C
Cashier
Take
Order
Map (pronounced
sipoc) and the
Cook
M k
Make C
Cookk Pizza Box
Pizza Pizza Correct Pizza
Value Stream
Map.
Deliverer
Deliver
Pizza
The value of the Swim Lane map is that is shows you who or which department is responsible for
While they all
the steps in a process. This can provide powerful insights in the way a process performs. A
timeline can be added to show how long it takes each group to perform their work. Also each
show how work
time work moves across a swim lane, there is a “Supplier – Customer” interaction. This is usually
where bottlenecks and queues form.
gets done, they
emphasize
different aspects of process flow and provide you with alternative ways to understand the
behavior of the process so you can do something about it. The Linear Flow Map is the most
traditional and is usually where most start the mapping effort.
The Swim Lane Map adds another dimension of knowledge to the picture of the process: Now
you can see which department area or person is responsible. You can use the various types of
maps in the form of any of the three levels of a Process Map.

104
Process Discovery
Process Maps – Examples for Different Processes
L in e a r P r o c e s s M a p fo r D o o r M a n u fa c tu r in g
B e g in P r e p d o o r s In s p e c t P r e -c le a n in g A
R e tu r n
fo r
r e w o r k
M a r k f o r d o o r
In s ta ll in to In s p e c t
A w o r k jig
L ig h t s a n d in g
f in is h
h a n d le B
d r illin g
R e w o r k
D e - b u r r a n d A p p ly p a r t M o v e t o
B D r ill h o le s
s m o o th h o le n u m b e r fin is h in g
C
S c r a t c h F in a l A p p ly s t a in
C In s p e c t In s p e c t E n d
r e p a ir c le a n in g a n d d r y
S c r a p
S w im L a n e P r o c e s s M a p fo r C a p ita l E q u ip
P r e p a r e
Business
D e fin e p a p e r w o r k R e v ie w &
R e c e iv e &
Unit
( C A A R & a p p r o v e
N e e d s in s t a lla tio n C A A R
u s e
r e q u e s t )
R e v ie w &
C o n f ig u r e
I.T.
a p p r o v e
& in s t a ll
s t a n d a r d
Finance
R e v ie w &
Is s u e
a p p r o v e
p a y m e n t
C A A R
Corporate
Top Mgt/
R e v ie w &
a p p r o v e
C A A R
Procurement
A c q u ir e
e q u ip m e n t
S u p p lie r S u p p lie r
Supplier
S h ip s P a id
2 1 d a y s 6 d a y s 1 5 d a y s 5 d a y s 1 7 d a y s 7 d a y s 7 1 d a y s 5 0 d a y s

The SIPOC diagram is The SIPOC “Supplier – Input – Process – Output – Customer”
especially useful after Process Map
you have been able to Suppliers Inputs Process O utputs Custom ers Requirem ents
construct either a Level 1 r ATT Phones

Ph r Pi
Pizza type r See Below r Pi
Price r C k
Cook r C
Complete
l callll < 3 min
i
r Office Depot r Size r Order confirmation r Accounting r Order to Cook < 1 minute
or Level 2 Map because r TI Calculators r Quantity r Bake order r Complete bake order
it facilitates your r N EC Cash Register r

r
Extra Toppings
Special orders
r
r
Data on cycle time
Order rate data
r
r
Correct bake order
Correct address
gathering of other r Drink types & quantities r Order transaction r Correct Price
r Other products r Delivery info
pertinent data that is r Phone number
affecting the process in a r

r
Address
N ame
systematic way. It will r Time, day and date
r Volume
help you to better see
and understand all of the Level 1 Process M a p for Custom er O rder Process
influences affecting the
Call for Answer W rite Confirm Sets Address Order to
behavior and an Order Phone Order Order Price & Phone Cook
performance of the
process. The SIPOC diagram is especially useful after you have been able to construct
either a Level 1 or Level 2 Map because it facilitates your gathering of other
You may also add a pertinent data that is affecting the process in a systematic way.
requirements section
to both the supplier side and the customer side to capture the expectations for the inputs and
the outputs of the process. Doing a SIPOC is a great building block to creating the Level 3
Micro Process Map. The two really compliment each other and give you the power to make
improvements to the process.

105
Process Discovery

The Value Stream Map p is a The Value Stream Map
specialized map that helps Process Steps
Log Route Disposition Cut Check Mail Delivery
you to understand -Computer
-1 Person
-Department
Assignments
-Guidelines
-1 Person
-Computer -Envelops
-Postage
Size of work queue or I I I I -Printer I
numerous performance inventory
-1 Person -1 Person -1 Person
1,700 2,450 1,840

metrics associated primarily Process Step
4,300 C/T = 15 sec
Uptime = 0.90
7,000 C/T = 75 sec
Uptime = 0.95
C/T = 255 sec
Uptime = 0.95
C/T = 15 sec
Uptime = 0.85
C/T = 100 sec
Uptime = 0.90
Hours = 8 Hours = 8 Hours = 8 Hours = 8 Hours = 8
with the speed of the Time Parameters Breaks = 0.5
Hours
Breaks = 0.5
Hours
Breaks = 0.5
Hours
Breaks = 0.5
Hours
Breaks = 0.5
Hours
Available =6.75 Available =7.13 Available =7.13 Available =6.38 Available =6.75
process, but has many Sec.
Avail. = 24,300
Sec.
Avail. = 25,650
Sec.
Avail. = 25,650
Sec.
Avail. = 22,950
Sec.
Avail. = 24,300
Step Processing Time
other important data. While Days of Work in 15 sec 75 sec 255 sec 15 sec 100 sec
thi P
this Process M Map llevell iis att queue 2.65 days 20.47 days 16.9 days 1.60 days 7.57 days
the macro level, the Value Process Performance

Metrics IPY = 0.92 IPY = .94 IPY = .59 IPY = .96 IPY = .96
Stream Map provides you a Defects = 0.08
RTY = .92
Defects = .06
RTY = .86
Defects = .41
RTY = .51
Defects = .04
RTY = .49
Defects = .04
RTY = .47
Rework = 4.0% Rework = 0.0 Rework = 10% Rework = 0.0 Rework = 0.0
lot of detailed performance Material Yield = .96
Scrap = 0.0%
Material Yield = .94
Scrap = 0.0%
Scrap = 0.0%
Scrap = 0.0%
Scrap = 4.0%
Aggregate Performance
data for the major steps of Metrics
Cum Material Yield = .96 X .94 X .69 X .96 X .96 = .57 RTY = .92 X .94 X .59 X .96 X .96 = .47
the process. It is great for
finding bottlenecks in the The Value Stream Map is a very powerful technique to understand the
process.
p velocity of process transactions, queue levels and value added ratios in
both manufacturing and non-manufacturing processes.
Process Mapping Exercise – Going to Work
The purpose of this exercise is to develop a Level 1 Macro, Linear

Process Flow Map and then convert this map to a Swim Lane Map.
Read the following background for the exercise: You have been concerned
about your ability to arrive at work on time and also the amount of time it takes
from the time your alarm goes off until you arrive at work. To help you better
understand both the variation in arrival times and the total time,, you
y decide to
create a Level 1 Macro Process Map. For purposes of this exercise, the start is
when your alarm goes off the first time and the end is when you arrive at your
work station.
Task 1 – Mentally think about the various tasks and activities that you routinely
do from the defined start to the end points of the exercise.
Task 2 – Using a pencil and paper create a linear process map at the macro
level but with enough detail that you can see all the major steps of your
level,
process.
Task 3 – From the Linear Process Map, create a swim lane style Process Map.
For the lanes you may use the different phases of your process, such as the
wake up phase, getting prepared, driving, etc.

106
Process Discovery
A Process Map of Process Mapping
Process Mapping
follows a general Select the process
Create the Level 2 Create a Level 3
PFM PFM
order, but sometimes
you may find it
necessary, even Determine Add Performance
approach to map Perform SIPOC
advisable to deviate the process data
somewhat. However,
you will find this a
good path to follow Complete Level 1
PFM worksheet
Identify all X’s and Identify VA/NVA
Y’s steps
as it has proven itself
to generate
significant results. Identify customer
Create Level 1 PFM
requirements
On the lessons
ahead we will always
show you where you Define the scope Identify supplier
for the Level 2 PFM requirements
are at in this
sequence of tasks
for Process Mapping. Before we begin our Process Mapping we will first start you off with how to
determine the approach to mapping the process.
Basically there are two approaches: the individual and the team approach.
Process Mapping Approach
Select the Using the Individual Approach

process 1. Start with the Level 1 Macro Process Map.
2. Meet with process owner(s) / manager(s). Create a
Level 1 Map and obtain approval to interview
Determine
approach to process members.
map the 3 Starting with the beginning of the process
3. process, pretend
process
you are the product or service flowing through the
process, interview to gather information.
Complete
Level 1 4. As the interview progress, assemble the data into a
PFM Level 2 PFM.
worksheet
5. Verify the accuracy of the Level 2 PFM with the
individuals who provided input.
Create
Level 1
6. Update the Level 2 PFM as needed.
PFM
Using the Team Approach

Define the
scope for 1. Follow the Team Approach to Process Mapping
the Level 2
PFM
If you decide to do the individual approach, here are a few key factors: You must pretend that you are the
product or service flowing through the process and you are trying to “experience” all of the tasks that
h
happen th
through
h th
the various
i steps.
t
You must start by talking to the manager of the area and/or the process owner. This is where you will
develop the Level 1 Macro Process Map. While you are talking to him, you will need to receive permission
to talk to the various members of the process in order to get the detailed information you need.

107
Process Discovery
Process Mapping Approach
Process Mapping
P M i
works best with a Select the Using the Team Approach
process 1. Start with the Level 1 Macro Process Map.
team approach. The
2. Meet with process owner(s) / manager(s). Create a
logistics of Level 1 Map and obtain approval to call a process
Determine
performing the approach to mapping meeting with process members (See team
map the workshop instructions for details on running the
mapping a process meeting).
somewhat different, 3. Bring key members of the process into the process
Complete
but it overall it takes Level 1 flow workshop. If the process is large in scope, hold
less time, the quality PFM individual workshops for each subsection of the
worksheet total process. Start with the beginning steps.
of the output is Organize meeting to use the “post-it note approach
higher and you will Create to gather individual tasks and activities, based on
Level 1 the macro map, that comprise the process.
have more “buy-in” PFM 4. Immediately assemble the information that has
into the results. Input been provided into a Process Map.
should come from Define the 5. Verify the PFM by discussing it with process owners
scope for and by observing the actual process from beginning
individuals familiar the Level 2
PFM to end.
with
ith allll stages
t off
process.
Where appropriate the team should include line individuals, supervisors, design engineers, process
engineers, process technicians, maintenance, etc. The team process mapping workshop is where it
all comes together.
Select the The Team Process Mapping Workshop

process 1. Add to and agree on Macro Process Map.
2. Using 8.5 X 11 paper for each macro process step,
Determine tape the process to the wall in a linear style.
approach to 3. Process Members then list all known process tasks
map the that they do on a Post-it note, one process task per
process
note.
• Include the actual time spent to perform each
Complete
Level 1 activity, do not include any wait time or queue
PFM time.
worksheet • List any known performance data that describe
the quality of the task.
Create 4. Place the post-it notes on the wall under the
Level 1 appropriate macro step in the order of the work flow.
PFM 5. Review process with whole group, add additional
information and close meeting.
Define the 6. Immediately consolidate information into a Level 2
scope for Process Map.
the Level 2 7. You will still have to verify the map by walking the
PFM
process
process.
In summary, after adding to and agreeing to the Macro Process Map, the team process mapping
approach is performed using multiple post-it notes where each person writes one task per note and,
when finished, place them onto a wall which contains a large scale Macro Process Map.
This is a very fast way to get a lot of information including how long it takes to do a particular task.
Using the Value Stream Analysis techniques which you will study laterlater, you will use this data to
improve the process. We will now discuss the development of the various levels of Process Mapping.

108
Process Discovery
Steps in Generating a Level 1 PFM

You may y recall that the
preferred method for
describing a process is to Select the Creating a Level 1 PFM
process
identify it with a generic 1. Identify a generic name for the process:
For instance: “Customer order process”
name, describe its purpose 2. Identify the beginning and ending steps of the process:
Determine
with an operational approach to Beginning - customer calls in. Ending – baked pizza given to
map the operations
description and show the process
3. Describe the primary purpose and objective of the process
workflow with a process (operational definition):
Complete
p
map. When
Wh developing
d l i a Level 1 Th purpose off the
The th process isi to
t obtain
bt i telephone
t l h orders
d ffor
PFM pizzas, sell additional products if possible, let the customer
Macro Process Map, always worksheet know the price and approximate delivery time, provide an
add one process step in front accurate cook order, log the time and immediately give it to the
Create pizza cooker.
of and behind the area you Level 1 4. Mentally “walk” through the major steps of the process and
believe contains your PFM write them down:
Receive the order via phone call from the customer, calculate
problem as a minimum. To the price, create a build order and provide the order to
Define the operations
aid you in your start, we have scope for
the Level 2 5. Use standard flowcharting symbols to order and to illustrate
provided yyou with a checklist
p PFM the flow of the major process steps.
steps
or worksheet. You may
acquire this data from
your own knowledge and/or with the interviews you do with the managers / process owners. Once you
have this data, and you should do this before drawing maps, you will be well positioned to
communicate with others and you will be much more confident as you proceed.
A Macro Process Map can be useful when reporting project status to management. A macro-map can
show the scope of the project
project, so management can adjust their expectations accordingly.
accordingly Remember
Remember,
only major process steps are included. For example, a step listed as “Plating” in a manufacturing
Macro Process Map, might actually consists of many steps: pre-clean, anodic cleaning, cathodic
activation, pre-plate, electro-deposition, reverse-plate, rinse and spin-dry, etc. The plating step in the
macro-map will then be detailed in the Level 2 Process Map.
Exercise – Generate a Level 1 PFM
Th purpose off thi

The this exercise
i iis tto d
develop
l aL
Levell 1 Li
Linear
Select the Process Flow Map for the key process you have selected as your
process workplace assignment.
Read the following background for the exercise: You will use
Determine your selected key process for this exercise (if more than one
approach to person in the class is part of the same process you may do it as a
map the
process small group). You may not have all the pertinent detail to correctly
put together the Process Map, that is ok, do the best you can.
Complete This will give you a starting template when you go back to do your
Level 1 workplace assignment. In this exercise you may use the Level 1
PFM
worksheet PFM worksheet on the next page as an example.
Create Task 1 – Identify a generic name for the process.

Level 1 Task 2 - Identify the beginning and ending steps of the process.
PFM Task 3 - Describe the primary purpose and objective of the
process ((operational
p p definition).
)
Define the Task 4 - Mentally “walk” through the major steps of the process
scope for and write them down.
the Level 2
PFM Task 5 - Use standard flowcharting symbols to order and to
illustrate the flow of the major process steps.

109
Process Discovery
Exercise – Generate a Level 1 PFM (cont.)
If necessary,
necessary you may look
at the example for the Pizza 1. Identify a generic name for the process:
order entry process.
2. Identify the beginning and ending steps of the process:
3. Describe the primary purpose and objective of the process

(operational definition):
4. Mentally “walk” through the major steps of the process and write
them down:
5. Use standard flowcharting symbols to order and to illustrate the flow

off the
h major
j process steps on a separate sheet
h off paper.
Exercise – Generate a Level 1 PFM Solution
1
1. Identify a generic name for the process:
(I.E. customer order process).
• Identify the beginning and ending steps of the process:

(beginning - customer calls in, ending – pizza order given to the chef).
• Describe the primary purpose and objective of the process (operational

definition):
) ((The p
purpose
p of the p
process is to obtain telephone
p orders for
Pizzas, sell additional products if possible, let the customer know the
price and approximate delivery time, provide an accurate cook order, log
the time and immediately give it to the pizza cooker).
• Mentally “walk” through the major steps of the process and write them
down:
(Receive the order via phone call from the customer, calculate the price,
create a build order and provide the order to the chef).
• Use standard flowcharting symbols to order and to illustrate the flow of

the major process steps on a separate sheet of paper.

110
Process Discovery
Defining the Scope of Level 2 PFM

With a completed Level 1
PFM, you can now “see” Customer Order Process Customer Order Process
where you have to go to get Select the Customer Calls for Take Make Cook Box Deliver Customer
more detailed information. process Hungry Order Order Pizza Pizza Pizza Pizza Eats
You will have the basis for

a Level 2 Process Map.
Determine Pizza
The improvements are in approach to Dough
th details.
the d t il If th
the efficiency
ffi i map the No
process Take Order Add Place in Observe Check Yes Remove
or effectiveness of the from Cashier Ingredients Oven Frequently if Done from Oven 1
process could be
significantly improved by a Complete Start New
Pizza
Level 1
broad summary analysis, PFM Scrap
the improvement would be worksheet No
done already. If you map 1 Pizza Place in
Tape
Order on
Put on
Correct Box Delivery Rack
the process at an Yes Box
Create
actionable level, you can
Level 1
identify the source of PFM The rules for determining the Level 2 Process Map scope:
inefficiencies and defects.
But you need to be careful • From your Macro Process Map, select the area which represents your
about mapping too little an problem.
Define the
area and missing your scope for • Map this area at a Level 2.
problem cause, or mapping the Level 2
PFM • Start and end at natural startingg and stopping
pp g ppoints for a pprocess, in
t large
to l an area in
i d detail,
t il other words you have the complete associated process.
thereby wasting your
valuable time.
The rules for determining the

scope of the Level 2 Process Crea te the
Map: Level 2 PFM Pizza
Dough
a)) Look at your
y Macro Process No
Map, select the area which Take Order Add Place in Observe Check Yes Remove
Perform from Cashier Ingredients Frequently from Oven 1
represents your problem. SIPO C
Oven if Done
b) Map this area at a Level 2.

Start New
c) Start and end at natural Pizza
Identify a ll
starting and stopping points for X ’s a nd Y’s Scrap
a process, in other words you No
Tape
have the complete associated Identify
1 Pizza
Correct
Place in
Box
Order on
Put on
Delivery Rack
Yes Box
process
process. customer
t
requirements
When you perform the process

Identify
mapping workshop or do the supplier
requirements
individual interviews, you will
determine how the various tasks
and activities form a complete step. Do not worry about precisely defining the steps, it is not an exact
science, common sense will prevail. If you have done a process mapping workshop, which you will
remember we highly recommended, you will actually have a lot of the data for the Level 3 Micro
Process Map. You will now perform a SIPOC and, with the other data you already have, it will
position you for about 70 percent to 80 percent of the details you will need for the Level 3 Process
Map.

111
Process Discovery
Building a SIPOC
SIPOC diagram for customer-order process:

Create the Suppliers Inputs Process Outputs Customers Requirements
Level 2 PFM r ATT Phones r Pizza type r See Below r Price r Cook r Complete call < 3 min
r Office Depot r Size r Order confirmation r Accounting r Order to Cook < 1 minute
r TI Calculators r Quantity r Bake order r Complete bake order
r NEC Cash Register r Extra Toppings r Data on cycle time r Correct bake order
r Special orders r Order rate data r Correct address
r Drink types & quantities r Order transaction r Correct Price
Perform
r Other products r Delivery info
SIPOC r Phone number
r Address
r Name
r Time da
Time, day and date
r Volume
Identify all X’s
and Y’s
Identify Customer Order:

customer
requirements
Level 1 process flow diagram
Call for Answer Write Confirm Sets Address Order to
an Order Phone Order Order Price & Phone Cook
Identify
y
supplier
requirements
The tool name prompts the team to consider the suppliers (the 'S' in SIPOC) of your process, the
inputs (the 'I') to the process, the process (the 'P') your team is improving, the outputs (the 'O') of
the process and the customers (the 'C') that receive the process outputs.
Requirements of the customers can be appended to the end of the SIPOC for further detail and
requirements are easily added for the suppliers as well.
The SIPOC tool is particularly useful in identifying:
Who supplies inputs to the process?
What are all of the inputs to the process we are aware of? (Later in the DMAIC methodology
you will use other tools which will find still more inputs, remember Y=f(X) and if we are going to
improve Y, we are going to have to find all the X’s.
What specifications are placed on the inputs?
What are all of the outputs of the process?
Who are the true customers of the process?
What are the requirements of the customers?
You can actually begin with the Level 1 PFM that has 4 to 8 high-level steps, but a Level 2 PFM is even
of more value. Creating a SIPOC with a process mapping team, again the recommended method is a
wall exercise similar to your other process mapping workshop. Create an area that will allow the team to
place post-it
post it note additions to the 8.5
8 5 X 11 sheets with the letters S,
S I,
I P,
P O and C on them with a copy of
the Process Map below the sheet with the letter P on it.
Hold a process flow workshop with key members. (Note: If the process is large in scope, hold an
individual workshop for each subsection of the total process, starting with the beginning steps).
The preferred order of the steps is as follows:
1. Identify the outputs of this overall process.
2. Identify the customers who will receive the outputs of the process.
3. Identify
f customers’ preliminary requirements
4. Identify the inputs required for the process.
5. Identify suppliers of the required inputs that are necessary for the process to function.
6. Identify the preliminary requirements of the inputs for the process to function properly.

112
Process Discovery
Identifying Customer Requirements

You are now ready to
identify the customer
Create the
requirements for the Level 2 PFM
Process Name
outputs you have defined. PROCESS OUTPUT
Operational
Definition
IDENTIFICATION AND ANALYSIS
Customer requirements, 1
Output Data
3 4 5 6 7
Requirements Data
8 9 10
Measurement Data
11 12
Value Data
13
General Data/Information
Customer (Name) Metric Measurement VA
called VOC, determine Perform Process Output - Name (Y) Internal External Metric LSL Target USL
System (How is it
Measured)
Frequency of
Measurement Performance Level Data
or
NVA Comments
what are and are not SIPOC
acceptable for each of the

outputs. You may find that
some of the outputs do not Identify all X’s
and Y’s
have requirements or
specifications. For a well
managed process, this is Identify
customer
not acceptable. If this is the requirements
case, you must
ask/negotiate with the
Identify
customer as to what is supplier
acceptable. requirements
There is a technique for

determining the validity of customer and supplier requirements. It is called “RUMBA” standing for:
Reasonable, Understandable, Measurable, Believable and Achievable. If a requirement cannot meet all of
these characteristics, then it is not a valid requirement , hence the word negotiation. We have included the
process for validating
p g customer requirements
q at the end of this lesson.
The Excel spreadsheet is somewhat self explanatory. You will use a similar form for identifying the
supplier requirements. Start by writing in the process name followed by the process operational
definition. The operational definition is a short paragraph which states why the process exists, what it
does and what its value proposition is. Always take sufficient time to write this such that anyone who
reads it will be able to understand the process. Then list each of the outputs, the Y’s, and write in the
customer’s name who receives this output, categorized as an internal or external customer.
Next are the requirements data. To specify and measure something, it must have a unit of measure;
called a metric. As an example, the metric for the speed of your car is miles per hour, for your weight it is
pounds, for time it is hours or minutes and so on. You may know what the LSL and USL are but you may
not have a target value. A target is the value the customer prefers all the output to be centered at;
essentially, the average of the distribution. Sometimes it is stated as “1 hour +/- 5 minutes”. One hour is
the target, the LSL is 55 minutes and the USL is 65 minutes. A target may not be specified by the
customer; if not, put in what the average would be. You will want to minimize the variation from this
value.
value
You will learn more about measurement, but for now you must know that if something is required, you
must have a way to measure it as specified in column 9. Column 10 is how often the measurement is
made and column 11 is the current value for the measurement data. Column 12 is for identifying if this is
a value or non value added activity; more on that later. And finally column 13 is for any comments you
want to make about the output.
You will
Yo ill come back to this form and rank the significance of the o
outputs
tp ts in terms of importance to identif
identify
the CTQ’s.

113
Process Discovery
Identifying Supplier Requirements
The supplier input or

process input identification
Create the
and analysis form is nearly Level 2 PFM
identical to the output form PROCESS INPUT
Process Name
Operational
Definition
IDENTIFICATION AND ANALYSIS
just covered. Now you are 1 2
Input Data
3 4 5 6 7
Requirements Data
8 9 10
Measurement Data
11
Value Data
12
the customer, you will Perform Controlled (C)

Process Input- Name (X) Noise (N)
Supplier (Name)
Internal External Metric LSL

Metric
Target USL
Measurement
System (How is it Frequency of Performance
Measured) Measurement Level Data
NV
or
NVA Comments
specify what is required of SIPOC
your suppliers for your

process to work correctly;
Identify all X’s
remember RUMBA – the and Y’s
same rules apply.
You will notice a new Identify

customer
parameter introduced in requirements
column 2. It asks if the input
is a controlled input or an
Identify
uncontrolled input (noise). supplier
requirements
The next topic will discuss
the meaning of these terms.
Later you will come back to this form and rank the importance of the inputs to the success of your
process and eventually you will have found the Critical X’s.
Controllable vs. Noise Inputs
For any process or process Screens in Place
step input, there are two Procedural Oven Clean

Ingredients prepared
Inputs
primary types of inputs:
Controllable - we can exert
influence over them
Uncontrollable - they behave Controllable Key Process
Inputs Process Outputs
as they want to within some
reasonable boundaries. Correct Ingredients
Properly Cooked
Procedural - A standardized Room Temp Pizza Size Hot Pizza >140 deg
Moisture Content
set of activities leading to Ingredient Variation
Noise Inputs Ingredient Types/Mixes
Volume
readiness of a step.
Compliance to GAAP
Every input can be either:
(Generally Accepted Controllable (C) - Inputs can be adjusted or controlled while the process is running (e.g., speed,
Accounting Principals). feed rate, temperature, and pressure)
Procedural (P) - Inputs can be adjusted or controlled while the process is running (e.g., speed,
feed rate, temperature, and pressure)
However, even with the inputs Noise (N) - Things we don’t think we can control, we are unaware of or see, too expensive or too
we define as controllable, we difficult to control (e.g., ambient temperature, humidity, individual)
never exert complete control.

We can control an input within the limits of its natural variation, but it will vary on its own based on
its distributional shape - as you have previously learned. You choose to control certain inputs
because you either know or believe they have an effect on the outcome of the process, it is
inexpensive to do, so controlling it “makes us feel better” or there once was a problem and the
solution (right or wrong) was to exert control over some input.

114
Process Discovery
Controllable vs. Noise Inputs (cont.)

You choose to not control some inputs because you think you cannot control them them, you either know or
believe they don’t have much affect on the output, you think it is not cost justified or you just don’t
know these inputs even exist. Yes, that’s right, you don’t know they are having an affect on the output.
For example, what effect does ambient noise or temperature have on your ability to be attentive or
productive, etc?
It is important to distinguish which category an input falls into. You know through Y=f(X), that if it is a
Critical X, by definition, that you must control it. Also if you believe that an input is or needs to be
controlled then you have automatically implied there are requirements placed on it and that it must be
controlled,
measured. You must always think and ask whether an input is or should be controlled or if it is
uncontrolled.
Exercise – Supplier Requirements
The purpose of this exercise is to identify the requirements for the

Create the
suppliers to the key process you have selected as your workplace
Level 2 PFM assignment.
Read the following background for the exercise: You will use
Perform your selected key process for this exercise (if more than one
SIPOC
person in the class is part of the same process you may do it as a
small ggroup).
p) You may y not have all the p
pertinent detail to correctly
y
identify all supplier requirements, that is ok, do the best you can.
Identify all X’s This will give you a starting template when you go back to do your
and Y’s
workplace assignment. Use the process input identification and
analysis form for this exercise.
Identify
customer Task 1 – Identify a generic name for the process.
requirements
Task 2 - Write an operational description for the process
Task 3 - Complete
p the remainder of the form except
p the Value –
Identify Non value added column.
supplier Task 4 - Report out to the class when called upon,
requirements

115
Process Discovery
The Level 3 Process Flow Diagram
Pi
Pizza
Dough
No
Take Order Add Place in Observe Check Yes Remove
from Cashier Ingredients Oven Frequently if Done from Oven 1
Start New
Pizza
Scrap
No
Tape
Pizza Place in Put on
1 Correct Box
Order on Delivery Rack
Yes Box
Process Name Step Name/Number

PROCESS STEP Process Name Step Name/Number Process Name Step Name/Number
PROCESS STEP PROCESS STEP
OUTPUT IDENTIFICATION AND ANALYSIS PROCESS STEP Process Name Step Name/Number
OUTPUT
1 IDENTIFICATION
3 4 AND
5 ANALYSIS
6 7 8 9 10 11 12 13 INPUT IDENTIFICATION AND ANALYSIS
1 Output Data 3 4 5 Requirements
6 Data7 8 9 Measurement
10 Data 11 Value Data
12 General Data/Information
13 INPUT
1 IDENTIFICATION
2 3 AND4 ANALYSIS
5 6 7 8 9 10 11 12 13
Customer (Name) Metric 1 Input Data2 3 4 5Requirements
6 Data7 8 9 Measurement 10Data 11 Value Data
12 General Data/Information
13
Output Data Requirements Data Measurement Measurement Data Value
VA Data General Data/Information
Customer (Name) Metric System (How is it Frequency of or VA Input Data Supplier (Name) Metric
Requirements Data Measurement Measurement Data VA
Value Data General Data/Information
Measurement Controlled (C) System (How is it Frequency of Performance or VA
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Supplier (Name) Metric Measurement
System (How is itMeasurement
Frequency of Performance Level Data NVA or Comments
Process Input- Name (X) Noise Internal
(N) (C) External Metric LSL Target USL Measured)
Controlled it Frequency ofLevel
System (How isMeasurement Data
Performance NVA or Comments
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments
Process Input- Name (X) Noise (N) Internal External Metric LSL Target USL Measured) Measurement Level Data NVA Comments
You h
Y have a d
decision
i i att thi
this point
i t tto continue
ti with
ith a complete
l t characterization
h t i ti off th
the process you h
have
documented at a Level 2 in order to fully build the process management system or to narrow the effort
by focusing on those steps that are contributing to the problem you want solved.
Usually just a few of the process steps are the root cause areas for any given higher level process
output problem. If your desire is the latter, there are some other Measure Phase actions and tools you
will have to use to narrow the number of potential X’s and subsequently the number of process steps.
To narrow the
T th scope so it is
i relevant
l t to
t your problem
bl consider
id ththe ffollowing:
ll i R
Remember
b using
i ththe pizza
i
restaurant as our example for selecting a key process? They were having a problem with overall delivery
time and burnt pizzas. Which steps in this process would contribute to burnt pizzas and how might a
pizza which was burnt so badly it had to be scrapped and restarted effect delivery time? It would most
likely be the steps between “place in oven” to “remove from oven”, but it might also include “add
ingredients” because certain ingredients may burn more quickly than others. This is how, based on the
Problem Statement you have made, you would narrow the scope for doing a Level 3 PFM.
For your project, the priority will be to do your best to find the problematic steps associated with your
Problem Statement. We will teach you some new tools in a later lesson to aid you in doing this. You may
have to characterize a number of steps until you get more experience at narrowing the steps that cause
problems; this is to be expected. If you have the time you should characterize the whole process.
Each step you select as the causal steps in the process must be fully characterized, just as you have
previously done for the whole process. In essence you will do a “mini SIPOC” on each step of the
process as defined in the Level 2 Process Map.Map This can be done using a Level 3 Micro Process Map
and placing all the information on it or it can be consolidated onto an Excel spreadsheet format or a
combination of both. If all the data and information is put onto an actual Process Map, expect the map to
be rather large physically. Depending on the scope of the process, some people dedicate a wall space
for doing this; say a 12 to 14 foot long wall. An effective approach for this is to use a roll of industrial

116
Process Discovery
The Level 3 Process Flow Diagram (Cont.)

grade brown package wrapping paper, which is generally 4 feet wide. Just roll out the length you want,
cut it, place this on the wall and then build your Level 3 Process Map by taping and writing various
elements onto the paper. The value of this approach is that you can take it off the wall, roll it up, take it
with you and then put it back on any wall; great for team efforts.
A Level 3 Process Map contains all of the process details needed to meet your objective: all of the flows,
set points, standard operating procedures (SOPs), inputs and outputs; their specifications and if they are
classified as being controllable or non-controllable (noise). The Level 3 PFM usually contains estimates of
defects per unit (DPU), yield and rolled throughput yield (RTY) and value/non-value
value/non value add. If processing
cycle times and inventory levels (materials or work queues) are important, value stream parameters are
also included.
This can be a lot of detail to manage and appropriate tracking sheets are required. We have supplied
these sheets in a paper and Excel spreadsheet format for your use. The good news is the approach and
forms for the steps are essentially the same as the format for identifying supplier and customer
requirements at the process level. A spreadsheet is very convenient tool and the output from the
spreadsheet can then be fed directly into a C&E matrix and an FMEA (to be described later),
later) also built
using spreadsheets.
You will find the work you have done up to this point in terms of a Level 1 and 2 Process Maps and the
SIPOC will be of use, both from knowledge of the process and actual data.
An important reminder of a previous lesson: You will recall when you were taught about project definition
where it was stated that you should only try to solve the performance of only one process output, at any
one time
time. Because of the amount of detail you can get into for just one Y
Y, trying to optimize more than one
Y at a time can become overwhelming. The good news is that you will have laid all the ground work to
focus on a second and a third Y for a process by just focusing on one Y in your initial project.
Process Inputs (X’s) and Outputs (Y’s)
You are now down at the PROCESS STEP
OUTPUT IDENTIFICATION AND ANALYSIS

step level of the process, 1
Output Data
3 4 5 6 7
Requirements Data
8 9 10
Measurement Data
11 12
Value Data
13
this is what we call the Create a Customer (Name) Metric Measurement

System (How is it Frequency of
VA
or
Level 3 PFM Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments
improvement view of a
process. Now you do
exactly the same thing
Add
as you did for the overall Performance
process, you list all of data
the input and output

information for steps of
th process you have
the h Identify
f
VA/NVA steps PROCESS STEP
selected for analysis and INPUT IDENTIFICATION AND ANALYSIS
characterization to solve 1 2
Input Data
3 4 5 6 7
Requirements Data
8 9 10
Measurement Data
11 12
Value Data
13
Supplier (Name) Metric Measurement VA
your problem. To help Process Input- Name (X)

Controlled (C)
Noise (N) Internal External Metric LSL Target USL
System (How is it Frequency of Performance
Measured) Measurement Level Data
or
NVA Comments
you comprehend what

we are trying to
accomplish we have
provided you with
visualization for the
inputs and outputs of the
Pizza restaurant.

117
Process Discovery
Process Inputs (X’s) and Outputs (Y’s) (cont.)
Any process, even a pizza

restaurant process can be C /N Requirements or Specs. Inputs (Xs) Process
characterized. This Ys
visualization shows many N/C 7”, 12”, 16”
N/C 12 meats, 2 veggies, 3 cheese
Size of Pizza
Toppings
of the inputs and outputs N N/A Name
N Within 10 miles Address Order •All fields
and their requirements. By N Within area code Phone
Take Order
complete
N 11 AM to 1 AM Time
using the process and the N 5 X 52 Day
process step input and N MM/DD,YY Date
output
t t sheets,
h t you gett a
very detail picture about C All fields complete
C Per Spec Sheets
Order
Ingredients •Size
Make Pizza Raw •Weight
how your process works. S.O.P Per Rev 7.0 Recipe Pizza
•Ingredients
C As per recipe chart 3-1 in Oz. Amounts
Now you have enough data correct
to start making informed

C All fields complete Order
decisions about the C Ingredients per order Raw Pizza
C 350F +/- 5F Oven Temp Cook Pizza •>140F
process performance. The C 10 Min Time Cooked
Pizza •Ingredients
next lesson pages will N 60 per hour max Volume correct
•No burns
describe how you
determine if a process task, activity or step is a value added step or not.
Identifying Waste
When we produce
A Writes Add to Rewrite
products or services, we NV
time on Order order
scratch
pad
p
engage process-based A No
Greetings Request
NV NoWrites on
Call for an Answer and Asks for Confirm
activities to transform Order phone mention
specials
order from
customer
scratch
pad more? order
Yes
physical materials, ideas 1 2
and information into NV
A No
No
Asks cook Inform Gets Thanks
something valued by 2 Calculate
price for time
estimate
customer
of
price/time
Yes
Order
still OK?
address &
phone #
3 customer
& hangs
up
Another
call
waiting
customers. Some A
NV
Yes
Writes
time on
Create a Yes
activities in the process Level 3 •Each process activity can be tested for 1
scratch
pad New
order?
its value-add contribution
generate true value
value, PFM N
No
A
others do not. The •Ask the following two questions to NV Completes
order from
3
Add identify non-value added activity: from note
pad
expenditure of resources, Performance –Is the form, fit or function of the A
data work item changed as a result of OK NV
capital and other this activity?
Give order to
Cook
Verify
with
notes
Not
energies that do not Identify
–Is the customer willing to pay for OK
this activity? A
generate value is VA/NVA NV
Rewrite
Order
steps
considered waste. Value
generation is any activity
that changes the form, fit or function of what we are working on in a way that the customer is willing to
pay for. The goal of testing for VA vs. NVA is to remove unnecessary activity (waste) from a process.
Hint: If an action starts with the two letters “re” it’s a good chance that it’s a form of waste, i.e. rework,
replace, review, etc.
Some non-value activities cannot be removed; i.e., data collection is required to understand and plan
production activity
p y levels,, data must be collected to comply
p y with g
governmental regulations,
g , etc. ((even
though the data have no effect on the actual product or service)
On the process flow diagram we place a red X through the steps or we write NVA or VA by each step.

118
Process Discovery
Definition of X-Y Diagram
The X-Y Diagram is a great tool to • The X-Y diagram

g is:
help us focus, again it is based on – A tool used to identify/collate potential X’s and assess their
team experience and “Tribal” relative impact on multiple Y’s (include all Y’s that are
knowledge. At this point in the customer focused)
project that is great although it – Based on the team’s collective “opinions”
should be recognized that this is
– Created for every project
NOT hard data. As you progress
through the methodology don’t be – Never completed
surprised if you find out through – Updated
U d t d whenever
h a parameter t iis changed
h d
data analysis that what the team
thought might be critical turns out • To summarize, the X-Y is a team-based prioritization tool for the
to be insignificant. potential X’s
The great thing about the X-Y
Diagram is that it is sort of an • WARNING! This is not real data, this is organized
unbiased way to approach brainstorming!!
g At the conclusion of the pproject
j yyou mayy realize
definition around the process and that the things you thought were critical are in fact not as
WILL give you focus. important as was believed.
The Vital Few
A Six Sigma Belt does not just discover which X’s are important in
a process (the vital few).
– The team considers all possible X’s that can contribute or
cause the problem observed.
– The team uses 3 primary sources of X identification:
• Process
ocess Mapping
app g
• Fishbone Analysis
• Basic Data Analysis – Graphical and Statistical
– A List of X’s is established and compiled.
– The team then prioritizes which X’s it will explore first, and
eliminates the “obvious” low impact X’s from further
consideration.
The X-Y Diagram is this Prioritization Tool!
This is an important tool for the many reasons we have already stated. Use it to your benefit,
leverage the team and this will help you progress you through the methodology to accomplish your
ultimate project goal.

119
Process Discovery
The “XY Diagram”
This is the X-Y Diagram. You should have a copy of this template. If possible open it and get
familiar with it as we progress through this section.
Using the Classified X

X’s
s
• Breakthrough requires dealing primarily with controllable X’s

impacting the “Y”.
• Use the controllable X’s from the Fishbone analysis to include in the
g
X-Y Diagram.
• The goal is to isolate the vital few X’s from the trivial many X’s.
• Procedures and Noise X’s will be used in the FMEA at the end of
this module. However:
– All procedures must be in total compliance.
• This mayy require
q some type
yp of effectiveness measure.
• This could reduce or eliminate some of the defects currently seen in
the process (allowing focus on controllable X’s).
– Noise type inputs increase risk of defects under current
technology of operation and therefore:
• Increase RPN on the FMEA document from an input.
• Help identify areas needing investment for a justified ROI.
*Risk Priority Number

120
Process Discovery
X-Y Diagram: Steps
Li t X’s
List X’ from
f Fishbone
Fi hb Diagram
Di in
i horizontal
h i t l rows
Use your Fishbone Diagram as the source and type in the Inputs in this section, use common sense,
some of the info from the Fishbone may not justify going into the X-Y inputs.
Enter your primary metric

and any other secondary
List Y’s in columns (including Primary and Secondary metrics).
metrics across into this Weight the Y’s on a scale of 1-10 (10 - highest and 1- lowest).
area. Weight these output
variables (Y’s) on a scale
of 1-10 you may find that
some have the same
weight which is just fine.
If, at this time, additional
metrics come to the
surface, which is totally
common, you may realize
that you need to add
secondary metrics to your
project or even refine your
primary metric.

121
Process Discovery
X-Y Diagram: Steps (cont.)

For each X listed along the left
left, F eachh X listed,
For li t d rankk its
it effect
ff t on eachh metric
t i based
b d on a scale
l off 1,
1 3 or 99.
rank its effect on each
– 9 = Highest
corresponding metric based on
– 3 = Marginal
a scale of 0, 1, 3 or 9. You can
use any scale you choose – 1 = None
however we recommend this
on. If you use a scale of 1 to 10
this can cause uncertainty
within
i hi the
h team…is i iit a 6 or a 7,
what’s the difference, etc.?
The template we have provided

automatically calculates and “Ranking” multiplies the rank of each X by the Weight of each Metric.
The product of that is added together to become the “Ranking”.
sorts the ranking shown here.

122
Process Discovery
Example
Click the Demo button to see an example. Shown here is a basic

example of a completed X-Y
Diagram. You can click
“Demo” on your template to
view this anytime.
Example
This is the summary
worksheet. If you click Click the Summary Worksheet
on the “Summary” tab
you will see this output.
Take some time to YX Diagram Summary
review the worksheet. Process: laminatingg
Date: 5/2/2006 Input Matrix Results
100.00%
Output Variables Input Variables 90.00%
80.00%
Description Weight Description Ranking Rank %
Output (Y's)
70.00%
60.00%
broken 10 temperature 162 14.90% 50.00%
40.00%
unbonded area 9 human handling 159 14.63% 30.00%
20.00%
smears 8 material properties 130 11.96% 10.00%
0.00%
thickness 7 washer 126 11.59%
temperature
time
clean room cleanliness
material properties
pressure
foreign material 6 pressure 120 11.04%

0 robot handling 120 11.04%
0 time 102 9.38%
0 clean room practices 90 8 28%
8.28% Input Summary
0 clean room cleanliness 78 7.18% Input (X's)
0 - 0.00%

123
Process Discovery
Fishbone Diagram Exercise
Ex ercise objective: Create an X-Y diagram

using the information from the Fishbone
analysis.
1. Using the Fishbone Diagram created earlier, create

an X
X-Y
Y diagram.
diagram
2. Present results to your mentor.
Definition of FMEA
Failure Modes Effect
Analysis or FMEA Failure Modes Effect Analysis (FMEA) is a structured approach to:
[*usually pronounced • Predict failures and pprevent their occurrence in manufacturingg
as F-M-E-A (individual
and other functional areas which generate defects.
letters) or FEMA** (as
a word)] is a structured • Identify the ways in which a process can fail to meet critical
approach to: read customer requirements (Y).
bullets. FMEA at this
point is developed with • Estimate the Severity, Occurrence and Detection (SOD) of
tribal knowledge with a defects
cross-functional
cross functional team.
team • Evaluate the current control plan for preventing these failures
Later using process from occurring and escaping to the customer.
data the FMEA can be
updated and better • Prioritize the actions that should be taken to improve and control
estimates of detection the process using a Risk Priority Number (RPN).
and occurrence can be
obtained. The FMEA is
not a tool to eliminate
X’s but rather control
Give
G ve mee an
a “F”,
F , give
g ve mee an
a “M”……
M
the X’s. It is only a tool
to identify potential X’s
and prioritize the order
in which the X’s should
be evaluated.

124
Process Discovery
History of FMEA
History of FM EA:
• First used in the 1960’s in the Aerospace industry during the
Apollo missions
• In 1974 the N avy developed MIL-STD-1629 regarding the use of
FMEA
• In the late 1970’s
1970 s, automotive applications driven by liability
costs, began to incorporate FMEA into the management of their
processes
• Automotive Industry Action Group (AIAG) now maintains the
FMEA standard for both Design and Process FMEA’s
The “edge of your seat” info on the history of the FMEA! I’m sure you will all be sharing this with
everyone tonight at the dinner table!
Types of FMEA’s
There are many diff

Th differentt • System FMEA: Performed on a product or service product at the early
types of FMEA’s. The concept/design level when various modules all tie together. All the module level
basic premise is the FMEA’s tie together to form a system. As you go lower into a system more failure
modes are considered.
same.
– Example: Electrical system of a car, consists of the following modules:
battery, wiring harness, lighting control module, and alternator/regulator.
– System FMEA focuses on potential failure modes associated with the
modules of a system caused by design
• Design DFMEA: Performed early in the design phase to analyze product fail
modes before they are released to production. The purpose is to analyze how
fail modes affect the system and minimize them. The severity rating of a fail
mode MUST be carried into the Process PFMEA.
• Process PFMEA: Performed in the early quality planning phase of

manufacturing to analyze fail modes in manufacturing and transactional
processes that may escape to the customer. The failure modes and the potential
sources of defects are rated and corrective action taken based on a pareto
analysis ranking.
• Equipment FMEA: used to analyze failure modes in the equipment used in a

process to detect or make the part.
– Example: Test Equipment fail modes to detect open and short circuits.

125
Process Discovery
Purpose of FMEA
FMEA’s:
• Improve the quality, reliability, and safety of products.
• Increase customer satisfaction.

satisfaction
• Reduce product development time and cost.
• Document and track actions taken to reduce risk and

improve the process.
process
• Focus on continuous problem prevention not problem

solving.
Who Creates FMEAs and When?
FMEA’s are a team tool like

most in this phase of the Who When
methodology. They are • Process FMEAs should be started:
• The focused team working
applicable
pp is most everyy
on a breakthrough project
project. • At the conceptual design phase
phase.
project, manufacturing or
• Process FMEAs should be updated:
service based. • ANYONE who had or has a
• When an existing design or process
role in defining, executing,
For all intensive purposes is being changed.
or changing the process.
they will be used in • When carry-over designs or
conjunction with your • This includes: processes will be used in new
problem solving project to applications and environments.
• Associates
characterize and measure • When a pproblem solvingg studyy is
• Technical Experts completed and needs to be
process variables. In some
• Supervisors documented.
cases the FMEA will
manifest itself as a • System FMEAs should be created after
• Managers system functions are defined but before
management tool when the
• Etc. specific hardware is selected.
project concludes and in
• Design FMEAs should be created when
some cases it will not be
new systems, products and processes are
appropriate to be used in being designed.
that nature.

126
Process Discovery
Why Create an FMEA?
A a means to
As t manage… FMEA s help you manage
FMEA’s
RISK by classifying your
process inputs and monitoring
RISK!!! their effects. This is extremely

important during the course of
your project work.
We want to avoid causing failures in the Process as well as the

Primary & Secondary Metrics .
The FMEA…
This is an FMEA
FMEA. We have provided a template for you to use.
use
# Process Potential Potential S C Potential O Current D R Recommen Responsibl Taken S O D R

Functio Failure Failure E l Causes of C Process E P d Actions e Person & Action E C E P
n Modes Effects V a Failure C Controls T N Target s V C T N
((Step)
p) (process (Y's) s (X's) Date
defects) s
1
2
3
4
5
6
7
8
9

127
Process Discovery
FMEA Components…#
The first column

Function Failure Failure E l Causes of C Process E P d Actions e Person & Action E C E P highlighted here is the
(Step) Modes Effects V a Failure C Controls T N Target s V C T N “Process Step
(process (Y's) s (X's) Date
defects) s Number”.
The first column is the Process Step Number.

1
2
3
4
5
Etc.
FMEA Components…Process Step
The second
Th d column
l iis
the Name of the Process # Process
Function
Potential
Failure
Potential
Failure
S
E
C
l
Potential
Causes of
O
C
Current
Process
D R
E P
Recommen
d Actions
Responsibl
e Person &
Taken
Action
S O D R
E C E P
Step. The FMEA should (Step) Modes
(process
Effects
(Y's)
V a
s
Failure
(X's)
C Controls T N Target
Date
s V C T N
sequentially follow the defects) s
steps documented in
your Process Map.
Phone Enter the Name of the Process Step here. The FMEA should
Dial Number sequentially
ti ll ffollow
ll ththe steps
t d
documented
t d iin your Process
P Map.
M
Listen for Ring Phone
Say Hello Dial Number
Introduce Yourself Listen for Ring
Etc. Say Hello
Introduce Yourself
Etc.

128
Process Discovery
FMEA Components…Potential Failure Modes
The third column to the mode

Functio Failure Failure E l Causes of C Process E P d Actions e Person & Action E C E P in which the process could
(Step) (process (Y's) s (X's) Date potentially fail. These are the
defects) s
defects caused by a C, P or N
factor that could occur in the
Process.
This refers to the mode in which the process could potentially fail.
These are the defects caused by a C,P or N factor that could occur
in the Process.
Process
This information is obtained from Historical Defect Data.
FYI..A failure mode is a fancy name for a defect.
FMEA Components…Potential Failure Effects
The fourth column

highlighted here is # Process Potential Potential S C Potential O Current D R Recommen Responsibl Taken S O D R
simply the effect of Functio Failure Failure E l Causes of C Process E P d Actions e Person & Action E C E P
realizing the potential (process (Y's) s (X's) Date
(Step)
failure mode on the defects) s
overall process and is
focused on the output
of each step.
This information is
usually obtained from This is simply the effect of realizing the potential failure
your Process Map.
mode on the overall process. It focuses on the outputs
of each step.
This information can be obtained in the Process Map.

129
Process Discovery
FMEA Components…Severity (SEV

(Step) (process (Y's) s (X's) Date
defects) s
This ranking should be developed based on the teams knowledge of

the process in conjunction with the predetermined scale.
The measure of Severity is a financial measure of the impact to the
business of realizing a failure in the output.
The fifth column highlighted here is the ranking that is developed based on the team’s knowledge of the
process in conjunction with the predetermined scale.
Severity is a financial measure of the impact to the business of a failure in the output.
Ranking Severity
The Automotive Industry Action Group, a consortium of the “Big Three”: Ford, GM and Chrysler
developed this criteria. If you don’t like it develop one that fits your organization; just make sure it’s
standardized so everyone uses the same scale.
Effect Criteria: Severity of Effect Defined Ranking

Hazardous: May endanger the operator. Failure mode affects safe vehicle operation and/or 10
Without involves non
non-compliance
compliance with government regulation
regulation. Failure will occur WITHOUT
Warning warning.
Hazardous: May endanger the operator. Failure mode affects safe vehicle operation and/or 9
With Warning involves non-compliance with government regulation. Failure will occur WITH
warning.
Very High Major disruption to the production line. 100% of the product may have to be scrapped. 8
Vehicle/item inoperable, loss of primary function. Customers will be very dissatisfied.
High Minor disruption to the production line. The product may have to be sorted and a portion 7
(less than 100%) scrapped. Vehicle operable, but at a reduced level of
performance. Customers will be dissatisfied.
Moderate Minor disruption to the production line. A portion (less than 100%) may have to be 6
scrapped (no sorting)
sorting). Vehicle/item operable
operable, but some comfort/convenience
item(s) inoperable. Customers will experience discomfort.
Low Minor disruption to the production line. 100% of product may have to be re-worked. 5
Vehicle/item operable, but some comfort/convenience item(s) operable at a
reduced level of performance. Customers will experience some dissatisfaction.
Very Low Minor disruption to the production line. The product may have to be sorted and a 4
portion (less than 100%) re-worked. Fit/finish/squeak/rattle item does not
conform. Most customers will notice the defect.
Minor Minor disruption to the production line. A portion (less than 100%) of the product may 3
have to be re-worked online but out-of-station. Fit/finish/squeak/rattle item
does not conform. Average customers will notice the defect.
Very Minor Minor disruption to the production line. A portion (less than 100%) of the product may 2
have to be re-worked online but in-station. Fit/finish/squeak/rattle
q item does
not conform. Discriminating customers will notice the defect.
None No effect. 1
* Potential Failure Mode and Effects Analysis (FMEA), Reference Manual, 2002. Pgs 29-45. Chrysler
Corporation, Ford Motor Company, General Motors Corporation.

130
Process Discovery
Applying Severity Ratings to Your Process
• The guidelines presented on the previous slide were developed for

the auto industry.
• This was included only as a guideline....”actual results may vary” for
your project.
• Your severity may be linked to impact on the business or impact on
the next customer, etc.
You will
Y ill need
d to
t define
d fi your own criteria…
it i …
criteria
and be consistent throughout your FMEA
Let’s brainstorm how we might define the following SEVERITY

levels in our own projects:
1, 5, 10
The actual definitions of the severity are not so important as the fact that the team remains
consistent in its use of the definitions. Below is a sample of transactional severities.
Sample Transactional Severities
Effect Criteria: Impact of Effect Defined Ranking
Critical Business May endanger company’s ability to do business. Failure mode affects process
10
Unit-wide operation and / or involves noncompliance with government regulation.
Critical Loss - May endanger relationship with customer. Failure mode affects product delivered
Customer and/or customer relationship due to process failure and/or noncompliance with 9
Specific government regulation.
Major disruption to process/production down situation. Results in near 100%
High 7
rework or an inability to process. Customer very dissatisfied.
Moderate disruption to process. Results in some rework or an inability to process.
Moderate Process is operable, but some work arounds are required. Customers experience 5
dissatisfaction.
Minor disruption to process. Process can be completed with workarounds or
Low rework at the back end. Results in reduced level of performance. Defect is 3
noticed and commented upon by customers.
Minor disruption to process. Process can be completed with workarounds or
Minor rework at the back end. Results in reduced level of performance. Defect noticed 2
internally, but not externally.
None No effect. 1
Shown here is an example for severity guidelines developed for a financial services company.

131
Process Discovery
FMEA Components…Classification “Class”

defects) s
Class should categorize each step as a…

Controllable (C)
Procedural (P)
Noise (N)
This information can be obtained in the Process Map.
Controllable – A factor that can be dialed into a specific setting/value. For example Temperature or
Flow.
Procedures – A standardized set of activities leading to readiness of a step. For example Safety
Compliance, “Lock -Out Tag-Out.”
Noise - A factor that can not be dialed in to a specific setting/value
setting/value. For example rain in a mine
mine.
Recall the classifications of Procedural, Controllable and Noise developed when constructing your
Process Map and Fishbone Diagram? Use those classifications from the Fishbone in the “Class”
column, highlighted here, in the FMEA.
P t ti l C
Potential Causes off F
Failure
il (X’s)
(X’ )

defects) s
Potential Causes of the Failure refers to how the failure could occur.
This information should be obtained from the Fishbone Diagram.
The column “Potential Causes of the Failure”, highlighted here, refers to how the failure could
occur.
This should also be obtained from the Fishbone Diagram.

132
Process Discovery
FMEA Components…Occurrence “OCC”
The column “Occurrence”

Occurrence
# Process Potential Potential S C Potential O Current D R Recommen Responsibl Taken S O D R highlighted here, refers to
Function Failure Failure E l Causes of C Process E P d Actions e Person & Action E C E P
(Step) Modes Effects V a Failure C Controls T N Target s V C T N how frequently the specified
(process (Y's) s (X's) Date failure is projected to occur.
defects) s
This information should be
obtained from Capability
Studies or Historical Defect
Data in conjunction with the
Occurrence refers to how frequently the specified failure is projected predetermined scale.
to occur.
This information should be obtained from Capability Studies or
Historical Defect Data - in conjunction with the predetermined scale.
Ranking Occurrence
Probability of Failure Possible Failure Rates Cpk Ranking
Very High: Failure is almost ≥ 1 in 2 < 0.33 10

inevitable.
1 in 3 ³ 0.33 9
High: Generally associated with
1 in 8 ³ 0.51 8
processes similar to previous
processes that have often failed.
1 in 20 ³ 0.67 7
Moderate: Generally associated 1 in 80 ³ 0.83 6

with processes similar to previous
processes that have experienced 1 in 400 ³ 1.00 5
occasional failures but not in major
proportions. 1 in 2,000 ³ 1.17 4
Low: Isolated failures associated

1 in 15,000 ³ 1.33 3
with similar processes.
Very Low: Only isolated failures
associated with almost identical 1 in 150,000 ³ 1.5 2
processes.
Remote: Failure is unlikely
unlikely. No
failures ever associated with almost ≤ 1 in 1,500,000 ³ 1.67 1
identical processes.
Potential Failure Mode and Effects Analysis (FMEA), Reference Manual, 2002. Pg. 35.. Chrysler Corporation, Ford
Motor Company, General Motors Corporation.
developed these Occurrence rankings.

133
Process Discovery
FMEA Components…Current Process Controls

(Step) Modes Effects V a Failure C Controls T N Target s V C T N
defects) s
Current Process Controls refers to the three types of controls that are
in place to prevent a failure in with the X’s. The 3 types of controls are:
•SPC (Statistical Process Control)
•Poke-Yoke – (Mistake Proofing)
•Detection after Failure
Ask yourself “how do we control this defect?”
The column “Current Process Controls” highlighted here refers to the three types of controls that are
in place to prevent a failures.
FMEA Components…Detection
Components Detection (DET)

defects) s
Detection is an assessment of the probability that the proposed type

of control will detect a subsequent failure mode.
This information should be obtained from your Measurement System

Analysis Studies and the Process Map. A rating should be assign in
conjunction with the predetermined scale.
The “Detection” highlighted here is an assessment of the probability that the proposed type of
control will detect a subsequent failure mode.

134
Process Discovery
Ranking Detection
Criteria: The likelihood that the existence of a defect will

Detection be detected by the test content before the product Ranking
advances to the next or subsequent process
Almost Impossible Test content must detect < 80% of failures 10
Very Remote Test content must detect 80% of failures 9
Remote Test content must detect 82.5% of failures 8
Very Low Test content must detect 85% of failures 7
Low Test content must detect 87.5% of failures 6
Moderate Test content must detect 90% of failures 5
Moderately High Test content must detect 92.5% of failures 4
High Test content must detect 95% of failures 3
Very High Test content must detect 97.5% of failures 2
Almost Certain Test content must detect 99.5% of failures 1
Potential Failure Mode and Effects Analysis (FMEA), AIAG Reference Manual, 2002 Pg. 35.. Chrysler Corporation,
Ford Motor Company, General Motors Corporation.
developed these Detection criteria.
Risk Priority Number “RPN”
The “The Risk Priority

Number” highlighted here is # Process
Functio
Potential
Failure
Potential
Failure
S C
E l
Potential
Causes of
O Current
C Process
D R
E P
Recomme
nd Actions
Responsibl
e Person &
Taken
Action
S O D R
E C E P
a value that will be used to n Modes Effects V a Failure C Controls T N Target s V C T N
rank order the concerns defects) s
from the process.
We provided you with a

template which will The Risk Priority Number is a value that will be used to rank order
automatically calculate this the concerns from the process.
for you based on your
inputs for Severity,
Occurrence and Detection
Detection. The RPN is the product of,
of Severity,
Severity Occurrence and Detect ability
as represented here…
RPN = (SEV)*(OCC)*(DET)

135
Process Discovery
FEMA Components…Actions

defects) s
Recommended Actions refers to the activity for the prevention of a

defect.
Responsible Person & Date refers to the name of the group or person
responsible for completing the activity and when they will complete it.
Taken Action refers to the action and effective date after it has been
completed.
The columns highlighted here are a type of post FMEA. Remember to update the FMEA throughout
your project, this is what we call a “Living Document” as it changes throughout your project.
FMEA Components…Adjust RPN

defects) s
Once the Recommended Actions, Responsible Person & Date,

T k Action
Taken A ti have
h beenb completedl t d th
the SSeverity,
it OOccurrence andd
Detection should be adjusted. This will result in a new RPN rating.
The columns highlighted here are the adjusted levels based on the actions you have taken within the
process.

136
Process Discovery
FMEA Exercise
Exercise objective: Assemble your team in order

to create a FMEA using the information
generated from the Process Map, Fishbone
Diagram and X-Y Diagram.
1. Be prepared to present results to your mentor.
OK Team,
Team let’s
get that FMEA!

137
Process Discovery
Create a high-level Process Map
Create a Fishbone Diagram
Create an X-Y Diagram
Create an FMEA
Describe the purpose of each tool and when it should be used
You have now completed Measure Phase – Process Discovery.
Notes

138
Lean Six Sigma

Black Belt Training
Measure Phase
Six Sigma Statistics
Now we will continue in the Measure Phase with “Six Sigma Statistics”.

139
Overview
In this module you will learn how your
processes speak to you in the form of W
Welcome
l
Welcome to
tto Measure
M
Measure
data. If you are to understand the Process
Process Discovery
Discovery
behaviors of your processes, then you
Six
Six Sigma
Sigma Statistics
Statistics
must learn to communicate with the
process in the language of data. Basic
Basic Statistics
Statistics
Descriptive
Descriptive Statistics
Statistics
The field of statistics provides the tools
and techniques
q to act on data,, to turn Normal
Normal Distribution
Distribution
data into information and knowledge Assessing
Assessing Normality
Normality
which you will then use to make Special
Special Cause
Cause // Common
Common Cause
Cause
decisions and to manage your
processes. Graphing
Graphing Techniques
Techniques
Measurement
Measurement System
System Analysis
Analysis
The statistical tools and methods that
you will need to understand and Process
Process Capability
Capability
optimize your processes are not
Wrap
Wrap Up
Up &
& Action
Action Items
Items
difficult. Use of Excel spreadsheets or
specific statistical analytical software
has made this a relatively easy task.
In this module you will learn basic, yet powerful analytical approaches and tools to increase your
ability to solve problems and manage process behavior.
Purpose of Basic Statistics
The purpose of Basic Statistics is to:

• Provide a numerical summary of the data being analyzed.
– Data (n)
• Factual information organized for analysis.
• Numerical or other information represented in a form suitable for processing by
computer
• Values from scientific experiments.
• Provide the basis for making inferences about the future.
• Provide the foundation for assessing process capability.
• Provide a common language to be used throughout an organization to
describe processes.
Relax….it won’t
be that bad!
Statistics is the basic language of Six Sigma

Sigma. A solid understanding of Basic Statistics is the foundation
upon which many of the subsequent tools will be based.
Having an understanding of Basic Statistics can be quite valuable to an individual. Statistics however,
like anything, can be taken to the extreme.

140
Purpose of Basic Statistics (Cont.)
But it is not the need or the intent of this course to do that

that, nor is it the intent of Six Sigma
Sigma. It can
be stated that Six Sigma does not make people into statisticians, rather it makes people into
excellent problem solvers by using appropriate statistical techniques.
Data is like crude oil that comes out of the ground. Crude oil is not of much good use. However if
the crude oil is refined many useful products occur; such as medicines, fuel, food products,
lubricants, etc. In a similar sense statistics can refine data into usable “products” to aid in decision
making, to be able to see and understand what is happening, etc
Statistics is broadly used by just about everyone today. Sometimes we just don’t realize it. Things
as simple as using graphs to better understand something is a form of statistics, as are the many
opinion and political polls used today. With easy to use software tools to reduce the difficulty and
time to do statistical analyses, knowledge of statistics is becoming a common capability amongst
people.
An understanding of Basic Statistics is also one of the differentiating features of Six Sigma and it
would
ld nott b
be possible
ibl without
ith t th
the use off computers
t andd programs liklike MINITAB™
MINITAB™. It hhas b
been
observed that the laptop is one of the primary reasons that Six Sigma has become both popular
and effective.
Statistical Notation – Cheat Sheet
Summation An individual value, an observation
The standard deviation of sample data A particular (1st) individual value
The standard deviation of population data For each, all, individual values
The variance of sample data The mean, average of sample data

The variance of population data
The grand mean, grand average
The range of data
The mean of population data
The average range of data
Multi-purpose notation, i.e. # of subgroups, # A proportion of sample data

of classes
A proportion of population data
The absolute value of some term
Sample size
Greater than, less than
Greater than or equal to, less than or equal to Population size
Use this as a cheat sheet, don’t bother memorizing all of this. Actually most of the notation in Greek is
for population data.

141
Parameters vs. Statistics
Population: All the items that have the “property of interest” under study.
Frame: An identifiable subset of the population.
Sample: A significantly smaller subset of the population used to make an inference.
Population
Sample
Sample
Sample
Population Parameters: Sample Statistics:

– Arithmetic
A ith ti d
descriptions
i ti off a population
l ti – Arithmetic
A ith ti ddescriptions
i ti off a
– µ, σ , P, σ2, N sample
– X-bar , s, p, s2, n
The purpose of sampling is:

To get a “sufficiently accurate” inference for considerably less time, money, and other resources,
and also to provide a basis for statistical inference; if sampling is done well, and sufficiently, then
the inference is that “what
what we see in the sample is representative of the population”
population
A population parameter is a numerical value that summarizes the data for an entire population, a
sample has a corresponding numerical value called a statistic.
The population is a collection of all the individual data of interest. It must be defined carefully, such
as all the trades completed in 2001. If for some reason there are unique subsets of trades it may
be appropriate to define those as a unique population, such as, “all sub custodial market trades
completed in 2001”
2001 or “emerging
emerging market trades”
trades .
Sampling frames are complete lists and should be identical to a population with every element
listed only once. It sounds very similar to population… and it is. The difference is how it is used. A
sampling frame, such as the list of registered voters, could be used to represent the population of
adult general public. Maybe there are reasons why this wouldn’t be a good sampling frame.
Perhaps a sampling frame of licensed drivers would be a better frame to represent the general
public.
The sampling frame is the source for a sample to be drawn.
It is important to recognize the difference between a sample and a population because we typically
are dealing with a sample of the what the potential population could be in order to make an
inference. The formulas for describing samples and populations are slightly different. In most
cases we will be dealing with the formulas for samples.

142
Types of Data
Attribute Data (Qualitative)

– Is always binary, there are only two possible values (0, 1)
• Yes, No
• Go, No go
• Pass/Fail
Variable Data (Quantitative)
– Discrete (Count) Data
• Can be categorized in a classification and is based on counts.
– Number of defects
– Number of defective units
– Number of customer returns
– Continuous Data
• Can be measured on a continuum, it has decimal subdivisions that are
meaningful
– Time, Pressure, Conveyor Speed, Material feed rate
– Money
– Pressure
– Conveyor Speed
– Material feed rate
The nature of data is important to understand. Based on the type of data you will have the option
to utilize different analyses.
Data, or numbers, are usually abundant and available to virtually everyone in the organization.
Using data to measure, analyze, improve and control processes forms the foundation of the Six
Sigma
g methodology. gy Data turned into information,, then transformed into knowledge,
g , lowers the
risks of decision. Your goal is to make more decisions based on data versus the typical practices
of “I think”, “I feel” and “In my opinion”.
One of your first steps in refining data into information is to recognize what the type of data it is
that you are using. There are two primary types of data, they are Attribute and Variable Data.
Attribute Data is also called qualitative data. Attribute Data is the lowest level of data. It is purely
binary in nature. Good or bad, yes or no type data. No analysis can be performed on Attribute
Data. Attribute Data must be converted to a form of Variable Data called Discrete Data in order to
be counted or be useful.
Discrete Data is information that can be categorized into a classification. Discrete Data is based
on counts. It is typically things counted in whole numbers. Discrete Data is data that can't be
broken down into a smaller unit to add additional meaning. Only a finite number of values is
possible and the values cannot be subdivided meaningfully. For example, there is no such thing
as a half of defect or a half of a system lockup.
lockup
Continuous Data is information that can be measured on a continuum or scale. Continuous Data,
also called quantitative data can have almost any numeric value and can be meaningfully
subdivided into finer and finer increments, depending upon the precision of the measurement
system. Decimal sub-divisions are meaningful with Continuous Data. As opposed to Attribute
Data like good or bad, off or on, etc., Continuous Data can be recorded at many different points
(length, size, width, time, temperature, cost, etc.). For example 2.543 inches is a meaningful
number, whereas 2.543 defects does not make sense.
Later in the course we will study many different statistical tests but it is first important to
understand what kind of data you have.

143
Discrete Variables
Discrete Variable Possible values for the variable
The number of defective needles in boxes of 100 0,1,2, …, 100

diabetic syringes
The number of individuals in groups of 30 with a 0,1,2, …, 30

Type A personality
The number of surveys returned out of 300 0,1,2, … 300

mailed in a customer satisfaction study.
The number of employees in 100 having finished 0,1,2, … 100

high school or obtained a GED
The number of times you need to flip a coin 1,2,3, …

before a head appears for the first time
(note, there is no upper limit because you might
need to flip forever before the first head appears.
Shown here are additional Discrete Variables. Can you think of others within your business?
Continuous Variables
Continuous Variable Possible Values for the Variable
The length of prison time served for individuals All the real numbers between a and b,
b where a is
convicted of first degree murder the smallest amount of time served and b is the
largest.
The household income for households with All the real numbers between a and $30,000,
incomes less than or equal to $30,000 where a is the smallest household income in the
population
Th blood
The bl d glucose
l reading
di ffor those
th individuals
i di id l All reall numbers
b b
between
t 200 and d b,
b where
h b is
i
having glucose readings equal to or greater than the largest glucose reading in all such individuals
200
Shown here are additional Continuous Variables. Can you think of others within your business?

144
Definitions of Scaled Data
• Understanding the nature of data and how to represent it can affect the
types of statistical tests possible.
• Nominal Scale – data consists of names, labels, or categories. Cannot

be arranged in an ordering scheme. No arithmetic operations are
performed for nominal data.
• Ordinal Scale – data is arranged in some order, but differences between

data values either cannot be determined or are meaningless.
• Interval Scale – data can be arranged in some order and for which
differences in data values are meaningful. The data can be arranged in
an ordering scheme and differences can be interpreted
interpreted.
• Ratio Scale – data that can be ranked and for which all arithmetic
operations including division can be performed. (division by zero is of
course excluded) Ratio level data has an absolute zero and a value of
zero indicates a complete absence of the characteristic of interest.
Shown here are the four types of scales. It is important to understand these scales as they will dictate
the type of statistical analysis that can be performed on your data.
Nominal Scale
Listed are some Qualitative Variable Possible nominal level data values for
examples of the variable
Nominal Data.
The only analysis Blood Types A, B, AB, O
is whether they
are different or
not.
State of Residence Alabama, …, Wyoming
Country of Birth United States, China, other
Time to weigh in!

145
Ordinal Scale
These are examples of

Qualitative Variable Possible Ordinal level data
Ordinal Data.
values
Automobile Sizes Subcompact, compact,

intermediate, full size, luxury
Product rating Poor, good, excellent
Baseball team classification Class A, Class AA, Class AAA,

Major League
Interval Scale
I t
Interval
l Variable
V i bl P
Possible
ibl Scores
S
IQ scores of students in 100…

BlackBelt Training (the difference between scores
is measurable and has
meaning but a difference of 20
points between 100 and 120
does not indicate that one
student is 1.2 times more
i t lli
intelligent
t )
These are examples of Interval Data.

146
Ratio Scale
Ratio Variable Possible Scores Shown here is an

example of Ratio Data.
Grams of fat consumed per adult in the 0…

United States (If person A consumes 25 grams of fat and
person B consumes 50 grams, we can say
that person B consumes twice as much fat
as person A. If a person C consumes zero
grams of fat per day, we can say there is a
complete absence of fat consumed on that
day. Note that a ratio is interpretable and
an absolute zero exists.)
Converting Attribute Data to Continuous Data
Continuous Data
provides us more • Continuous Data is always more desirable
opportunity for
statistical analyses.
Attribute Data can often • In many cases Attribute Data can be converted to
be converted to continuous
Continuous by
converting it to a rate. • Which is more useful?
– 15 scratches or Total scratch length of 9.25”
– 22 foreign materials or 2.5 fm/square inch
– 200 defects or 25 defects/hour

147
We will review the Measures of Location (central tendency)

Descriptive Statistics shown – Mean
here which are the most – Median
commonly used. – Mode
1) For each of the measures

Measures of Variation (dispersion)
of location, how alike or
– Range
different are they?
y
– Interquartile Range
2) For each measure of – Standard deviation
variation, how alike or – Variance
different are they?
3) What do these similarities

or differences tell us?
We are going to use

O pen the M IN ITAB™ Project “ M ea sure Da ta Sets.m pj” the MINITAB™
a nd select the w ork sheet “ ba sicsta tistics.mtw ” worksheet shown here
to create graphs and
statistics. Open the
worksheet
“basicstatistics.mtw”.
.

148
Measures of Location
Mean are the most

common measure of Mean is:
location. A “Mean”, implies • Commonly referred to as the average.
that you are talking about • The arithmetic balance point of a distribution of data.
the population or inferring
Stat>Basic Statistics>Display Descriptive Statistics…>Graphs…
something about the >Histogram of data, with normal curve
population. Conversely,
average,
g implies
p Histogram(with
Hi t
Histogram ( ithNorm
(with N allCurve)
Normal C )of
Curve) offData
D t
Data
Sample
p Population
p
something about sample 80
80
Mean
Mean
StDev
5.000
5.000
StDev 0.01007
0.01007
data. 70
70
NN 200
200
60
60
Although the symbol is 50

FrFrequency
50
e que ncy
different, there is no 40
40
Descriptive Statistics: Data
mathematical difference 30
30
Variable N N* Mean SE Mean StDev Minimum Q1
between the Mean of a 20
20 Median Q3
Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900
10 5.0000 5.0100
sample and Mean of a 100
00 Variable Maximum
population. 4.97
4.97 4.98
4.98 4.99
4.99
Data
5.00
5.00 5.01
5.01 5.02
5.02 Data 5.0200
Data
The physical
Th h i l center t
of a data set is the Median is:
Median and • The mid-point, or 50th percentile, of a distribution of data.
• Arrange the data from low to high, or high to low.
unaffected by large
– It is the single middle value in the ordered list if there is an odd
data values. This is number of observations
why people use – It is the average of the two middle values in the ordered list if there
Median when are an even number of observations
discussingg averageg
salary for an Histogram
Histogram(with
(withNNormal
ormal Curv e) of
Curve) ofData
Data
M ean 5.000
80
American worker, 80
70
N
Mean
S tD ev
5.000
0.01007
StDev 0.01007
N
200
200
70
people like Bill 60
60
Gates and Warren 50

Frequency
50
Frequency

40
Buffet skew the 40
30
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3
30 Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900 5.0000 5.0100
average number. 20
20 Variable Maximum
10
Data 5.0200
10
00
4.97
4.97 4.98
4.98 4.99
4.99 5.00
5.00 5.01
5.01 5.02
5.02
Dat a
Data

149
Measures of Location (cont.)
Trimmed Mean is a:
Compromise between the mean and median.
• The trimmed mean is calculated by eliminating a specified percentage
of the smallest and largest observations from the data set and then
calculating the average of the remaining observations
• Useful for data with potential extreme values
values.
Stat>Basic Statistics>Display Descriptive Statistics…>Statistics…> Trimmed Mean
Variable N N* Mean SE Mean TrMean StDev Minimum Q1 Median

Data 200 0 4.9999 0.000712 4.9999 0.0101 4.9700 4.9900 5.0000
Variable Q3 Maximum
Data 5.0100 5.0200
The trimmed Mean (highlighted above) is less susceptible to the effects of extreme scores.
Mode is:
The most frequently occurring value in a distribution of data.
Mode = 5
H i s t o g r a m ((with
Histogram w ith N o r m a l CCurve)
Normal u r v e ) oof
f DData
a ta
MMean
ean 55.000
.000
880
0
SStDev
tD e v 00.01007
.01 007
NN 2200
00
770
0
660
0
550
0
quency
requency
440
0
Fre
Fr
r
330
0
220
0
110
0
00
44.97
.9 7 44.98
.9 8 44.99
.9 9 55.00
.0 0 55.01
.0 1 55.02
.0 2
DData
ata
It is
i possible
ibl to
t have
h multiple
lti l Modes,
M d when
h this
thi h
happens it’
it’s called
ll d Bi
Bi-Modal
M d l Di
Distributions.
t ib ti H
Here we
only have One mode = 5.

150
Measures of Variation (cont.)
Range is the:
Difference between the largest observation and the smallest
observation in the data set.
• A small range would indicate a small amount of variability and a large
range a large amount of variability.

Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900 5.0000 5.0100
Variable Maximum
Data 5.0200
Interquartile Range is the:

Difference between the 75th percentile and the 25th percentile.
Use Range or Interquartile Range when the data distribution is skewed.
A range is typically used for small data sets which is completely efficient in estimating variation for
a sample of 2. As your data increases the Standard Deviation is a more appropriate measure of
variation.
Standard Deviation is:

Equivalent of the average deviation of values from the mean for a
distribution of data.
A “unit of measure” for distances from the mean.
Use when data are symmetrical.
S
Sample P
Population
l ti

Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900 5.0000 5.0100
Variable Maximum
Data 5.0200
Cannot calculate population Standard Deviation because this is sample data.
The Standard Deviation for a sample and population can be equated with short and long-term
variation.
Usually a sample is taken over a short period of time making it free from the types of variation
that can accumulate over time so be aware.
We will explore this further at a later point in the methodology.

151
Measures of Variation (cont.)
Variance is the:
Average squared deviation of each individual data point from the
mean.
Sample Population
The Variance is the square of the Standard Deviation. It is common in statistical tests where it is
necessary to add up sources of variation to estimate the total. Standard Deviations cannot be
added, variances can.
Normal Distribution
The normal distribution is the most recognized distribution in

statistics.
What are the characteristics of a Normal distribution?

– Only random error is present
– Process free of assignable cause
– Process free of drifts and shifts
So what is present when the data is Non-Normal?
We can begin to discuss the Normal Curve and its properties once we understand the basic
concepts of central tendency and dispersion.
As we begin to assess our distributions know that sometimes it’s actually more difficult to determine
what is effecting a process if it is Normally Distributed. When we have a Non-normal Distribution
there is usually special or more obvious causes of variation that can be readily apparent upon
process investigation.

152
The Normal Curve
The Normal Distribution is

the most commonly used The normal curve is a smooth, symmetrical, bell-shaped curve,
and abused distribution in generated by the density function.
statistics and serves as the
foundation of many
statistical tools which will be
taught later in the
methodology
methodology.
It is the most useful continuous probability model as many

naturallyy occurringg measurements such as heights,
g weights,
g
etc. are approximately normally distributed.
Normal Distribution
The shape of the

Each combination of Mean and Standard Deviation generates a
Normal
unique normal curve:
Distribution is a
function of 2
parameters, (the
Mean and the
Standard
Deviation).
We will convert the “Standard” Normal Distribution

Normal
– Has a μ = 0, and σ = 1
Distribution to the
standard Normal in – D
Datat ffrom any normall di
distribution
t ib ti can be
b made
d to
t
order to compare fit the standard normal by converting raw scores
various Normal to standard scores.
Distributions and
to estimate tail – Z-scores measure how many Standard Deviations from the
area proportions. mean a particular data-value lies.
By normalizing the Normal Distribution this converts the raw scores into standard Z-scores
Z scores with a
Mean of 0 and Standard Deviation of 1, this practice allows us to use the Z-table.

153
Normal Distribution (cont.)
The area under the curve between any 2 points represents the
proportion of the distribution between those points.
The
Thearea
areabetween
betweenthe
the
Mean
Mean andany
and anyother
other
point
pointdepends
dependsupon
uponthethe
Standard Deviation.
Standard Deviation.
μ x
Convert any raw score to a Z-score using the formula:
Refer to a set of Standard Normal Tables to find the

proportion between μ and x.
The area under the curve between any two points represents the proportion of the distribution. The
concept of determining the proportion between 2 points under the standard Normal curve is a critical
componentt to
t estimating
ti ti Process
P Capability
C bilit and
d will
ill b
be covered
d iin d
detail
t il iin th
thatt module.
d l
Empirical Rule
The Empirical rule

allows us to predict or The Empirical Rule…
more appropriately
make an estimate of
how our process is
performing. You will
gain a great deal of
understanding within
the Process Capability
module. Notice the
difference between +/-
1 SD and +/- 6 SD. -6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
68.27 % of the data will fall within +/- 1 standard deviation

95.45 % of the data will fall within +/- 2 standard deviations

154
The Empirical Rule (cont.)
No matter what the shape of your distribution is, as you travel 3 Standard
Deviations from the Mean, the probability of occurrence beyond that point
begins to converge to a very low number.
Why Assess Normality?
There is no W hile ma ny processes in na ture beha ve a ccording to the

good and bad. It N orma l Distribution, ma ny processes in business, pa rticula rly in
is not always the a rea s of service a nd tra nsa ctions, do not
better to have
“Normal” data,
There a re m a ny types of distributions:
look at it in
respect to the
intent of your
project. Again,
there is much
informational
content in non- There a re m a ny sta tistica l tools tha t a ssume N orma l Distribution
Normal properties in their ca lcula tions.
Distributions for
Distributions,
this reason it is
useful to know So understa nding just how “ N orma l” the da ta a re w ill impa ct
how Normal our how w e look a t the da ta .
data are.
Go back to your project, what do you want to do with your distribution, Normal or Non-normal.
Many distributions simply by nature can NOT be Normal. Assume that your dealing with a time
metric how do you get negative time
metric, time, without having a flux capacitor as in the movie “Back
Back to the
Future.” If your metric is, by nature bound to some setting.

155
Tools for Assessing Normality
The Anderson
Darling test yields a The sha pe of a ny norma l curve ca n be ca lcula ted ba sed
statistical on the norma l proba bility density function.
assessment (called
a goodness-of-fit
test) of Normality Tests for N orma lity ba sica lly compa re the sha pe of the
and the MINITAB™ ca lcula ted curve to the a ctua l distribution of your da ta
version of the points.
N
Normal l probability
b bili
test produces a For the purposes of this tra ining, w e w ill focus on 2
graph to visual
w a ys in M IN ITAB™ to a ssess N orma lity:
demonstrate just
how good that fit is. – The Anderson-Da rling test
– N orma l proba bility test
Watch that curve!
Goodness-of-Fit
The Anderson-Darling test uses an empirical density function.
100
Expected for Normal Distribution
Departure of the Actual Data
20%
actual data from the
80
expected normal C
u
m
distribution. The u
l
a 60
Anderson-Darling t
i
v
Goodness-of-Fit test e
P
assesses the e 40
r
c
magnitude of these e
n
departures using an t
20
Observed minus 20%
Expected formula. 0
3.0 3.5 4.0 4.5 5.0 5.5
Raw Data Scale
Anderson-Darling assess how closely actual frequency at a given value corresponds to the
theoretical frequency for a Normal Distribution with the same Mean and Standard Deviation.

156
The Normal Probability Plot
Probability
ProbabilityPlot
Plotof
ofAmount
Amount
Normal
Normal
99.9
99.9
Mean
Mean 84.69
84.69
StDev
StDev 7.913
7.913
99
99 NN 70
70
AD
AD 0.265
0.265
95
95 P-Value 0.684
P-Value 0.684
90
90
80
80
70
ercent
70
rcent
60
60
50
50
40
40
Pe
Pe
30
30
20
20
10
10
55
11
0.1
0.1
60
60 70
70 80
80 90
90 100
100 110
110
Amount
Amount
The Anderson-Darling test is a good litmus

test for normality: if the P-value is more
than .05, your data are normal enough for
most purposes.
The graph shows the probability density of your data plotted against the expected density of a
N
Normal l curve. NNotice
ti ththatt th
the y-axis
i ((probability)
b bilit ) d
does nott iincrease lilinearly.
l N Normall d
data
t will
ill lilie on a
straight line (the red line) in this analysis. The graph shows you which values tend to deviate from
the Normal curve.
The Anderson-Darling test also appears in this output. Again,

if the P-value is greater than .05, assume the data are normal.
The reasoning behind the

decision to assume
normality based on the P -
value will be covered in
the Analyze Phase. For
now, just accept this as a
general guideline.

157
Anderson-Darling Caveat
Use the Anderson Darling column to generate these graphs.

Summary
Summary for
for Anderson
AndersonDarling
Darling
Probability
Probability Plot
Plotof
of Anderson
AndersonDarling
Darling AAnderson-D
nderson-Darling
arlingNNorm ality TTest
est
Normal
Normal ormality
AA-S-Squared
quared 0.18
0.18
99.9
99.9 PP-V-Value
alue 0.921
0.921
Mean
Mean 50.03
50.03 MMean
ean 50.031
50.031
StDev
StDev 4.951
4.951
99 SStD
tDev
ev 4.951
4.951
99 NN 500
500 VVariance 24.511
ariance 24.511
AADD 0.177 SSkew
95
0.177 kewness
ness -0.061788
-0.061788
95 P-Value
P-Value 0.921
0.921 KKurtosis
urtosis -0.180064
-0.180064
90
90 NN 500
500
80
80 MMinim um
inimum 35.727
35.727
70 1st
1stQQuartile
uartile 46.800
46.800
70
Perrcent
cent
60
60 MMedian
edian 50.006
50.006
50
50 3rd
3rdQQuartile
uartile 53.218
53 218
53.218
Perc
40
40
36
36 40
40 44
44 48
48 52
52 56
56 60
60 MMaxim um 62.823
aximum 62.823
30
30 95%
95%CConfidence
onfidenceInterv
Intervalalfor
forMMean
ean
20
20 49.596
49.596 50.466
50.466
10
10 95%
95%CConfidence
onfidenceInterv
Intervalalfor
forMMedian
edian
55 49.663 50.500
49.663 50.500
95%
95%CConfidence
onfidenceInterv
Intervalalfor
forSStD
tDev
ev
11 9955%
% CConfide nce IInter
onfidence nte r vvaals
ls
4.662
4.662 5.278
5.278
Mean
Mean
0.1
0.1
35
35 40
40 45
45 50
50 55
55 60
60 65
65 Median
Median
AAnderson
ndersonDarling
Darling 49.50 49.75 50.00 50.25 50.50
49.50 49.75 50.00 50.25 50.50
In this case, both the Histogram and the Normality Plot look very “normal”. However,
because the sample size is so large, the Anderson-Darling test is very sensitive and any
slight deviation from normal will cause the p-value to be very low
low. Again
Again, the topic of
sensitivity will be covered in greater detail in the Analyze Phase.
For now, just assume that if N > 100 and the data look normal, then they probably are.
If the Data Are Not Normal, Don’t Panic!
Once again, Non-

normal Data is NOT a • Normal data are not common in the transactional world.
bad thing, depending
on the type of process • There are lots of meaningful statistical tools you can use to
/ metrics yyou are analyze your data (more on that later).
working with.
Sometimes it can even • It just means you may have to think about your data in a
be exciting to have slightly different way.
Non-normal Data
because in some ways
it represents
opportunities for
improvements
improvements.
Don’t touch that button!

158

Normality Exercise
Ex ercise objective: To demonstra te how to

test for N orma lity.
1 . Genera te N orma l Proba bility Plots a nd

the gra phica l summa ry using the
“ Descriptive Sta tistics M TW ” file.
tistics.M file
2 . Use only the columns Dist A a nd Dist D.
3 . Answ er the follow ing quiz questions

ba sed on your a na lysis of this da ta set.
Answers:
1) Is Distribution A Normal? Answer > No
2) Is Distribution B Normal? Answer > No
Isolating Special Causes from Common Causes
Don’t get too worried

about killing all variation, Special Cause: Variation is caused by known factors that result in
get the biggest bang for a non-random distribution of output. Also referred to as “Assignable
your buck and start Cause”.
making improvements by
following the
methodology. Many Common Cause: Variation caused by unknown factors resulting in
a steady but random distribution of output around the average of
companiesi ttoday
d can
the data. It is the variation left over after special cause variation has
realize BIG gains and
been removed and typically (not always) follows a normal
reductions in variation by
distribution.
simply measuring,
describing the
performance and then If we know that the basic structure of the data should follow a
making common sense normal distribution, but plots from our data shows otherwise; we
adjustments within the know the data contain special causes.
process…recall the
“ground fruit”?
Special Causes = Opportunity
Think about your data in
terms of what it should
look like, then compare it to what it does look like. See some deviation, maybe some Special
Causes at work?

159
Introduction to Graphing
Passive data
collection means The purpose of Gra phing is to:
don’t mess with the • Identify potential relationships between variables.
process! We are • Identify risk in meeting the critical needs of the Customer,
gathering data and Business and People.
looking for patterns • Provide insight into the nature of the X’s which may or may not
in a graphical tool. If control Y.
the data is
• Show the results of passive data collection
collection.
questionable, so is
the graph we create
from it. For now In this section w e w ill cover…
utilize the data 1. Box Plots
available, we will 2. Scatter Plots
learn a tool called
3. Dot Plots
Measurement
System Analysis 4. Time Series Plots
later in this phase. 5. Histograms
Data Sources
Data
demographics Data sources are suggested by many of the tools that have
will come out of
the basic
been covered so far:
Measure Phase – Process Map
tools such as – X-Y Matrix
Process Maps, – Fishbone Diagrams
X-Y Diagrams,
FMEAs and – FMEA
Fishbones. Put
your focus on Examples are:
the top X’s from
X-Y Diagram to 1. Time 3. Operator
focus your Shift Training
Day of the week Experience
activities.
Week of the month Skill
S
Season off th
the year Adherence to procedures
2. Location/position 4. Any other sources?

Facility
Region
Office

160
Graphical Concepts
The characteristics of a graph are

critical to the graphing process. The characteristics of a good graph include:
The validity of data allows us to • Variety of data
understand the extent of error in • Selection of
– Variables
the data. The selection of
– Graph
variables impacts how we can
– Range
control a specific output of a
process. The type
p yp of g
graph
p will Information to interpret relationships
depend on the data
demographics while the range Explore quantitative relationships
will be related to the needs of the
customer. The visual analysis of
the graph will qualify further
investigation of the quantitative
relationship between the
variables
variables.
The Histogram
A Histogram is a basic graphing tool
that displays the relative frequency A Histogram displays data that have been summarized into
or the number of times a measured intervals. It can be used to assess the symmetry or skewness of the
items falls within a certain cell size. data.
Histogram
Histogramof
ofHistogram
Histogram
The values for the measurements
are shown on the horizontal axis (in 40
40
cells) and the frequency of each size

30
is shown on the vertical axis as a bar 30
FrFrequency
equency
graph. The graph illustrates the

20
20
distribution of the data by showing
which values occur most and least 10
10
frequently. A Histogram illustrates
the shape, centering and spread of 00
98
98 99
99 100
100 101
101 102
102 103
103
the data you have. It is very easy to HHistogram
istogram
construct and an easy to use tool

that you will find useful in many
situations. This graph represents the To construct a Histogram, the horizontal axis is divided into equal
data for the 20 days of arrival times intervals and a vertical bar is drawn at each interval to represent its
intervals,
at work from the previous lesson frequency (the number of values that fall within the interval).
page.
In many situations the data will form specific shaped distributions. One very common distribution
you will encounter is called the Normal Distribution, also called the bell shaped curve for its
appearance. You will learn more about distributions and what they mean throughout this course.

161
Histogram Caveat
As you can see in
the MINITAB™ file All the Histograms below were generated using random samples of
the columns used to the data from the worksheet “ Graphing Data.mtw” .
generate the
Histogram
Histogramof
ofH1_20,
H1_20, H2_20,
H2_20, H3_20,
H3_20,H4_20
H4_20
Histograms above 98
98 99
99 100
100 101
101 102
102
only have 20 data 44
H1_20
H1_20
44
H2_20
H2_20
points. It is easy to 33 33
generate your own 22 22
samples to create 11 11
FFrequency
requency
Histogram simply by 00
88
H3_20
H3_20
00
88
H4_20
H4_20
using the MINITAB™ 66 66

menu path: 44 44
“Calc>Random 22 22
Data>Sample from 00 00
98
98 99
99 100
100 101
101 102
102
columns…”
Be careful not to determine N ormality simply from a Histogram plot,
if the sample size is low the data may not look very N ormal.
Variation on a Histogram
The
Histogram Using the worksheet “ Graphing Data.mtw” create a simple Histogram for
the data column called granular.
shown
here looks
to be very
Normal. Histogram of Granular
25
20
15
Frequency
10
0
44 46 48 50 52 54 56
Granular

162
Dot Plot
Using the worksheet “Graphing
Graphing
Data.mtw”, create a Dot Plot. The Dot Plot can be a useful alternative to the Histogram especially if you
want to see individual values or you want to brush the data.
Histogram for the granular
distribution obscures the granularity,
whereas the Dot Plot reveals it.
Also, Dot Plots allow the user to
brush data points. The Histogram
Dotplot
Dotplotof
of Granular
Granular
does not
not.
Points could have Special Causes

associated with them.
These occurrences should also be 44 46 48 50 52 54 56

44 46 48 50 52 54 56
identified in the Logbook in order to Granular
Granular
assess the potential for a special

cause related to them
them. You should
look for potential Special Cause
situations by examining the Dot Plot for both high frequencies and location.
If in fact there are special causes (Uncontrollable Noise or Procedural non-compliance) then they
should be addressed separately and then excluded from this analysis.
Take a few minutes and create other Dot Plots using the columns in this data set.
Box Plot
A Box Plot (sometimes called a
Whisker Plot) is made up of a box Box Plots summarize data about the shape, dispersion and center of the
representing the central mass of the data and also help spot outliers.
variation and thin lines, called Box Plots require that one of the variables, X or Y, be categorical or
whiskers extending out on either
whiskers, discrete and the other be continuous
continuous.
side representing the thinning tails of
A minimum of 10 observations should be included in generating the box
the distribution. Box Plots summarize plot.
information about the shape, Maximum Value
dispersion and center of your data.

Because of their concise nature, it
75th Percentile
easy to compare multiple Middle
distributions side by side. 50% of 50th Percentile (Median)
Data
Mean
25th Percentile
These may be “before” and “after”
views of a process or a variable. Or
they may be several alternative ways
min(1.5 x Interquartile Range
of conducting an operation. or minimum value)
Essentially, when you want to quickly Outliers
find out if two or more distributions

are different (or the same) then you
create a Box Plot. They can also
help you spot outliers quickly which
show up as asterisks on the chart.

163
Box Plot Anatomy

A Box Plot is based on quartiles and
represents a distribution as shown * Outlier
on the left of the graphic. The lines Upper Limit: Q3+1.5(Q3-Q1)
extending from the box are called
whiskers. The whiskers extend Upper Whisker
outward to indicate the lowest and
highest values in the data set Q3: 75th Percentile
(excluding outliers). The lower
B ox
Median
whisker represents the first 25% of Q2 Median
Q2: M di 50th Percentile
P til
the data in the Histogram (the light Q1: 25th Percentile
grey area). The second and third
quartiles form the box, which
Lower Whisker
represents fifty percent of the data
and finally the whisker on the right
Lower Limit: Q1+1.5(Q3-Q1)
represents the fourth quartile. The
line drawn through the box
represents the median of the data. Extreme values, or outliers, are represented by asterisks. A
value is considered an outlier if it is outside of the box (greater than Q3 or less than Q1) by more
than 1.5 times (Q3-Q1).
You can use the Box Plot to assess the symmetry of the data: If the data are fairly symmetric,
the Median line will be roughly in the middle of the box and the whiskers will be similar in length.
If the data are skewed, the Median may not fall in the middle of the box and one whisker will
likel be noticeabl
likely noticeably longer than the other
other.
Box Plot Examples
The first Box Plot

shows the differences Boxplot
Boxplot of
of Glucoselevel
Glucoselevel vs
vs SubjectID
SubjectID
in glucose level for 225
225 What can you tell about
nine different people. 200
200
the data expressed in a
B Pl
Box Plots?
t ?
175
175
The second Box Plot
Glucoselevel
s e le v e l
shows the effects of 150

150
cholesterol 125
Gluco
125
medication over time 100

100
for a group of
75
75
patients. Cholesterol
Cholesterol Levels
Levels
50
50
11 22 33 44 55 66 77 8350
8350 99
SubjectID
SubjectID
300
300
Eat this – 250

250
Data
ta
then check
Da
200
200
the Box 150

150
Plot! 100
100
2-Day
2-Day 4-Day
4-Day 14-Day
14-Day

164
Box Plot Examples
Using the
MINITAB™
worksheet “Graphing
Data.mtw”.
The data shows the setup

cycle time to complete The data shows the setup cycle time to complete “Lockout –
“Lockout – Tagout” for three Tagout” for 3 individuals in the department.
people in the department.
Looking only at the Box Plots,

SetupCycle
Setup Cycle Timefor
Time for "Lockou
"Lockoutt -- Tagout"
Tagout"
it appears that Brian should
be the benchmark for the 20.0
20.0
department since he has the
lowest median setup cycle 17.5
17.5
time with the smallest
variation. On the other hand, 15.0
15.0
Shree’s data has 3 outlier
12.5
12.5
DData
points that are well beyond

ata
what would be expected for

the rest off the data and his 10.0
100
10.0
variation is larger.
7.5
7.5
Be cautious drawing
conclusions solely from a Box 5.0
5.0
Plot. Shree may be the expert
BBrian
rian Greg
Greg Shree
Shree
who is brought in for special
setups
p because no one else
can complete the job.

165
Individual Value Plot Enhancement
Open the
O h MINITAB™ P Project
j The individual value plot shows the individual data points that are
“Measure Data Sets.mpj” and represented in the Box Plot.
select the worksheet “Graphing
Data.mtw”.
The individual value plot shows

the individual data points that are Individual
Individual Value
Value Plot
Plot of
of Brian,
Brian, Greg,
Greg, Shree
Shree
represented
p in the Box Plot. 20.0
20.0
There are many options available 17.5

17.5
within MINITAB™, take a few
15.0
15.0
minutes and explore the options
within the dialog box found by 12.5
Data
12.5
Da ta
following the menu path “Graph> 10.0
10.0
Individual Value Plot> Multiple
7.5
7.5
Y’s, Simple…”.
5.0
5 0
5.0
Brian
Brian Greg
Greg Shree
Shree
Attribute Y Box Plot
Using the MINITAB™

Box Plot with an attribute Y (pass/fail) and a continuous X
Data.mtw”. Graph> Box Plot…One Y, With Groups…Scale…Transpose value and category scales
To create this Box

Plot follow the
MINITAB™ menu
path “Graph>
Graph> Box
Plot…One Y, With
Groups…Scale…Tran
spose value and
category scales”.
If the output is
pass/fail, it must be
plotted on the y axis.
Use the data shown
to create the
transposed Box Plot.
The reason we do this
is for consistency and
accuracy.

166
Attribute Y Box Plot
The dialog box

shown here can be
found by selecting
the “Scale” button
in the “One Y, With
Groups “ dialog
box.
Boxplot
Boxplot of
of Hydrogen
Hydrogen Content
Content vs
vs Pass/Fail
Pass/Fail
The output Y is
Pass/Fail, the Box
11
Plot shows the
spread of hydrogen
Pass/Fail
Pass/Fail
content that created
the results.
22
215.0
215.0 217.5
217.5 220.0
220.0 222.5
222.5 225.0
225.0 227.5
227.5 230.0
230.0 232.5
232.5
Hydrogen
Hydrogen Content
Content
Individual Value Plot

worksheet “Graphing The Individual Value Plot when used with a Categorical X or Y
Data.mtw”, follow the enhances the information provided in the Box Plot:
MINITAB™ menu
path “Stat>ANOVA> – Recall the inherent problem with the Box Plot when a bimodal
One-Way (Unstacked distribution exists (Box Plot looks perfectly symmetrical)
)>Graphs…Individual – The Individual Value Plot will highlight the problem
value plot, Boxplots of
data”, make both
Stat>ANOVA> One-Way (Unstacked )>Graphs…Individual value plot, Box Plots of data
graphs using the
columns indicated
and tile them. Boxplot
Boxplotof
of Weibull, Norm
Weibull, al, Bi
Normal, Bi Modal
Modal Individual
Individual Value
ValuePlot
Plotof
ofW eibull, Norm
Weibull, al, Bi
Normal, BiModal
Modal
30
30 30
30
25
25 25
25
20
20 20
20
Data
Data
15
15 15
15
ta
ta
Da
Da
10
10 10
10
55 55
00 00
Weibull
Weibull Normal
Normal BiBiModal
Modal Weibull
Weibull Normal
Normal BiBiModal
Modal

167
Jitter Example
By using the Jitter
function we will Once your graph is created, click once on any of the data points (that
action should select all the data points).
spread the data apart
Then go to MINITAB™ menu path: Editor> Edit Individual Symbols…Jitter…
making it easier to
Increase the jitter in the x-direction to .075, click OK, then click anywhere
see how many data on the graph except on the data points to see the results of the change.
points there are.
This gives us
Individual
Individual Value
Value Plot
Plot of
of Weibull,
Weibull, Normal,
Normal, Bi
Bi Modal
Modal
relevance so we 30
30
don’t have points

25
25
plotted on top of
each other. 20
20
Data
15
Data
15
10
10
55
00
Weibull
Weibull Normal
Normal Bi
Bi Modal
Modal
Time Series Plot
Using the MINITAB™ Time series plots allow you to examine data over time.
Depending on the shape and frequency of patterns in the plot,
Data.mtw”.
several X’s can be found as critical or eliminated.
A Time Series is Graph> Time Series Plot> Simple...
created by following
the MINITAB™ menu
path “Graph>
Graph> Time
Time Series
Time Series Plot
Plot of
of Time 11
Time
Series Plot>
Simple...” 602
602
Time Series Plots are 601

601
very useful in most
projects. Every 600
600
Time 11
project should provide

Time
time series data to 599

599
look for frequency,
magnitude and 598
598
patterns. What X
would cause these 597
597
11 10
10 20
20 30
30 40
40 50
50 60
60 70
70 80
80 90
90 100
100
issues? Index
Index

168
Time Series Example

Looking at the Time
Looking at the time series plot below, the response appears to be
Series Plot, the
very dynamic.
response appears to
be very dynamic.
Time
TimeSeries
Series Plot
Plotof
of Time
Time11
The benefit of this 602

602
approach to charting
601
is you can see every 601
d t point
data i t as it is
i
600
600
Time 11
gathered over time.
Time
Some interesting 599
599
occurrences can be
revealed. 598
598
597
597
11 10
10 20
20 30
30 40
40 50
50 60
60 70
70 80
80 90
90 100
100
Index
Index
What other characteristic is present?

worksheet “Graphing Let’s look at some other time series plots.
Data.mtw”. What is happening within each plot?
What is different between the two plots?
Now let’s lay two
Time Series on top of Graph> Time Series Plot> Multiple...(use variables Time 2 and Time 3)
each other. This can

Time
Time Series
Series Plot
Plot of
of Time
Time 2,
2, Time
Time 33
be done by following
605
605 Variable
Variable
the MINITAB™ menu Time
Time 22
604
604 Time
Time 33
path “Graph> Time
603
603
Series Plot>
602
602
Multiple...” (use
601
601
variables Time 2 and
Data
Data
600
600
Time 3).
599
599
What is happening 598

598
within each plot? 597

597
What’s the difference 596
596
between the two 11 10
10 20
20 30
30 40
40 50
50 60
60 70
70 80
80 90
90 100
100
Index
Index
plots? Time 3 appears
to have wave pattern.

169
Curve Fitting Time Series

Using the
MINITAB™ MINITAB™ allows you to add a smoothed line to your time series
worksheet based on a smoothing technique called Lowess.
“Graphing Lowess means Locally Weighted Scatterplot Smoother.
Data.mtw”.
Graph> Time Series Plot> Simple…(select variable Time 3)…Data View…Smoother…Lowess
MINITAB™
allows you to Time
Time Series
Series Plot
Plot of
of Time
Time 33
add a 605
605
smoothed line 604

604
603
to your time 603
602
602
series based on
601
601
Time 33
a smoothing
Time
600
600
technique 599
599
called Lowess. 598
598
597
597
596
596
11 10
10 20
20 30
30 40
40 50
50 60
60 70
70 80
80 90
90 100
100
Index
Index

170
Explain the various statistics used to express location and spread

of data
Describe characteristics of a Normal Distribution
Explain Special Cause variation
Use data to generate various graphs and make interpretations

based on their output
You have now completed Measure Phase – Six Sigma Statistics.
Notes

171
Lean Six Sigma

Black Belt Training
Measure Phase
Measurement System Analysis
Now we will continue in the Measure Phase with “Measurements System Analysis”.

172
Overview
Measurement System
Analysis is one of those Welcome
Welcome to
to Measure
Measure
non-negotiable items!
MSA is applicable in Process
Process Discovery
Discovery
98% of projects and it
alone can have a Six
Six Sigma
Sigma Statistics
Statistics
massive effect on the
success of your project Measurement
Measurement System
y
System Analysis
y
Analysis
and improvements
within the company. Basics
Basics of
of MSA
MSA
In other words, LEARN
IT & DO IT. It is very Variables
Variables MSA
MSA
important. Attribute
Attribute MSA
MSA
Process
Process Capability
Capability
Wrap
Wrap Up
Up &
& Action
Action Items
Items
Introduction to MSA
So far we have learned that the heart and soul of Six Sigma is
that it is a data-driven methodology.
– How do you know that the data you have used is accurate and
precise?
– How do know if a measurement is a repeatable and
reproducible?
How good are these?

or
or
MSA
In order to improve your processes, it is necessary to collect data on the "critical to" characteristics.
When there is variation in this data, it can either be attributed to the characteristic that is being
measured and to the way that measurements are being taken; which is known as measurement error.
When there is a large measurement errorerror, it affects the data and may lead to inaccurate decision
decision-
making.
Measurement error is defined as the effect of all sources of measurement variability that cause an
observed value (measured value) to deviate from the true value.

173
Introduction to MSA (Cont.)

The measurement system is the complete process used to obtain measurements, such as the
procedures, gages and personnel that are employed to obtain measurements. Each component
of this system represents a potential source of error. It is important to identify the amount of error
and, if necessary, the sources of error. This can only be done by evaluating the measurement
system with statistical tools.
There are several types of measurement error which affect the location and the spread of the
distribution. Accuracy, linearity and stability affect location (the average). Measurement accuracy
describes the difference between the observed average and the true average based on a master
reference value for the measurements. A linearity problem describes a change in accuracy
through the expected operating range of the measuring instrument. A stability problem suggests
that there is a lack of consistency in the measurement over time. Precision is the variability in the
measured value and is quantified like all variation by using the standard deviation of the
distribution of measurements. For estimating accuracy and precision, multiple measurements of
one single characteristic must be taken.
The primary contributors to measurement system error are repeatability and reproducibility
reproducibility.
Repeatability is the variation in measurements obtained by one individual measuring the same
characteristic on the same item with the same measuring instrument. Reproducibility refers to
the variation in the average of measurements of an identical characteristic taken by different
individuals using the same instrument.
Given that Reproducibility and Repeatability are important types of error, they are the object of a
specific study called a Gage Repeatability & Reproducibility study (Gage R&R). This study can be
performed on either attribute-based or variable-based measurement systems. It enables an
evaluation of the consistency in measurements among individuals after having at least two
individuals measure several parts at random on a few trials. If there are inconsistencies, then the
measurement system must be improved.

Measurement System Analysis is the MSA is a mathematical procedure to quantify variation introduced to a
entire system, NOT just calibration or process or product by the act of measuring.
how good the measurement instrument
is. We must evaluate the entire
environment and Measurement System Item to be Reference
Analysis gives us a way to evaluate the Measured Measurement
measurement environment Operator Measurement Equipment
mathematically. Process
All these sources of variation combine Procedure

to yield a measurement that is different Environment
than the true value.
The item to be measured can be a physical part, document or a scenario for customer service.
It is also referred to as “Gage R&R” Operator can refer to a person or can be different instruments measuring the same products.
studies where R&R is: Repeatability & Reference is a standard that is used to calibrate the equipment.
Procedure is the method used to perform the test.
Reproducibility. Equipment
q p is the device used to measure the pproduct.
Environment is the surroundings where the measures are performed.

174
Measurement Purpose
Measurement is a process within In order to be worth collecting,
g, measurements must provide
p value -
itself. In order to measure something that is, they must provide us with information and ultimately,
you must go through a series of tasks knowledge
and activities in sequence. Usually
there is some from of set-up, there is The question…
an instrument that makes the
measurement, there is a way of
recording the value and it may be
What do I need to know?
done by multiple people.
people Even when
you are making a judgment call about …must be answered before we begin to consider issues of measurements,
metrics, statistics, or data collection systems
something, there is some form of
setup. You become the instrument
and the result of a decision is Too often, organizations build complex data collection and
information management systems without truly understanding how
recorded someway; even if it is verbal
the data collected and metrics calculated actually benefit the
or it is a set of actions that you take.
organization.
The ttypes and
Th d sophistication
hi ti ti off
measurement vary almost infinitely. It is becoming increasingly popular or cost effective to have
computerized measurement systems. The quality of measurements also varies significantly - with
those taken by computer tending to be the best. In some cases the quality of measurement is so
bad that you would be just as well off to guess at what the outcome should be. You will be
primarily concerned with the accuracy, precision and reproducibility of measurements to determine
the usability of the data.
Purpose
The purpose of
conducting an MSA is The purpose of MSA is to assess the error due to
to mathematically measurement systems.
partition sources of
The error can be partitioned into specific sources:
variation within the
measurement system – Precision
itself. This allows us • Repeatability - within an operator or piece of equipment
to create an action • Reproducibility - operator to operator or attribute gage to
plan to reduce the attribute gage
biggest contributors of – Accuracy
measurement error. • Stability - accuracy over time
• Linearity-
Linearity accuracy throughout the measurement range
• Resolution
• Bias – Off-set from true value
– Constant Bias
– Variable Bias – typically seen with electronic
equipment, amount of Bias changes with setting
levels

175
Accuracy and Precision
Measurement systems,
systems like
all things, generate some Accurate
Accuratebut butnotnotprecise
precise--On On Precise
Precisebut
butnotnotaccurate
accurate--The
The
average,
average,thetheshots
shotsare
areininthe average
averageisisnot
noton onthe
thecenter,
center,but
amount of variation in the the but
center
centerofofthe
thetarget
targetbut
butthere
thereisisaa the
thevariability
variabilityisissmall
small
results/data they output. In lot
lotof
ofvariability
variability
measuring, we are primarily
concerned with 3
characteristics:
1. How
1 H accurate
t is
i th
the
measurement? For a
repeated measurement,
where is the average
compared to some known
standard?. Think of the
target as the measurement
system,, the
syste t e known
o
standard is the bulls eye in
the center of the target. In
the first example you can see the “measurements” are very dispersed, there is a lot of variability as
indicated by the Histogram curve at the bottom. But on average, the “measurements” are on target.
When the average is on target, we say the measurement is accurate. However, in this example they
are not very precise.
2 How precise is the measurement? For a repeated measurement

2. measurement, how much variability exists? As
seen in the first target example, the “measurements” are not very precise, but on the second target
they have much less dispersion. There is less variability as seen in the Histogram curve. However, we
notice that the tight cluster of “measurements” are off target, they are not very accurate.
3. The third characteristic is how reproducible is the measurement from individual to another? What is
the accuracy and precision from person to person. Here you would expect each person that performs
the measurement to be able to reproduce the same amount of accuracy and precision as that of other
person performing
f i the
h same measurement.
Ultimately, we make decisions based on data collected from measurement systems. If the
measurement system does not generate accurate or precise enough data, we will make the decisions
that generate errors, waste and cost. When solving a problem or optimizing a process, we must know
how good our data are and the only way to do this is to perform a Measurement System Analysis.

176
MSA Uses
M SA ca n be used to:
Compare internal inspection standards with the standards of your

customer.
Highlight areas where calibration training is required.
Provide a method to evaluate inspector training effectiveness as well

as serves as an excellent training tool.
Provide a great way to:

–Compare existing measurement equipment.
–Qualify new inspection equipment.
The measurement system always has some amount of variation and that variation is additive to
the actual amount of true variation that exists in what we are measuring. The only exception is
when the discrimination of the measurement system is so poor that it virtually sees everything the
same.
This means that you may actually be producing a better product or service than you think you are,
providing that the measurement system is accurate; meaning it does not have a bias, linearity or
stability problem. It may also mean that your customer may be making the wrong interpretations
about your product or service.
The components of variation are statistically additive. The primary contributors to measurement
system error are Repeatability and Reproducibility. Repeatability is the variation in measurements
obtained by one individual measuring the same characteristic on the same item with the same
measuring instrument. Reproducibility refers to the variation in the average of measurements of an
identical characteristic taken by different individuals using the same instrument.
Why MSA?
Why is MSA so important?
M ea surem ent System Ana ly sis is important to:
MSA is was allows us to trust
• Study the % of variation in our process that is caused by our
the data generated from our measurement system.
processes. When you charter • Compare measurements between operators.
a project you are taking on a • Compare measurements between two (or more) measurement
significant burden which will devices.
require Statistical Analysis. • Provide criteria to accept new measurement systems (consider new
What happens if you have a equipment).
great project, with lots of data • Evaluate a suspect gage
gage.
from measurement systems • Evaluate a gage before and after repair.
that produce data with no • Determine true process variation.
integrity?
• Evaluate effectiveness of training program.

177
Appropriate Measures
Sufficient means that are
Sufficient,
measures are available to Appropria te M ea sures are:
be measured regularly, if
not it would take too long • Sufficient – available to be measured regularly
to gather data.
Relevant, means that they • Relevant –help to understand/ isolate the problems
will help to understand
and isolate the problems.
problems
• Representative - of the process across shifts and people
Representative measures
mean that we can detect • Contextual – collected with other relevant information that
variation across shifts and might explain process variability.
people.
Contextual means they are necessary to gather information on other relevant information that actually
ld h
would help
l tto explain
l i sources off variation.
i ti
Poor Measures
It is very common
while working gpprojects
j Poor M ea sures can result from:
to discover that the
current measurement • Poor or non-existent operational definitions
systems are poor. • Difficult measures
Have you ever come
across a situation • Poor sampling
where the data from • Lack of understanding of the definitions
your customer or
supplier doesn’t
doesn t match • Inaccurate,
Inaccurate insufficient or non-calibrated
non calibrated measurement
yours? It happens devices
often. It is likely a
problem with one of
the measurement
M ea surement Error compromises decisions that affect:
systems. We have – Customers
worked MSA projects – Producers
across critical – Suppliers
measurement points
in various companies,
it is not uncommon for more than 80% of the measurements to fail in one way or another.

178
Examples of What to Measure

At this point you should
have a fairly good idea Ex a mples of w ha t a nd w hen to m ea sure:
of what to measure, • Primary and secondary metrics
listed here are some
ideas to get you • Decision points in Process Maps
thinking… • Any and all gauges, measurement devices, instruments, etc
• “ X’s” in the process
• Prior to Hypothesis Testing
• Prior to modeling
• Prior to planning designed experiments
• Before and after process changes
• To qualify operators
M SA is a Show Stopper!!!
Components of Variation
W henever y ou mea sure a nything, the va ria tion tha t you

observe ca n be segmented into the follow ing components…
O bserved Va ria tion
Unit-to-unit (true) Variation Measurement System Error
Precision Accuracy
Repeatability Reproducibility
p y Stability
y Bias Linearity
y
All measurement systems have error. If you don’t know how much of the
variation you observe is contributed by your measurement system, you
cannot make confident decisions.
If you w ere one speeding tick et a w a y from losing your license,

how fa st w ould you be w illing to drive in a school zone?
We are going to strive to have the measured variation be as close as possible to the true variation.
In any case we want the variation from the measurement system to be a small as possible. We are
now going to investigate the various components of variation of measurements.

179
Precision
A precise metric is one that returns the same value of a given The spread of the data
is measured by
attribute every time an estimate is made.
Precision. This tells us
how well a measure
can be repeated and
Precise data are independent of who estimates them or when
reproduced.
the estimate is made.
Precision can be partitioned into two components:

– Repeatability
– Reproducibility
Repea ta bility a nd Reproducibility = Ga ge R+R
Repeatability
Measurements will be Repea ta bility is the variation in measurements obtained with one
different…expect it! If mea surement instrument used several times by one appraiser
measurement are while measuring the identical characteristic on the sa m e pa rt.
always exactly the
same this is a flag,
sometimes it is Y
because the gauge
does not have the
proper resolution,
meaning the scale
doesn’t go down far Repeatability
enough to get any For example:
variation in the – Manufacturing: One person measures the purity of multiple samples
measurement. of the same vial and gets different purity measures.
– Transactional: One person evaluates a contract multiple times (over a
For example, would
period of time) and makes different determinations of errors.
you use a football field
to measure the gap in a
spark plug?

180
Reproducibility
Reproducibility will be
present when it is Reproducibility is the variation in the average of the
possible to have more measurements made by different appraisers using the sa me
than one operator or mea suring instrument when measuring the identical
more than one characteristic on the sa me pa rt.
instrument measure the Reproducibility
same part.
Y Operator A
Operator B
For example:
– Manufacturing: Different people perform purity test on samples from
the same vial and get different results.
– Transactional: Different people evaluate the same contract and
make different determinations.
Time Estimate Exercise
Ex ercise objective: Demonstrate how well you can

estimate a 10 second time interval.
1. Pair up
p with an associate.
2. One person will say start and stop to indicate how
long they think the 10 seconds last. Do this 6 times.
3. The other person will have a watch with a second
hand to actually measure the duration of the estimate.
Record the value where your partner can’t see it.
4 Switch tasks with partner and do it 6 times also
4. also.
5. Record all estimates, what do you notice?

181
Accuracy
Accuracy and the
average are related. An accurate measurement is the difference between the observed average of
Recall in the Basic the measurement and a reference value.
Statistics module we – W hen a metric or measurement system consistently over or under estimates the
talked about the Mean value of an attribute, it is said to be “ inaccurate”
and the variance of a Accuracy can be assessed in several ways:
distribution. – Measurement of a known standard
– Comparison with another known measurement method
Think of it this – Prediction of a theoretical value
way….If the W hat happens if we don’t have standards, comparisons or theories?
Measurement System
True
is the distribution then Avera ge
accuracy is the Mean
and the precision is
Accura cy
the variance. W a rning, do not a ssume y our
gy reference is gospel.
m etrology g
M ea surement
Accuracy Against a Known Standard
In transactional processes, the measurement system can consist of a

database query.
– For example, you may be interested in measuring product
returns where you will want to analyze the details of the
returns over some time period.
– The query will provide you all the transaction details
However, before you invest a lot of time analyzing the data, you
must ensure the data has integrity.
– The analysis should include a comparison with known
reference points.
– For the example of product returns, the transaction details
should add up to the same number that appears on financial
reports, such as the income statement.

182
Accuracy vs. Precision
ACCURATE PRECISE BO TH
+ =
Accuracy relates to how close the

average of the shots are to the
Master or bull's
bull s-eye.
eye
Precision relates to the spread of

the shots or Variance.
N EITHER
Most Measurement Systems

y are accurate but not at all p
precise.
Bias
Bia s is defined as the deviation of the measured value from the

actual value.
Calibration procedures can minimize and control bias within

acceptable limits. Ideally, Bias can never be eliminated due to
material wear and tear!
Bias Bias
Bias is a component of Accuracy. Constant Bias is when the measurement is off by a constant
value. A scale is a prefect example, if the scale reads 3 lbs when there is no weight on it then there
is a 3lb Bias. Make sense?

183
Stability
Stability just looks

for changes in the Sta bility of a gauge is defined as error (measured in terms of
accuracy or Bias standard deviation) as a function of time. Environmental conditions
over time. such as cleanliness, noise, vibration, lighting, chemical, wear and
tear or other factors usually influence gauge instability. Ideally,
gauges can be maintained to give a high degree of stability but can
never be eliminated unlike reproducibility. Gauge stability studies
would be the first exercise past calibration procedures.
C t l Ch
Control Charts
t are commonly l usedd tto ttrack
k th
the stability
t bilit off a
measurement system over time.
Drift
Sta bility is Bia s cha ra cterized

a s a function of time!
Linearity
Linea rity is defined as the difference in Bias values throughout the

measurement range in which the gauge is intended to be used. This tells you
how accurate your measurements are through the expected range of the
measurements. It answers the question, " Does my gage have the same
accuracy for all sizes of objects being measured?"
measured?
Linearity = | Slope| * Process Variation

Low Nominal High
% Linearity = | Slope| * 100 +e

B i a s (y)
0.00
*
-e
*
*
Reference Value (x)
y = a + b.x
y: Bias, x: Ref. Value
a: Slope, b: Intercept
Linearity just evaluates if any Bias is consistent throughout the measurement range of the
instrument. Many times Linearity indicates a need to replace or maintenance measurement
equipment.

184
Types of MSA’s
Variable Data is
always preferred over M SA’s fa ll into tw o ca tegories:
Attribute because it – Attribute
give us more to work – Va ria ble
with.
Attribute Va ria ble
Now we are gong to – Pa ss/ Fa il – Continuous sca le
review Variable MSA – Go/ N o Go – Discrete sca le
testing
testing. – Document Prepa ra tion – Critica l dimensions
– Surfa ce imperfections – Pull strength
– Customer Service – W a rp
Response
Tra nsa ctiona l projects typica lly ha ve a ttribute ba sed

mea surem ent systems.
M a nufa cturing projects genera lly use va ria ble studies more
often, but do use a ttribute studies to a lesser degree.
Variable MSA’s
MSA s
MSA’s use a
MIN ITAB™ calculates a column of variance components (VarComp) which are used to
random effects calculate % Gage R&R using the AN OVA Method.
model meaning
that the levels for
Measured Value True Value
the variance
components are
not fixed or
assigned, they are
assumed to be
random. Estimates for a Gage R&R study are obtained by calculating the variance components
for each term and for error. Repeatability, Operator and Operator* Part components
are summed to obtain a total variability due to the measuring system.
W e use variance components to assess the variation contributed by each source of
measurement error relative to the total variation.

185
Session Window Cheat Sheet
Contribution
Contribution ofof variation
variation to
to the
the total
total
variation
variation of
of the
the study.
study.
%
% Contribution,
Contribution, based
based onon variance
variance
components,
components, is is calculated
calculated byby dividing
dividing each
each
value
value in
in VarComp
VarComp by by the
the Total
Total Variation
Variation then
then
multiplying
multiplying the
the result
result by
by 100.
100.
Use
Use %% Study
Study Var
Var when
when you
you are
are interested
interested in
in
comparing
comparing thethe measurement
measurement system
system variation
variation to
to the
the
total variation.
total variation.
%
% Study
Study Var
Var is
is calculated
calculated by
by dividing
dividing each
each value
value in
in
Study
Study Var
Var by
by Total
Total Variation
Variation and
and Multiplying
Multiplying by
by
100
100.
100
100.
Study
Study Var
Var isis calculated
calculated asas 5.15
5.15 times
times the
the Standard
Standard
Deviation
Deviation for
for each
each source.
source.
(5.15
(5.15 is
is used
used because
because when
when data
data are
are normally
normally
distributed,
distributed, 99%
99% ofof the
the data
data fall
fall within
within 5.15
5.15
Standard
Standard Deviations.)
Deviations.)
Refer to this when analyzing your Session Window output.
Session W indow ex pla na tions
WWhen
hen the
the process
process tolerance
tolerance is is entered
entered inin the
the
system,
system, MIN ITABTM
MINITAB TM calculates
calculates % % Tolerance
Tolerance whichwhich
compares
compares measurements
measurements system
system variation
variation to to
customer
customer specification.
specification. This
specification This allows
allows us
us to
to
determine
determine thethe proportion
proportion of of the
the process
process tolerance
tolerance
that
that is
is used
used by
by the
the variation
variation inin the
the measurement
measurement
system.
system.
Always round down to the nearest whole number.
Notice the calculation method explained here for Distinct Categories.

186
Number of Distinct Categories
The number of distinct ca tegories tells you how ma ny sepa ra te

groups of pa rts the system is a ble to distinguish.
Una ccepta ble for

estima ting process
pa ra m eters a nd indices
O nly indica tes w hether
the process is producing
conform ingg or
1 Data Category
nonconform ing pa rts
Genera lly una ccepta ble

for estim a ting process
pa ra m eters a nd indices
O nly provides coa rse
2 - 4 Categories
estima tes
R
Recom mended
d d
5 or more Categories
Here is a rule of thumb for distinct categories.
AIAG St
Standards
d d for
f Gage
G Acceptance
A t
Here are the Automotive Industry Action Group’s definitions for

Gage acceptance.
% Tolera nce
or % Contribution System is…
% Study Va ria nce
1 0 % or less 1 % or less Idea l
10% - 20% 1% - 4% Accepta ble
20% - 30% 5% - 9% M a rgina l
3 0 % or grea ter 1 0 % or grea ter Poor

187
MINITABTM Graphic Output Cheat Sheet
Gage name: Sample Study - Caliper

Date of study: 2-10-01
Gage R&R (ANOVA) for Data Reported by: B Wheat
Tolerance:
Misc:
Components of Variation By Part

100
%Contribution 0.630
%Study Var
Percent
%Tolerance
50 0.625
0 0.620
Gage R&R Repeat Reprod Part to Part
Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
R Chart by Operator By Operator

MIN TM
MINITAB
ITABTMbreaks
breaksdown
downthe thevariation
variationininthe
0.010 1 2 3
0.630 the
Sample Range
UCL=0.005936
measurement
measurementsystem systeminto intospecific
specificsources.
sources. Each
Eachcluster
cluster
0.005
ofofbars
bars represents a source of variation. Bydefault,
0.625 represents a source of variation. By default,
R=0.001817
each
each cluster will have two bars, corresponding to
0.000 LCL=0 0.620 cluster will have two bars, corresponding to
0 %Contribution
%Contribution
Operator 1 and
and%StudyVar.
%StudyVar.
2 3 If you add a tolerance
If you add a tolerance
Xbar Chart by Operator and/
and/ ororhistorical sigma,
Operator*Part
historical sigma, bars
Interaction
bars for
for %% Tolerance
Toleranceand/
Operator and/oror
0.632 1 2 3
0.631
UCL=0.6316
%Process
0.631
%Process
0.630 are
areadded.
added.
1
Mean
0.630 2
0.629
age
0.629 3
Sample M
Mean=0 6282
Mean=0.6282 0 628
0.628
Avera
0 628
0.628
0.627
0.626
InInaa good
goodmeasurement
0.627
0.626 measurementsystem,
system,thethelargest
largestcomponent
component
0.625
0.624
LCL=0.6248
ofofvariation
variation is Part-to-Part variation. Ifinstead
0.625
0.624
is Part-to-Part variation. If insteadyou
youhave
have
0
large
largeamounts
Part
amountsofofvariation
1 2 3 4
variationattributed
5 6 7 8
attributedtotoGage
9 10
GageR&R,
R&R,then
then
corrective
correctiveaction
actionisisneeded.
needed.

Tolerance:
Misc:

100 %Contribution 0.630
%Study Var
nt
%Tolerance
Percen
50 0.625
0.620
0
MIN ITABTMTMprovides an R Chart and Xbar Chart by Operator.
Gage R&R Repeat Reprod Part-to-Part MIN ITAB
Part 1 2 provides
3 4 5 an 6 R7 Chart
8 9 and
10 Xbar Chart by Operator.
The
TheRRchart
chartconsists
consistsofofthe
thefollowing:
following:
R Chart by Operator By Operator
0.010 1 2 3
- The plotted points are the difference between the largest
0.630
- The plotted points are the difference between the largest
Sample Range
UCL=0.005936 and
andsmallest
smallestmeasurements
measurementson oneach
eachpart
partfor
foreach
eachoperator.
operator.
0.005
If the measurements are the same then the range = 0.
0.625
If the measurements are the same then the range = 0.
R=0.001817 - The Center Line, is the grand average for the process.
- The Center Line, is the grand average for the process.
0.000 LCL=0 - -The
0.620 TheControl
ControlLimits
Limitsrepresent
representthetheamount
amountofofvariation
variation
0 expected
Operator 1 for the subgroup
2 ranges
ranges. 3These limits are calculated
expected for the subgroup ranges. These limits are calculated
Xbar Chart by Operator using the variation within subgroups.
using the Operator*Part Interaction
variation within subgroups. Operator
0.632 1 2 3
UCL=0.6316 0.631 1
0.631
If any of the points on the graph go above 2the upper Control
0.630
Sample Mean
0.630 If any of the points on the graph go above3 the upper Control
0.629
Average
0.629 Limit (UCL), then that operator is having problems consistently

0.628 Mean=0.6282 Limit (UCL), then that operator is having problems consistently
0.628
0.627 measuring
measuringparts.
0.627
parts. The
TheUpper
UpperControl
ControlLimit
Limitvalue
valuetakes
takesinto
into
0.626 0.626
0.625 LCL=0.6248 account
accountthe
0.625 thenumber
numberofofmeasurements
measurementsby byananoperator
operatoron onaa
0.624
part and the variability between parts. If the operators are
0.624
0 part and
Part 1 2the3 variability
4 5 6 between
7 8 9 parts.
10 If the operators are
measuring
measuringconsistently,
consistently,then
thenthese
theseranges
rangesshould
shouldbe besmall
small
relative
relativetotothethedata
dataandandthe
thepoints
pointsshould
shouldstay
stayinincontrol.
control.

188
MINITABTM Graphic Output Cheat Sheet (cont.)

Tolerance:
Misc:

100
%Contribution 0.630
%Study Var
Percent
%Tolerance
50 MIN ITABTMTMprovides an R Chart and Xbar Chart by Operator.
MIN ITAB provides an R Chart and Xbar Chart by Operator.
0.625
The Xbar Chart compares the part-to-part variation to
The Xbar Chart compares the part-to-part variation to
repeatability.
repeatability. The
0.620 TheXbar
Xbarchart
chartconsists
consistsofofthe
thefollowing:
following:
0
Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
R Chart by Operator - -The

Theplotted
plottedpoints
By are
points arethe
theaverage
Operator averagemeasurement
measurementon oneach
each
0.010 1 2 3 part
partfor
foreach
each operator.
operator.
- -The
0.630 Center Line is the overall average for all part
Sample Range
The Center Line is the overall average for all part

0.005
UCL=0.005936 measurements
measurementsby byall
alloperators.
operators.
- -The
0.625
TheControl
ControlLimits
Limits(UCL
(UCLand
andLCL)
LCL)are
arebased
basedon onthe
thevariability
variability
R=0.001817 between
between parts and thenumber
parts and the numberofofmeasurements
measurementsinineach
each
0.000 LCL=0
average.
0.620
average.
0 Operator 1 2 3
Xbar Chart by Operator Because Operator*Part Interaction

0.632 1 2 3
Becausethetheparts
partschosen
chosenfor
foraaGage
GageR&RR&R study
studyshould
Operator should
0.631
UCL=0.6316 represent
representthe
0.631
theentire
entirerange
rangeofofpossible
possibleparts,
parts,this
1
thisgraph
graphshould
should
0.630
an
0.630 ideally show lack-of-control.

lack of control. Lack
Lack-of-control
of control exists
2
i t when
h many
Sample Mea
id ll show
ideally
0 629
0.629 h llack-of-control.
k f t l L Lack-of-control
k f t l 3exists when many
Average
0.629
0.628 Mean=0.6282 points are above the Upper Control Limit and/ or below the
0.628
0.627 points are above the Upper Control Limit and/ or below the
0.627
Lower Control Limit.
0.626
0.625
Lower Control Limit.
0.626
LCL=0.6248 0.625
0.624 0.624
In this case there are only a 7few8 points out of control which
0 In this case
Part 1 2 there
3 4are
5 only
6 a few points
9 10 out of control which
indicates the measurement system is inadequate.
indicates the measurement system is inadequate.

Tolerance:
MIN TM
MINITAB
ITABTMprovides
providesananinteraction
interactionchart
chartthat
thatshows
Misc:
shows
the
theaverage
averagemeasurements taken
takenofby
Components
measurements each
eachoperator
Variation
by operatoron on By Part
each
eachpart
partininthe
thestudy,
100
study,arranged
arrangedby bypart.
part. Each
Eachline
line
%Contribution 0.630
connects
connectsthe
theaverages
averagesfor
foraasingle
singleoperator.
operator.
%Study Var
nt
%Tolerance
%
Percen
50 0.625
Ideally,
Ideally,the
thelines
lineswill
willfollow
followthethesame
samepattern
patternand andthe
the
part
partaverages
averageswill0
willvary
vary enough
enough that
that differences
differences
0.620
Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
between
betweenparts
partsare areclear.
clear. R Chart by Operator By Operator
0.010 1 2 3
0.630
Sample Range
UCL=0.005936
Pa ttern 0.005 M ea ns… 0.625
R=0.001817
0.000 LCL=0
Lines a re virtua lly identica l O pera tors a re m ea suring 0.620
0 O
Operator
t
the pa rts the sa m e 1 2 3
Xbar Chart by Operator Operator*Part Interaction

O ne line is consistently
0.632 Tha
1 t opera
2 tor is mea
3 suring
UCL=0.6316 0.631
Operator
1
0.631
higher or low er tha
0.630n the pa rts consistently higher or 0.630
Sample Mean
2
0.629
others low er tha n the others
Average
0.629 3
0.628 Mean=0.6282 0.628
0.627 0.627
Lines a re not pa ra llel or
0.626
0.625
The opera tors a bility to 0.626
LCL=0.6248 0.625
they cross 0.624 mea sure a pa rt depends 0.624
0 on w hich pa rt is being Part 1 2 3 4 5 6 7 8 9 10
mea sured (a n intera ction

betw een opera tor a nd
pa rt))
p

189
Practical Conclusions
For this example, the measuring system contributes a great deal to the overall variation,
as confirmed by both the Gage R&R table and graphs.
The variation due to the measurement system, as a percent of study variation is causing
92.21% of the variation seen in the process.
By AIAG Standards this gage should not be used. By all standards, the
data being produced by this gage is not valid for analysis.
% Tolera nce
or % C
Contribution
t ib ti System is
is…
% Study Va ria nce
1 0 % or less 1 % or less Idea l
10% - 20% 1% - 4% Accepta ble
20% - 30% 5% - 9% M a rgina l
3 0 % or grea ter 1 0 % or grea ter Poor
Repeatability and Reproducibility Problems
For Repeatability Problems:

If all operators have the same Repea ta bility Problems:
Repeatability and it is too big, • Calibrate or replace gage.
• If only occurring with one operator, re-train.
the gage needs to be repaired
or replaced. Reproducibility Problems:
If only one operator or in the • Measurement machines
case where there are no – Similar machines
• Ensure
E allll h
have b
been calibrated
lib t d andd th
thatt th
the standard
t d d measurementt
operators, but several gages method is being utilized.
and only one gage is showing – Dissimilar machines
Repeatability problems, re- • One machine is superior.
• Operators
train the one operator or – Training and skill level of the operators must be assessed.
replace the one gage. – Operators should be observed to ensure that standard procedures are
followed.
• Operator/ machine by part interactions
For Reproducibility Problems: – Understand why the operator/ machine had problems measuring some parts
In the case where only and not others
others.
machines are used and the • Re-measure the problem parts
• Problem could be a result of gage linearity
multiple machines are all • Problem could be fixture problem
similar in design, check the • Problem could be poor gage design
calibration and ensure that the
standard measurement method is being used. One of the gages maybe performing differently than
the rest, the graphs will show which one is performing differently. It may need to go in for repair or it
may simply be a setup or calibration issue. If dissimilar machines are used it typically means that
one machine is superior. In the case where multiple operator are the graphs will show who will need
additional training to perform at the same level as the rest. The most common operator/machine
interactions are either someone misread a value, recorded the value incorrectly or that the fixture
holding the part is poor.

190
Design Types
Crossed Designs are
the workhorse of Crossed Design
• A crossed design is used only in non-destructive testing and assumes that all
MSA. They are the the parts can be measured multiple times by either operators or multiple
most commonly machines.
used design in – Gives the ability to separate part-to-part variation from measurement
industries where it is system variation.
possible to measure – Assesses repeatability and reproducibility.
something more than – Assesses the interaction between the operator and the part.
once. Chemical and
biological systems N ested Design
can use Crossed • A nested design is used for destructive testing (we will learn about this in
MBB training) and also situations where it is not possible to have all
Designs also as long operators or machines measure all the parts multiple times.
as you can assume – Destructive testing assumes that all the parts within a single batch are
that the samples identical enough to claim they are the same.
used come from a – N ested designs are used to test measurement systems where it is not
homogeneous possible (or desirable) to send operators with parts to different locations.
solution and there is – Do not include all possible combinations of factors.
no reason they can – Uses slightly different mathematical model than the crossed design.
be different.
Nested Designs must be used for destructive testing. In a Nested Design, each part is measured by
only one operator. This is due to the fact that after destructive testing, the measured characteristic is
different after the measurement process than it was at the beginning. Crash testing is an example of
destructive testing.
testing
If you need to use destructive testing, you must be able to assume that all parts within a single batch
are identical enough to claim that they are the same part. If you are unable to make that assumption
then part-to-part variation within a batch will mask the measurement system variation.
If you can make that assumption, then choosing between a Crossed or Nested Gage R&R Study for
destructive testing depends on how your measurement process is set up. If all operators measure
parts from each batch
batch, then use Gage R&R Study (Crossed).
(Crossed) If each batch is only measured by a
single operator, then you must use Gage R&R Study (Nested). In fact, whenever operators measure
unique parts, you have a Nested Design. Your Master Black Belt can assist you with the set-up of
your design.

191
Gage R & R Study
A Gage R&R
R&R, like any study
study, Ga ge R& R Study
requires careful planning. The – Is a set of trials conducted to assess the repeatability and reproducibility
common way of doing an of the measurement system.
Attribute Gage R&R consists – Multiple people measure the same characteristic of the same set of
of having at least two people multiple units multiple times (a crossed study)
measure 20 parts at random, – Example: 10 units are measured by 3 people. These units are then
twice each. This will enable randomized and a second measure on each unit is taken.
you to determine how
y
consistently these people A Blind Study is extremely desirable.
evaluate a set of samples
– Best scenario: operator does not know the measurement is a part of a test
against a known standard. If
– At minimum: operators should not know which of the test parts they are
there is no consistency currently measuring.
among the people, then the
measurement system must
be improved, either by NO, not that kind of R&R!
defining a measurement
method, training, etc. You use
an Excel spreadsheet
template to record your study and then to perform the calculations for the result of the study.
Variable Gage R & R Steps
The pparts selected for

St
Step 1 : Call
C ll a tteam meeting
ti and d iintroduce
t d th
the concepts
t off th
the G
Gage R&R
the MSA are not
Step 2 : Select parts for the study across the range of interest
random samples. We – If the intent is to evaluate the measurement system throughout the process range,
want to be sure the select parts throughout the range
parts selected represent – If only a small improvement is being made to the process, the range of interest is
the overall spread of now the improvement range
parts that would Step 3 : Identify the inspectors or equipment you plan to use for the analysis
normally be seen in – In the case of inspectors, explain the purpose of the analysis and that the
inspection system is being evaluated not the people
manufacturing.
f t i Do
D nott
Step 4 : Calibrate the gage or gages for the study
include parts that are
– Remember linearity, stability and bias
obviously grossly
Step 5 : Have the first inspector measure all the samples once in random order
defective, they could Step 6 : Have the second inspector measure all the samples in random order
actually skew your – Continue this process until all the operators have measured all the parts one time
mathematical results – This completes the first replicate
and conclude that the Step 7 : Repeat steps 5 and 6 for the required number of replicates
MSA is jjust fine. For – Ensure there is always a delay between the first and second inspection
example, an engine Step 8 : Enter the data into MIN ITABTM and analyze your results
manufacturer was using Step 9 : Draw conclusions and make changes if necessary
a pressure tester to
check for leaks in engine blocks. All the usual ports were sealed with plugs and the tester was attached
and pressure was applied. Obviously, they were looking for pin hole leaks that would cause problems
later down the line. The team performing the MSA decided to include an engine block that had a hole in
the casting so large you could insert your entire fist. That was an obvious gross defect and should
never been
b iincluded
l d d iin th
the MSA.
MSA Don’t
D ’t be b silly
ill saying
i th
thatt once iin a while
hil you gett a partt lik
like th
thatt and
d it
should be tested. NO IT SHOULDN’T - you should never have received it in the first place and you
have got much bigger problems to take care of before you do an MSA.

192
Gage R & R Study
This is the most

commonly used Pa rt Alloca tion From Any Popula tion
Crossed Design.
1 0 x 3 x 2 Crossed Design is show n
10 parts are each
measure by 3
A minimum of tw o mea surem ents/ pa rt/ opera tor is required
different operators Three is better!
2 different times.
To get the total Tria l 1

O pera tor 1
number of data
Tria l 2
points in the study P
simply multiply a
Tria l 1
these numbers r 1 2 3 4 5 6 7 8 9 10 O pera tor 2
together. In this t Tria l 2
study we have 60 s
measurements. Tria l 1
O pera tor 3
Tria l 2
Gage R & R Study
Crea te a da ta collection sheet for:

– 10 parts
– 3 operators
– 2 trials
The next few slides show how to create a data collection table in MINITAB™
MINITAB . You can use Excel
also.

193
Data Collection Sheet
Here is the
completed table.
The trial column
will not be used
for the analysis
and can actually
be deleted.
Open the file “ Gageaiag2.MTW ” to view the worksheet.
Va ria bles:
– Part
– Operator
– Response

194
Gage R & R
Use the MINITAB™

menu path
“Stat>Quality
Tools>Gage
Study>Gage R&R
Study (Crossed)…”.
Within the dialog box
Gageg R&R Study y
(Crossed), the
“Options…” button
shown in the dialog
box here allows you
to calculate variation
as a percent of study
variation, process
tolerance or a
Use 1.0 for the tolerance.
historical Standard
Deviation.
In this example a Tolerance Range of 1 was used.
Graphical Output
Looking at the “ Components of Variation” chart, the Part to Part Variation needs to be larger
than Gage Variation.
If in the “ Components of Variation” chart the “ Gage R&R” bars are larger than the “ Part-to-
Part’ bars, then all your measurement variation is in the measuring tool i.e.… “ maybe the
gage needs to be replaced” . The same concept applies to the “ Response by Operator”
chart. If there is extreme variation within operators, then the training of the operators is
suspect.
Pa rt to Pa rt
Va ria tion needs
to be la rger tha n
Ga ge Va ria tion
O pera tor
Error

195
Session Window
The Session Tw o-W a y AN O VA Ta ble W ith Intera ction

Window output from Source DF SS MS F P
Part 9 1.89586 0.210651 193.752 0.000
Gage R & R has Operator 2 0.00706 0.003532 3.248 0.062
many values. The Part * Operator 18 0.01957 0.001087 1.431 0.188
Repeatability 30 0.02280 0.000760
ANOVA table values Total 59 1.94529
Ga ge R& R
are utilized to
calculate % %Contribution
Source VarComp (of VarComp)
Contribution and Total Gage R&R 0.0010458 2.91
Standard Deviation. Repeatability 0.0007600 2.11
Reproducibility 0.0002858 0.79
To calculate % Operator 0.0001222 0.34
study variation and Operator* Part 0.0001636 0.45
Part-To-Part 0.0349273 97.09
% tolerance, you will Total Variation 0.0359731 100.00
need to know values N umber of Distinct Categories = 8
for the Standard
Deviation and
tolerance ranges.
ranges
MINITAB™ defaults I can see clearly now!
to a value of 6
(the number of Standard Deviations within which about 99.7 % of your values should fall).
Tolerance ranges are based on process tolerance and are business values specific to each
process.
If the va ria tion due to Ga ge R & R is high, consider:

• Procedures revision?
• Gage update? • 2 0 % < % Tol. GRR < 3 0 % Æ Ga ge Una ccepta ble
• Operator issue? • 1 0 % < % Tol GRR < 2 0 % Æ Ga ge Accepta ble
• Tolerance validation?
• 1 % < % Tol GRR < 1 0 % Æ Ga ge Prefera ble
Study Var %Study Var %Tolerance

Source StdDev (SD) (6 * SD) (%SV) (SV/ Toler)
Total Gage R&R 0.032339 0.19404 17.05 19.40
Repeatability 0.027568 0.16541 14.54 16.54
Reproducibility 0.016907 0.10144 8.91 10.14
Operator 0.011055 0.06633 5.83 6.63
Operator* Part 0.012791 0.07675 6.74 7.67
Part-To-Part 0.186889 1.12133 98.54 112.13
Total Variation 0.189666 1.13800 100.00 113.80
N umber of Distinct Categories = 8
This output tells us that the part to part variation exceeds the allowable tolerance. This gage is
acceptable.

196
Signal Averaging
Signa l Avera ging can be used to reduce repeatability error when

a better gage is not available.
– Uses average of repeat measurements.
– Uses central limit theorem to estimate how many repeat
measures are necessary.
Signal Averaging is a method

to reduce repeatability error in
a poor gage when a better
gage is not available or when
a better gage is not possible.
Signal Averaging Example
Suppose SV/ Tolerance is 35%.
SV/ Tolerance must be 15% or less to use gage.
Suppose the Standard Deviation for one part measured by one person
many times is 9.5.
Determine what the new reduced Standard Deviation should be.
Here we have a problem with Repeatability, not Reproducibility so we calculate what the Standard
Deviation should be in order to meet our desire of a 15% gage.
The 35% represents the biggest problem, Repeatability.
We are assuming that 15% will be acceptable for the short term until an appropriate fix can be
implemented. The 9.5 represents our estimate for Standard Deviation of population of Repeatability.

197
Signal Averaging Example (cont.)
We now use it in the

Determ ine sa m ple size:
Central Limit
Theorem equation
to estimate the
needed number of Using
Using the
theaverage
averageof of66
repeated measures repeated
repeatedmeasures
measureswillwill
to do this we will use reduce
reducethe
therepeatability
repeatability
the Standard component
componentof of
measurement
measurement t error
errorto
tto
Deviation estimated
the
thedesired
desired15%
15%level.
level.
previously.
This m ethod should

sho ld be considered tempora ry!
r !
Paper Cutting Exercise
Ex ercise objective: Perform and Analyze a variable

MSA Study.
1. Cut a piece of paper into 12 different lengths that are all

fairly close to one another but not too uniform. Label the
back of the piece of paper to designate its “ part number”
2. Perform a variable gage R&R study as outlined in this module.
Use the following guidelines:
– N umber of parts: 12
– N umber of inspectors: 3
– N umber of trials: 5
3. Create a MIN ITABTM data sheet and enter the data into the
sheet as each inspector performs a measurement. If possible,
assign one person to data collection.
4. Analyze the results and discuss with your mentor.

198
Attribute MSA
The Discrete Measurement A methodology used to assess Attribute Measurement Systems.

Study is a set of trials
conducted to assess the ability Attribute
of operators to use an Attribute Gage
Gage Error
Error
operational definition or
categorize samples, an
Attribute MSA has:
Repeatability
Repeatability
epea ab y Reproducibility
ep oduc b y
Reproducibility Calibration
Calibration
Ca ba o
1 . Multiple operators measure
(categorize) multiple samples a – They are used in situations where a continuous measure cannot
multiple number of times. For be obtained.
example: 3 operators each – It requires a minimum of 5x as many samples as a continuous
categorize the same 50 study.
samples, then repeat the – Disagreements should be used to clarify operational definitions
measures at least once. for the categories.
• Attribute data are usually the result of human judgment (which
category does this item belong in).
2. The test should be blind. It
• W hen categorizing items (good/ bad; type of call; reason for
is difficult to run this without the leaving) you need a high degree of agreement on which way an
operator knowing it is a item should be categorized.
calibration test, but the
samples should be
randomized and their true categorization unknown to each operator.
The test is analyzed based on correct (vs

(vs. incorrect) answers to determine the goodness of the
measuring system.
Attribute MSA Purpose
The purpose of an Attribute M SA is:

– To determine if all inspectors use the same criteria to determine “ pass” from “ fail” .
– To assess your inspection standards against your customer’s requirements.
– To determine how well inspectors are conforming to themselves.
– To identify how inspectors are conforming to a “ known master,” which includes:
• How often operators ship defective product.
• How often operators dispose of acceptable product.
– Discover areas where:
• Training is required
required.
• Procedures must be developed.
• Standards are not available.
An Attribute MSA is similar in many ways to the continuous MSA, including the
purposes. Do you have any visual inspections in your processes? In your experience
y been?
how effective have they
When a Continuous MSA is not possible an Attribute MSA can be performed to evaluate the quality
of the data being reported from the process.

199
Visual Inspection Test
Take 60 Seconds and count the number of times “F”

F appears in this paragraph?
The N ecessity of Training Farm Hands for First Class Farms

in the Fatherly Handling of Farm Live Stock is Foremost in
the Eyes of Farm Owners. Since the Forefathers of the Farm
O ners Trained the Farm Hands for First Class Farms in
Owners
the Fatherly Handling of Farm Live Stock, the Farm Owners
Feel they should carry on with the Family Tradition of
Training Farm Hands of First Class Farmers in the Fatherly
Handling of Farm Live Stock Because they Believe it is the
B i off G
Basis Good dFFundamental
d t lF
Farm Management.
M t
Did you get 34? That’s the right answer!
Why not? Does everyone know what an “F” (defect) looks like? Was the lighting good in the
room? Was it quite so you could concentrate? Was the writing clear? Was 60 seconds long
enough?
e oug
This is the nature of visual inspections! How many places in your process do you have visual
inspection? How good do you expect them to be?
How can we Improve Visual Inspection?
Visua l Inspection ca n be im proved by:

• O pera tor Tra ining & Certifica tion
• Develop Visua l Aids/ Bounda ry Sa m ples
• Esta blish Sta nda rds
• Esta blish Set-Up Procedures
• Esta blish Eva lua tion Procedures
– Eva lua tion of the sa me loca tion on ea ch pa rt.
– Ea ch eva lua tion perform ed under the sa m e lighting.
– Ensure a ll eva lua tions a re m a de w ith the sa m e
sta nda rd.
Look closely now!

200
Excel Attribute R & R Template
Attribute Gage R & R Effectiveness
SCORING REPORT
DATE: 5/10/2006
Attribute Legend5 (used in computations) NAME: Joe Smith
1 pass PRODUCT: My Gadget All operators
2 fail BUSINESS: Unit 1 agree within and All Operators
between each agree with
Other standard
Known Population Operator #1 Operator #2 Operator #3 Y/N Y/N
Sample # Attribute Try #1 Try #2 Try #1 Try #2 Try #1 Try #2 Agree Agree
1 pass pass pass pass pass fail fail N N
2 pass pass pass pass pass fail fail N N
3 fail fail fail fail pass fail fail N N
4 fail fail fail fail fail fail fail Y Y
5 fail fail fail pass fail fail fail N N
6 pass pass pass pass pass pass pass Y Y
7 pass fail fail fail fail fail fail Y N
9 fail pass
p pass
p pass
p pass
p pass
p pass
p Y N
10 fail pass pass fail fail fail fail N N
In order to conduct an Attribute Gage R&R first select a set of samples. These samples should be
a mix of clearly Good/Pass, clearly Bad/Fail and Marginal so we can test an operator’s ability
across different types of attributes.
For each sample an attribute or true status of the part should be documented by an expert or team
of experts, these people have to be different that the operators who will do the study. Each
operator should assign a Pass or Fail to each part on two or three separate occasions.
The requirements for any sort of confidence with Attribute Data are big. Start with 50 samples, that
should give you enough data. If you use more, realistically things will just get worse.
Attribute: Precision Assessment Deliverable
Precision Precision + Bia s
Repea ta bility
Reproducibility R
A
C A
T
The
eggreen
ee ttriangle
a g e represents
ep ese ts tthe
e actua
actual sco
score
eoof tthe
e
U
A
N
appraiser. The range between the red squares is the L
G
Confidence Interval which is a function of the operators
score and the size of the sample they have inspected. E

201
Statistical Report
The O pera tor a grees on

both tria ls w ith the k now n
The O pera tor a grees w ith sta nda rd
them selves on both tria ls
All O pera tors a gree

All O pera tors a gree W ithin & Betw een
W ithin & Betw een them selves a nd w ith
them selves the sta nda rd
M&M Exercise
Ex ercise objective: Perform and Analyze an Attribute MSA Study.
• You will need the following to complete the study:

– A bag of M&Ms containing 50 or more “ pieces”
– The attribute value for each piece.
– Three or more inspectors.
• Judge each M&M as pass or fail.

N umber Part Attribute
– The customer has indicated that they want a bright and shiny M&M
1 M&M Pass and that they like M’s.
2 M&M Fail
• Pick 50 M&Ms out of a package.
3 M&M Pass
• Enter results into either the Excel template or MIN ITABTM and
draw conclusions.
• The instructor will represent the customer for the attribute score.
To complete this study you will need, a bag of M&Ms containing 50 or more “pieces”. The Attribute
Value for each piece, which means the “True” value for each piece, in addition to being the
facilitator of this study you will also serve as the customer, so you will have the say as to if the
piece is actually a Pass or Fail piece
piece. Determine this before the inspectors review the pieces
pieces. You
will need to construct a sheet as shown here to keep track of the “pieces” or “parts” in our case
M&Ms it is important to be well organized during these activities. Then the inspectors will
individually judge each piece based on the customer specifications of bright and shiny M&M with
nice M’s.

202
Understand Precision & Accuracy
Understand Bias, Linearity and Stability
Understand Repeatability & Reproducibility
Understand the impact of poor gage capability on

product quality.
Identify the various components of variation
Perform the step by step methodology in variable

variable,
and attribute MSA’s
You have now completed Measure Phase – Measurement System Analysis.
Notes

203
Lean Six Sigma

Black Belt Training
Measure Phase
Process Capability
Now we will continue in the Measure Phase with “Process Capability”.

204
Process Capability
Overview
Within this module we W

W elcom
elcomee to
to M
Mea
easure
sure
are going to go through
Stability and its affect
on a process as well as Process
Process Discovery
Discovery
how to measure the
Capability of a process. Six
Six Sigm
Sigmaa Sta
Statistics
tistics
We will examine the M
Mea
easurem
surement
entt Sy
System
S stem
t
meaning of each of Ana
Analy
lysis
sis
these and show you
how to apply them. Process
Process Ca
Capa
pability
bility
Continuous
Continuous Capability
Capability
Concept
Concept of
of Stability
Stability
Attribute
Attribute Capability
Capability
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Understanding Process Capability
Process Ca pa bility:
• The inherent a bility of a process to meet the ex pecta tions of the

custom er w ithout a ny a dditiona l efforts.
• Provides insight a s to w hether the process ha s a :

– Centering Issue (rela tive to specifica tion lim its)
– Va ria tion Issue
– A com bina tion of centering a nd va ria tion
– Ina ppropria te specifica tion lim its
• Allow s for a ba seline m etric for improvem ent.
*Efforts: Time, Money, Manpower, Technology, and Manipulation
This is the Definition of Process Capability. We will now begin to learn how to assess it.

205
Process Capability
Capability as a Statistical Problem
Simply put Six

O ur Sta
St tistica
ti ti l Problem:
P bl W hat
h t is
i th
the probability
b bilit off our
Sigma always starts
with a practical
process producing a defect ?
problem, translates it
into a statistical
Define a Practical
problem, corrects the
Problem
statistical problem
and then validates
Create a
the practical Statistical Problem
problem.
Correct the
We will re-visit this Statistical Problem
concept over and
over, especially in Apply the Correction
the Analyze Phase to the Practical
when determining g Problem
sample size.
Capability Analysis
Capability Analysis
provides you with a The X
X’ss Y = f(X) (Process Function) The Y
Y’ss
Variation – “Voice of
(Inputs) (Outputs)
quantitative assessment of the Process”
Frequency
your processes ability to Verified Op i + 1
Op i
meet the requirements X1
Data for
?
Y1…Yn
placed on it. Capability X2 Off-Line

Y1
Analysis Scrap
10.16
10.11
10.16
10.05
10.11
9.87
9.99
10.16
9.87 10.11
Correction 10.33
10.05 10.12
9.99 10.05
Analysis is traditionally
10.44
10.33 10.43
10.12 10.33
X3 Y2 9.86
10.44 10.21
10.43 10.44
10.01
10.21 9.86
9.80 9.90 10.0 10.1 10.2 10.3 10.4 10.5
10.07
9.86
10.29
10.07 10.15
10.01 10.07
10.36
10.29 10.44
10.15 10.29
10.03
10.44 10.36
X4 10.36
used for assessing the X5

Yes Y3
Correctable
No
10.33
10.03
10.15
10.33
10.15
?
outputs of a process, in
other words comparing the
Requirements – “Voice
Voice of the Process to the Critical X(s): of the Customer” Data - VOP
Any variable(s) USL = 10.44
Voice of the Customer. LSL = 9.96 10.16
10.11 9.87 10.16
which exerts an 10.05
10.33
9.99
10.12
10.11
10.05
undue influence on
10.44 10.43 10.33
9.86 10.21 10.44
However, you can use the the important

outputs (CTQ’s) of a
10.07
10.29
10.36
10.01
10.15
10.44
10.03
10.33
9.86
10.07
10.29
10.36
10.15
same technique to assess process Defects

Defects
the capability of the inputs

Ca pa bility Ana lysis N um erica lly
going into the process. they
Com pa res the VO P to the VO C
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
are after all,

all outputs from Percent Composition
9.70 9.80 9.90 10.0 10.1 10.2 10.3 10.4 10.5 10.6
some previous process,

and you have expectations, specifications or requirements for their performance. Capability Analysis
will give you a metric that you can use to describe how well it performs and you can convert this
metric to a sigma score if you so desire.
You will learn in the lesson how the output variation width of a given process output compares with
the specification width established for that out put. This ratio, the output variation width divided by
th specification
the ifi ti width
idth iis what
h t iis kknow as capability.
bilit
Since the specification is an essential part of this assessment, a rigorous understanding of the
validity of the specification is vitally important, it also has to be accurate. This is why it is important to
perform a RUMBA type analysis on process inputs and outputs.

206
Process Capability
Process Output Categories

Two output behaviors
determine how well we meet
Incapable Off target
our customer or process
LSL
Average
USL LSL Average
USL
output expectations. The first
is the amount of variation
present in the output and the
second is how well the output
is centered relative to the Target Target
requirements.
i t If the
th amountt off
Re
ss
du
variation is larger than the Capable and ce
ce
on target ro
rp
sp
difference between the upper e
nt
r ea
Average
spec limit minus the lower LSL USL Ce
d
spec limit, our product or
service output will always
produce defects, it will not be
capable of meeting the T
Target
t
customer or process output
requirements.
As you have learned, variation exists in everything. There will always be variability in every process
output. You can’t eliminate it completely, but you can minimize it and control it. You can tolerate
variability if the variability is relatively small compared to the requirements and the process
demonstrates long-term stability, in other words the variability is predictable and the process
performance is on target meaning the average value is near the middle value of the requirements.
The output from a process is either: capable or not capable, centered or not centered. The degree of
capability and/or centering determines the number of defects generated. If the process is not
capable, you must find a way to reduce the variation.
And if it is not centered, it is obvious that you must find a way to shift the performance. But what do
you do if it is both incapable and not centered? It depends, but most of the time you must minimize
and gget control of the variation first, this is because high
g variation creates high
g uncertainty,
y yyou can’t
be sure if your efforts to move the average are valid or not. Of course, if is just a simple adjustment
to shift the average to where you want it, you would do that before addressing the variation.
Problem Solving Options – Shift the Mean
Our efforts in a Six Sigma This involves finding the variables that will shift the process over to the
project that is examining a target. This is usually the easiest option.
process that is p
p performing
g at a
level less than desired is to USL
LSL
Shift the Mean of performance Shift
such that all outputs are within
an acceptable range.
Our ability to Shift the Mean

involves finding the variables
that will shift the process over
to the target. This is the
easiest option.

207
Process Capability
Problem Solving Options – Reduce Variation

Reducing the variation means
fewer of our outputs fail further This is typically not so easy to accomplish and occurs often in Six
Sigma projects.
away from the target. Our
objective then is to reduce
LSL USL
variation of the inputs to
stabilize the output.
Problem Solving Options – Shift Mean & Reduce Variation

Combination of shifting the
This occurs often in Six Sigma projects.
Mean and reducing variation –
This is the primary objective of
Six Sigma projects. USL
LSL Shift & Reduce
Problem Solving Options
Move the specification limits – Obviously this implies making them wider, not narrower. Customers
Obviously this implies making usually do not go for this option but if they do…it’s the easiest!
them wider,, not narrower.
Customers usually do not go LSL USL USL
for this option.
Move Spec

208
Process Capability
Capability Studies
A stable process is one that is

consistent with time. Time Ca pa bility Studies:
Series Plots are one way to • Are intended to be regular, periodic, estimations of a process’s ability
check for stability, Control to meet its requirements.
Charts are another. Your • Can be conducted on both discrete and continuous data.
process may not be stable at • Are most meaningful when conducted on stable, predictable
this time. One of the purposes processes.
of the Measure Phase is to • Are commonly reported as Sigma Level which is optimal (short term)
identify the many X’s possible performance
performance.
for the defects seen, gather • Require a thorough understanding of the following:
data and plot it to see if there – Customer’s or business’s specification limits
– N ature of long term vs. short term data
are any patterns to identify
– Mean and Standard Deviation of the process
what to work on first.
– Assessment of the normality of the data (continuous data only)
When performing Capability – Procedure for determining Sigma level
Analysis,
y , tryy to get
g as much
data as are possible, back as far in time as possible, over a reference frame that is generally
representative of your process.
Steps to Capability
Select Output for

Improvement
#1 Verify Customer
Requirements
#2 Validate
Specification
Limits
#3 Collect Sample
Data
#4 Determine
Data Type
(LT or ST)
#5 Check data
for normality
#6 Calculate
Z-Score, PPM,
Yield, Capability
Cp, Cpk, Pp, Ppk
#7

209
Process Capability
Verifying the Specifications
Q uestions
ti to
t consider:
id Specifications must be
verified before
completing the
• W hat is the source of the specifications?
Capability Analysis. It
– Customer requirements (VOC) doesn’t mean that you
– Business requirements (target, benchmark) will be able to change
– Compliance requirements (regulations) them, but on occasion
– Design requirements (blueprint
(blueprint, system) some internal
specifications have
• Are they current? Likely to change? been made much
tighter than the
customer wants.
• Are they understood and agreed upon?
– Operational definitions
– Deployed to the work force
Data Collection
You must know if the
data collected from Ca pa bility Studies should include “ a ll” observa tions (1 0 0 % sa mpling) for a specified period.
process outputs is a Short-term da ta : Long-term da ta :

• Collected across a narrow inference • Is collected across a broader inference
short-term or a long-term space. space.
representation of how • Daily, weekly; for one shift, • Monthly, quarterly; across multiple
machine operator
machine, operator, etc
etc. shifts machines,
shifts, machines operators
operators, etc
wellll th
the process • Is potentially free of special cause • Subject to both common and special
performs. There are variation. causes of variation.
• Often reflects the optimal • More representative of process
several reasons for this, performance over a period of time.
performance level.
but for now we will focus • Typically consists of 30 – 50 data • Typically consists of at least 100 – 200
on it from the points. data points.
perspective of assessing Lot 1 Lot 5

Quantity
the capability of the Lot 3
process.
Fill Q
To help you understand

Lot 2
short-term vs. long-term
Lot 4
data, we will start by S h o r t - t e r m s tu d ie s
looking at a
L o n g -te rm s tu d y
manufacturing example
first. In this scenario the
manufacturer is filling bottles with a certain amount of fluid fluid. Assume the product is built in lots
lots. Each
lot is built using a particular vendor of the bottle, by a particular shift and set of employees and by one
of many manufacturing lines. The next lot could be from a different vendor, employees, line, shift, etc.
Each lot is sampled as it leaves the manufacturing facility on its way to the warehouse. The results
are represented by the graphic where you see the performance data on a lot by lot basis for the
amount of fill based on the samples that were taken. Each lot has its own variability and average as
shown. The variability actually looks reasonable and we notice that the average from lot to lot is
varying as well.
What the customer eventually experiences in the amount of fluid in each bottle is the value across the
full variability of all the lots. It can now be seen and stated that the long-term variability will always be
greater than the short-term variability.

210
Process Capability
Baseline Performance
Here is another way to look

at long-term and short-term Process
Process Ba Baseline:
seline: The
The
performance. The “road” average,
average, long-term
long-termperformance
performance
appearing graphic actually level
levelof
ofaaprocess
processwhen
whenall
allinput
input
represents the target (center variables
variablesare
areunconstrained.
unconstrained. Long-term
Long-term
line) and the upper and lower ba
baseline
seline
spec limits. Here again you 4
Short
ShortTerm
Term
see the representative
p PPerform
Perform
f aance
nce
performance in short-term
snapshots, which result in
the larger long-term ` 3
performance.
Process Baseline is a term 2

that you will use frequently 1
as a way to describe the
output performance of a LSL TARGET USL
process. Whenever you hear
the word “Baseline” it automatically implies long-term performance. To not use long-term data to
describe the Baseline Performance would be dangerous.
As an example, imagine you reported the process performance Baseline was based on distribution 3
in the graphic, you would mislead yourself and others that the process had excellent on target
performance. If you used distribution 2, you would be led to believe that the average performance was
near the USL and that most of the output of the process was above the spec limit. To resolve these
potential problems, it is important to always use long-term data to report the Baseline.
How do you know if the data you have is short or long-term? Here are some guidelines. A somewhat
technical interpretation of long-term data is that the process has had the opportunity to experience
most of the sources of variation that can impact it. Remembering the outputs are a function of the
inputs what we are saying is that most of the combinations of the inputs
inputs, inputs, each with their full range of
variation has been experienced by the process. You may use these situations as guidelines.
Short-term data is a “snapshot” of process performance and is characterized by these types of

conditions:
One shift One line
One batch One employee
One type of service One or only a few suppliers
Long-term data is a “video” of process performance and is characterized by these types of conditions:
Many shifts Many batches
Many employees Many services and lines
Many suppliers
Long-term variation is larger than short-term variation because of : material differences, fluctuations in
temperature and humidity, different people performing the work, multiple suppliers providing
materials, equipment wear, etc.
As a general rule, short-term data consist of 20 to 30 data points over a relatively short period of time
and long-term data consist of 100 to 200 data points over an extended period of time. Do not be

211
Process Capability
Baseline Performance (cont.)

misled by the volume of product or service produced as an indicator of long and short-term
performance. Data that represents the performance of a process that produces 100,000 widgets a day
for that day will be short-term performance. Data the represents the performance of a process that
produces 20 widgets a day over a 3 month period will be long-term performance.
While we have used a manufacturing example to explain all this, it is exactly the same for a service or
administrative type of process. In these types of processes, there are still different people, different
shifts, different workloads, differences in the way inputs come into the process, different software,
computers,
t temperatures,
t t etc.
t The
Th same exactt conceptst andd rules
l apply.l
You should now appreciate why, when we report process performance, we need to know what the data
is representative of. Using such data we will now demonstrate how to calculate process capability and
then we will show how it is used.
C
Components
t off V
Variation
i ti
There are many ways

to look at the Even stable processes will drift and shift over time by as much as 1.5
difference between Standard Deviations on the average.
short-term and long-
term data. Long Term
O vera ll Va ria tion
First keep on mind
that you never have
purely short-term or
purely long-term data.
It is always something
in between.
Short Term
Betw een Group Va ria tion
Short term data
Short-term
basically represent
your “entitlement”
Short Term
situation: you are
W ithin Group Va ria tion
controlling all the
controllable sources
of variation.
Long-term data includes (in theory) all the variation that one can expect to see in the process
process.
Usually what we have is something in between. It is a judgment call to decide which type of data you
have: it varies depending on what you are trying to do with it and what you want to learn from it.
In general one or more months of data are probably more long-term than short-term; two weeks or
less is probably more like short-term data.

212
Process Capability
Sum of the Squares Formulas
These are the equations

describing the sum of
squares which are the
SS tota l = SS betw een + SS w ithin
basis for the calculations
used in capability.
No, you do not need to

memorize them or even
really understand them
them.
They are built into Precision
Shift (short-term capability)
MINITABTM for the x
Output Y
processing of data. x x
x
x x
x x x
x x
x
x x x Time
x x x x
x x
x x x
x
Stability
Stability is established by A Sta ble Process is consistent over time. Time Series Plots and
plotting data in a Time Control Charts are the typical graphs used to determine stability.
Series Plot or in a
Control Chart. If the data At this point in the Measure Phase there is no reason to assume the
used in the Control Chart process is stable.
goes out of control, the Time Series Plot of PC Data
data is not stable. 70
Att this
t s point
po t in the
t e 60
Measure Phase there is
no reason to assume the
PC Data
50
process is stable.
Performing a capability Tic toc…
40
study at this point
tic toc…
effectively draws a line in
the sand. 30
1 48 96 144 192 240 288 336 384 432 480
Index
If however, the process
is stable, short-term data
provides a more reliable estimate of true process capability.
Looking at the Time Series Plot shown on this slide, where would you look to determine the
entitlement of this process?
As you can see th

A the circled
i l d region
i hhas a much h titighter
ht variation.
i ti W
We would
ld consider
id thi
this th
the process
entitlement; meaning, that if we could find the X’s that are causing the instability this is the best the
process can perform in the short term. The idea is that we’ve done it for some time, we should be
able to do it again. This does not mean that this is the best this process will ever be able to do.

213
Process Capability
Measures of Capability
Mathematically Cpk and Ppk are the same and Cp and Pp are the same.
The only difference is the source of the data, Short-term and Long-term,
respectively.
– Cp and Pp Hope
p
• W hat is Possible if your process is perfectly Centered
• The Best your process can be
• Process Potential (Entitlement)
– Cpk and Ppk Reality

• The Reality of your process performance
• How the pprocess is actually
y runningg
• Process capability relative to specification limits
Capability Formulas
Six tim es the sa m ple

Sta nda rd Devia tion
Sa m ple M ea n
Three tim es the sa m ple

Sta nda rd Devia tion
Note: Consider the “K” value the penalty for being off center LSL – Lower specification limit
USL – Upper specification limit

214
Process Capability
MINITAB™ Example
Open the worksheet

Open worksheet “ Camshaft.mtw” . Check for N ormality.
“Camshaft.mtw”.
There are two columns of

data that show the length of By looking at the “ P-values”
camshafts from two different the data look to be normal
since P is greater than .05
suppliers. Check the
Normalityy of each supplier.
pp
In order to use process

capability as a predictive
statistic, the data must be
Normal for the tool we are
using in MINITAB™.
At this point in time we are only attempting to get a Baseline number that we can compare to at the
end of problem solving. We are not using it to predict a quality, we want to get a snapshot. DO NOT
try and make your process STABLE BEFORE working on it! Your process is a project because
there is something wrong with it so go figure it out, don’t bother playing around with stability.
Crea te a Ca pa bility Ana lysis for both suppliers, a ssume long term
da ta .
N ote the subgroup size for this ex a m ple is 5 .
LSL= 5 9 8 USL=6 0 2

215
Process Capability
MINITAB™ Example (cont.)
599.548
599 548 is the process
Process Capability of Supplier 1
Mean which falls short of
the target (600) for LSL USL
Supplier 1, and the left P rocess D ata Within
LS L 598 Ov erall
tail of the distribution Target *
P otential (Within) C apability
USL 602
falls outside the lower S ample M ean 599.115 Cp 1.19
S ample N 100 C P L 0.66
specification limits. From S tD ev (Within) 0.559239 C P U 1.72
S tD ev (O v erall) 0.604106 C pk 0.66
a practical standpoint O v erall C apability
p y
what does this mean? Pp 1.10
PPL 0.62
You will have camshafts PPU 1.59
P pk 0.62
that do not meet the C pm *
lower specification of
598 mm.
597.75 598.50 599.25 600.00 600.75 601.50
Next we look at the Cp O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance
P P M < LS L 30000.00 P P M < LS L 23088.05 P P M < LS L 32467.79
index.
de This s te
tells
s us if we
e PPM > USL 0 00
0.00 PPM > USL 0 12
0.12 PPM > USL 0 90
0.90
P P M Total 30000.00 P P M Total 23088.18 P P M Total 32468.68
will produce units within
the tolerance limits.
Supplier 1 Cp index is
.66 which tells us they need reduce the process variation and work on centering.
Look at the PMM levels? What does this tell us?
600.06
600 06 is the process man
for Supplier 2 and is very Process
Process Capability
Capability of
of Supplier
Supplier 22
close to the target
LSL
LSL USL
USL
although both tails of the PProcess
rocessDData
ata W
Within
ithin
LS
LSLL 598 Ov
distribution fall outside of 598 O verall
erall
Target
Target **
UUSSLL 602 PPotential
otential(Within)
(Within)CCapability
apability
the specification limits. SSample
ample MMean
602
ean 600.061
600.061
CCpp 0.66
0.66
SSample CCPPLL 0.68
The Cpk index is very ample NN 100
100 0.68
SStD CCPPUU 0.64 0.64
tDev
ev(Within)
(Within) 1.00606
1.00606
CCpk
pk 0.64
similar to Supplier 1 but SStD
tDev
ev(O
(Ovverall)
erall) 1.14898
1.14898
OOvverall C
0.64
apability
erallll C apability
bilit
this infers that we need to PPpp 0.58
0.58
PPPPLL 0.60
0.60
work on reducing PPPPUU 0.56
0.56
PPpk 0.56
variation. When making a pk
CCpm
pm
0.56
**
comparison between
Supplier 1 and 2 elative to
Cpk vs Ppk we see that 597
597 598
598 599
599 600
600 601
601 602
602 603
603
Supplier 2 process is more OObserv
bserved
edPPerformance
erformance EExp.
xp.Within
WithinPPerformance
erformance EExp.
xp.OOvverall
erallPPerformance
erformance
PPPPMM << LS
LSLL 40000.00
40000.00 PPPPMM << LS
LSLL 20251.30
20251.30 PPPPMM <<LS
LSLL 36425.88
36425.88
prone to shifting over time
time. PPPPMM >> UUSSLL 60000.00
60000.00 PPPPMM >> UUSSLL 26969.82
26969.82 PPPPMM >>UUSSLL 45746.17
45746.17
PPPPMM Total 100000.00 PPPPMM Total 47221.11 PPPPMM Total 82172.05
That could be a risk to be Total 100000.00 Total 47221.11 Total 82172.05
concerned about.
Again, Compare the PPM levels? What does this tell us? Hint look at PPM < LSL.
So what do we do. In looking only at the means you may claim that Supplier 2 is the best. Although
Supplier 1 has greater potential as depicted by the Cp measure and it will likely be easier to move their
Mean than deal with the variation issues of Supplier 2
2. Therefore we will work with Supplier 1 1.

216
Process Capability
Generate the new

capability graphs for both MIN ITAB™ has a selection to calculate Benchmark Z’s or Sigma
suppliers and compare Z levels along with the Cp and Pp statistics. By selecting these the
values or sigma levels. graph will display the “ Sigma Level” of your process!
Stat>Quality Tools>Capability Analysis>Normal…>Options…Benchmark Z’s (sigma level)
The overall long term

sigma level is 1.85 for Process Capability of Supplier 1
supplier 1 you should LSL USL

also note that it has the P rocess Data Within
LS L 598 Ov erall
potential to be 1.99 Target *
P otential (Within) C apability
USL 602
sigma as the process S ample M ean 599.115 Z.Bench 1.99
S ample N 100 Z.LS L 1.99
stands in its current S tDev (Within) 0.559239 Z.U S L 5.16
S tDev (O v erall) 0.604106 C pk 0 66
0.66
state. O v erall C apability
Z.Bench 1.85
Z.LS L 1.85
Z.U S L 4.78
P pk 0.62
C pm *
597.75 598.50 599.25 600.00 600.75 601.50

O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance
P P M < LS L 30000.00 P P M < LS L 23088.05 P P M < LS L 32467.79
PPM > USL 0.00 PPM > USL 0.12 PPM > USL 0.90

217
Process Capability
The overall long-term

long term sigma
level is 1.39 for supplier 2, Process Capability of Supplier 2
you should also note that it LSL USL
has the potential to be 1.39 LS L
P rocess D ata
598
Within
Ov erall
sigma as the process Target
USL
*
602 P otential (Within) C apability
stands in its current state. S ample M ean

S ample N
600.061
100
Z.Bench 1.67
Z.LS L 2.05
S tD ev (Within) 1.00606 Z.U S L 1.93
S tD ev (O v erall) 1.14898 C pk 0.64
O v erall C apability
Z.Bench
Z Bench 1.39
1 39
Z.LS L 1.79
Z.U S L 1.69
P pk 0.56
C pm *
597 598 599 600 601 602 603

O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance
P P M < LS L 40000.00 P P M < LSL 20251.30 P P M < LS L 36425.88
PPM > USL 60000.00 P P M > U S L 26969.82 P P M > U S L 45746.17
Example Short Term
MIN ITAB™ assumes long term data

– W hen short-term data is taken, do one of the following:
O ption 1 O ption 2
Enter subgroup size = tota l Go to options, turn off W ithin
num ber of sa m ples subgroup a na lysis
The default of MINITAB™ assumes long-term data. Many times you will have short-term data, be
sure to adjust MINITAB™ based on Option 1 or 2 as shown here to ensure you get a proper
analysis.
For option 1 you will enter the subgroup size as the total number of data points you have in your
short-term study.
For option 2, you will turn off the within subgroup analysis found inside the Options selection.

218
Process Capability
Continuous Variable Caveats

Well this is one way to lie Capability
C bilit iindices
di assume N ormally
ll Di
Distributed
t ib t d d
data.
t
with Statistics…When Always perform a N ormality test before assessing capability.
used as a predictive
Process Capability
model, capability makes
assumptions about the LSL USL
Process Data Within
shape to the data. When LSL
Target
35.00000
*
Overall
Potential (Within) Capability

data is Non-normal, the USL
Sample Mean
65.00000
50.19214 Z.Bench 2.54
Sample N 150 Z.LSL 2.81
models assumptions StDev(Within) 5.40199 Z.USL

Cpk
2.74
0.91
StDev(Overall) 20.93958
don’t work and would be CCpk
Overall Capability
0.93
inappropriate to predict. Z.Bench

Z.LSL
0.07
0.73
Z.USL 0.71
Ppk 0.24
Cpm *
It’s actually good news to 99.9
Probability Plot
Mean
StDev
50.19
20.90
have data that looks like 0 15 30 45 60 75 90

99
95
90
N
AD
P-Value
150
11.238
<0.005
this because your project Observed Performance Exp. Within Performance Exp. Overall Performance
80
70
Percent
60
50
PPM < LSL 413333.33 PPM < LSL 2459.27 PPM < LSL 234065.73 40
work will be easy!!! PPM > USL

PPM Total
453333.33
866666.67
PPM > USL
PPM Total
3060.91
5520.18
PPM > USL
PPM Total
239730.12
473795.85
30
20
10
Why? y Clearly y there is

5
something occurring in 0.1

0 25 50 75 100 125
the process that should

be fairly obvious and is causing these very two distinct distribution to occur. Go take a look at each of
the distributions individually and determine what is causing this. DON’T fuss or worry about Normality
at this point, hop out to the process and see what is going on.
Here in the Measure Phase stick with observed performance unless your data are Normal. There are
ways to deal with Non-normal Data for predictive capability but we
we’llll look at that once you have
removed some of the Special Causes from the process. Remember here in the Measure Phase we get
a snapshot of what we’re dealing with, at this point don’t worry about predictability, we’ll eventually get
there.
Capability Steps
When we follow the
steps in performing a
Select Output for
capability study on Improvement
W e can follow the steps for
Attribute Data we hit calculating capability for
a wall at step 6. #1 Verify Customer
Requirements
Continuous Data until we
Attribute Data is not reach the question about
considered Normal #2 Validate
so we will use a Specification data N ormality…
Limits
different
#3 Collect Sample
mathematical Data
method to estimate
capability. #4 Determine
Data Type
(LT or ST)
#5 Check data
for Normality
#6 Calculate
Z-Score, PPM,
Yield, Capability
Cp, Cpk, Pp, Ppk
#7

219
Process Capability
Attribute Capability Steps
Select Output for

Improvement
N otice the difference when
#1 Verify Customer
we come to step 5 …
Requirements
#2 Validate
Specification
Li it
Limits
#3 Collect Sample
Data
#4
Calculate
DPU
#5
Find Z-Score
#6 Convert Z-Score
to Cp & Cpk
#7
Z Scores
Z Score is a measure of the distance in Standard Deviations of a

sample from the Mean.
The Z Score effectively transforms the actual data into standard normal
units. By referring to a standard Z table you can estimate the area under
the N ormal curve.
– Given an average of 50 with a Standard Deviation of 3 what is
the proportion beyond the upper spec limit of 54?
50
54

220
Process Capability
Z Table
In our case we have
to lookup the
proportion for the Z
score of 1.33. This
means that
approximately 9.1%
of our data falls
beyond the upper
spec limit of 54. If
we are interested in
determining parts
per million defective
we would simply
multiply the
proportion .09176 by
one million
million. In this
case there are
91,760 parts per
million defective.
Attribute Capability
Attribute data is a lw a ys long term in the shifted condition since it requires so

many samples to get a good estimate with reasonable confidence.
Short term capability is typically reported, so a shifting method will be employed

to estimate short term capability.
You Want to Estimate : ZST ZLT

Short Term Long Term Sigma Short-Term Long-Term
Your Data Is : Capability Capability Level DPMO DPMO
1 158655.3 691462.5
Short Term Subtract
ZST Capability 1.5 2 22750.1 308537.5
Long Term Add 3 1350.0 66807.2

ZLT Capability 1.5 4 31.7 6209.7
5 0.3 232.7
6 0.0 3.4
Stable process can shift and drift by as much as 1.5 Standard Deviations. Want the theory behind
the 1.5…Google it! It doesn’t matter.

221
Process Capability
Attribute Capability (cont.)
Some people like to

use sigma level By viewing these formulas you can see there is a relationship between them.
(MINITAB™ reports
this as “Z-bench”), If we divide our Z short-term by 3 we can determine our Cpk and if we divide
other like to use Cpk, our Z long-term by 3 we can determine our Ppk.
Ppk. If you are using
Cpk and Ppk you
can easily translate
that into a Z score or
sigma level by
dividing by 3.
Attribute Capability Example
A customer service group is interested in estimating the capability of their

call center.
A total of 20,000 calls came in during the month but 2,500 of them
“ dropped” before they were answered (the caller hung up).
Results of the call center data set:

Samples = 20,000
Defects = 2,666
They hung up….!
We will use this example

p to demonstrate the capability
p y of a customer service call g
group.
p

222
Process Capability
Attribute Capability Example (cont.)
Follow these steps to

determine your 1. Ca
Calcula
lculate
te DPU
DPU
process capability. 2. Look up
up DPU
DPU va
va lue
lue on
on the
the Z
Z-Ta
-Table
ble
3. Find
Find Z-Score
Remember that, 4. Convert Z Score toto Cpk
Cpk,, Ppk
Ppk
DPU is Defects per
unit, the total number
of possible errors or
defects that could be
counted in a process
or service. DPU is
calculated by
dividing the total Example:
Example:
Look
Look up
up ZLT
ZLT
number of defects by
ZLT
ZLT == 1.11
1.11
the number of units Convert
Convert ZLT
ZLT to
to ZST
ZST == 1.11+1.5
1.11+1.5 == 1.61
1.61
or p
products.
"Cpk” is an index (a
11.. Ca
Calcula
lculate
te DPU
DPU
simple number)
22.. Look
Look upup DPU
DPU va
value
lue on
on the
the Z-Ta
Z-Table
ble
which measures how 33.. Find
Find ZZ Score
Score
close a process is 44.. Convert
C
Convert tZZ Score
Score
S to
t Cpk
to C k,, Ppk
Cpk P
Ppk
k
running to its
specification limits,
relative to the natural Ex
Ex aample:
mple:
variability of the Look
Look up up ZLT
ZLT
Z LT == 11.1
ZLT .111
process.
Convert
Convert ZLT ZLT to
to ZST
ZST == 11.1
.111++1
1 .5
.5 == 11.6
.611
A Cpk of at least
1.33
1 33 is desired and
is about 4 sigma +
with a yield of
99.3790% .
The above Cpk of

.54 is about 1.5
sigma or a 50%
Yield.
If you want to know how that variation will affect the ability of your process to meet customer
requirements (CTQ's), you should use Cpk.
If you just want to know how much variation the process exhibits, a Ppk measurement is fine.
Remember Cpk represents the short-term capability of the process and Ppk represents the long-
t
term capability
bilit off th
the process.
With the 1.5 shift, the above Ppk process capability will be worse than the Cpk short-term capability.

223
Process Capability
Estimate capability for Continuous Data
Estimate capability for Attribute Data
Describe the impact of Non-normal Data on the

analysis presented in this module for continuous
capability
You have now completed Measure Phase – Process Capability.
Notes

224
Lean Six Sigma

Black Belt Training
Measure Phase
The Measure Phase is now complete. Get ready to apply it. This module will help you create a
plan to implement the Measure Phase for your project.

225
Measure Phase Overview - The Goal
Th goa l off the

The th M ea sure Pha
Ph se is
i to:
t
• Define, explore and classify “ X” variables using a variety of tools.

– Detailed Process Mapping
– Fishbone Diagrams
– X-Y Diagrams
– FMEA
• Demonstrate a working knowledge of Basic Statistics to use as a

communication tool and a basis for inference.
• Perform Measurement Capability studies on output variables.
• Evaluate stability of process and estimate starting point capability.
Six Sigma Behaviors
• Being tenacious, courageous
• Being
B i rigorous,
i di
disciplined
i li d
• Making data-based decisions

the
Walk!
Ea
Each
ch ““pla
playerer” in
yyer” in the
the Six
Six Sigma
Sigma process
process m
must
ust be
be
AA RO LE M O DEL
RO LE M O DEL
for
for the Six Sigm
the Six Sigmaa culture
culture

226
Measure Phase Deliverables
Listed below are the M ea sure Delivera bles that each candidate
should present in a Power Point presentation to their mentor and project
champion.
At this point you should understand what is necessary to provide these

deliverables in your presentation.
– Team Members (Team Meeting Attendance)
– Primary Metric
– Process Map – detailed
– FMEA
– X-Y Matrix
– Basic Statistics on Y
– MSA
– Stability graphs
– Capability Analysis
– Project Plan
Measure Phase - The Roadblocks
Look for the potential roadblocks and plan to address them before they
become problems:
– Team members do not have the time to collect data.
– Data presented is the best guess by functional managers.
– Process participants do not participate in the creation of the X-Y
Matrix, FMEA and Process Map.
It won’t all be
smooth
sailing…..
g
You will run into roadblocks throughout your project. Listed here are some common ones that Belts
have to deal with in the Measure Phase.

227
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze
Prove/Disprove Impact X’s Have On Problem

Imprrove

Control
The DMAIC Phases Roadmap p is a flow chart of what g

goals should be reached during
g each p
phase of
DMAIC. Please take a moment to review.
Measure Phase
This map of the Measure Phase

rollout is more of a guideline than a Detailed Problem Statement Determined
rule. The way that you apply the Six Detailed Process Mapping
Sigma problem-solving methods to a Identify All Process X’s Causing Problems (Fishbone, Process Map)
project depends on the type of

Select the Vital Few X’s Causing Problems (X-Y Matrix, FMEA)
project your working with and the
Assess Measurement System
environment that you are working in.
Y
Repeatable &
Reproducible?
For example in some cases it may N
make sense to jump directly into

Implement Changes to Make System Acceptable
Assess Stability (Statistical Control)
studies while you collect data to
characterize other aspects of the Assess Capability (Problem with Centering/ Spread)
process in parallel. In other cases it Estimate Process Sigma Level
may be necessary to get a better Review Progress with Champion
understanding of the process first.

Ready for Analyze
Let common sense and data dictate
your path.
y

228
Measure Phase Checklist
These are questions that

you should be able to M ea sure Q uestions
answer in clear, Identify critica l X ’ s a nd potentia l fa ilure m odes
understandable • Is the “ as is” Process Map created?
• Are the decision points identified?
language at the end of • W here are the data collection points?
this phase. • Is there an analysis of the measurement system?
• W here did you get the data?
Identify critica l X ’ s a nd potentia l fa ilure m odes
• Is there a completed XYX-Y Matrix?
• W ho participated in these activities?
• Is there a completed FMEA?
• Has the Problem Statement changed?
• Have you identified more COPQ?
Sta bility Assessm ent
• is the “ Voice of the Process” stable?
• If not, have the special causes been acknowledged?
• Can the good signals be incorporated into the process?
• Can the bad signals be removed from the process?
• How stable can you make the process?
Ca pa bility Assessm ent
• W hat is the short-term and long-term capability of the process?
• W hat is the problem, one of centering, spread or some combination?
Genera l Q uestions
• Are there any issues or barriers that prevent you from completing this phase?
Planning for Action
W HAT W HO W HEN W HY W HY N O T HO W
Identify the com plex ity of the process
Focus on the problem solving process
Define Cha ra cteristics of Da ta
Va lida te Fina ncia l Benefits
Ba la nce a nd Focus Resources
Esta blish potentia l rela tionships betw een va ria bles

Q ua ntify risk of m eeting critica l needs of Custom er,
Business a nd People
Predict the Risk of susta ina bility
Cha rt a pla n to a ccomplish the desired sta te of the
culture
W ha t is y our defect?
W hen does your defect occur?
How is your defect m ea sured?
W ha t is y our project fina ncia l goa l (ta rget & tim e) to
rea ch it?
W ha t is y our Prim a ry metric?
W ha t a re your Seconda ry m etrics?
Define the a ppropria te elem ents of w a ste
Over the last decade of deploying Six Sigma it has been found that the parallel application of the
tools and techniques in a real project yields the maximum success for the rapid transfer of
knowledge. For maximum benefit you should apply what has been learned in the Measure Phase
to a Six Sigma project. Use this checklist to assist.

229
Have started to develop a Project Plan to complete the

action items
Be ready to apply the Six Sigma method within your

business
You have now completed the Measure Phase. Congratulations!
Notes

230
Lean Six Sigma

Black Belt Training
Measure Phase
Quiz
Now we will see what you have retained from the Measure Phase of the course. Please answer
these questions to the best of your ability without referencing the text. The answers are in the
Appendix. Please check your answers against the answers provided and review the sections in
the Measure Phase where your retention of the knowledge is less than you desire.

231
Measure Phase Quiz
1 Wh
1. When llooking
ki att precision,
i i th
the primary
i d
desire
i iis tto confirm
fi ththe process measurementt
system has low Repeatability and____________________. (fill in the blank)
2. The difference in Bias values across the process range are known
as_______________________. (fill in the blank)
3. There are many reasons why Basic Statistics are important to a Black Belt. The following
items are good reasons for using Basic Statistics except which one?
A. Makes inferences about the future
B. Foundation for assessing process capability
C. Data collection for streamed orientation
D. Provide a numerical description of the data especially if it´s Normally Distributed
4. Variable Data can be either Discrete or Continuous.

True False
5. A Black Belt was entering data into MINITABTM. The data being entered is the name of
the countries that his company supplies product to. This is an example of:
A. Nominal Scale Data
B. Ration Scale Data
C. Continuous Data
D. Ordinal Scale Data
6. The most frequently occurring number in a distribution set is 7. The 7 is the sample´s?
A. Mean
B. Median
C. Mode
D. Standard Deviation
7. A fundamental rule is that Standard Deviations cannot be summed but variances can be
summed.d
True False
8. The main difference between Special Cause and Common Cause is? (check all that
apply)
A. Sample size impacts if Common Cause variation is found or not.
B. Special Causes are often the focus of BB projects
C. Special Causes are found in short term Process Capability
D. Common Cause variation is larger than Special Cause variation.
9. The Fishbone is a tool to generate ideas about possible causes for defects.
True False
10. The X-Y Diagram is a tool used to identify/collate potential X´s and assess their relative
impact on multiple Y´s.
T
True False
F l

232
Measure Phase Quiz
11. The X-Y Diagram serves an important function to a Black Belt. From the list below select
th item
the it th
thatt best
b t describes
d ib ththe iimportance
t off th
the X
X-Y
Y Di
Diagram.
A. To eliminate the obvious high impact independent variables
B. To help prioritize the independent variables
C. To help prioritize the dependent variables
D. To help with project scope
12. The term FMEA is an abbreviation for Failure Measures Effect Analysis.
True False
13. The FMEA tool is an important tool for a Black Belt. From the list below select the items
that describe the importance of constructing a FMEA. (check all that apply)
A. Predict failure risks and minimize their occurrence
B. Quantifies the severity, occurrence and detection of defects
C. Highlights the non-value added portions of a process
D. Identify ways how a process leads to a failure to meet customer requirements
14. Measurement System Analysis is an analytical process to quantify accuracy or variation

in a process or product by the use of a gage.
True False
15. After performing a MSA study if an error occurs, the error can be categorized into which
two specific categories?
A. Precision
B. Detailed
C. Accuracy
D. Random
E. Desirability
16. The following are some good examples of what Black Belt projects should measure:
(check all that apply)
A Primary
A. Pi anddSSecondary
d M
Metrics
ti
B. Vital few X´s in the process
C. Before and after process changes
D. All outputs of the process steps
17. The reason for performing a MSA on your system is to confirm minimal variation or
inaccuracy with your measurement systems and reduce the sources for the excessive
variation or inaccuracy.
y
True False
18. Accuracy can be assessed in several ways. From the list below select the least correct
accuracy assessment.
A. Measurement of a known standard
B. Comparison to another recently calibrated instrument with a proven accuracy
C. Comparison with another proven measurement technique
D C
D. Comparison
i with
ith a proven precise
i iinstrument
t t
19. A Crossed Design Gage R&R is best used for destructive testing.
True False

233
Lean Six Sigma

Black Belt Training
Analyze Phase
Welcome to Analyze
Now that we have completed the Measure Phase we are going to jump into the Analyze Phase.
Welcome to Analyze will give you a brief look at the topics we are going to cover
cover.

234
Welcome to Analyze
Overview
These are the

deliverables for the W
W elcom
elcomee to
to Ana
Analy
lyze
ze
Analyze Phase.
““X
X”” Sifting
Sifting
Inferentia
Inferentiall Sta
Statistics
tistics
Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
H
Hypothesis
H th
th ii Testing
Hypothesis TTesting
T ti
ti N
NNND
D P1
P1
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Analyze Phase Roadmap

Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

Analyze

Improve

ol
Contro
I l
Implement
t Control
C t l Plan
Pl tto Ensure
E Problem
P bl D
Doesn’t
’t Return
Rt

235
Analyze Phase Process Map
Vital Few X’s Identified
State Practical Theories of Vital Few x’s Impact on Problem
Translate Practical Theories into Scientific Hypothesis
Select Analysis Tools to Prove/ Disprove Hypothesis
Collect Data
Perform Statistical Tests
State Practical Conclusion
Statistically
Significant?
N
Y
Update FMEA
N
Practically
Significant?
Root
Cause
N
Y
Identify Root Cause
Ready for Improve and Control
This provides a process look at putting “Analyze” to work. By the time we complete this phase you will
have a thorough understanding of the various Analyze Phase concepts.
We will build upon the foundational work of the Define and Measure Phases by introducing
techniques to find root causes, then using experimentation and Lean Principles to find solutions to
process problems. Next you will learn techniques for sustaining and maintaining process performance
using control tools and finally placing your process knowledge into a high level Process Management
tool for controlling and monitoring process performance.

236
Lean Six Sigma

Black Belt Training
Analyze Phase
“X” Sifting
Now we will continue in the Analyze Phase with “X Sifting” – determining what the impact of the
inputs to our process are.

237
“X” Sifting
Overview
The core
fundamentals of this W
W elcom
elcomee to
to Ana
Analy
lyze
ze
M
Multi-Va
ulti-Vari
ri Ana
Analysis
lysis
phase are Multi-Vari
Analysis and ““X
X”” Sifting
Sifting
Classes and Cla
Classes
sses aand
nd Ca
Causes
uses
Causes. Inferentia
Inferentiall Sta
Statistics
tistics
We will examine the Intro

Intro to
to Hy
Hypothesis
pothesis Testing
Testing
meaningg of each of
these and show you Hy
Hypothesis
pothesis Testing
Testing N
NDD P1
P1
how to apply them.
Hy pothesis Testing
Hypothesis Testing N
NDD P2
P2
Hy
Hypothesis
pothesis Testing
Testing N
NNND
D P1
P1
Hy
Hypothesis
pothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Multi-Vari Studies
In the Define Phase we used Process Mapping to identify all the

possible “ X’s” on the horizon. In the Measure Phase we used the X-Y
Diagram, FMEA and Process Map to narrow our investigation down
to the probable “ X’s” .
The XXXXXXXXXX
The many
manyXs Xs
XXXXXXXXXX
when
when wewe first
first start
start X XX XXXXX X X
(The
(The trivial
trivial many)
many) X XX XXXXX X X
The
Thequantity
quantityofofX’s
Xs
XX XX XX X
keep
after reducing
we think as
you
about
workY=f(X)
the project
+e
The
Thequantity
quantityofofX’s
Xs
remaining
when we apply
after
XXX
leverage
DMAIC
(The vital
few)
In the Define Phase you use tools like Process Mapping to identify all possible “X’s”
X s . In the Measure
Phase you use tools to help refine all possible “X’s” like the X-Y Diagram and FMEA.
In the Analyze Phase we start to “dis-assemble” the data to determine what it tells us. This is the fun
part.

238
“X” Sifting
Multi-Vari Definition
Vari Studies – is a tool that graphically displays patterns of variation

Multi-Vari
Multi variation. Multi-Vari
Multi Vari Studies are
used to identify possible X’s or families of variation. These families of variation can hide within a
subgroup, between subgroups, or over time.
The Multi-Vari Chart helps in screening factors by using graphical techniques to logically subgroup
discrete X’s (Independent Variables) plotted against a continuous Y (Dependent). By looking at the
pattern of the graphed points, conclusions are drawn from about the largest family of variation.
Multi-Vari Chart can also be used to assess capability

capability, stability and graphical relationships between
X’s and Y’s.
The use of a Multi-Vari Chart is to illustrate analysis of variance data graphically.
A picture can be worth a thousand words, or numbers.

- Multi-Vari Charts are useful in visualizing two-way interactions.
Multi-Vari Charts reveal information such as:

- Effect of work shift on Y’s.
- Impact of specific machinery, or material on Y’s.
- Effect of noise factors on Y’s, etc.
At this point in DMAIC, Multi-Vari Charts are intended to be used as a passive study, but later in the
process they can be used as a graphical representation where factors were intentionally changed. The
only caveat with using MINITABTM to graph the data is that the data must be balanced. Each source of
variation
i ti mustt have
h th
the same numberb off d
data
t points
i t across titime.

239
“X” Sifting
Multi-Vari Example
To put Multi-Vari studies in practice follow an example of an injection molding process.
You are probably asking yourself what is Injection Molding? Well basically an injection molding
machine takes hard plastic pellets and melts them into a fluid. This fluid is then injected into a
mold or die, under pressure, to create products, such as piping and computer cases.
Method
Typically, we start Sa mpling pla ns should encom pa ss a ll three types of
with a data collection
va ria tion: W ithin,, Betw een a nd Tem pora
p l.
sheet
h t th
thatt makes
k
sense based on our
1). Create Sampling Plan
knowledge of the
process. Then follow 2). Gather Passive Date
the steps. 3). Graph Data
If we only see minor 4). Check to see if Variation is Exposed
variation in the
5) Interpret Results
5).
sample, it is time to go
back and collect No
additional data. When Is
Yes
Crea
Create
te Ga
Gather
ther Is
your data collection Pa
Gra
Graph
ph Va
Varia
riation
tion Interpret
Interpret
Sa
Sammpling
pling Passive
ssive Da Results
represents at least Da Data
ta Ex
Exposed
posed Results
Pla
Plann Datata
80% of the variation
within the
process then you should have enough information to evaluate the graph.
graph
Remember for a Multi-Vari Analysis to work the output must be continuous and the sources of
variation discrete.

240
“X” Sifting
Sources of Variation
Within unit, between
unit and temporal are W ithin unit or Positiona l
the classic causes of – W ithin piece variation related to the geometry of the part.
variation. A unit can – Variation across a single unit containing many individual parts
be a single piece or
such as a wafer containing many computer processors.
a grouping of pieces
– Location in a batch process such as plating.
depending on
whether they were
Between unit or Cyclica l
created
t d att unique
i
times. – Variation among consecutive pieces.
– Variation among groups of pieces.
Multi-Vari Analysis – Variation among consecutive batches.
can be performed on
other processes, Tempora l or Over time Shift-to-Shift
simply identify the
categorical
g sources – Day-to-Day
of variation you are – W eek-to-W eek
interested in.
Machine Layout & Variables

In this example there are 4 widgets created with each die cycle. Therefore, a unit is 4 widgets that
were created at that unique time.
M a ster Injection
Pressure
% O x ygen
Dista nce to ta nk
Injection Pressure
Per Ca vity
Fluid Level
#1
#2
Am bient
Die
#3 Temp
Tem p
#4
Die
Relea se
An example of Within Unit Variation is measured by differences in the 4 widgets from a single
die cycle. For example, we could measure the wall thickness for each of the 4 widgets.
Between Unit Variation is measured by differences from sequential die cycles. An example of
Between Unit Variation is, comparing the average of wall thickness from die cycle to die cycle.
Temporal Variation is measured over some meaningful time period. For example, we would
compare the average of all the data collected in a time period say the 8 o’clock hour to the 10
o’clock hour.
241
“X” Sifting
Sampling Plan
To continue with this Monday W ednesday Friday
example, the Multi-Vari Die Die Die Die Die Die Die Die Die
Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle
sampling plan will be to #1 #2 #3 #1 #2 #3 #1 #2 #3
gather data for 3 die cycles
Cavity #1
on 3 different days for 4
widgets inside the mold.
Cavity #2
If you find this initial

Cavity #3
sampling
p gp plan does not
show the variation of
Cavity #4
interest, it will be necessary
to continue sampling, or
make changes to the
sampling plan. Monday W ednesday Friday
Die Die Die Die Die Die Die Die Die
Within-Unit Encoding Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle
#1 #2 #3 #1 #2 #3 #1 #2 #3
Comparing individual data
Cavity #1
points within a die cycle is
Within Unit Variation.
Cavity #2
Examples of measurement
could be wall thickness, Cavity #3
diameter or uniformity of
thickness to name a few Cavity #4
Monday W ednesday Friday

Between-Unit Encoding Die Die Die Die Die Die Die Die Die
#1 #2 #3 #1 #2 #3 #1 #2 #3
Comparing the averages
from each die cycle is Cavity #1
called Between Unit
Variation. Cavity #2
Cavity #3
Cavity #4
Monday W ednesday Friday

Temporal Encoding Die Die Die Die Die Die Die Die Die
#1 #2 #3 #1 #2 #3 #1 #2 #3
Comparing the average
of all the data within a Cavity #1
day and plot 3 time

Cavity #2
periods is known as
Temporal Variation
Variation.
Cavity #3
Cavity #4

242
“X” Sifting
Using Multi-Vari to Narrow X’s
Gather the list of potential X’s and assign to one of the families of
variation.
– This information can be pulled from the X-Y Diagram from the
Measure Phase.
If an X spans one or more families, assign %’s to the supposed split.
Now let’s
let s use the same information from the X-Y
X Y Diagram that was created in the Measure Phase Phase. The
following exercise will help you assign one of the variables to the family of variation. f you find yourself
with a variable or (X) then assign percentages to split. Use your best judgment for the splits. Don’t
assume that the true X’s causing variation have to come from one in the list.
Step 1 - Graph the data from the process in Multi-Vari form.
Step 2 - Identify the largest family of variation.
Step 3 - Establish statistical significance through the appropriate

statistical testing.
Step 4 - Focus further effort on the X’s associated with the family of
largest variation.
Remember
R b ththe goa l isi nott to
t onlyl
figure out w ha t it is, but w ha t it is not!

243
“X” Sifting
Data Worksheet
Now create the Multi-Vari
Chart in MINITABTM.
Open the MINITABTM

Project “Analyze Data
Sets.mpj” and select the
worksheet
“MVInjectionMold.mtw”.
Take a few minutes to look
through the worksheet to
see the balanced structure.
Create the Multi-Vari Chart
in MINITABTM .
After you create the graph

as indicated, take a few
minutes
i t to t create
t graphsh
using a different order.
Always use the graph that
shows the variation in the
easiest manner to interpret.
Run Multi-Vari
Here is the graph that should have been generated.

244
“X” Sifting
Identify The Largest Family of Variation
To find an example of
within unit variation, look
at Unit 1 in the second
time period. Notice the
spread of data is 0.07.
Now let’s try and find

between unit variation
variation,
compare the averages of
the units within a time
period. All three time
periods appear similar so
looking at the first time
period it appears the
spread of the data is
0.18 units.
To determine temporal
variation, compare the
averages between time periods. It appears time period 3 and 2 have a difference of 0.06.
To determine within unit variation, find the unit with the greatest variation like Unit 1 in the second
time p
period. Notice the spread
p of data is 0.07. It appears
pp the second unit in the third.
Notice that the shifting from unit to unit is not consistent, but it certainly jumps up and down. The
question at this point should be: Does this graph represent the problem I’m working on? Do I see at
least 80% of the variation? Read the units off the Y axis or look in the worksheet. Notice the spread
of the data is 0.22 units. If the usual spread of the data is 0.25 units, then this data set represents
88% of the usual variation which tells us our sampling plan was sufficient to detect the problem.
Root Cause Analysis

y
Focus further effort on the
X’s associated with the
family of greatest variation.
After the analysis we now

know the largest source of
variation is occurring die
cycle to die cycle we can
focus our effort on those
X’s that we suspect have
the greatest impact. In this
case, the pattern of Die Cycle to Die Cycle –
variation is not consistent Something is Cha nging!
within the small scope of
d t we gave gathered.
data th d
Additional data may be
required, or this process
may be ready for
experimentation.
245
“X” Sifting
Call Center Example
Let’s try another example, A company with two call centers wants to compare two methods of
open the MINITABTM handling calls at each location at different times of the day.
worksheet “CallCenter.mtw”.
This example is a One method involves a team to resolve customer issues, and the other
transactional application of method requires a single subject-matter expert to handle the call
alone.
the tool.
In this p
particular case,, a • Output (Y)
company with two call – Call Time
centers wants to compare
two methods of handling • Input (X)
calls at each location at – Call Center (GA,N V)
different times of the day. – Time of Day (10:00, 13:00, 17:00)
One method involves a team – Method (Expert, Team)
to resolve customer issues,
and the other method
requires a single subject-
matter expert to handle the
call alone.
Method What is the largest

source of variation…
Method?
Location?
Time?
Location Is the largest

source of variation
more or less
obvious?
Notice the Multi-

Vari graph plotted
is dependent on the
order in which the
variable column
names are entered
into MINITABTM.

246
“X” Sifting
Call Center Example
To display individual data

points click the “Options”
button. This helps to see
the quantity of data and
to identify unusually long
or short calls.
It is not necessaryy to
force fit any one tool to
your project. For
transactional projects
Multi-Vari may be difficult
to interpret purely
graphically. We will re-
visit this data set later
when working through
Hypothesis Testing.
M lti V i Exercise
Multi-Vari E i
Ex ercise objective: To practice Six Sigma techniques learned

to date in your teams.
1 Open files named MVA Cell Media

1. Media.MTW
MTW and
MVA Cell Media.XLS
2. Perform Capability Analysis, use the column labeled volume.
There is only a lower specification limit of
500 ml. ?
– Are the data normal? _______
– Is
I the
th process capable?
bl ? _______
3. W hat is the issue that needs work in terms of Six Sigma
terminology?
– Shift Mean? _______
– Reduce variation? _______
– Combination of mean and variation? _______
– Change specifications? _______

247
“X” Sifting
MVA Solution
Do you recall the reason
why Normality is an Check for norm a lity …
issue? Normality is
required if you intend to
use the information as a
predictive tool. Early in
the Six Sigma process
there is no reason to
assume that
th t your data
d t Probability
ProbabilityPlot
Plotof
ofVolume
Volume
Normal
Normal
will be Normal. 99.9
99.9
Mean
Mean 514.7
514.7
StDev 6.854
Remember, if it is not 99 StDev 6.854

99 NN 144
144
AADD 0.495
0.495
95
Normal it usually makes 95
90
90
80
P-Value
P-Value 0.212
0.212
finding potential causes

80
70
Percent
70
Percent
60
60
50
50
easier. Let’s work the 40
40
30
30
20
20
problem now. Is that 10
10
5
5
First check the data for normal? l 1

1
0.1
0.1
490 500 510 520 530 540
Normality. Since the P- 490 500 510
Volume
Volume
520 530 540
value is greater than

0.05, the data are
considered Normal.
COPQ and Lean
Having a graphical
summary is quite Another method to check norm a lity is…
nice since it
provides a picture
of the data as well
as the summary
statistics The
statistics.
Summary
Summaryfor
forVolume
graphical summary Volume
AAnderson-Darling
nderson-D arlingNNormality
ormalityTest
Test
command in AA-Squared
-S quared
PP-V-Value
alue
0.49
0.49
0.212
0.212
MINITABTM is an MMean
ean
SStDev
tD ev
514.71
514.71
6.85
6.85
VVariance 46.97
alternative method ariance
SSkew
kewness
Kurtosis
ness
46.97
-0.084725
-0.084725
-0.696960
Kurtosis -0.696960
to check for NN
MMinimum 500.64
144
144
inimum 500.64
Normality. Notice 1st
1stQQuartile
MMedian
uartile
edian
509.70
509.70
515.32
515.32
3rd
3rdQQuartile 520.12
that the P-value in 504
504
510
510
516
516
522
522
528
528
95%
MMaximum
uartile
aximum
520.12
529.39
529.39
95%CConfidence
onfidenceInterv
Intervalalfor
forMMean
ean
this window is the 513.58
513.58 515.84
515.84
95% C onfidence Interv al for M edian
same as the 513.90
513.90
516.37
516.37
95% C onfidence Interv al for StDev
95% C onfidence Interv al for S tD ev
previous. Mean
9 5 % C onfidence Inter vals
9 5 % C onfidence Inter vals 6.14
6.14
7.75
7.75
Mean
Median
Median
Notice that even 513.5
513.5
514.0
514.0
514.5
514.5
515.0
515.0
515.5
515.5
516.0
516.0
516.5
516.5
though the data are

Normal, the
distribution is quite wide.
wide If you had a process where you were filling bottles wouldn’t
wouldn t you expect
the process to be Normal?

248
“X” Sifting
MVA Solution
Now it is time to
perform the
process
capability. For
subgroup size is
enter 12 since all
12 bottles are
filled at the same
time. Also, use
500 milliliters as
the upper spec
limit in order to
see how bad the
capability was
from a
manufacturers
prospective.
Under the
“Options” tab you
can select the “Benchmark Z’s (sigma level)” of the process, or you can leave the default as
“Capability stats”. Just for fun you can run MINITABTM to generate the Capability Analysis using 500
as the upper spec limit, then run it again as the lower spec limit and see what happens to the
statistics
statistics.
MVA Solution
Is this process is in trouble?

Process
P
Process Capability
C bilit of
Capability off Volume
V l
Volume
The answer is yes, since the
USL
USL Z bench value is negative!
PProcess
rocessDData Within
LS
LSLL
ata
**
W ithin
Ov
O verall
erall
That is very bad. To correct
Target
Target
UUSSLL
**
500
500
PPotential
otential(Within)
(Within)CCapability
apability this problem the process has
Z.Bench
Z.Bench -2.89 -2.89
SSample
ampleMMean
SSample
ampleNN
ean 514.709
514.709
144
144
Z.LS L
Z.LS L ** to be set in such a manner
Z.U
Z.USSLL -2.89
-2.89
SStD
tDev
SStD
ev(Within)
tDev
(Within) 5.08411
ev(O
(Ovverall)
5.08411
erall) 6.86575
6.86575
CCpk
pk -0.96
-0.96 that none of the bottles are
OOvverall
erallCCapability
Z.Bench
Z.Bench
apability
-2.14
-2.14
ever under filled, while trying
Z.LS
Z LSLL
Z.LS
Z.U
Z.USSLL
**
-2.14
-2.14
to minimize the amount of
PPpk
pk
CCpm
pm
-0.71
-0.71
**
overfill.
To answer step three of this

500
500 504
504 508
508 512
512 516
516 520
520 524
524 528
528 exercise, it is a combination
OObserv
bservededPPerformance
erformance EExp.
xp.Within
WithinPPerformance
erformance EExp.
xp.OOvverall
erallPPerformance
erformance of reducing variation and
PPPPMM << LS
LSLL ** PPPPMM <<LS
LSLL ** PPPPMM <<LS
LSLL **
PPPPMM >> UUSSLL 1000000.00
1000000.00 PPPPMM >>UUSSLL 998092.41
998092.41 PPPPMM >>UUSSLL 983915.86
983915.86 shifting the Mean. The Mean
PPPPMM Total
Total 1000000.00
1000000.00 PPPPMM Total
Total 998092.41
998092.41 PPPPMM Total
Total 983915.86
983915.86
cannot be shifted however,
until
til th
the variation
i ti iis reduced
d d
REDUCE V ARIATIO N !! - then shift m ea n dramatically.

249
“X” Sifting
Perform an MVA
The order in which you enter the factors will
produce different graphs. The “classical”
method is to use Within, Between and over-
time (Temporal) order.
MVA Solution
The graph shows the variation within a unit is consistent across all the data. The variation between
units also looks consistent across all the data. What seems to stand out is the machine may be set
up differently from first shift to second. That should be easy to fix! What is the largest source of
variation? Within Unit Variation is the largest
largest, Temporal is the next largest (and probably easiest to
fix) and Between Unit Variation comes in last.
So to fix this process

your game plan should
W ha t is the la rgest source of va ria tion?
be based on the
Multi-Vari
Multi-Vari Chart
Chart for
for Volume
Volume by
by Within
Within--Temporal
Temporal
information in the Excel
11 22 11 22 11 22
file and involve 800 1100 1400 1700 2000 2300 W
800 1100 1400 1700 2000 2300 Within
ithin
additional information 530
530 11
22
you have about the 525
525
33
44
process. 55
66
520
520
77
Volume
Volume
88
This example was 515
515 99
based on a real process 510

10
10
11
11
510
where the nasty culprit 12
12
was actually the location 505

505
of the in-line scale. No 500

500
one wanted to believe 11 22 11 22

Between
11 22
Between
that a high price scale Panel
Panel variable:
variable: Temporal
Temporal
could be generating
significant variation.
The in-line scale weighed the bottles and either sent them forward to ship or rejected them to be
topped off. The wind generated by the positive pressure in the room blew across the scale making
the weights recorded fluctuate unacceptably. The filling machine was actually quite good, there
were a few adjustments made once the variation from the scale was fixed. Once the variation in
the data was reduced, they were able to shift the Mean closer to the specification of 500 ml.

250
“X” Sifting
Data Collection Sheet
The data used in the Multi-Vari Analysis must be balanced for R

Rememberb th the d
data
t usedd
MIN ITABTM to generate the graphic properly. in the Multi-Vari Analysis
must be balanced for
The injection molding data collection sheet was created as follows: MINITABTM to generate
– 3 time periods the graphic properly.
– 4 widgets per die cycle
– 3 units per time period The injection molding data
collection sheet was
created to include:
3 time periods
4 widgets per die cycle
3 units per time period
for a total of 36 rows of
data. (3 times 4 times 3)
The data sheet is now

balanced meaning that there
is an equal number of data
points for each condition in the
data table and ready for data
to be entered.
If you were to label the units 1

– 9 instead of 1 – 3 per time
period,
i d MINITABTM would ld
generate an error message
and would not be able to
create the graphic. Think in
terms of generic units instead
of being specific in labeling.

251
“X” Sifting
Classes of Distributions
By now you are
convinced that M ulti-Va ri is a tool tha t helps screen X ’s by visua lizing
Multi-Vari is a tool three prima ry sources of va ria tion. La ter w e w ill
that helps screen perform Hypothesis Tests ba sed on our findings.
X’s by visualizing
three primary
sources of At this point we will review classes and causes of distributions that
variation. At this can also help us screen X’s to perform Hypothesis Tests.
point we will review
classes and causes – N ormal Distribution
of distributions that
– N on-normality – 4 Primary Classifications
can also help us
screen X’s to 1. Skewness
perform Hypothesis
Tests. 2. Multiple Modes
3 Kurtosis
3.
4. Granularity
The Normal (Z) Distribution
Please review the

characteristics of Cha ra cteristics of norm a l distribution (Ga ussia n curve) a re:
the Gaussian – It is considered to be the most important distribution in statistics.
curve shown
– The total area under the curve is equal to 1.
here…
– The distribution is mounded and symmetric; it extends indefinitely in
both directions, approaching but never touching the horizontal axis.
– All processes will exhibit a normal curve shape if you have pure
random variation (white noise).
– The Z distribution has a Mean of 0 and a Standard Deviation of 1.
– The Mean divides the area in half, 50%
on one side and 50% on
the other side.
– The Mean, Median and
Mode are at the same
data point.
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6

252
“X” Sifting
Normal Distribution
This Normal Curve is W hy do w e ca re?

NOT a plot of our
– ON LY IF, we need accurate estimates of mean and standard deviation.
observed data!!!
• Our theoretical distribution should MOST accurately represent our
This theoretical
sample distribution in order to make accurate inferences about our
curve is estimated population.
based on our data’s
Mean and Standard
Deviation. Many
Hypothesis Tests
that are available
assume a Normal
Distribution. If the
assumption is not
satisfied we cannot
use them to infer
anything about the
future.
However, just
because a
distribution of sample data looks Normal does not mean that the variation cannot be reduced and a
new Normal Distribution created.
Non-Normal Distributions
Data may follow

Non-normal
Distributions for a 2 Kurtosis
variety of reason, 1 Sk ew ed
or there may be
multiple sources of
variation causing
data that would
otherwise be
normal to appear 4 Gra nula rity
not Normal. 3 M ulti-M oda l

253
“X” Sifting
Skewness Classification
When a distribution P t ti l Ca
Potentia C uses off Sk ew ness
is not symmetrical,
then it’s Skewed. Left Sk ew Right Sk ew
Generally a Skewed 60
distribution longest 40
50
tail points in the
Frequency
Frequency
30 40
direction of the 20 30
Skew. 10
20
10
0 0
10 15 20 4 5 6 7 8 9 10 11
1-1 N atural Limits

1-2 Artificial Limits (Sorting)
1-3 Mixtures
1 4 N on-Linear
1-4 on Linear Relationships
1-5 Interactions
1-6 N on-Random Patterns Across Time
Mixed Distributions 1-3
M ix ed Distributions occur when data comes from multiple sources

that are supposed to be the same yet are not.
M a chine A M a chine B
O pera tor A O pera tor B
Pa y ment M ethod A Pa y ment M ethod B Combined
Interview er A Interview er B
Sa m ple A
+ Sa mple B
=
What causes Mixed Distributions? Mixed Distributions occur when data comes from several sources
that are supposed to be the same but are not.
Note that both distributions that formed the combined Skewed Distribution started out as Normal
Distributions.

254
“X” Sifting
1-4 Non-Linear Relationships
Just because on Linea r Rela tionships occur w hen the X a nd Y sca les
N on-Linea
your Input (X) a re different.
is Normally
Distributed 10
about a Mean,
the Output (Y)
may not be
Normally
Distributed.
Y
M a rgina l Distribution
5
of Y
0 50 100
X
M a rgina l Distribution
of X
1 5 IInteractions
1-5 t ti
Intera ctions occur when two inputs interact with each other to
have a larger impact on Y than either would by themselves.
Interaction Plot for Process Output Aerosol Hairspray
On
35
Temperature
Spray
Off
30
Room T
25
No Spray
No Fire With Fire
If you find that two inputs have a large impact on Y but would not effect Y by themselves
themselves, this is
called a Interaction.
For instance, if you spray an aerosol can in the direction of a flame what would happen to room
temperature? What do you see regarding these distributions?

255
“X” Sifting
1-6 Time Relationships / Patterns
Th distribution
The di t ib ti is
i dependent
d d t on time.
ti Time
relationships
occur when the
30
distribution is
dependent on
M a rgina l Distrribution
time, some
25
examples are
tool wear,
wear
chemical bath
of Y
depletion, stock
20 prices, etc.
10 20 30 40 50
Tim e
O
Often
ften seen
seen w w hen
hen tooling
tooling requires
requires ““w
w aarm ing up”
rming up”,, tool
tool w
w ea
ear,
r,
chemica l ba th depletions, a mbient tem pera ture effect on tooling.
chemica l ba th depletions, a m bient tem pera ture effect on tooling.
Non-Normal Right (Positive) Skewed
Moment coefficient of Skewness will be close to zero for symmetric

distributions, negative for left skewed and positive for right skewed.
To measure Skewness we use Descriptive Statistics. When looking at a symmetrical distribution,

Skewness will be close to zero. If the distribution is skewed to the left it will have a negative number,
if skewed to the right, it should be positive.

256
“X” Sifting
Kurtosis 2
K t i refers
Kurtosis f tto th
the sha
h pe off the
th ta
t ils.
il Platykurtic are
flat with short-
– Leptokurtic tails.
– Platykurtic
• Different combinations of distributions causes the resulting overall
shapes.
Leptok urtic Pla tyk urtic

Pea k ed w ith Long-Ta ils Fla t w ith Short-Ta ils
Platykurtic
M ultiple M ea ns shifting over tim e produces a

pla tea u of the da ta a s the shift ex hibits this
shift. Ca uses:
2 -1 . M ix tures: (Com bined

Da ta from M ultiple
Processes)
Multiple Set-Ups
Multiple Batches
Multiple Machines
Tool W ear (over time)
2 -2
2 Sorting or Selecting:
Scrapping product that falls
outside the spec limits
2 -3 Trends or Pa tterns:
Lack of Independence in the
data (example: tool wear,
chemical bath)
2 -4 N on Linea r
Rela tionships
Chemical Systems
Negative coefficient of Kurtosis indicates Platykurtic distribution.

257
“X” Sifting
Leptokurtic
Positive
Kurtosis value Distributions
Di t ib ti overla
l ying
i ea ch
h other
th ththa t hah ve very
indicates different va ria nce ca n ca use a Leptok urtic
Leptokurtic distribution.
distribution. Ca uses:
2 -1 . M ix tures: (Com bined

Da ta from M ultiple
Processes)
Multiple Set-Ups
Set Ups
Multiple Batches
Multiple Machines
Tool W ear (over time)
2 -2 Sorting or Selecting:
Scrapping product that falls
outside the spec limits
2 -3 Trends or Pa tterns:
Lack of Independence in the data
(example: tool wear, chemical
bath)
2 -4 N on Linea r
Rela tionships
Chemical Systems
Multiple Modes 3
Rea sons for M ultiple M odes:
3-1 Mixtures of distributions ((most likely)

y)
3-2 Lack of independence – trends or patterns
3-3 Catastrophic failures (example: testing

voltage on a motor and the motor shorts
out so we get a zero reading etc.)
Make mine a la mode!
Multiple Modes have such dramatic combinations of underlying sources that they show distinct
modes. They may have shown as Platykurtic, but were far enough apart to see separation.
Celebrate! These are usually the easiest to identify causes.

258
“X” Sifting
Bimodal Distributions
Thi iis an example

This l off a Bi
Bi- 2 Different Distributions
Modal Distribution. -2 different m a chines
Interestingly each peak is -2 different opera tors
actually a Normal Distribution, -2 different a dm inistra tors
but when the data is viewed
as a group it is obviously not
Normal.
Extreme Bi-Modal (Outliers)
Variable: ExtremeBiMod
Anderson-Darling
Anderson Darling Normality Test
A-Squared: 22.657
P-Value: 0.000
Mean 28.8144
StDev 7.5702
Variance 57.3081
Skewness 1.37767
Kurtosis 2.66E-03
N 127
22 26 30 34 38 42 46
Minimum 22.6294
1 t Quartile
1st Q til 24 2649
24.2649
Median 25.2902
3rd Quartile 26.5494
95% Confidence Interval for Mu Maximum 45.3291
95% Confidence Interval for Mu
27.4851 30.1438
24.6 25.6 26.6 27.6 28.6 29.6 30.6 95% Confidence Interval for Sigma
6.7398 8.6359
95% Confidence Interval for Median
25 0263
25.0263 25 7491
25.7491
If you see an extreme outlier, it usually has its on cause or own source of variation. It’s relatively
easy to isolate the cause by looking on the X Axis of the Histogram.

259
“X” Sifting
Bi-Modal – Multiple Outliers
Having multiple outliers is

more difficult to correct. Descriptive Statistics
This action typically means Variable: C11
multiple inputs.
Anderson-Darling Normality Test
A-Squared: 20.899
P-Value: 0.000
Mean 26.2507
StDev 4.8453
Variance 23.4767
Skewness 3.17250
Kurtosis 9.11483
N 108
22 26 30 34 38 42 46
Minimum 22.6294
1st Quartile 24.1285
Median 25.0534
3rd Quartile 25.9709
95% Confidence Interval for Mu Maximum 46.0000
95% Confidence Interval for Mu
25.3265 27.1750
25 26 27 95% Confidence Interval for Sigma
4.2740 5.5943
24.8365 25.2971
Granular 4
Gra nula r da ta is ea sy to see w hen plotted w ith

a Dot Plot.
– Use Caution!
• It looks “ N ormal” but it is only symmetric and not continuous.
– Causes:
• 4-1 Measurement system resolution (Gage(G R&R)
& )
• 4-2 Categorical (step-type function) data
Now let’s take a moment and Notice the P-value in the Normal Probability Plot, it is definitely smaller
than 0.05!
There simply is not enough resolution in the data.

260
“X” Sifting
Normal Example
N otice the contra st to the previous pa ge!
Conclusions Regarding Distributions
N on-normal Distributions are not BAD!!!
N on
on-normal
normal Distributions can give more root cause
information than N ormal data (the nature of why…)
Understanding what the data is telling us is KEY !!!
W ha t do you w a nt to k now ???
Hey
y Honey,
y I found the key….
y
Here is what to conclude regarding distributions.

261
“X” Sifting
Perform a Multi-Vari Analysis

Interpret and a Multi-Vari Graph
Identify when a Multi-Vari Analysis is applicable
Interpret
I t t what
h t Skewed
Sk d Data
D t looks
l k like
lik
Explain how data distributions become Non-normal
when they are really Normal
You have now completed Analyze Phase – ”X” Sifting.
Notes

262
Lean Six Sigma

Black Belt Training
Analyze Phase
Inferential Statistics
Now we will continue in the Analyze Phase with Inferential Statistics

Statistics.

263
Overview
The core
W elcome
elcome to
to Ana
Analyze
lyze
phase are Inferential
Statistics, Nature of ““X
X”” Sifting
Sifting Inferential
Statistics
Sampling and
Central Limit Inferentia
Inferentiall Sta
Statistics
tistics N
Nature
ature of
of Sampling
Sampling
Theorem.
Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing Central
Central Limit
Limit Theorem
Theorem
We will examine the
meaning of each of
these and show you Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
how to apply them.
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Hypothesis
H h i Testing
Hypothesis TTesting
i N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Nature of Inference
in·fer·ence (n.) “ The act or process of deriving logical conclusions

from premises known or assumed to be true. The act of reasoning
from factual knowledge or evidence.” 1 1. Dictionary.com
Inferentia l Sta tistics – To draw inferences about the process or

population being studied by modeling patterns of data in a way that
account for randomness and uncertainty in the observations. 2
2. W ikipedia.com
Putting the pieces of

the puzzle
together….
One objective of Six Sigma is to move from only describing the nature of the data or descriptive
statistics to that of inferring what will happen in the future with our data or Inferential Statistics.

264
5 Step Approach to Inferential Statistics
1 . W ha t do you w a nt to k now ?
2 . W ha t tool w ill give you tha t informa tion?
3 . W ha t k ind of da ta does tha t tool require?
4 . How w ill you collect the da ta ?
5 . How confident a re you of your da ta summa ries?
So many
questions….?
As with most things you have learned associated with Six Sigma – there are defined steps to be
taken.
Types of Error
1 . Error in sa mpling
– Error due to differences among samples drawn at random from the
population (luck of the draw).
– This is the only source of error that statistics can accommodate.
2 . Bia s in sa mpling
– Error due to lack of independence among random samples or due to
systematic sampling procedures (height of horse jockeys only).
3 . Error in mea surement
– Error in the measurement of the samples (MSA/ GR&R)
4 . La ck of mea surement va lidity
– Error in the measurement does not actually measure what it intends to
measure (placing a probe in the wrong slot measuring temperature
with a thermometer that is just next to a furnace).
Types of error contribute to uncertainty when trying to infer with data.
There are four types of error that are explained above.

265
Population, Sample, Observation
Popula tion
– EVERY data point that has ever been or ever will be generated from a given
characteristic.
Sa m ple
– A portion (or subset) of the population, either at one time or over time.
X
X X
X X
O bserva tion
– An
A individual
i di id l measurement.
t
Let’s just review a few definitions: A population is EVERY data point that has ever been or ever will
be generated from a given characteristic. A sample is a portion (or subset) of the population, either
at one time or over time. An observation is an individual measurement.
Significance
Significa nce is all about differences. In general, larger differences

(or deltas) are considered to be “ more significant.”
Pra ctica l difference and significance is:
– The amount of difference, change, or improvement that will be of
practical, economic, or technical value to you.
– The amount of improvement required to pay for the cost of making the
improvement.
Sta tistica l difference and significance is:
– The magnitude of difference or change required to distinguish
between a true difference, change, or improvement and one that
could have occurred by chance.
Six Sigma decisions will ultimately have a return on resource
investment (RORI)* element associated with them.
– The key question of interest for our decisions “ is the benefit of making
a change worth the cost and risk of making it?”
* RORI includes not only dollars and assets but the time and participation of your teams.

266
The Mission
Mean Shift Variation Both

Reduction
Your mission, which you have chosen to accept, is to reduce cycle time, reduce the error rate,
reduce costs, reduce investment, improve service level, improve throughput, reduce lead time,
increase productivity… change the output metric of some process, etc…
In statistical terms, this translates to the need to move the process Mean and/or reduce the process
Standard Deviation
You’ll be making decisions about how to adjust key process input variables based on sample data,
not population data - that means you are taking some risks.
How will you know your key process output variable really changed, and is not just an unlikely
sample? The Central Limit Theorem helps us understand the risk we are taking and is the basis for
using sampling to estimate population parameters.
A Distribution of Sample Means
Imagine you have some population. The individual values of this population form some distribution.
Take a sample of some of the individual values and calculate the sample Mean.
Keep taking samples and calculating sample Means.
Plot a new distribution of these sample Means.
The Central Limit Theorem says that as the sample size becomes large, this new distribution (the
sample Mean distribution) will form a Normal Distribution, no matter what the shape of the
population distribution of individuals.

267
Sampling Distributions—The Foundation of Statistics
Popula tion • Samples from the population, each with five observations:
3
5 Sa mple 1 Sa mple 2 Sa mple 3
2
12 1 9 2
10 12 8 3
1 9 5 6
6
12 7 14 11
5 8 10 10
6
12 7 .4 9 .2 6 .4
14
3 • In this example, we have taken three samples out of the
6 population, each with five observations in it. W e computed a
11
9 Mean for each sample. N ote that the Means are not the same!
10 • W hy not?
10
12
• W hat would happen if we kept taking more samples?
Every statistic derives from a sampling distribution. For instance, if you were to keep taking samples
from the population over and over, a distribution could be formed for calculating Means, Medians,
Mode, Standard Deviations, etc. As you will see the above sample distributions each have a
diff
different
t statistic.
t ti ti ThThe goall h
here iis tto successfully
f ll make
k iinferences
f regarding
di ththe statistical
t ti ti l ddata.
t
Constructing Sampling Distributions

To demonstrate how sampling distributions work we will create some random data for die rolls.
Create a sample of 1,000 individual rolls of a die that we will store in a variable named “Population”.
From the p
population,
p , we will draw five random samples.
p

268
Sampling Distributions
To draw random samples from the population follow the command shown below and repeat 4 more
times for the other columns.
Calc> Random Data> Sample from Columns…
Sampling Error
Ca lcula te the M ea n a nd Sta nda rd Devia tion for ea ch column Now compare
a nd compa re the sa mple sta tistics to the popula tion. the Mean and
Standard
Stat > Basic Statistics > Display Descriptive Statistics…
Deviation of the
samples of 5
Descriptive Sta tistics: Popula tion, Sa mple1 , Sa mple2 , Sa m ple3 , Sa mple4 , Sa mple5 observations to
the population.
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum What do you
see?
Population 1000 0 3.5510 0.0528 1.6692 1.0000 2.0000 4.0000 5.0000 6.0000
Sample1
Sa pe 5 0 3.400
3 00 0.927
0 9 2.074
0 1.000
000 1.500
500 3.000
3 000 5.500
5 500 6.000
6 000
Sample2 5 0 4.600 0.678 1.517 2.000 3.500 5.000 5.500 6.000
Sample3 5 0 4.200 0.663 1.483 2.000 3.000 4.000 5.500 6.000
Sample4 5 0 3.800 0.917 2.049 2.000 2.000 3.000 6.000 6.000
Sample5 5 0 3.600 0.872 1.949 1.000 2.000 3.000 5.500 6.000
Ra nge in M ea n 1 .2 Ra nge in StDev 0 .5 9 1

269
Sampling Error
Create 5 more columns of data sampling 10 observations from the population

population.
Calc> Random Data> Sample from Columns…
Sampling
p g Error - Reduced
Calculate the Mean and Standard Deviation for each column and compare the sample statistics to
the population.
Stat > Basic Statistics > Display Descriptive Statistics…
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum
S
Sample6
l 6 10 0 3 600
3.600 0 653
0.653 2 066
2.066 1 000
1.000 1 750
1.750 3 500
3.500 6 000
6.000 6 000
6.000
Sample7 10 0 4.100 0.567 1.792 1.000 2.750 4.500 6.000 6.000
Sample8 10 0 3.200 0.442 1.398 1.000 2.000 3.500 4.250 5.000
Sample9 10 0 3.500 0.563 1.780 1.000 2.000 3.500 5.250 6.000
Sample10 10 0 3.300 0.616 1.947 1.000 1.750 3.000 5.250 6.000
Range in mean 0.9 Range in StDev 0.668
W ith 1 0 observa tions, the differences

betw een sa m ples a re now m uch sm a ller.
Can you tell what is happening to the Mean and Standard Deviation? When the sample size
increases, the values of the Mean and Standard Deviation decrease.
What do you think would happen if the sample increased? Let’s try 30 for a sample size.

270
Sampling Error - Reduced
Do you notice
anything different?
Look how much

smaller the range of
the Mean and
Standard deviations.
Did the sampling
error get reduced?
Va ria ble N M ea n StDev

Sa mple 1 1 3 0 3 .7 3 3 1 .8 1 8
Sa mple 1 2 3 0 3 .8 0 0 1 .5 6 2
Sa mple 1 3 3 0 3 .4 0 0 1 .8 6 8
Sa mple 1 4 3 0 3 .6 6 7 1 .7 6 8
S mple
Sa l 15 30 3 .1
167 1 .4
487
Ra nge in M ea n 0 . 6 3 Ra nge in StDev 0 .3 8 1
In theory, if w e k ept ta k ing sa m ples of size n= 5 a nd

n= 1 0 a nd ca lcula ted the sa mple M ea ns, w e could see
how the sa m ple M ea ns a re distributed.
Simula te this in M IN ITABTM by crea ting ten colum ns of

1 0 0 0 rolls of a die:
Feeling lucky…?
Now instead of looking at the effect of sample size on error, we will create a sampling distribution
of averages. Follow along to generate your own random data.

271
For ea ch row , ca lcula te the M ea n of five columns.
Repea
p tt this
Repea this com
commmaand
nd
to
to ca
calcula
lculate
te the
the M
Meaeann
of
of C1
C1-C1
-C100,, aand
nd store
store
result
result in
in M
Meaeann 1100
The commands shown above will create new columns that are now averages from the columns of
random population data. We have 1000 averages of sample size 5 and 1000 averages of sample
size
i 1010.
Crea te a Histogra m of C1 , M ea n5 a nd M ea n1 0 .
Graph> Histogram> Simple…..
Multiple
p Graph…On
p separate
p graphs…Same
g p X,, including
g same bins
In MINITABTM
follow the above
commands. The
Histogram being
generated makes
it easy to see
what happened
when the sample
size was
Select
Select ““Sa
Sam meeX X ,,
including
increased.
including sasammee
bins”
bins” to
to fa
facilita
cilitate
te
com
compapa rison
rison

272
Different Distributions
Sa m ple M ea ns
1. W hat is different about the

three distributions?
2. W hat happens as the

number of dice increases?
Individua ls
Observations
As the sample size (number of dice) increases from 1 to 5 to 10, there

are three points to note:
1. The center remains the same.
2. The variation decreases.
3. The shape of the distribution changes - it tends to become
normal.
The M ea n of the sa m ple The Sta nda rd Devia tion of the
M ea n distribution: sa m ple M ea n distribution, a lso
k now n a s the sta nda rd error.
Good new s: the M ea n of the Better new s: I ca n reduce m y

sa mple M ea n distribution is the uncerta inty
y a bout the p
popula
p tion
m ea n of the popula tion. m ea n by increa sing m y sa m ple
size n.
Central Limit Theorem

If all possible random samples, each of size n, are taken from any
population with a Mean μ and Standard Deviation σ, the distribution
of sample Means will:
have a Mean
Everything we have gone have a Std Dev
through with sampling error
and sampling distributions
was leading up to the and be normally distributed when the parent population is
normally distributed, or will be approximately normal for samples
Central Limit Theorem. of size 30 or more when the parent population is not normally
distributed
distributed.
This improves with samples of larger size.
Bigger is Better!

273
So What?
So how does this theorem help me understa nd the

risk I a m ta k ing w hen I use sa mple da ta , instea d of
popula tion da ta ?
Recall that 95% of N ormally Distributed data is within ± 2 Standard

Deviations from the Mean. Therefore, the probability is 95% that my
sample Mean is within 2 standard errors of the true population Mean.
A Practical Example
Let’s sa y your project is to reduce the setup time for a

la rge ca sting:
– Based on a sample of 20 setups, you learn that your baseline

average is 45 minutes, with a Standard Deviation of 10
minutes.
– Because this is just a sample, the 45 minute average is just an
estimate of the true average.
– Using the central limit theorem
theorem, there is 95% probability that
the true average is somewhere between 40.5 and 49.5
minutes.
– Therefore, don’t get too excited if you made a process change
that resulted in a reduction of only 2 minutes.
What is the likelihood of getting a sample with a 2 second difference? This could be caused either
by implementing changes or could be a result of random sampling variation, sampling error. The
95% confidence interval exceeds the 2 second difference (delta) seen as a result. What is the delta
caused from? This could be a true difference in performance or random sampling error
error. This is why
you look further than only relying on point estimators.

274
Sample Size and the Mean
W hen ta k ing a sa mple w e ha ve only estima ted the

true M ea n :
– All we know is that the true Mean lies somewhere within the
theoretical distribution of sample Means or the t-distribution which are
analyzed using t-tests.
– T-tests measure the significance of differences between Means.
Theoretica l distribution of
sa mple M ea ns for n = 2
Distribution of
Theoretica l distribution of
individua ls in the
sa mple M ea ns for n = 1 0
popula tion
Standard Error of the Mean
The Sta nda rd Devia tion for the distribution of M ea ns is

ca lled the sta nda rd error of the M ea n a nd is defined a s:
– This formula shows that the Mean is more stable than a single
observation by a factor of the square root of the sample size.

275
Standard Error
The ra te of cha nge in the sta nda rd error a pproa ches zero
a t a bout 3 0 sa mples.
Sta nda rd Errror
0 5 10 20 30
Sa m ple Size
This is w hy 3 0 sa mples is often recommended w hen

genera ting summa ry sta tistics such a s the M ea n a nd
Sta nda rd Devia tion.
This is a lso the point a t w hich the t a nd Z distributions

become nea rly equiva lent.
When comparing standard error with sample size, the rate of change in the standard error
approaches zero at about 30 samples. This is why a sample size of 30 comes up often in discussions
on sample size.
This is the point at which the t and the Z distributions become nearly equivalent. If you look at a Z
table and a t table to compare
p Z=1.96 to t at 0.975 as sample
p approaches
pp infinite degrees
g of freedom
they are equal.

276
Explain the term “Inferential Statistics”
Explain the Central Limit Theorem
Describe what impact sample size has on your

estimates of population parameters
Explain Standard Error
You have now completed Analyze Phase – Inferential Statistics.
Notes

277
Lean Six Sigma

Black Belt Training
Analyze Phase
Introduction to Hypothesis Testing
Now we will continue in the Analyze Phase with “Introduction

Testing”.

278

Overview
The core
W elcome
elcome to
to Ana
Analyze
lyze
phase are
Hypothesis Testing, ““X
X”” Sifting
Sifting
Tests for Central
Tendency, Tests for Hypothesis
Inferentia
Inferentiall Sta
Statistics
tistics Hypothesis Testing
Testing Purpose
Purpose
Variance and
ANOVA. Tests
Tests for
for Centra
Centrall Tendency
Tendency
Intro
Intro to
to Hypothesis
yyp
Hypothesis Testing
gg
Testing
We will examine the Tests
Tests for
for Va
Varia
riance
nce
meaning of each of Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
these and show you AN
ANOVA
OVA
how to apply them. Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Six Sigma Goals and Hypothesis Testing
Our goal is to improve our Process Capability, this translates to the need to move the process Mean
(or proportion) and reduce the Standard Deviation.
Because it is too expensive or too impractical (not to mention theoretically impossible) to
collect population data, we will make decisions based on sample data.
Because we are dealing with sample data, there is some uncertainty about the true
population parameters.
Hypothesis Testing helps us make fact-based decisions about whether there are different population
parameters or that the differences are just due to expected sample variation.
Process Capability of Process Before Process Capability of Process After
LSL USL LSL USL

P rocess Data Within P rocess Data Within
LSL 100.00000 Ov erall LSL 100.00000 Ov erall
Target * Target *
USL 120.00000 P otential (Within) C apability USL 120.00000 P otential (Within) C apability
Sample M ean 108.65832 Cp 1.42 S ample M ean 109.86078 Cp 2.14
Sample N 150 C PL 1.23 S ample N 100 C PL 2.11
StD ev (Within) 2.35158 C PU 1.61 S tD ev (Within) 1.55861 C PU 2.17
StD ev (O v erall) 5.41996 C pk 1.23 S tD ev (O v erall) 1.54407 C pk 2.11
C C pk 1.42 C C pk 2.14
O v erall C apability O v erall C apability
Pp 0.62 Pp 2.16
PPL 0.53 PPL 2.13
PPU 0.70 PPU 2.19
P pk 0.53 P pk 2.13
C pm * C pm *
96 100 104 108 112 116 120 102 105 108 111 114 117 120
O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance O bserv ed P erformance Exp. Within P erformance E xp. O v erall P erformance
P P M < LS L 6666.67 P P M < LSL 115.74 P P M < LSL 55078.48 P P M < LS L 0.00 P P M < LSL 0.00 P P M < LSL 0.00
PPM > USL 0.00 PPM > USL 0.71 P P M > U SL 18193.49 P P M > U SL 0.00 P P M > U S L 0.00 P P M > U S L 0.00
P P M Total 6666.67 P P M Total 116.45 P P M Total 73271.97 P P M Total 0.00 P P M Total 0.00 P P M Total 0.00

279
Purpose of Hypothesis Testing
The purpose of appropriate Hypothesis Testing is to integrate the Voice of the Process with the
Voice of the Business to make data-based decisions to resolve problems.
Hypothesis Testing can help avoid high costs of experimental efforts by using existing data. This
can be likened to:
Local store costs versus mini bar expenses.
There may be a need to eventually use experimentation, but careful data analysis can
indicate a direction for experimentation if necessary.
The probability of occurrence is based on a pre-determined statistical confidence.

Decisions are based on:
Beliefs (past experience)
Preferences (current needs)
Evidence (statistical data)
Risk (acceptable level of failure)
The Basic Concept for Hypothesis Tests
Recall from the discussion on classes and cause of distributions that a data set may seem Normal,
yet still be made up of multiple distributions.
Hypothesis Testing can help establish a statistical difference between factors from different
distributions.
0.8
0.7
0.6
0.5
freq
0.4
03
0.3
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
x
Did my sample come from this population? Or this? Or this?
Because of not having the capability to test an entire population, having to use a sample is the
closest we can get to the population. Since we are using sample data and not the entire population
we need to have methods what will allow us to infer the sample if a fair representation of then
population.
When we use a proper sample size, Hypothesis Testing gives us a way to detect the likelihood that
a sample came from a particular distribution. Sometimes the questions can be: Did our sample
come from a population with a mean of 100? Is our sample variance significantly different than the
variance of the population? Is it different from a target?

280
Significant Difference
Are the two distributions “ significantly” different from each

other? How sure are we of our decision?
How do the number of observations affect our confidence in

detecting population Mean?
μ1 μ2
Sa mple 1 Sa mple 2
Do you see a difference between Sample 1 and Sample 2? There may be a real difference between
the samples shown; however, we may not be able to determine a statistical difference. Our
confidence is established statistically which has an effect on the necessary sample size. Our ability
to detect a difference is directly linked to sample size and in turn whether we practically care about
such a small difference.
Detecting Significance
Sta tistics provide a methodology to detect differences.

– Examples
p might
g include differences in suppliers,
pp , shifts or
equipment.
– Two types of significant differences occur and must be well
understood, pra ctica l and sta tistica l.
– Failure to tie these two differences together is one of the most
common errors in statistics.
H O : The sk y is not fa lling.
H A : The sk y is fa lling.
We will discuss the difference between practical and statistical throughout this session. We can
affect the outcome of a statistical test simply by changing the sample size.

281
Practical vs. Statistical
Pra ctica l Difference: The difference which results in an

improvement of practical or economic value to the company.
– Example, an improvement in yield from 96 to 99 percent.
Sta tistica l Difference: A difference or change to the process that

probably
b bl ((withith some d
defined
fi d ddegree off confidence)
fid ) did nott h
happen
by chance.
– Examples might include differences in suppliers, markets or servers.
W e w ill see tha t it is possible to rea lize a sta tistica lly

significa
g nt difference w ithout rea lizing g a pra
p ctica lly y
significa nt difference.
Lets take a moment to explore the concept of Practical Differences versus Statistical Differences.
Detecting Significance
During the Measure Phase, it is important that the nature M ea n Shift

of the problem be well understood.
In understanding the problem, the practical difference to

be achieved must match the statistical difference.
g in the Mean or in
The difference d can be either a change
the variance.
Detection of a difference is then accomplished using

statistical Hypothesis Testing.
Va ria tion
An important concept to understand is the process of Reduction
detecting a significant change. How much of a shift in the
Mean will offset the cost in making a change to the
process?
This is not necessarily the full shift from the Business

Case of your project. Realistically, how small or how large
a delta is required? The larger the delta, the smaller the
necessary sample will be because there will be a very
small overlap of the distributions. The smaller the delta is, the larger the sample size has to be to be
able
bl tto d
detect
t t a statistical
t ti ti l diff
difference.

282
Hypothesis Testing
A Hypothesis
H T t is
th i Test i an a priori
i i theory
th relating
l ti tot differences
diff b
between
t variables.
i bl
A statistical test or Hypothesis Test is performed to prove or disprove the theory.
A Hypothesis Test converts the practical problem into a statistical problem.

Since relatively small sample sizes are used to
estimate population parameters, there is always a
chance of collecting a non
non-representative
representative sample
sample.
Inferential Statistics allows us to estimate the
probability of getting a non-representative sample
DICE Example
You have rolled dice before haven’t you? You know dice that you would find in a board game or in
Las Vegas.
Well assume that we suspect a single die is “Fixed.” Meaning it has been altered in some form or
fashion to make a certain number appear more often that it rightfully should.
Consider the example on how we would go about determining if in fact a die was loaded.
If we threw the die five times and got five one’s, what would you conclude? How sure can you be?
The pprobability
y of g
getting
g jjust a single
g one. The p
probability
y of g
getting
g five ones.
W e could throw it a number of times and track how many each face
occurred. W ith a standard die, we would expect each face to occur 1/ 6 or
16.67% of the time.
If we threw the die 5 times and got 5 one’s, what would you conclude? How
sure can you be?
– Pr (1 one) = 0.1667 Pr (5 ones) = (0.1667)5 = 0.00013
There are approximately 1.3 chances out of 1000 that we could have gotten
5 ones with a standard die.
Therefore, we would say we are willing to take a 0.1% chance of being
wrong about our hypothesis that the die was “ loaded” since the results do not
come
co e cclose
ose to ou
our p
predicted
ed cted outco
outcome.
e

283
Hypothesis Testing
When it comes to Hypothesis

Hypothesis, you must look at
three focus points to help validate your claim.
These points are Type I, Type II and Sample
Size. α
DECISIONS
β n
How Likely is Unlikely?
A differences
Any diff between
b observed
b dddata and
d claims
l i made
d under
d H0 may b
be reall or d
due to chance.
h
Hypothesis Tests determine the probabilities of these differences occurring solely due to chance and
call them P-values.
The a level of a test (level of significance) represents the yardstick against which p-values are
measured and H0 is rejected if the P-value is less than the alpha level.
Th mostt commonly
The l used
d a llevels
l are 5%
5%, 10% and
d 1%
1%.
Hypothesis Testing Risk

The alpha risk or Type 1 Error (generally called the “Producer’s Risk”) is the probability that we
could be wrong in saying that something is “different.” It is an assessment of the likelihood that
the observed difference could have occurred by random chance. Alpha is the primary decision-
makingg tool of most statistical tests.
Alpha risk can also be Actua l Conditions

explained as: The risk with N ot Different Different
implementing a change when (Ho is True) (Ho is False)
you should not.
N ot Different Correct Type II
Alpha risk is typically lower
than beta risk because you are ((Fail to Reject
j Ho)) Decision Error
more hesitant to make a Sta tistica l
mistake about claiming the Conclusions
significance of an X (and
Different
Type 1 Correct
therefore spending money) as
(Reject Ho) Error Decision
compared to overlooking an X
(which is never revealed).
There of ttwo
o ttypes
pes of error T
Type
pe I with
ith an associated risk eq
equal
al to alpha (the first letter in the Greek
alphabet), and of course named the other one Type II with an associated risk equal to beta.
The formula reads: alpha is equal to the probability of making a Type 1 error, or alpha is equal to
the probability of rejecting the null hypothesis when the null hypothesis is true.

284
Alpha Risk
Alpha (α ) risk s a re ex pressed rela tive to a reference

distribution.
Distributions include:
– t-distribution The
The aa-level
-level is
is represented
represented by
by
– z-distribution the
the clouded
clouded aarea
reas.
s.
Sa
Sammple
ple results
results in
in this
this aarea
rea
– χ2 - distribution lea
leadd to
to rejection
rejection of H00..
of H
– F-distribution
Region R i
Region
of of
DO UBT DO UBT
Accept a s cha nce differences
Hypothesis
yp Testing
g Risk
The beta risk or Type 2 Error (also called the “Consumer’s Risk”) is the probability that we could
be wrong in saying that two or more things are the same when, in fact, they are different.
Actua l Conditions
N ot Different Different
(Ho is True) (Ho is False)
N ot Different Correct Type II

(Fail to Reject Ho) Decision Error
Sta tistica l
Conclusions
Different
Type 1 Correct
(Reject Ho) Error Decision
Another way to describe beta risk is failing to recognize an improvement. Chances are the sample
size was inappropriate or the data was imprecise and/or inaccurate.
Reading the formula: Beta is equal to the probability of making a Type 2 error.
Or: Beta is equal to the probability of failing to reject the null hypothesis given that the null
hypothesis is false.

285
Beta Risk
Beta and sample

size are very Beta Risk is the probability of failing to reject the null hypothesis
closely related. when a difference exists.
When calculating
Sample size in Distribution if H 0 is
true
MINITABTM, we
always enter the Reject H 0
“power” of the test α = Pr(Type 1 error)
which
hi h iis one
α = 0.05
minus beta. In
doing so, we are H0 value
establishing a
sample size that
Accept H 0
will allow the β= Pr(Type II error) Distribution if H a is
proper overlap of true
distributions.
μ
Critica
Criticall va
value
lue of
of
test
test sta
statistic
tistic
Distinguishing between Two Samples
Recall from the Central Limit Theorem as the Theoretical Distribution

number of individual observations increase the of Means
Standard Error decreases. δ When n = 2
δ=5
In this example when n=2 we cannot S=1
distinguish the difference between the Means
(> 5% overlap
overlap, P-value
P value > 0.05).
0 05)
When n=30, we can distinguish between the

Means (< 5% overlap, P-value < 0.05) There is
a significant difference.
Theoretical Distribution
of Means
When n = 30
δ=5
S=1

286
Delta Sigma—The Ratio between d and S
Delta (d) is the size of the difference between

two Means or one Mean and a target value.
Large Delta
Sigma (S) is the sample Standard Deviation of
the distribution of individuals of one or both of
δ
the samples under question.
When & S is large, we don’t need statistics

b
because th
the diff
differences are so llarge.
If the variance of the data is large, it is difficult

to establish differences. We need larger
sample sizes to reduce uncertainty.
Large S
We want to be 95% confident in all of our estimates!
All samples are estimates of the population. All statistics based on samples are estimates of the
equivalent population parameters. All estimates could be wrong!
These are typical questions you will experience or hear during sampling. The most common answer
is “It depends.”. Primarily because someone could say a sample of 30 is perfect where that may
actually be too many. Point is you don’t know what the right sample is without the test.
Question: “How many samples should we take?”

Answer: “Well, that depends on the size of your delta and
Standard Deviation”.
Question: “How should we conduct the sampling?”

Answer: “Well, that depends on what you want to know”.
Question: “Was the sample we took large enough?”

Answer: “Well, that depends on the size of your delta and
Standard Deviation”.
Question: “Should we take some more samples just to be sure?”

Answer: “No, not if you took the correct number of samples the
first time!”

287
The Perfect Sample Size

The minimum sample size
required to provide exactly 5%
overlap (risk). In order to
distinguish the Delta.
Note: If you are working with non-

Normal Data, multiply your
calculated sample size by 1.1.
40 50 60 70
Popula tion
40 50 60 70
Hypothesis Testing Roadmap – Continuous Data
H
Here iis a H
Hypothesis
th i T Testing
ti roadmap
d ffor C
Continuous
ti D
Data.
t Thi
This iis a greatt reference
f ttooll while
hil you
are conducting Hypothesis Tests.
N orm a l
u s
i n uo
nt
Co Da ta
Test of Equa l Va ria nce 1 Sa m ple Va ria nce 1 Sa m ple t-test
Va ria nce Equa l Va ria nce N ot Equa l
2 Sa m ple T O ne W a y AN O VA 2 Sa m ple T O ne W a y AN O VA

288
s
ou
t i nu
n
Co D a ta N on N orma l
Test of Equa l Va ria nce M edia n Test
M a nn-W hitney Severa l M edia n Tests
Hypothesis Testing Roadmap – Attribute Data
Attribute Da ta
u te
t t r ib
A a ta
D
One Fa ctor Tw o Fa ctors
Two Samples Two or More Samples
One Sample
One Sa mple Tw o Sa mple Chi Squa re Test

Proportion Proportion (Contingency
Ta ble))
Minitab:
Stat - Basic Stats - 2 Proportions Minitab:
If P-value < 0.05 the proportions Stat - Tables - Chi-Square Test
are different If P-value < 0.05 at least one
proportion is different
Chi Squa re Test

(Contingency
Ta ble)
Minitab:
Stat - Tables - Chi-Square Test
If P-value < 0.05 the factors are not
independent

289
Common Pitfalls to Avoid
W hile
hil using
i H Hypothesis
th i T Testing
ti ththe ffollowing
ll i ffacts
t should
h ld bbe b
borne in
i
mind at the conclusion stage:
– The decision is about Ho and N OT Ha.
– The conclusion statement is whether the contention of Ha was upheld.
– The null hypothesis (Ho) is on trial.
– W hen a decision has been made:
• N othing has been proved.
• It is just a decision.
• All decisions can lead to errors (Types I and II).
– If the decision is to “ Reject Ho,” then the conclusion should read “ There
is sufficient evidence at the α level of significance
g to show that “ state the
alternative hypothesis Ha.”
– If the decision is to “ Fail to Reject Ho,” then the conclusion should read
“ There isn’t sufficient evidence at the α level of significance to show that
“ state the alternative hypothesis.”
Notes

290
Articulate the purpose of Hypothesis Testing

Explain the concepts of the Central Tendency
Be familiar with the types of Hypothesis Tests
You have now completed Analyze Phase – Introduction to Hypothesis Testing.
Notes

291
Lean Six Sigma

Black Belt Training
Analyze Phase
Hypothesis Testing Normal Data Part 1
Now we will continue in the Analyze Phase with “Hypothesis

Hypothesis Testing Normal Data Part 1”
1.

292
Overview
The core
W elcom
elcomee to
to Ana
Analy
lyze
ze
phase are
Hypothesis Testing, ““X
X”” Sifting
Sifting
Tests for Central
Tendency, Tests for Inferentia
Inferentiall Sta
Statistics
tistics
Variance and
ANOVA.
Intro
Intro to
to Hypothesis
Hy pothesis Testing
Testing Sa
Sample
p Size
mple Size
We will examine the
meaning of each of Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1 Testing
Testing M
Mea
eans
ns
these and show you Ana
Analy
lyzing
zing Results
Results
Hypothesis Testing
Testing N
NDD P2
P2
Hy pothesis Testing
Hypothesis Testing N
NNND
D P1
P1
Hypothesis
H th i Testing
Hypothesis Testing
T ti N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Item s
Test of Means (t-tests)
T-tests are used to compare a Mean against a target and to compare Means from two different
samples and to compare paired data. When comparing multiple Means it is inappropriate to use a t-
test. Analysis of variance or ANOVA is used when it is necessary to compare more than two Means.
t-tests a re used:
– To compare
p a Mean against
g a target.
g
• i.e.; The team made improvements and wants to compare
the mean against a target to see if they met the target.
– To compare Means from two different samples.

• i.e.; Machine one to machine two.
• i.e.;
i S
Supplier
li one quality
lit tto supplier
li ttwo quality.
lit
– To compare paired data.

• Comparing the same part before and after a given process.
They don’t
The d ’t look
l k the
same to me!

293
1 Sample t
Here we are looking for the region in which we can be 95% sure our true population Mean will lie
lie.
This is based on a calculated average, Standard Deviation, number of trials and a given alpha risk of
.05.
A 1-sample t-test is used to compare an expected population mean to a
In order for the Mean target.
of the sample to be
considered not
significantly different
than the target,
target the
target must fall within Target μsample
the confidence
interval of the sample MIN ITABTM performs a one sample t-test or t-confidence interval for the Mean.
Mean.
Use 1-sample t to compute a confidence interval and perform a hypothesis
test of the Mean when the population Standard Deviation, σ, is unknown. For
a one or two-tailed 1-sample t:
– H0 : μsample = μtarget If p-va lue > 0 .0 5 fa il to reject H o

– Ha: μsample ≠, <, > μtarget If p-va lue < 0 .0 5 reject H o
1 Sample t-test Sample Size
T One common pitfall in

statistics is not understanding
Ta rget what the proper sample size
Popula tion should be. If you look at the
graphic, the question is: Is
n= 2 Ca n not tell the X there a difference between my
X X
difference
X
XX
X
process Mean and the desired
betw een the sa m ple X
a nd the ta rget. X X X X
X X target. If we had population
n= 3 0 Ca n tell the data, it would be very easy –
X
difference X no they are not the same, but
betw een the sa m ple XX
a nd the ta rget. X X they may be within an
X XX acceptable tolerance (or
S specification window). If we
SE Mean =
n took a sample of 2 can we tell
a difference? NoNo, because the
spread of the distribution of averages from samples of 2 will create too much uncertainty and make it
very difficult to statistically say there is a difference.
If you remember from earlier, 95% of the area under the curve of a Normal Distribution falls within plus
or minus 2 Standard Deviations. Confidence intervals are based on your selected alpha level, so if you
selected an alpha of 5%, then the confidence interval would be 95% which is roughly plus or minus 2
Standard Deviations. Using your eye to guesstimate you can see that the target value falls within plus
or minus 2 Standard Deviations of the sampling distribution of sample size 2
2.
If you used a sample of 30, could you tell if the target was different? Just using your eye it appears
that the target is outside the 95% confidence interval of the Mean. Luckily, MINITABTM makes this very
easy…

294
Sample Size
IInstead
t d off going
i To determine proper sa mple size in M IN ITABTM :
through the dreadful
hand calculations of
sample size we will
use MINITABTM.
Three fields must be
filled in and one left
blank in the sample Three fields m ust be filled
in a nd one left bla nk .
size window.
MINITABTM will solve
for the third. If you
want to know the
sample size, you must
enter the difference,
which is the shift that
mustt be
b detected.
d t t d It isi
common to state the
difference in terms of
“generic” Standard Deviations when you do not have an estimate for the Standard Deviation of the
process. For example, if you want to detect a shift of 1.5 Standard Deviations enter that in difference
and enter 1 for Standard Deviation. If you knew the Standard Deviation and it was 0.8, then enter it
for Standard Deviation and 1.2 for the difference (which is a 1.5 Standard Deviation shift in terms of
real values)
values).
If you are unsure of the desired difference, or in many cases simply get stuck with a sample size that
you didn’t have a lot of control over, MINITABTM will tell you how much of a difference can be
detected. You as a practitioner must be careful when drawing Practical Conclusions because it is
possible to have statistical significance without practical significance. In other words- do a reality
check. MINITABTM has made it easy to see an assortment of sample sizes and differences.
Try the example shown.

shown
Pow er a nd Sa m ple Size
Notice that as the sample size
1 -Sa m ple t Test
increases, there is not as big an
effect on the difference. If it was only Testing M ea n = null (versus not = null)
necessary to see a difference of 0.9, Ca lcula ting pow er for M ea n = null + difference
why bother taking any more samples Alpha = 0 .0 5 Assum ed Sta nda rd Devia tion = 1
than 15? The Standard Deviation Sa m ple
entered has an effect on the
Size Pow er Difference
difference calculated.
10 0 .9 1 .1 5 4 5 6
Take a few moments and explore 15 0 .9 0 .9 0 0 8 7
The va rious sa mple
different Standard Deviation sizes in 20 0 .9 0 .7 6 4 4 6 sizes show how much
MINITABTM to see their effect on
25 0 .9 0 .6 7 5 9 0 of a difference ca n be
difference.
30 0 .9 0 .6 1 2 4 5 detected a ssuming a
35 0 .9 0 .5 6 4 0 8
Sta nda rd Devia tion = 1 .
40 0 .9 0 .5 2 5 6 4

295
1-Sample t Example
1 . Pra ctica l Problem:

• W e are considering changing suppliers for a part that we currently
purchase from a supplier that charges us a premium for the hardening
process.
• The proposed new supplier has provided us with a sample of their
product. They have stated that they can maintain a given characteristic
of 5 on their product.
• W e want to test the samples and determine if their claim is accurate.
2 . Sta tistica l Problem:

Ho: μN .S. = 5
Ha: μN .S.
S ≠ 5
3 . 1 -sa mple t-test (popula tion Sta nda rd Devia tion unk now n,
compa ring to ta rget).
α = 0.05 β = 0.10
Let’s now try a 1-sample t example.
Step 1: Take a moment and review the practical problem

Step 2: The Statistical Problem is: The null hypothesis is the Mean of the new supplier is equal to 5.
The alternative hypothesis is the Mean of the new supplier is not equal to 5. This is considered a 2-
tailed test if you’ve heard that terminology before.
Step 3: Our selected alpha level is 0.05 and beta is 0.10.
4 . Sa mple Size:
• O pen the M IN ITABTM w ork sheet:
Ex h_Sta t.M TW
• Use the C1 colum n: Va lues
– In this ca se, the new supplier
sent 9 sa mples for eva lua tion.
– How much of a difference ca n
be detected w ith this sa mple?

296
Hypothesis Testing
Follow along in
MINITABTM, as you can
see, we will be able to
detect a difference of
1.23 with the sample of
9.
If this was not good

enough you would need
enough, M IN ITABTM Session W indow
to request additional Pow er a nd Sa mple Size
samples. 1 -Sa mple t Test
Testing m ea n = null (versus not = null)
This m ea ns w e w ill only be Ca lcula ting pow er for m ea n = null +
a ble to detect a difference of difference
only 1 .2 4 if the popula tion ha s Alpha = 0 .0 5 Assum ed sta nda rd
a Sta nda rd Devia tion of 1 unit. devia tion = 1
Sa mple
9 0 .9 1 .2 3 7 4 8
Example: Follow the Road Map
Now refer to the road map for Hypothesis Testing, you must first check for Normality. In MINITABTM
select “Stats>Basic Statistics>Normality Test”. For the “Variable Fields” double-click on “Values” in
the left-hand box. Once this is complete select “OK”.
Since the P-value is greater than 0.05 we fail to reject the null hypothesis that the data are Normal.
5 . Sta te Sta tistica l Solution
Probability Plot of Values

Normal
99
Mean 4.789
StDev 0.2472
95 N 9
AD 0.327
90
P-Value 0.442
80
70
Percent
60
50
40
30
20
Are
Are the
the
10
da
datata in
in the
the
va
values
lues
column
l
colum n
4.2 4.4 4.6 4.8 5.0 5.2 5.4 norma
norm al? l?
Values

297
1-Sample t Example
Perform the one sample t-t
test. In MINITABTM select
Stat>Basic Statistics>1-
Sample t. From the left-
hand box double-click on
“Values”.
Click “ Gra phs”
In the “Options” button
th
there iis a selection
l ti ffor th
the -Select
S l t a ll 3
alternative hypothesis, the Click “ O ptions
default is not equal which
corresponds to our - In CI enter 9 5
hypothesis. If your
alternative hypothesis was
a greater than or less than,
you would have to change
y g
the default.
Histogram of Values
Based of the graph we can

Histogram
Histogramof
ofValues
say there is a statistical (with
(withHo
Hoand
and95%
Values
95%t-confidence
t-confidenceinterval
intervalfor
forthe
themean)
mean)
difference or reject the null 2.0
2.0
hypothesis for the following
reason: 1.5
1.5
ency
ency
A Histogram is not 1.0

10
1.0
FFreque
reque
especially interesting when

0.5
0.5
there are so few data points
but it does show the 95% 0.0
0.0 __
XX
confidence interval of the Ho
Ho
data along with the 4.4

4.4 4.5
4.5 4.6
4.6 4.7
4.7
Values
4.8
Values
4.8 4.9
4.9 5.0
5.0 5.1
5.1
hypothesized value of 5
noted as the Ho or null N ote our ta rget M ea n (represented by red Ho) is outside our
hypothesis. popula tion confidence bounda ries w hich tells tha t there is
significa nt difference betw een popula tion a nd ta rget M ea n.

298
Box Plot of Values
Boxplot
Boxplotof
ofValues
Values
(with
(withHo
Hoand
and95%
95%t-confidence
intervalfor
forthe
themean)
mean)
__
XX
Ho
Ho
4.4
4.4 4.5
4.5 4.6
4.6 4.7
4.7 4.8
4.8 4.9
4.9 5.0
5.0 5.1
5.1
Values
Values
The Box Plot shows a different representation of the data

data, but the conclusion is the same
same.
Individual Value Plot (Dot Plot)
I di id l Value
Individual V l Pl t off Values
Plot V l
(with Ho and 95% t-confidence interval for the mean)
_
X
Ho
4.4 4.5 4.6 4.7 4.8 4.9 5.0 5.1

Values
As you will see the conclusion is the same, but the Dot Plot is just another representation of data.

299
Session Window
Ho Ha
n
(X i − X ) 2
One-Sample T: Values
s= ∑
i =1 n −1
Test of mu = 5 vs not = 5 S
SE Mean =
n
Variable N Mean StDev SE Mean 95% CI T P

Values 9 4.78889 0.24721 0.08240 (4.59887, 4.97891) -2.56 0.034
T-Ca lc = O bserved – Ex pected over SE M ea n

T-Ca lc = X -ba r – Ta rget over Sta nda rd Error
T Ca lc = 4 .7
T-Ca 7 8 8 9 – 5 over .0
0 8 2 4 = - 2 .5
56
N – sa mple size
M ea n – ca lcula te ma thema tic a vera ge
StDev – ca lcula ted individua l sta nda rd devia tion (cla ssica l m ethod)
SE M ea n – ca lcula ted sta nda rd devia tion of the distribution of the m ea ns
Confidence Interva l tha t our popula tion a vera ge w ill fa ll betw een 4 .5 9 8 9 , 4 .9 7 8 9
Shown here is the MINITABTM Session Window output for the 1

1-Sample
Sample tt-test.
test
Evaluating the Results
Since the P-va lue of 0 .0 3 4 is less tha n 0 .0 5 , reject the null

hypothesis.
Ba sed on the sa mples given there is a difference betw een the

a vera ge of the sa mple a nd the desired ta rget.
X Ho
6 . Sta te Pra ctica l Conclusions

The new supplier’s cla im tha t they ca n meet the ta rget
of 5 for the ha rdness is not correct.

300
Manual Calculation of 1- Sample t
Let’s compa re the ma nua l ca lcula tions to w ha t the

com puter ca lcula tes.
– Ca lcula te t-sta tistic from da ta :
X − Target 4 . 79 − 5 . 00
t= = = − 2 . 56
s 0 . 247
n 9
– Determ ine critica l t-va lue from t-ta ble in reference

section.
• W hen the a lterna tive hypothesis ha s a not equa l
sign, it is a tw o-sided test.
• Split the α in ha lf a nd rea d from the 0 .9 7 5 colum n
in the t-ta ble for n -1 (9 - 1 ) degrees of freedom .
Here are the manual calculations of the 1

1-samle
samle tt, verify that MINITABTM is correct.
correct
Manual Calculation of 1- Sample t
degrees of T - Distribution
freedom
.600 .700 .800 .900 .950 .975 .990 .995
1 0.325
0 325 0.727
0 727 1.376
1 376 3.078
3 078 6.314
6 314 12.706
12 706 31.821
31 821 63.657
63 657
2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925
3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841
4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604
5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032
6 0.265 0.553 0.906 1.440 1.943 2.447 3.143 3.707

7 0.263 0.549 0.896 1.415 1.895 2.365 2.998 3.499
8 0.262 0.546 0.889 1.397 1.860 2.306 2.896 3.355
9 0.261 0.543 0.883 1.383 1.833 2.262 2.821 3.250
10 0.260 0.542 0.879 1.372 1.812 2.228 2.764 3.169
μ
-2.56
2 56
-2.306 2.306
α/2=.025 α/2 =.025
0
If the ca lcula ted t-va lue lies a ny w here Critical Regions
in the critica l regions
regions, reject the null hypothesis
hypothesis.
– The da ta supports the a lterna tive hy pothesis tha t the
estima te for the M ea n of the popula tion is not 5 .0 .

301
Confidence Intervals for Two-Sided t-test
Here iis th
H the fformula
l ffor th
the
confidence interval. Notice The formula for a tw o-sided t-test is:
we get the same results as
MINITABTM. s s
X − t α/2, n −1 ≤ μ ≤ X + t α/2,n −1
n n
or
X ± t crit SE
S mean = 4.788
88 ± 2.306
2 306 * .0824
0824
4.5989 to 4.9789
4.5989 X Ho
4.9789
4.7889
1-Sample t Exercise
Ex ercise objective: Utilize what you have learned to

conduct and analyze a one sample t-test using
MIN ITABTM .
1. The last engineering estimation said we would achieve

a product with average results of 32 parts per million
(ppm).
2. W e want to test if we are achieving this performance

level, we want to know if we are on target, with 95%
confidence in our answer. Use data in column “ ppm
VOC”
3. Are we on Target?

302
1-Sample t Exercise: Solution

Since we do not know the
population Standard
Deviation, we will use the 1
sample t-test to determine if
we are at target.

After selecting column C1 and
setting “Hypothesis Mean” to
32.0, click “Graphs” and select
“Histogram of data” to get a
good visualization of the
analysis.
Depending on the test you are

running you may need to
select “Options” to set your
desired confidence Interval
and hypothesis. In this case
the MINITABTM Defaults are
what we want.

303
B
Because we used d th
the
option of “Graphs”, we get a Histogram of ppm VOC
nice visualization of the (with Ho and 95% t-confidence interval for the mean)
data in a Histogram AND a 10
plot of the null hypothesis

relative to the confidence 8
level of the population

6
Mean.
cy
Frequenc
4
Because the null
hypothesis is within the
2
confidence level, you know
we will “fail to reject” the 0 _
X
null hypothesis and accept Ho
the equipment is running at 20 25 30 35 40 45 50
the target of 32.0.
32 0 ppm VOC

In MINITABTM’s Session Window (ctrl – M), you can see the P-value of 0.201. Because it is above
0.05, we “fail to reject” the null hypothesis so we accept the equipment is giving product at a target of
32.0 ppm VOC.

304
Hypothesis Testing Roadmap
N orma l
s
ou
nu
o n ti ta
C Da
Test of Equa l Va ria nce 1 Sa mple Va ria nce 1 Sa mple t-test
2 Sa mple T O ne W a y AN O VA 2 Sa mple T O ne W a y AN O VA
2 Sample t-test
Notice the
difference in the A 2-sample t-test is used to compare two Means.
hypothesis for two
two- Stat > Basic Statistics > 2-Sample t
tailed vs. one-tailed MIN ITABTM performs an independent two-sample t-test and generates a
test. This confidence interval.
terminology is only
used to know which Use 2-Sample t to perform a Hypothesis Test and compute a confidence
column to look interval of the difference between two population Means when the
down in the t-table. population Standard Deviations, σ’s, are unknown.
Two tailed test:

– H0 : μ1 = μ2 If p-va lue > 0 .0 5 fa il to reject
Ho
– Ha: μ1 ≠ μ2 If p-va lue < 0 .0 5 reject H o
One tailed test:

– H0 : μ1 = μ2
– Ha: μ1 > or < μ2
μ1 μ2

305
Sample Size
In MINITABTM
select “Stat>Power To determine proper sa mple size in M IN ITABTM :
and Sample
Size>2-Sample t”.
Follow the same
steps that were
taken for 1-sample
t.
Three fields m ust be filled
in a nd one left bla nk .
As you can see we

used the same Pow er a nd Sa m ple Size
command here just
2 -Sa m ple t Test
as in the 1-sample t.
Do you think the Testing M ea n 1 = M ea n 2 (versus not = )
results are Ca lcula ting pow er for M ea n 1 = M ea n 2 + difference
different?
Alpha = 0 .0 5 Assum ed Sta nda rd Devia tion = 1
Correct, the results Sa m ple
are different.
10 0 .9 1 .5 3 3 6 9
The va rious sa mple
15 0 .9 1 .2 2 6 4 4
sizes show how
20 0 .9 1 .0 5 1 9 9 m uch of a
25 0 .9
9 0 .9
93576 difference ca n be
30 0 .9 0 .8 5 1 1 7 detected a ssum ing
35 0 .9 0 .7 8 6 0 5
the Sta nda rd
Devia tion = 1 .
40 0 .9 0 .7 3 3 9 2
The sa m ple size is for ea ch group.

306
2-Sample t Example
Over the next several

lesson pages we will 1 . Pra ctica l Problem:
explore an example for • W e have conducted a study in order to determine the effectiveness of
a 2-Sample t-test. a new heating system. W e have installed two different types of
Step 1. Read Practical
dampers in home ( Damper = 1 and Damper = 2).
Problem • W e want to compare the BTU.In data from the two types of dampers
to determine if there is any difference between the two products.
Stepp 2. The null
hypothesis is the Mean 2. S
Sta tistica
i i l Problem:
P bl
of BTU.In for damper 1 Ho:μ1 = μ2
is equal to the Mean of Ha:μ1 ≠ μ2
BTU.In for damper 2.
The alternative 3 . 2 -Sa m ple t-test (population standard deviations unknown).
hypothesis is the α = 0.05 β = 0.10
Means are not equal.
Step 3. We will use the
2-Sample t-test since No, not that kind of damper!
the population
Standard Deviations
are unknown.
Now in Step 4.
Open the worksheet 4 . Sa mple Size:
in MINITABTM • Open the MIN ITABTM worksheet: Furnace.MTW
called:
g the data to see how the data is coded.
• Scroll through
“Furnace
Furnace.MTW”
MTW”
• In order to work with the data in the BTU.In column, we will need
How is the data to unstack the data by damper type.
coded?
The only way we

can work with the
data in the BTU.In
i b
is by unstacking
t ki ththe
data by damper
type.

307
2-Sample t Example
We will unstack the data in BTU

BTU.In,
In in “Using
Using subscripts in:”
in: select “Damper”
Damper . Store the unstacked
data in “After the last column in use”. Check the “Name the columns containing the unstacked data”
box. Then click “OK”.
Notice the “unstacked” data for each damper. WE NOW HAVE TWO COLUMNS.

308
2-Sample t Example
Now let us perform a 2 Sample t Example

Example. In MINITABTM select “Stat>Power
Stat>Power and Sample size>2-
size>2
Sample t”.
For the field “Sample Sizes:” enter 40 space 50 because our data set has unequal sample sizes
which is not uncommon. The smallest difference that can be detected is based on the smallest
sample size, so in this case it is: 0.734.
M inita b Session W indow
Pow er a nd Sa mple Size

2 -Sa mple t Test
Testing mea n 1 = mea n 2 (versus not
=)
Ca lcula ting pow er for mea n 1 = mea n
2 + difference
Alpha = 0 .0 5 Assumed sta nda rd
devia tion = 1
Sa mple
40 0 .9 0 .7 3 3 9 1 9
50 0 .9 0 .6 5 4 7 5 2
The sa mple size is for ea ch group.
Example: Follow the Roadmap…

309
Normality Test – Is the Data Normal?
Probability
ProbabilityPlot
Plotof
ofBTU.In_1
BTU.In_1
Normal
Normal
99
99
Mean
Mean 9.908
9.908
StDev
StDev 3.020
3.020
95
95 NN 40
40
AAD
D 0.475
0.475
90
90 P-Value
P-Value 0.228
0.228
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
55 10
10 15
15 20
20
BTU.In_1
BTU.In_1
The data is considered Normal since the P-value is greater than 0.05.
Probability
ProbabilityPlot
Plotof
ofBTU.In_2
BTU.In_2
Normal
Normal
99
99
Mean
Mean 10.14
10.14
StDev
StDev 2.767
2.767
95
95 NN 50
50
AAD
D 0.190
0.190
90
90 P-Value
P-Value 0.895
0.895
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
22 44 66 88 10
10 12
12 14
14 16
16 18
18
BTU.In_2
BTU.In_2
This is the Normality Plot for damper 2. Is the data Normal? It is Normal, continuing down the
roadmap…

310
Test of Equal Variance (Bartlett’s Test)
In MINITABTM select “Stat>ANOVA>Test Variance . This will allow us to perform a

Stat>ANOVA>Test for Equal Variance”
Bartlett’s Test.
Test for Equal Variances for BTU.In

F-Test
Test Statistic 1.19
1 P-Value 0.558
Damper
Levene'ss Test
Levene
Test Statistic 0.00
Sa mple 1 2
P-Value 0.996
2.0 2.5 3.0 3.5 4.0

95% Bonferroni Confidence Intervals for StDevs
Sa mple 2
1
Damper
5 10 15 20
BTU.In
The P-value of 0.558 indicates that there is no statistically significant difference in variance.
Bartlett’s Test (>2) (f-test 2-samples)

311
2 Sample t-test Equal Variance
Let s continue along the roadmap…

Let’s roadmap Perform the 2
2-Sample
Sample tt-test;
test; be sure to check the box “Assume
Assume
equal variances”.
Box Plot
Boxplot
Boxplotof
ofBTU.In
BTU.Inby
byDamper
Damper
20
20
15
15
BBTU.In
TU.In
10
10
55
11 22
Damper
Damper
5 . Sta te sta tistica l conclusions: Fa il to reject the null

hypothesis.
pra ctica l conclusions: There is no difference
6 . Sta te p
betw een the da mpers for BTU’s in.
The Box Plots do not show much of a difference between the dampers.

312
Minitab Session Window

Take a moment and review the MINITABTM Session Window
Window.
Ca lcula ted
Avera ge n
(X i − X)
2
s= ∑i =1 n −1
S
SE Mean= (N 1 – 1 ) + (N 2 -1 )
n
Tw
Tw o-
o- Sa
Sammple
ple T-Test
T-Test
((Varia
(Va riances
nces Equa
Equal)l)
)
H
Hoo:: μμ11 == μμ22
N -1 .4 5 0 0 .9 8 0
Num
umber
ber of
of H
Haa:: μμ11≠≠ or
or << or
or >> μμ22
Sa
Samples
mples -0 .3 8
T-Ca lc = O bserved d – Ex pected d divided by s

T-Ca lc = Estim a te for difference – Ta rget for dista nce over s
T-Ca lc = (9 .9 1 – 1 0 .1 4 ) /
T-Ca
T Ca lc = -0
0 .2 3 5 / s
Exercise

conduct
d t andd analyze
l a 2 sample
l tt-test
t t using
i
MIN ITAB .
TM
1. Billy Bobs Pool Care has conducted a study on the

effectiveness of two chlorination distributors in a swimming
pool. (Distributor 1 & Distributor 2).
2. The up and coming Billy Bob Jr., looking to prove himself,

wants a comparison done on the Clor.Lev_Post data from the
two types of distributors in order to determine if there is any
difference between the two products.
3. W ith 95% confidence is there a significant difference

between the two distributors?

313
2 Sample t-test: Solution
1. W hat do we want to know: W ith 95% confidence is there a

significant difference between the two distributors?
2. Statistical Problem:
Ho:μ1 = μ2
Ha:μ1 ≠ μ2
3. 2-Sample t-test (population Standard

Deviations unknown).
α = 0.05 β = 0.10
4. N ow we need to look at the data to

determine the Sample Size but lets see
how the data is formatted first.
• “ Unstack data in” choose Clor.Levl_Post

• “ Using subscripts in” choose Distributor
To unstack the data follow the steps here. This will generate two new columns of data shown on the
next page…

314
By unstacking
the data we how
have the • Clor.Lev_Post_1 =
Clor.Lev data Distributor 1
separated by the
distributor it
came from. Now • Clor.Lev_Post_2 =
let’s
let s move on to Distributor 2
trying to
determine correct
sample size.
Follow path in MINITABTM.

315
We wantt to
W t determine
d t i whath t is
i the
th
smallest difference that can be
detected based on our data.
Fill in the three areas and leave

“Differences:” blank so that
MINITABTM will tell us the differences
we need.
The smallest difference that can be

calculated is based on the smallest sample
size.
In this case:
.7339 rounded to.734

316
Check Normality for Clor

Clor.Lev_Post_1
Lev Post 1
The results show us a P-value of 0.154 so our data is Normal. Recall if the P-value is greater than
.05 then we will consider our data Normal.
Probability Plot of Clor.Lev_Post_1

Normal
99
Mean 16.78
StDev 3.240
95 N 40
AD 0.542
90
P-Value 0.154
80
70
ent
60
Perce 50
0
40
30
20
10
1
10 12 14 16 18 20 22 24 26 28
Clor.Lev_Post_1
Check Normality for Clor.Lev_Post_2
The results show us a P-value of 0.961 so our data is also Normal.
Probability Plot of Clor.Lev_Post_2

Normal
99
Mean 17.22
StDev 2.980
95 N 50
AD 0.149
90
P-Value 0.961
80
70
Percent
60
50
40
30
20
10
1
10 12 14 16 18 20 22 24 26
Clor.Lev_Post_2

317
Test for Equal Variances
MINITABTM Path: “Stat >

ANOVA > Test for Equal
Variances…”
For the “Response:” we select

our stacked column
“Clor.Lev_Post”
For our “Factors:” we select our

stacked column “Distributor”

318
Look at the P
P-value
value of 0.574.
0 574
This tells us that there is no statistically significant difference in the variance in these two data sets.
What does this mean….We can finally run a 2 sample t–test with equal variances?
Test for Equal Variances for Clor.Lev_Post

F-Test
Test Statistic 1.18
1 P-Value 0.574
Distributor
Levene's Test
Test Statistic 0.00
P-Value 0.986
2
2.5 3.0 3.5 4.0 4.5

95% Bonferroni Confidence Intervals for StDevs
1
Distributor
10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5

Clor.Lev_Post
For “Samples:” enter “Clor.Lev_Post” For “Subscripts:” enter “Distributors”

319
Look at the Box Plot and Session Window

Window.
There is NO significant difference between the Distributors.
Boxplot
Boxplotof
ofClor.Lev_Post
Clor.Lev_Postby
byDistributor
Distributor
Hmm, we’re 28
28
26
26
a lot alike! 24
24
22
22
Clor.Lev_Post
Clor.Lev_Post
20
20
18
18
16
16
14
14
12
12
10
10
11 22
Distributor
Distributor
The Box Plots show VERY little difference between the Distributors,, also not the P-value in the
Session Window– there is no difference between the two Distributors.
N orma l
s
u ou
n tin a
Co Da t
Test of Equa l Va ria nce 1 Sa m ple Va ria nce 1 Sa mple t-test

320
Unequal Variance Example
Open MINITABTM worksheet: “2

2 sample unequal variance data”
data
Don t just sit there….

Don’t there
open it!
Normality Test
Let’s compa re the da ta Probability

ProbabilityPlot
Plotof
ofSample
Sample33
Normal
Normal
in Sa mple one a nd 99.9
99.9 Mean
Mean
4.852
4 852
4.852
Sa mple three columns.

StDev 3.134
99 StDev 3.134
99 N 100
N 100
AD 0.274
95 AD 0.274
95 P-Value 0.658
P-Value 0.658
90
90
80
80
70
Percent
70
Percent
60
60
50
50
40
40
30
30
20
20
10
10
5
5
1
1
Probability
ProbabilityPlot
Plotof
ofSample
Sample11
Normal
o a
Normal
0.1
01
0.1
-5
-5 00 55 10
10 15
15
99.9
99.9
Mean 4.853 Sample
Sample33
Mean 4.853
StDev
StDev 1.020
1.020
99
99 NN 100
100
AD 0.374
95
AD 0.374
95 P-Value 0.411
P-Value 0.411
90
90
80
80
70
Percent
70
Percent
60
60
50
50
40
OOur
ur da
datata sets
sets aare
40
30
30
20
20 re
norm a lly distributed.
10
10
5
5 norma lly distributed.
1
1
0.1
0.1
11 22 33 44 55 66 77 88
Sample
Sample11

321
Test for Equal Variance
Stat>ANOVA>Test of Equal Variance

Sta
Standa
ndard
rd Devia
Deviation
tion Test for Equal Variances for Stacked
of
of Sa
Samples
m ples
95% Confidence Intervals for Sigmas Factor Levels
1 2 3 4
WW ee use
use F-Test
F-Test Sta
Statistic
tistic
F-Test Levene's Test
beca
because
use ourour da
datata is
is Test Statistic: 0.106 Test Statistic: 67.073
norm
normaallylly distributed.
distributed. P-Value : 0.000 P-Value : 0.000
P-Va
P-Value
lue isis less
less tha
thann
00.0
.055,, our
our vavaria
riances
nces aarere Boxplots of Raw Data
not
not equa
equal.l.
1
0 5 10 15
Stacked
M
Media
edians
ns of
of Sa
Sam ples
mples
This is the output from MINITABTM. Notice that even though the names of the columns in
MINITABTM were Sample p 1 and Sample
p 3,, MINITABTM used Factor levels 1 and 2 to differentiate
the outcome. We have to interpret the meaning for factor levels properly, it is simply the difference
between the samples labeled one and three in our worksheet.
2-Sample t-test Unequal Variance
UN CHECK
“ Assum e equa
q l
va ria nces” box .
You can see there is very little difference in the 2-Sample t-tests.

322
Boxplot
Boxplot of
of Stacked
Stackedby
by C4
C4
15
15 Indica te
Sa mple
M ea ns
10
10
Stacked
Stacked
55
00
-5
-5
11 22
C4
C4
The Box Plot shows no difference between the Means. The overall box is smaller for sample on the
left,, which is an indication for the difference in variance.
Individual
Individual Value
Value Plot
Plotof
of Stacked
Stackedvs
vs C4
C4
15
15
IIndica
di te
10
10
Sa mple
M ea ns
Stacked
Stacked
55
00
-5
-5
11 22
C4
C4
By looking at this Individual Value Plot you can notice a big spread or variance of the data.

323
Tw
Tw o-Sa
o-Sammple
ple T-Test
T-Test
(Va
(Varia
riances
nces N
Not
ot Equa
Equal)
l)
Ho:
Ho: μμ11 == μμ22 (P-Va
(P-Value
lue >> 00.0
.055))
Ha:: μμ11 ≠≠ or
Ha or << or
or >> μμ22 (P-Va
(P-Value
lue << 00.0
.055))
Stat>Basic Stats> 2 sample T (Deselect Assume Equal Variance)
What does the P-value of 0.996 mean? After conducting a 2-sample t-test there is no significant
difference between the Means.
N orma l
s
u ou
n t in a
Co Da t
Test of Equa l Va ria nce 1 Sa mple Va ria nce 1 Sa m ple t-test
2 Sa mple
p T O ne W a y AN O VA 2 Sa m p
ple T O ne W a y AN O VA

324
Paired t-test
• A Pa ired t-test is used to com pa re the m ea ns of tw o m ea surem ents

from the sa m e sa m ples genera lly used a s a before a nd a fter test.
Stat > Basic Statistics > Paired t
• M IN ITABTM perform s a pa ired t-test. This is a ppropria te for testing the
difference betw een tw o M ea ns w hen the da ta a re pa ired a nd the
pa ired differences follow a norm a l distribution.
• Use th
U the PPa iired
d t com m a nd
d to
t com pute
t a confidence
fid interva
i t l a ndd
perform a Hy pothesis Test of the difference betw een popula tion M ea ns
w hen observa tions a re pa ired. A pa ired t-procedure m a tches responses
tha t a re dependent or rela ted in a pa ir-w ise delta
m a nner. This m a tching a llow s y ou to a ccount for (δ)
va ria bility betw een the pa irs usua lly resulting in
a sm a ller error term , thus increa sing the sensitivity
of the Hypothesis Test or confidence interva l.
– H o : μδ = μo
– H a : μδ ≠ μo
μbefore μafter
• W here μδ is the popula tion M ea n of the differences a nd μ0 is the
hypothesized M ea n of the differences, typica lly zero.
Example
1 . Pra ctica l Problem :

• W e are interested in changing the sole material for a popular
brand off shoes for
f children.
• In order to account for variation in activity of children wearing the
shoes, each child will wear one shoe of each type of sole material.
The sole material will be randomly assigned to either the left or
right shoe.
Ho: μδ = 0
Ha: μδ ≠ 0
3 . Pa ired t-test (comparing data that must remain paired).
α = 0.05 β = 0.10
JJustt checking
h ki your souls,
l
er…soles!

325
Example (cont.)
4 . Sa mple Size: Now let’s open

• How much of a difference ca n be detected w ith 1 0 “EXH_STAT
sa mples? Delta.MTW” for
analysis. Use
columns labeled
Mat-A and Mat-B.
EXH_STAT DELTA.MTW
Paired t-test
t test Example
In MINITABTM
open
“Stat>Power
Now that’s
and Sample
size>1-
a tee test!
Sample t”.
E t in
Enter i th
the
appropriate
Sample Size, M IN ITABTM Session W indow
Power Value Pow er a nd Sa mple Size
and Standard 1 -Sa mple t Test
Deviation.
Testing mea n = null (versus not = null)
Ca lcula ting pow er for mea n = null +
diff
difference
This mea ns w e w ill only be a ble to devia tion = 1
detect a difference of only 1 .1 5 if the
Sa mple
Sta nda rd Devia tion is equa l to 1 .
10 0 .9 1 .1 5 4 5 6
Given the sample size of 10 we will be able to detect a difference of 1.15. If this was your process you
would need to decide if this was good enough. In this case, is a difference of 1.15 enough to
practically want to change the material used for the soles of the children’s shoes.

326
Paired t-test Example
For the next test we

5 . Sta te Sta tistica l Solution must first calculate
Calc>Calculator
the difference
between the two
columns. In
MINITABTM open
“Calc>Calculator”.
We p placed Mat-B
first in the equation
shown because it
was generally higher
than the values for
Mat-A.
W e need to ca lcula te the difference

betw een the tw o distributions.
distributions W e a re
concerned w ith the delta , is the Ho
outside the t-ca lc (confidence interva l).
P i d t-test
Paired tt tE Example
l
Follow ing the Hy pothesis Test roa dm a p, w e first test the

AB-Delta distribution for norm a lity .
Probability
ProbabilityPlot
Plotof
ofAB
ABDelta
Delta
Normal
Normal
99
99
Mean
Mean 0.41
0.41
StDev
StDev 0.3872
0.3872
95
95 NN 10
10
AAD
D 0.261
0.261
90
90 P-Value
P-Value 0.622
0.622
80
80
70
70
Percent
ercent
60
60
50
50
40
40
P
30
30
20
20
10
10
55
11
-0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5
AAB
B Delta
Delta

327
1-Sample t
Stat > Basic Statistics > 1-Sample t-test…

Since there is only one
colum n, AB Delta , w e do
not test for equa l
va ria nce per the
Hypothesis Testing
roa dm a p.
Check this da ta for

sta tistica l significa nce in
its depa rture from our
ex pected va lue of zero.
Box Plot
M IN ITABTM Session W indow

Box Plot of AB Delta
O ne-Sa mple T: AB Delta
Test of m u = 0 vs not = 0
Va ria ble N M ea n StDev SE
M ea n
AB Delta 1 0 0 .4 1 0 0 0 0 0 .3 8 7 1 5 5
0 .1 2 2 4 2 9
9 5 % CI T P
(0 .1
1 3 3 0 4 6 , 0 .6
6 8 6 9 5 4 ) 3 .3
3 5 0 .0
009
5 . Sta te Sta tistica l Conclusions: Reject the null hypothesis
6 . Sta te Pra ctica l Conclusions: W e a re 9 5 % confident tha t

there is a difference in w ea r betw een the tw o ma teria ls.
Analyzing the Box Plot we see that the null hypothesis falls outside the confidence interval, so we
reject the null hypothesis. The P-value is also less than 0.05. Given this we are 95% confident that
there is a difference in the wear between the two materials used for the soles of children’s shoes.

328
Paired T-Test
Another w a y to a na lyze this da ta is to use the pa ired t-test

comma nd.
Stat>Basic Statistics>Paired T-test
Click
Click on
on ““Gra
Graphs”
phs” aand
nd select
select
the
the gra
graphs
phs yyou
ou w
w ould
ould lik
likee
to
to genera
generate.
te.
Distinguishing between Two Samples
Boxplot
Boxplotof
ofDifferences
Differences
(with
(withHo
Hoand
and95%
95%t-confidence
intervalfor
forthe
themean)
mean)
The
The P-Va
P-Valuelue of
of from
from
this
thi
this Pa
PPaired
iiredd T-Test
T T t tells
T-Test ttells
ll
us
us the
the difference
difference in in
mmaateria
terials
ls is
is
_
X
X
_ sta
statistica
tistically
lly significa
significant. nt.
Ho
Ho
-1.2
-1.2 -0.9
-0.9 -0.6
-0.6 -0.3
-0.3 0.0
0.0
Differences
Differences
Pa
Paired
ired T-Test
T-Test aand
nd CI:
CI: MMaat-A,
t-A, M
Maat-B
t-B
Pa
Paired
ired TT for
for M
Maat-A
t-A -- M
Maat-B
t-B
N
N MMeaeann StDev
StDev SE
SE M Meaeann
M
Maat-A
t-A 1100 1100.6.6330000 22.4 .4551133 00.7
.7775522
M
Maat-B
t-B 1100 1111.0.0440000 22.5 .5118855 00.7
.7996644
Difference
Difference 1100 -0
-0.4
.41100000000 00.3
.38877115555 00.1.12222442299
9955%
% CI
CI for
for m
mea difference: ((-0
eann difference: ( 0.6
(-0 .68866995544,, -0
-00.1
.13333004466))
T-Test
T-Test of
of m
meaeann difference
difference == 00 (vs
(vs not
not == 00): ): T-Va
T-Value
lue == -3-3.3
.355 P-Va
P-Value
lue == 00.0
.00099
As you will see the conclusions are the same, but just presented differently.

329
If you analyze this as a 2

2-sample
sample tt–test
test it simply compares the means of Material A to Material B B.
The power of the paired test is that it increases the sensitivity of the test without having to look at a
series of other factors.
The w rong w a y to a na lyze this da ta is to use a 2 -

sa mple t-test:
M IN ITABTM Session W indow
Tw o-sa mple T for M a t-A vs M a t-B

N M ea n StDev SE M ea n
M a t-A 1 0 1 0 .6 3 2 .4 5 0 .7 8
M a t-B 1 0 1 1 .0 4 2 .5 2 0 .8 0
Difference = m u (M a t-A) - m u (M a t-B)
Estim a te for difference: -0 .4 1 0 0 0 0
9 5 % CI for difference: (-2 .7 4 4 9 2 4 , 1 .9 2 4 9 2 4 )
T-Test of difference = 0 (vs not =): T-Va lue = -0 .3 7
P-Va lue = 0 .7 1 6 DF = 1 8
Both use Pooled StDev = 2 .4 8 5 1
Paired t-test Exercise

conduct
d t andd analyze
l a paired
i d tt-test
t t using
i MIN ITABTM .
1. A corrugated packaging company produces material which

has creases to make boxes easier to fold. It is a critical to
quality characteristic to have a predictable Relative Crease
Strength. The quality manager is having her lab test some
samples labeled 1-11. Then those same samples are being
sent to her colleague at another facility who will report their
measurements on those same 1-11 samples.
2. The US quality manager wants to know with 95% confidence

what the average difference is between the lab located in
Texas and the lab located in Mexico when measuring
Relative Crease Strength.
3. Use the data in columns “ Texas” & “ Mexico” to determine the

answer to the quality manager’s question.

330
Paired t-test Exercise: Solution
Because the two labs

ensured to exactly report
measurement results for
the same parts and the
results were put in the
correct corresponding row,
we are able to do a paired
t-test.
The first thing we must do

is create a new column
with the difference
between the two test
results.
Calc>Calculator
We must confirm the

differences (now in a new Summary for TX_MX-Diff
calculated column)) are from A nderson
nderson-Darling
Darling Normality Test
a Normal Distribution. This A -Squared

P-V alue
0.45
0.222
was confirmed with the Mean

StDev
0.22727
0.37971
Anderson-Darling Normality V ariance
Skewness
0.14418
-0.833133
Test by doing a graphical Kurtosis
N
-0.233638
11
summary under Basic Minimum -0.50000
1st Q uartile -0.10000
Statistics. Median 0.40000
3rd Q uartile 0.50000
-0.50 -0.25 0.00 0.25 0.50 0.75 Maximum 0.70000
95% C onfidence Interv al for Mean
-0.02782 0.48237
95% C onfidence Interv al for Median
-0.11644 0.50822
95% C onfidence Interv al for StDev
95% Confidence Intervals
0.26531 0.66637
Mean
Median
0.0 0.2 0.4 0.6

331
Paired t-test Exercise: Solution
As w e’ve seen before, this 1 Sa m ple T a na lysis is found

w ith:
Stat>Basic Stat>1-sample T
Even though the M ea n difference is 0 .2 3 , w e ha ve a 9 5 % confidence

interva l tha t includes zero so w e k now the 1 -sa mple T test’s null
hypothesis w a s “ fa iled to be rejected” . W e ca nnot conclude the tw o
la bs ha ve a difference in la b results
results.
Histogram
Histogramof
ofTX_MX-Diff
TX_MX-Diff
(with
(withHo
Hoand
and95%
95%t-confidence
intervalfor
forthe
themean)
mean)
55
44
33
Frequency
Frequency
The P-va lue is grea ter tha n 22

0 .0 5 so w e do not ha ve the
9 5 % confidence w e w a nted to 11
confirm a difference in the la b
mea ns. This confidence 00 __
XX
interva
te a l cou
could
d be reduced
educed w ith
t Ho
Ho
more sa mples ta k en nex t time -0.50 -0.25 0.00 0.25 0.50 0.75
-0.50 -0.25 0.00 0.25 0.50 0.75
a nd a na lyzed by both la bs. TX_MX-Diff
TX_MX-Diff

332
Continuous Data Roadmap
N orma l
s
ou
nu
o n ti ta
C Da
Test of Equa l Va ria nce 1 Sa mple Va ria nce 1 Sa mple t-test
Notes

333
Determine appropriate sample sizes for testing Means

Conduct various Hypothesis Tests for Means
Properly Analyze Results
You have now completed Analyze Phase – Hypothesis Testing Normal Data Part 1.
Notes

334
Lean Six Sigma

Black Belt Training
Analyze Phase

Hypothesis Testing Normal Data Part 2”
2.

335
Overview
We are now
moving into W
W elcome
elcome to
to Ana
Analy
lyze
ze
Hypothesis
Testing Normal ““X
X”” Sifting
Sifting
Data Part 2 where
we will address Inferentia
Inferentiall Sta
Statistics
tistics
Calculating
Sample Size, Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
Variance Testing
and Analyzing Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Results. Ca
Calcula
lculate
te Sa
Sample
mple Size
Size
We will examine Hypothesis

Hypothesis Testing
Testing N
NDD P2
P2 Va
Varia
riance
nce Testing
Testing
the meaning of Ana
Analyze
lyze Results
Results
each of these and Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
show you how to
apply
l th
them. Hypothesis
H h i Testing
Hypothesis TTesting
i N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Tests of Variance
Tests of Va ria nce are used for both normal and non-normal
data.
N orma l Da ta
– 1 Sample to a target
– 2 Samples – F-Test
– 3 or More Samples Bartlett’s Test
N on-N orm a l Da ta
– 2 or more samples Levene’s
Levene s Test
The null hypothesis states there is no difference between the

standard deviations or variances.
– Ho: σ1 = σ2 = σ3 …
– Ha = at least on is different

336
1-Sample Variance
A 1 -sa mple va ria nce test is used to compa re a n ex pected

popula tion va ria nce to a ta rget.
Stat > Basic Statistics > Graphical Summary
g va ria nce lies inside the confidence interva l,, fa il

If the ta rget
to reject the null hypothesis.
– H o: σ2 Sa m ple = σ2 Ta rget
– H a : σ2 Sa m ple ≠ σ2 Ta rget
Use the sa mple size ca lcula tions for a 1 sa mple t-test since
they a re ra rely performed w ithout performing a 1 sa mple t-
test a s w ell.
1 Sample t-test Sample Size

• W e a re considering cha nging supplies for a pa rt tha t w e
currently purcha se from a supplier tha t cha rges a
premium for the ha rdening process a nd ha s a la rge
va ria nce in their process.
• The p proposed
p new supplier
pp ha s provided
p us w ith a
sa mple of their product. They ha ve sta ted they ca n
ma inta in a va ria nce of 0 .1 0 .

H o : σ2 = 0 .1 0 or H o: σ = 0 .3 1
H a : σ2 ≠ 0 .1 0 H a : σ ≠ 0 .3 1
3 . 1 -sa mple va ria nce:

α = 0 .0 5 β = 0 .1 0
The Statistical Problem can be stated two ways:

The null hypothesis: The variance is equal to 0
0.10
10 and the alternative hypothesis: The variance is
not equal to 0.10
OR
The null hypothesis: The Standard Deviation is equal to 0.31 and the alternative hypothesis: The
Standard Deviation is not equal to 0.31

337
1-Sample Variance
4 . Sa mple Size:
• O pen the M IN ITABTM w ork sheet: Ex h_Sta t.M TW
• This is the sa me file used for the 1 Sa mple t ex a mple.
– W e w ill a ssume the sa mple size is a dequa te.
Stat > Basic Statistics > Graphical Summary
Take time and notice the

Standard Deviation of 0.31 Reca ll the ta rget Sta nda rd Devia tion is 0 .3 1 .
falls within 95% confidence
interval. Based off this data Summary
Summaryfor
forValues
Values
the statistical solution is “fail to AAnderson-Darling
nderson-D arling NNormality
AA-Squared
-S quared
ormality Test
0.33
Test
0.33
reject the null”
null . PP-Value
-V
V alue
l 00.442
0.442
442
MMean
ean 4.7889
4.7889
StD ev
StDev 0.2472
0.2472
What does this mean from a VVariance
ariance
Skew ness
Skewness
0.0611
0.0611
-0.02863
-0.02863
practical stand point? They Kurtosis
Kurtosis
NN
-1.24215
-1.24215
99
can maintain a variance of MMinimum

inimum
1st
4.4000
4.4000
1stQQuartile
uartile 4.6000
4.6000
0.10 that is valid. MMedian
edian
3rd
4.7000
4.7000
3rdQQuartile
uartile 5.0500
5.0500
4.4
4.4 4.6
4.6 4.8
4.8 5.0
5.0 MMaximum
aximum 5.1000
5.1000
Typically, shifting a Mean is 95%

95%CConfidence
onfidence Interv
4.5989
al for
Interval
4.9789
forMMean
ean
4.5989 4.9789
easier
i tto accomplish
li h iin a 95%
95%CConfidence
onfidence Interv al for
Interval forMMedian
edian
4.6000 5.0772
process than reducing 95%
95%Confidence
ConfidenceIntervals
Intervals
95%
4.6000
95%CConfidence
onfidence Interv
5.0772
al for
Interval forSStDev
tD ev
variance. The new supplier Mean

Mean
0.1670
0.1670 0.4736
0.4736
would be worth continuing the Median

Median
relationship to see if they can 4.6
4.6
4.7
4.7
4.8
4.8
4.9
4.9
5.0
5.0
5.1
5.1
increase the Mean slightly

while maintaining the reduced
variance.

338
Test of Variance Example

W e want to determine the effect of two different storage methods on
the rotting of potatoes. You study conditions conducive to potato rot
by injecting potatoes with bacteria that cause rotting and subjecting
them to different temperature and oxygen regimes. W e can test the
data to determine if there is a difference in the Standard Deviation of
the rot time between the two different methods.
2 . Sta tistica l Problem :

Ho: σ1 = σ2
Ha: σ1 ≠ σ2
3 . Equa l va ria nce test (F-test since there are only 2 factors.)
The Statistical problem is:

The null hypothesis: The Standard Deviation of the first method is equal to the Standard Deviation of
the second method
method.
The alternative hypothesis: The Standard Deviation of the first method is not equal to the Standard
Deviation of the second method.
These hypotheses can also be stated in terms of variance.
Now open the

data set 4 . Sa m ple Size:
“EXH_AOV.MT α = 0 .0 5 β = 0 .1 0
W”.
Stat > Power and Sample Size > One-Way ANOVA…
Follow along in
MINITABTM. EXH_AOV.MTW

O ne-w a y AN O VA
Alpha = 0 .0 5 Assum ed sta nda rd devia tion
= 1 N umber of Levels = 2
Sa mple M a x im um
Size Pow er SS M ea ns Difference
50 0 .9 0 .2 1 4 3 5 0 0 .6 5 4 7 5 2
The sa mple size is for ea ch level.

339
Normality Test – Follow the Roadmap
Ch k ffor N
Check Normality.
lit
5 . Sta tistica l Solution:
Stat>Basic Statistics>Normality Test
According to the
graph we have Ho: Da ta is norm a l
Normal data. Ha : Da ta is N O T norm a l Stat>Basic Stats> Normality Test
(Use Anderson Darling)
Probability
ProbabilityPlot
Plotof
ofRot
Rot11
Normal
Normal
99.9
99.9
Mean
Mean 4.871
4.871
StDev
StDev 0.9670
0.9670
99
99 NN 100
100
AAD
D 0.306
0.306
95
95 P-Value 0.559
P-Value 0.559
90
90
80
80
70
Percent
70
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
0.1
0 1
0.1
22 33 44 55 66 77 88
Rot
Rot11

340
Test of Equal Variance
Now conduct the test for

equal variance. Stat>ANOVA>Test for Equal Variance
What is the statistical solution? Fail to reject.
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot11
F-Test
F-Test
Test
TestStatistic
Statistic 0.74
0 74
0.74
11 P-Value
P-Value 0.298
0.298
Factors
Lev ene's Test

Factors
Levene's Test
Test
TestStatistic
Statistic 0.53
0.53
P-Value
P-Value 0.469
0.469
22
Use
Use F-Test
F-Test for
for 22 sa mmples
sa0.7 ples0.8 0.9 1.0 1.1 1.2 1.3 1.4
0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4
norm
normaally
lly distributed
distributed da data
ta.. 95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
P-Va
P-Value
lue >>00.0
.055 (.2
((.29988))
Assum
Assume e Equa
Equall Va
Va1 ria
riance
nce 1
Factors
Factors
22
22 33 44 55 66 77
Rot
Rot11
6 . Pra ctica l Solution:

The difference betw een the Sta nda rd Devia tions from the tw o
sa mples is not significa nt.

341
Normality Test
Perform another test using the column Rot.
Probability
Probability Plot
Plot of
of Rot
Rot
Normal
Normal
99
99
Mean
Mean 13.78
13.78
StDev
StDev 7.712
7.712
95
95 NN 18
18
AADD 0.285
0.285
90
90 P-Value
P-Value 0.586
0.586
80
80
70
70
Percent
Percent
60
60
50
50
40
40 The
The P-value
P-value is
is >> 0.05
0.05
30
30 We
We can
can assume
assume ourour
20
20 data
data is
is Normally
normally
10
10 Distributed.
distributed.
55
11
-5
-5 00 55 10
10 15
15 20
20 25
25 30
30 35
35
Rot
Rot
Test for Equal Variance (Normal Data)
Test for equa l va ria nce using Temp a s fa ctor.

342
Test of Equal Variance
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot
F-Test
F-Test
Test
TestStatistic
Statistic 0.68
0.68
10
10 P-Value
P-Value 0.598
0.598
Lev ene's Test
Levene's Test
Temp
Tem p
Test
TestStatistic
Statistic 0.05
0.05
P-Value
P-Value 0.824
0.824
16
16
22 44 66 88 10
10 12
12
95%
95%Bonferroni
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs Ho:
Ho: σ
σ11 == σσ22
Ha
Ha:: σ
σ11≠≠ σ
σ22
10
10
P-Va
P-Value
lue >> 00.0
.055,, There
There isis no
no
sta
statistica
tistically
lly significa
significant nt difference.
difference.
Temp
Tem p
16
16
00 55 10
10 15
15 20
20 25
25
Rot
Rot
You can see there is no statistical difference for variance in Rot based on temperature as a factor.
Since the data is Normally Distributed and we have 2 samples, use F-Test statistic.
Evaluating the Results
Use
Use F-
F- Test
Test for
for 22
sa
samples
mples ofof N
N orma
ormally lly
Distributed
Distributed da
datata..

343
Continuous Data - Normal
Another method for testing for equal variance will allow more than one factor. The Labels are the
factors. The data is the Output.
Test For Equal Variances
Stat>ANOVA>Test for Equal Variance
This time we have Rot as the response and Temp and Oxygen as the factors.

344
Test For Equal Variances Graphical Analysis
This graph
Thi h shows
h a ttestt off
equal variance which Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot
displays Bonferroni 95% Temp
Temp Oxygen
Oxygen
confidence for the response Bartlett's
Bartlett'sTest
Test
Standard Deviation at each 22 Test
TestStatistic
Statistic 2.71
2.71
P-Value
P-Value 0.744
0.744
level. As you will see the 10 66 Lev ene's Test
Levene's Test
10
Bartlett’s and Levene’s test Test
TestStatistic
Statistic 0.37
0.37
P-Value
P-Value 0.858
0.858
are displayed in the same 10
10
Session Window. The

asymmetry of the intervals 22
is due to the Skewness of
16
16 66
the chi-square distribution.
10
10
For the potato rot example,
you fail to reject the null 00 20
20 40
40 60
60 80
80 100
100 120120 140
140
95% Bonferroni
onferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
hypothesis of the variances 95%B StDevs
being equal.
P-va lue > 0 .0 5 show s insignifica nt
difference betw een va ria nce
Test For Equal Variances Statistical Analysis
Test for Equal Variances: Rot versus Temp, Oxygen
95% Bonferroni confidence intervals for standard deviations
Temp Oxygen N Lower StDev `Upper

10 2 3 2.26029 5.29150 81.890
10 6 3 1.28146 3.00000 46.427
10 10 3 2.80104 6.55744 101.481
16 2 3 1.54013 3.60555 55.799
16 6 3 1.50012 3.51188 54.349
16 10 3 3.55677 8.32666 128.862
Use this if
Bartlett s Test (normal distribution)
Bartlett's
da ta is N orma l
Test statistic = 2.71, p-value = 0.744
a nd for Fa ctors < 2
Levene's Test (any continuous distribution) Use this if

Test statistic = 0.37, p-value = 0.858 da ta is N on-norma l
And for fa ctors > 2
Does the Session Window have the same P-values as the Graphical Analysis?

345
Tests for Variance Exercise
Ex ercise objective: Utilize what you have learned to conduct

and analyze a test for equal variance using MIN ITABTM .
1. The quality manager was challenged by the plant director as to why

the VOC levels in the product varied so much. After using a Process
Map, some potential sources of variation were identified. These
sources included operating shifts and raw material supplier. Of
course, the q
qualityy manager
g has already y clarified the Gage
g R&R
results were less than 17% study variation so the gage was
acceptable.
2. The quality manager decided to investigate the effect of the raw

material supplier. He wants to see if the variation of the product
quality is different when using supplier A than supplier B. He wants
to be 95% confident the variances are similar when using the two
suppliers.
3. Use data ppm VOC and RM Supplier to determine if there is a

difference between suppliers.
Tests for Variance Exercise: Solution
First we want to do a graphical summary of the two samples from the two suppliers.

346
I “Variables:”
In “V i bl ” enter
t ‘‘ppm
VOC’
In “By variables:” enter ‘RM

Supplier’
We want to see if the 2

samples are from Normal
populations.
l ti
The P-value is greater than 0.05 for both Anderson-Darling Normality Tests so we conclude the
samples are from Normally Distributed populations because we “failed to reject” the null hypothesis
that the data sets are from Normal Distributions.
Summary for ppm VOC Summary for ppm VOC

RM Supplier = A RM Supplier = B
A nderson-Darling N ormality Test A nderson-Darling N ormality Test
A -S quared 0.33 A -S quared 0.49
P -V alue 0.465 P -V alue 0.175
M ean 37.583 M ean 30.500

S tDev 7.090 S tD ev 6.571
V ariance 50.265 V ariance 43.182
S kew ness 0.261735 S kew ness -0.555911
Kurtosis -0.091503 Kurtosis -0.988688
N 12 N 12
M inimum 25.000 M inimum 19.000

1st Q uartile 33.250 1st Q uartile 25.000
M edian 35.500 M edian 31.500
20 25 30 35 40 45 50 3rd Q uartile 42.000 20 25 30 35 40 45 50 3rd Q uartile 37.000
M aximum 50.000 M aximum 38.000
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M ean
33.079 42.088 26.325 34.675
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
33.263 42.000 25.000 37.000
95% Confidence Intervals 95% Confidence Intervals
95% C onfidence Interv al for S tD ev 95% C onfidence Interv al for S tDev
Mean 5.022 12.038 Mean 4.655 11.157
Median Median
32 34 36 38 40 42 25.0 27.5 30.0 32.5 35.0 37.5
Are both Da ta Sets a re N orma l?

347
Continue to
determine if
they are of
equal variance.
For “Response:”
p
enter ‘ppm VOC’
For “Factors:” enter

‘RM Supplier’
Note MINITABTM
defaults to 95%
confidence
co de ce interval
te a
which is exactly the
level we want to test
for this problem.

348
Because the 2 populations

were considered to be Test for Equal Variances for ppm VOC
Normally Distributed, the F- F-Test
test is used to evaluate Test Statistic 1.16
A P-Value 0.806
whether the variances
RM Supplier
Lev ene's Test
(Standard Deviation Test Statistic 0.02

P-Value 0.890
squared) are equal. B
4 6 8 10 12 14
The P-value of the F-test 95% Bonferroni Confidence Intervals for StDevs
was greater than 0.05 so we

“fail to reject” the null
hypothesis. A
RM Supplier
So once again in English: B
The variances are equal

between
bet ee the
t e results
esu ts from
o tthe
e 20 25 30 35
ppm VOC
40 45 50
two suppliers on our

product’s ppm VOC level.
N orma l
u s
uo
n tin a
Co D a t
Test of Equa l Va ria nce 1 Sa mple Va ria nce 1 Sa m ple t-test
2 Sa mple T O ne W a y AN O VA 2 Sa m ple T O ne W a y AN O VA

349
Purpose of ANOVA
Ana llysis
A i off Va
V ria
i nce (AN O VA) is
i used
d to
t investiga
i ti tet a ndd
m odel the rela tionship betw een a response va ria ble a nd
one or m ore independent va ria bles.
Ana lysis of Va ria nce ex tends the tw o sa m ple t-test for

testing the equa lity of tw o popula tion M ea ns to a more
genera l nullll hypothesis
h th i off com pa ringi the
th equa lity
lit off more
tha n tw o M ea ns, versus them not a ll being equa l.
– The cla ssifica tion va ria ble, or fa ctor, usua lly ha s three
or m ore levels (If there a re only tw o levels, a t-test
ca n be used).
– Allow s you to ex a mine differences a mong m ea ns
using multiple com pa risons.
– The AN O VA test sta tistic is:
Avg SS between S2 between
= 2
Avg
g SS within S within
What do we want to know?
Is the between group variation large enough to be distinguished from the within group variation?
delta X (Betw een Group Va ria tion)
(δ)
T
Tota l (O vera ll) Va
V ria
i tion
i
W ithin Group Va ria tion

(level of supplier 1 )
X
X
X X
X
X X X
μ1 μ2

350
Calculating ANOVA
Take a moment to review the formulas for an ANOVA

ANOVA.
W here:
G - the number of groups (levels in the study )
x ij = the individua l in the jth group
n j = the number of individua ls in the jth group or level
X = the gra nd M ea n
X j = the M ea n of the jth group or level
Tota l (O vera ll) V a ria tion
delta
(δ) W ithin Group Va ria tion
(Betw een Group Va ria tion)
Between Group Variation Within Group Variation Total Variation

g g nj g nj
∑j=1
nj (Xj − X ) 2 ∑ ∑ (X ij − X)2 ∑ ∑ (X
j=1 i =1
ij − X )2
j =1 i =1
Calculating ANOVA
The a lpha risk increa ses a s the number of M ea ns

i
increa ses w ith a pa ir-w
i i t-test
ise t t t scheme.
h The
Th formula
f l for
f
testing m ore tha n one pa ir of M ea ns using a t-test is:
1 − (1 − α )
k
where k = number of pairs of means

so, for 7 pairs of means and an α = 0.05 :
1 - (1 - 0.05) = 0.30
7
or 30% alpha risk
The reason we don’t use a t-test to evaluate series of Means is because the alpha risk increases as the
number of Means increases. If we had 7 pairs of Means and an alpha of 0.05 our actual alpha risk
could be as high as 30%. Notice we did not say it was 30%, only that it could be as high as 30% which
is quite unacceptable.

351
Three Samples
We have three potential suppliers that claim to have equal levels of quality
quality. Supplier B provides a
considerably lower purchase price than either of the other two vendors. We would like to choose the
lowest cost supplier but we must ensure that we do not effect the quality of our raw material.
File>Open Worksheet > ANOVA.MTW
Supplier A Supplier B Supplier C

3.16 4.24 4.58
4.35 3.87 4.00
3.46 3.87 4.24
3.74 4.12 3.87
3.61 3.74 3.46
W
W ee w
w ould
ould lik
likee test
test the
the da
data
ta to
to determ
determine
ine w
w hether
hether
there is a difference betw een the three suppliers.
there is a difference betw een the three suppliers.
Follow the Roadmap…Test for Normality
Compare P-values. All three suppliers sa mples

Probability Plot of Supplier A
Normal a re norm a lly distributed.
99
Mean 3.664
95
StDev
N
0.4401
5 Supplier A P-Value 0.568
90
AD
P-Value
0.246
0.568
Supplier B P-Value 0.385
80 Supplier C P-Value 0.910
70
Percent
60
50
40
30 Probability Plot of Supplier B
20 Normal
99
10 Mean 3.968
5 StDev 0.2051
95 N 5
AD 0.314
1
90 Probability
P-Value 0.385 Plot of Supplier C
25
2.5 30
3.0 80 35
3.5 40
4.0 45
4.5 Normal
70
Supplier A 99
Mean 4.03
Percent
60
StDev 0.4177
50
95 N 5
40
AD 0.148
30 90
P-Value 0.910
20
80
10 70
Percent
60
5
50
40
1 30
3.50 3.75 4.00 20 4.25 4.50
Supplier B
10
1
3.0 3.5 4.0 4.5 5.0
Supplier C

352
Test for Equal Variance…
Before testing for

Equal Variance, you Test for Equa l Va ria nce (m ust sta ck da ta first):
must first stack the
worksheet.
According to the data

there is no significant
Test
Testfor
forEqual
EqualVariances
Variancesfor
forData
difference in the Data
variance of the 3 Test

Bartlett's
Bartlett'sTest
TestStatistic
Statistic
Test
2.11
2.11
suppliers. Supplier
SupplierAA P-Value
P-Value
Lev ene's Test
Levene's
0.348
Test
0.348
Test
TestStatistic
Statistic 0.59
0.59
P-Value
P-Value 0.568
0.568
Suppliers
Suppliers
Supplier
SupplierBB
Supplier
pp CC
Supplier
0.0
0.0 0.2
0.2 0.4
0.4 0.6
0.6 0.8
0.8 1.0
1.0 1.2
1.2 1.4
1.4 1.6
1.6 1.8
1.8
95%
95%Bonferroni
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
ANOVA in MINITABTM
Follow along in MINITABTM.
Stat>ANOVA>One-Way Unstacked
Enter Sta ck ed Supplier

pp da ta
in Response.
Click on Gra ph, Check Box Plots

353
ANOVA
What does this graph

tell us?
Boxplot of Supplier A, Supplier B, Supplier C
There doesn’t seem 4.6
to be a huge
difference here. 4.4
4.2
4.0
Data
3.8
3.6
3.4
3.2
3.0
Supplier A Supplier B Supplier C
ANOVA Session Window
Looking at the P-value the conclusion is we fail to reject the null hypothesis. According to the data
there is no significant difference between the Means of the 3 suppliers.
N
Norm
ormaall da
data
ta P-va
P-value
lue >>
Stat>ANOVA>One Way
y .0
.055 N
Noo Difference
Difference
Test for Equal Variances: Suppliers vs ID

One-way ANOVA: Suppliers versus ID
Analysis of Variance for Supplier
Source DF SS MS F P
ID 2 0.384 0.192 1.40 0.284
Error 12 1.641 0.137
T t l
Total 14 2
2.025
025
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ----------+---------+---------+------
Supplier 5 3.6640 0.4401 (-----------*-----------)
Supplier 5 3.9680 0.2051 (-----------*-----------)
Supplier 5 4.0300 0.4177 (-----------*-----------)
----------+---------+---------+------
Pooled StDev = 0.3698 3.60 3.90 4.20

354
ANOVA
Before looking up the f critical value you must first know what the degrees of freedom are
are. The purpose
of the ANOVA’s test statistic uses variance between the Means divided by variance within the groups.
Therefore, the degrees of freedom would be 3 suppliers minus 1 for 2 degrees of freedom. The
denominator would be 5 samples minus 1 (for each supplier) multiplied by 3 suppliers, or 12 degrees of
freedom. As you can see the critical F value is 3.89, and since the calculated f of 1.40 not close to the
critical value we fail to reject the null hypothesis.
T t for
Test f Equal
E l Variances:
V i Suppliers
S li vs ID
One-way ANOVA: Suppliers versus ID
Analysis of Variance for Supplier
Source DF SS MS F P
ID 2 0.384 0.192 1.40 0.284
Error 12 1.641 0.137 F-Ca lc F-Critica l
Total 14 2.025
Individual 95% CIs For Mean
D/N 1 2 3 4
Based on Pooled StDev 1 161.40 199.50 215.70 224.60
2 18 51
18.51 19 00
19.00 19 16
19.16 19 25
19.25
L
Level
l N Mean
M StDev
StD ----------+---------+---------+------
3 10.13 9.55 9.28 9.12
Supplier 5 3.6640 0.4401 (-----------*-----------) 4 7.71 6.94 6.59 6.39
5 6.61 5.79 5.41 5.19
Supplier 5 3.9680 0.2051 (-----------*-----------)
6 5.99 5.14 4.76 4.53
Supplier 5 4.0300 0.4177 (-----------*-----------) 7 5.59 4.74 4.35 4.12
8 5.32 4.46 4.07 3.84
----------+---------+---------+------
9 5.12 4.26 3.86 3.63
Pooled StDev = 0.3698 3.60 3.90 4.20 10 4.96 4.10 3.71 3.48
11 4.84 3.98 3.59 3.36
12 4.75 3.89 3.49 3.26
13 4.67 3.81 3.41 3.18
14 4.60 3.74 3.34 3.11
15 4.54 3.68 3.29 3.06
Sample Size
Let’s check on how much difference we can see with a sample of 5.
Will having a
sample of 5 show
a difference?
After crunching
the numbers, a
sample of 5 can
only detect a
difference of 2.56
Standard Pow er a nd Sa m ple Size
Deviations Which
Deviations. O ne-w a y AN O VA
means that the
Alpha = 0 .0 5 Assum ed Sta nda rd Devia tion = 1
Mean would have N um ber of Levels = 3
to be at least 2.56 Sa m ple M a x im um
Standard Size Pow er SS M ea ns Difference
Deviations until we 5 0 .9 3 .2 9 6 5 9 2 .5 6 7 7 2
could see a
The sa mple size is for ea ch level.
difference. To help
elevate this
problem a larger sample should be used. If there is a larger sample you would be able to have a
more sensitive reading for the Means and the variance.

355
ANOVA Assumptions
1 . O bserva
b tions
ti a re a dequa
d tely
t l described
d ib d by
b the
th m odel.
d l
2 . Errors a re norm a lly a nd independently distributed.
3 . Homogeneity of va ria nce a m ong fa ctor levels.
In one-w a y AN O VA, m odel a dequa cy ca n be check ed

by either of the follow ing:
1 . Check the da ta for N orma lity a t ea ch level a nd for
hom ogeneity of va ria nce a cross a ll levels.
2 . Ex a m ine the residua ls (a residua l is the difference in
w ha t the m odel predicts a nd the true observa tion).
1 . N orma l plot of the residua ls
2 . Residua ls versus fits
3 . Residua ls versus order
If the model is a dequa te, the residua l plots w ill be structureless.
Residual Plots
To generate the residual plots in MINITABTM select “Stat>ANOVA>One-way Unstacked>Graphs”,
then select “Individual value plot” and check all three types of plots.
Stat>ANOVA>One-Way Unstacked>Graphs

356
Histogram of Residuals
Histogram
Histogramof
ofthe
theResiduals
Residuals
(responses
(responsesare
areSupplier
SupplierA,
A,Supplier
SupplierB,
B,Supplier
SupplierC)
C)
55
44
FFrequency
requency
33
22
11
00
-0.6
06
-0.6 -0.4
04
-0.4 -0.2
02
-0.2 00.0
0.0
0 0.2
0 2
0.2 0.4
0 4
0.4 0.6
0 6
0.6
Residual
Residual
The Histogra m of residua ls should

show a bell sha ped curve.
Normal Probability Plot of Residuals

The Normality plot of the residuals should follow a straight line on the probability plot. (Does a
pencil cover all the dots?)
N orm a lity plot of the residua ls should follow a stra ight line.
Results of our ex a m ple look good.
The norm a lity a ssum ption is sa tisfied.
Normal
NormalProbability
ProbabilityPlot
Plotof
ofthe
theResiduals
Residuals
(responses
(responsesare
areSupplier
SupplierA,
A,Supplier
SupplierB,
B,Supplier
SupplierC)
C)
99
99
95
95
90
90
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
-1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0
Residual
Residual

357
2-Sample t Example
Now let us perform a 2 Sample t Example

Example. In MINITABTM select “Stat>Power
Stat>Power and Sample size>2-
size>2
Sample t”.
For the field “Sample Sizes:” enter 40 space 50 because our data set has unequal sample sizes
which is not uncommon. The smallest difference that can be detected is based on the smallest
sample size, so in this case it is: 0.734.

2 -Sa mple t Test
Testing mea n 1 = mea n 2 (versus not
=)
Ca lcula ting pow er for mea n 1 = mea n
2 + difference
devia tion = 1
Sa mple
40 0 .9 0 .7 3 3 9 1 9
50 0 .9 0 .6 5 4 7 5 2
Example: Follow the Roadmap…

358
Residuals versus Fitted Values
The plot of Residua ls Versus the Fitted Va lues ex a m ines consta nt

va ria nce.
The plot should be structureless w ith no outliers present.
O ur ex a m ple does not indica te a problem.
Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
((responses
(responsesare
areSupplier
SSupplier
li A,AA,Supplier
SSupplier
li B,BB,Supplier
SSupplier
li C) C)
0.75
0.75
0.50
0.50
0.25
0.25
Residual
Residual
0.00
0.00
-0.25
-0.25
-0.50
-0.50
3.65
3.65 3.70
3.70 3.75
3.75 3.80
3.80 3.85
3.85 3.90
3.90 3.95
3.95 4.00
4.00 4.05
4.05
Fitted
FittedValue
Value
ANOVA Exercise

conduct and analyze a one way AN OVA using
MIN ITABTM .
1. The quality manager was challenged by the plant director as

to why the VOC levels in the product varied so much. The
quality manager now wants to find if the product quality is
different because of how the shifts work with the product.
2. The quality manager wants to know if the average is different

for the ppm VOC of the product among the production shifts.
3. Use Data in columns “ ppm VOC” and “ Shift” to determine

the answer for the quality manager at a 95% confidence
level.

359
ANOVA Exercise: Solution

First we need to do a graphical summary of the samples from the 3 shifts
shifts.
Stat>Basic Stat>Graphical Summary
We want to see if the

3 samples are from
Normal populations.
In “Variables:” enter
‘ppm VOC’
In “By Variables:”
e te ‘Shift’
enter S t

360
The P-value is greater than 0.05 Summary

Summaryfor
forppm
ppmVOC
VOC
Shift
Shift==11
P-Value 0.446
for both Anderson-Darling A nderson-Darling N ormality Test

Anderson-Darling Normality Test
A -Squared
A-Squared
P -V alue
0.32
0.32
0.446
N ormality Tests so we conclude

P-Value 0.446
M ean 39.500
Mean 39.500
S tDev 6.761
StDev 6.761
V ariance 45.714
the samples are from normally

Variance 45.714
S kew ness 0.58976
Skewness 0.58976
Kurtosis -1.13911
Kurtosis -1.13911
N 8
N 8
distributed populations because M inimum

Minimum
1st Q uartile
1st Quartile
M edian
Median
32.000
32.000
33.500
33.500
38.000
38.000
we “ failed to reject” the null

20 25 30 35 40 45 50 3rd Q uartile 46.500
20 25 30 35 40 45 50 3rd Quartile 46.500
M aximum 50.000
Maximum 50.000
95% C onfidence Interv al for M ean
95% Confidence Interval for Mean
hypothesis that the data sets are

33.847 45.153
33.847 45.153
32.936 48.129
95% Confidence Intervals 32.936 48.129
from N ormal distributions.

95% Confidence Intervals 95% C onfidence Interv al for S tD ev
95% Confidence Interval for StDev
Mean 4.470 13.761
Mean 4.470 13.761
Median
Median
35 40 45 50
35 40 45 50
Summary Summary
Summaryfor
forppm
ppmVOC P-Value 0.658
Summaryfor
forppm
ppmVOC
VOC P-Value 0.334 VOC
Shift
Shift
Shift==22 Shift==33
A nderson-Darling N ormality Test A nderson-D arling Normality Test
Anderson-Darling Normality Test Anderson-Darling Normality Test
A-Squared 0.37 A-Squared 0.24
P -V alue 0.334 P -V alue 0.658
P-Value 0.334 P-Value 0.658
M ean 34.625 M ean 28.000
Mean 34.625 Mean 28.000
S tD ev 5.041 S tD ev 6.525
StDev 5.041 StDev 6.525
Variance 25.411 Variance 42.571
S kew ness -0.74123 S kew ness 0.06172
Skewness -0.74123 Skewness 0.06172
Kurtosis 1.37039 Kurtosis -1.10012
Kurtosis 1 37039
1.37039 Kurtosis -1
1.10012
10012
N 8 N 8
N 8 N 8
Minimum 25.000 Minimum 19.000
1st Quartile 31.750 1st Quartile 22.000
Median 35.500 Median 28.000
20 25 30 35 40 45 50 3rd Q uartile 37.000 20 25 30 35 40 45 50 3rd Q uartile 32.750
20 25 30 35 40 45 50 3rd Quartile 37.000 20 25 30 35 40 45 50 3rd Quartile 32.750
M aximum 42.000 M aximum 38.000
Maximum 42.000 Maximum 38.000
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M ean
95% Confidence Interval for Mean 95% Confidence Interval for Mean
30.411 38.839 22.545 33.455
30.411 38.839 22.545 33.455
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
95% Confidence Interval for Median 95% Confidence Interval for Median
30.614 37.322 20.871 33.322
95% Confidence Intervals 30.614 37.322 95% Confidence Intervals 20.871 33.322
95% Confidence Intervals 95% C onfidence Interv al for S tDev 95% Confidence Intervals 95% C onfidence Interv al for S tDev
95% Confidence Interval for StDev 95% Confidence Interval for StDev
Mean 3.333 10.260 Mean 4.314 13.279
Mean 3.333 10.260 Mean 4.314 13.279
Median Median
Median Median
30 32 34 36 38 40 20.0 22.5 25.0 27.5 30.0 32.5 35.0

30 32 34 36 38 40 20.0 22.5 25.0 27.5 30.0 32.5 35.0
First w e need to determine if

our da ta ha s equa l va ria nces.
Stat > ANOVA > Test for Equal Variances…
N ow w e need to test the

va ria nces.
For “ Response:” enter ‘ppm VO C’
For “ Fa ctors:” enter ‘ Shift’

361
The P-va lue of the F-test w a s grea ter tha n 0 .0 5 so w e

“ fa il to reject” the null hypothesis.
Test
Testfor
forEqual
EqualVariances
Variancesfor
forppm
ppmVOC
VOC
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 0.63
0.63
11 P-Value
P-Value 0.729
0.729
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 0.85
0.85
P-Value
P-Value 0.440
0.440
Shift
Shift
22
33
22 44 66 88 10
10 12
12 14
14 16
16 18
18
95%
95%Bonferroni
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Are the va ria nces a re equa l…Yes!
W e need to use the One-W ay AN OVA to

determine if the means are equal of
product quality when being produced by
the 3 shifts. Again, we want to put 95.0
for the confidence level.
Stat > ANOVA > One-Way…
For “ Response:” enter ‘ppm VOC’
For “ Factor:” enter ‘Shift’
Also be sure to click “ Graphs” to select “ Four

in one” under residual plots.
Also, remember to click “ Assume equal

variances because we determined the
variances”
variances were equal between the 2 samples.

362

We must look at the residual plots to be sure our ANOVA analysis is valid
valid.
Since our residuals look Normally Distributed and randomly patterned, we will assume our analysis is
correct.
Residual Plots for ppm VOC

Normal Probability Plot Residuals Versus the Fitted Values
99
N 24 10
AD 0.255
90 P-Value 0.698
5
Residual
Percent
50 0
10 -5
1 -10
-10 0 10 30 35 40
Residual Fitted Value
Histogram of the Residuals Residuals Versus the Order of the Data

4.8 10
3.6 5
Frequency
Residual
2.4 0
1.2 -5
0.0 -10
-10 -5 0 5 10 2 4 6 8 10 12 14 16 18 20 22 24
Residual Observation Order
Since the P-value of the ANOVA test is less than 0.05, we “reject” the null hypothesis that the Mean
product quality as measured in ppm VOC is the same from all shifts.
We “accept” the alternate hypothesis that the Mean product quality is different from at least one shift.
Don’t miss that

shift!
Since the confidence intervals

of the Means do not overlap
between Shift 1 and Shift 3, we
see one of the shifts is
delivering a product quality
g
with a higher level of pp
ppm
VOC.

363
Be able to conduct Hypothesis Testing of Variances
Understand how to Analyze Hypothesis Testing Results
You have now completed Analyze Phase – Hypothesis Testing Normal Data Part 2.
Notes

364
Lean Six Sigma

Black Belt Training
Analyze Phase
Hypothesis Testing Non-Normal Data
Part 1

Hypothesis Testing Non
Non-Normal
Normal Data Part 1”
1.

365
Hypothesis Testing Non-Normal Data Part 1
Overview
The core
W elcom
elcomee to
to Ana
Analy ze
lyze
phase are Equal
Variance Tests and ““X
X”” Sifting
Sifting
Tests for Medians.
Inferentia
Inferentiall Sta
Statistics
tistics
We will examine the
meaning of each of Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
these and show you
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Equal
Equal Variance
Variance Tests
Tests
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Tests
Tests for
for Medians
Medians
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Non-Normal Hypothesis Tests
At this point we have covered the tests for determining significance for Normal Data. We will continue
to follow the roadmap to complete the test for Non-Normal Data with Continuous Data.
Later in the module we will use another roadmap that was designed for Discrete data.
Recall that Discrete data does not follow a Normal Distribution, but because it is not
Continuous Data, there are a separate set of tests to properly analyze the data.
We can test ffor anything!!

y g

366
1 Sample t
Why do we care if a data set is Normally Distributed?
When it is necessary to make inferences about the true nature of the
population based on random samples drawn from the population.
When the two indices of interest (X-Bar and s) depend on the data
being Normally Distributed.
For problem solving purposes, because we don’t want to make a bad
decision – having Normal Data is so critical that with EVERY statistical
test the first thing we do is check for Normality of the data
test, data.
Recall the four primary causes for Non-normal data:
Skewness – Natural and Artificial Limits
Mixed Distributions - Multiple Modes
Kurtosis
Granularity
We will focus on skewness for the remaining tests for Continuous Data.
s
u ou
n
o n ti ta
C Da N on N orm a l
Test of Equa l Va ria nce M edia n Test
M a nn-W hitney Severa l M edia n Tests
Now we will continue down the Non-Normal side of the roadmap. Notice this slide is primarily for tests
of Medians.

367
Sample Size
Levene’s test of equal variance, used to compare the estimated

population Standard Deviations from two or more samples
with N on-N ormal distributions.
– Ho: σ1 = σ2 = σ3 …
– Ha: At least one is different.
You have already seen this command in the last module, this is simply the application for Non-
Normal data. The question is: are any of the Standard Deviations or variances statistically different?
Follow the Roadmap…
O pen the M IN ITABTM w ork sheet EX H_AO V.M TW
P-Va
P-Value
lue << 00.0
.055 (0
(0.0
.000))
Assum
Assumee da
datata is
is not
not
N
N orma
ormally
lly distributed.
distributed.
Probability Plot of Rot 2

Normal
99.9
Mean 1.023
StDev 1.407
99 N 100
AD 7.448
95 P-Value <0.005
90
80
70
Percent
60
50
40
30
20
10
5
EXH_AOV.MTW 0.1
-5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Rot 2

368
Test of Equal Variance Non-Normal Distribution
Stat>ANOVA>Test for Equal Variance Use

Use Levene’s
Levene’s Sta Statistics
tistics for
for N
N on-
on-
N
Norm
orma a ll Da
Datata
P-Va
P-Valuelue >>00.0.055 (0
(0.8
.86600)) Assum
Assumee
va
varia
riance
nce isis equa
equal.l.
H
Hoo:: σ1
σ1 == σ2σ2 == σ3σ3 … …
H
Haa:: At
At lea
least st one
one is is different.
different.
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot22
F-Test
F-Test
Test
TestStatistic
Statistic 1.75
1.75
11 P-Value
P-Value 0.053
0.053
Factors2
Factors2
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 0.03
0.03
P-Value
P-Value 0.860
0.860
22
1.0
1.0 1.2
1.2 1.4
1.4 1.6
1.6 1.8
1.8 2.0
2.0 2.2
2.2
95%
95%Bonferroni
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
11
Factors2
Factors2
22
00 22 44 66 88 10
10
Rot
Rot22
Next we test for equal variance

variance. In MINITABTM select: “Stat>ANOVA>Test Variance”.
Since the data was not Normal, we need to know that the only correct test statistic is the Levene’s
test and not the F-test. Had there been more than two variances tested, then Bartlett’s and Levene’s
tests would have appeared.
Test of Equal Variance Non-Normal Distribution
W hen testing 2 samples with N ormal distribution

distribution, use f-test:
– To determine whether two N ormal distributions have equal
variance.
W hen testing >2 samples with N ormal distribution, use Bartlett’s test:
– To determine whether multiple N ormal distributions have equal
variance.
i
W hen testing 2 or more samples with N on-N ormal distributions, use

Levene’s test:
– To determine whether two or more distributions have equal
variance.
Our focus for this module when working with N on-N ormal distributions.

369
Hypothesis Test Exercise
Ex ercise objective: To practice solving problem

presented using the appropriate Hypothesis Test.
A credit card company wants to understand the need for

customer service personnel. The company thinks there is
variability impacting the efficiency of its customer service staff.
The credit card company has two types of cardscards. The company
wants to see if there is more variability in one type of customer
card than another. The Black Belt was selected and told to give
with 95% confidence the answer of similar variability between
the two card types.
1. Analyze the problem using the Hypothesis Testing roadmap.

2. Use the columns named CallsperW k1 and CallsperW k2.
3. Having a confidence level of 95% is there a difference in
variance?
Test for Equal Variance Example: Solution
First test to see if the data is Normal or Non-Normal.
Stat>Basic Statistics>Normality Test

370
Since there are two

variables we need to
perform a Normality Test on
CallsperWk1 and
CallsperWk2.
First select the variable

‘CallsperWk1’ and Press
“OK”.
Follow the same steps for

‘CallsperWk2’.
For the Data to be Normal the P-value must be greater than 0.05.
Based off the P-value, the variables being analyzed is Non-normal Data.

371

Since we know the variables are Non-
Non
normal Data, continue to follow the
Roadmap.
The next step is to test Calls/Week for

equal variance.
Before performing a Levene’s Test we

have to stack the columns for
CallsperWk1 and CallsperWk2
because currently the data is in
separate columns.
After stacking the Calls/ W eek

columns the next step in the
Roadmap is performing a
Levene’s Test.
Stat>ANOVA>Test for Equal Variances
As you can see the data illustrates a P-value of 0.247 which is more than 0.05. As a result, there
is no variance between CallperWk1 and CallperWk2.
Therefore with a 95% confidence level we reject the null hypothesis.

372
Nonparametric Tests
A non-parametric test makes no assumptions about Normality.
For a Skewed distribution:

- The appropriate statistic to describe the central tendency is the Median, rather than
the Mean.
- If just one distribution is not Normal, a non-parametric should be used.
Non-parametric Hypothesis Testing works the same way as parametric testing. Evaluate the P-
value in the same manner
~ ~ ~
Target X X1 X2
MINITABTM’s Nonparametrics
1-Sample Sign: performs a one-sample sign test of the Median and calculates the corresponding
point estimate and confidence interval. Use this test as an alternative to one-sample Z and one-
sample t-tests.
1-Sample Wilcoxon: performs a one-sample Wilcoxon signed rank test of the Median and
calculates the corresponding point estimate and confidence interval (more discriminating or efficient
than the sign test)
test). Use this test as a nonparametric alternative to one-sample
one sample Z and one-sample
one sample t- t
tests.
Mann-Whitney: performs a Hypothesis Test of the equality of two population Medians and
calculates the corresponding point estimate and confidence interval. Use this test as a
nonparametric alternative to the two-sample t-test.
Kruskal-Wallis: performs a Hypothesis Test of the equality of population Medians for a one-way
design. This test is more powerful than Mood’s Median (the confidence interval is narrower, on
average)
g ) for analyzing
y g data from many yp
populations,
p , but is less robust to outliers. Use this test as an
alternative to the one-way ANOVA.
Mood’s Median Test: performs a Hypothesis Test of the equality of population Medians in a one-
way design. Test is similar to the Kruskal-Wallis Test. Also referred to as the Median test or sign
scores test. Use as an alternative to the one-way ANOVA.
There are 5 basic nonparametric tests that

MINITABTM calculates. Each one has a
counterpart in normal Hypothesis Testing.

373
1-Sample Sign Test
Here iis a littl

H little trick!
t i k! Dividing
Di idi This test is used when you want to compare the Median of one
the sample size from a t-test distribution to a target value.
estimate by 0.864 should give – Must have at least one column of numeric data. If there is more than
you a large enough sample one column of data, MIN ITABTM performs a one-sample W ilcoxon
regardless of the underlying test separately for each column.
distribution…most of the time. The hypotheses:
– H0 : M = M target
For instance, having a – Ha: M ≠ M target
sample size of 23 using the t- Interpretation of the resulting p-value is the same.
test method, the sample size
would increase by 3. If there
N ote: For the purpose of calculating sample size for a non-
is a Normal Distribution parametric (Median) test use:
(assuming) this number
would increase by 1. n
Truthfully, it is really possible n non-parametric = t test
to decrease
0.864
the sample size depending on the distribution selected for the alternative.
1-Sample Example
1 . Pra ctica l Problem:

O facility
Our f ili requires
i a cycle
l time
i ffrom an iimproved
d process off 63 minutes.
i Thi
This
process supports the customer service division and has become a bottleneck to
completion of order processing. To alleviate the bottleneck the improved process
must perform at least at the expected 63 minutes.
2 . Sta tistica l Problem :

Ho: M = 63
Ha: M ≠ 63
3 . 1 -Sa mple Sign or 1 -Sa mple W ilcox on
O pen the M IN ITABTM da ta file: DISTRIB1 .M TW

Sta t>N on pa ra m etric> 1 sa m ple sign …
Or
Sta t> N on pa ra m etric> 1 sa m ple W ilcox on
4 . Sa m ple Size:
This data set has 500 samples (well in excess of necessary sample size).
The Statistical Problem is: The null hypothesis is that the Median is equal to 63 and the
alternative hypothesis is the Median is not equal to 63.
Open the MINITABTM Data File: “DISTRIB1.MTW”. Next you have a choice of either performing a
1-Sample Sign Test or 1-Sample Wilcoxon Test because both will test the Median against a
target. For this example we will perform a 1-Sample Sign Test.

374
1-Sample Example
Stat>Non parametric> 1 Sample Sign …
For a tw o ta iled test, choose the

not equa l for the a lterna tive
hypothesis.
=
Sign Test for M edia n: Pos Sk ew
Sign test of M edia n = 6 3 .0 0 versus = 6 3 .0 0
N Below Equa l Above P M edia n
Pos Sk ew 5 0 0 37 0 463 0 .0 0 0 0 6 5 .7 0
As you can see the P-value is less than 0.05, so we must reject the null hypothesis which
means we have data that supports the alternative hypothesis that the Median is different than
63. The actual Median of 65.70 is shown in the Session Window. Since the Median is greater
than the target value, it seem the new process is not as good as we may have hoped.
Stat>Non parametric> 1 Sample Wilcoxon …
W ilcox on Signed Ra nk Test: Pos Sk ew

Test of M edia n = 6 3 .0 0 versus M edia n not = 6 3 .0 0
N for W ilcox on Estima ted

N Test Sta tistic P M edia n
Pos Sk ew 5 0 0 500 1 2 4 0 1 5 .0 0 .0 0 0 6 7 .8 3
Perform the same steps as the 1-Sample Sign to use the 1-sample Wilcoxon.

375
1-Sample Example
For a confidence interva l,

select a nd enter desired
confidence
Sign confidence interva l for m edia n

Confidence
Achieved Interva l
Since the ta rget of 6 3 N M edia n Confidence Low er Upper Position
is not w ithin the Pos Sk ew 5 0 0 6 5 .7 0 0 .9 4 5 5 6 5 .3 0 6 6 .5 0 229
confidence interva l, 0 .9 5 0 0 6 5 .2 6 6 6 .5 0 N LI
reject the null 0 .9 5 5 8 6 5 .2 0 6 6 .5 1 228
hypothesis.
For the 1-sample sign test, select a confidence interval level of 95%. As you can see this yields a
result intervals of 65.26 to 66.50. The NLI means a non linear interpolation method was used to
estimate the confidence intervals
intervals. As you can see the confidence interval is very narrow
narrow.
Since the target of 63 is not within the confidence interval, reject the null hypothesis.
Since the ta rget of 6 3 is not

w ithin the confidence
interva l, reject the null
hypothesis.
W ilcox on Signed Ra nk CI: Pos Sk ew

Confidence
Estim a ted Achieved Interva l
N M edia n Confidence Low er Upper
Pos Sk ew 5 0 0 6 7 .8 3 9 5 .0 6 7 .0 1 6 8 .7 0
As you will see the confidence interval is even tighter for the Wilcoxon test. Therefore we reject
the null, the Median is higher than the target of 63. Unfortunately, the Median was higher than
the target which is not the desired direction.

376
Hypothesis Test Exercise

A mining company is falling behind profit targets. The mine

manager wants to determine if his mine is achieving the
target production of 2.1 tons/ day and has some limited data
to analyze. The mine manager asks the Black Belt to say if
the mine is achieving 2.1 tons/ day and the Black Belt says
she will answer with 95% confidence.
1. Analyze the problem using the hypothesis testing roadmap.

2. Use the column Tons hauled.
3 Does the Median equal the target value?
3.
HYPOTESTSTUD.MPJ
1 Sample Example: Solution

According to the Hypothesis
the Mine Manager feels he is
achieving his target of 2.1
tons/day.
H0: M = 2.1 tons/day

Ha: M ≠ 2.1 tons/day
Since we are using one

sample, we have a choice of
choosing either a 1 Sample-
Sign or 1 Sample Wilcoxon.
For this example we will use a
1 Sample-Sign.
Stat>Nonparametrics>1-Sample Sign

377
1 Sample Example: Solution
Sign Test for M edia n: Tons ha uled

Sign test of m edia n = 2 .1 0 0 versus = 2 .1 0 0
N Below Equa l Above P M edia n
Tons ha uled 1 7 14 0 3 0 .0 1 2 7 1 .8 0 0
The results show a P-value of 0.0127 and a Median of 1.800
The Black Belt in this case agrees the Mine Manager is achieving
his target of 2.1 tons/ day
We agree!
Mann-Whitney
Mann Whitney Example
The Mann-W hitney test is used to test if the Medians for 2 samples
are different.
1. Determine if different machines have different Median cycle times.
2. Ho: M 1 = M 2
Ha: M 1 ≠ M 2
3. Mann
Mann-W
W hitney test.
4. There are 200 data points for each machine, well over the
minimum sample necessary.
5 Open the MIN ITABTM data set: N onparametric.mtw

5. onparametric mtw

378
Mann-Whitney Example
Wh llooking
When ki att th
the
5 . Sta tistica l Conclusion
Probability Plot,
Match A yields a less Probability
ProbabilityPlot
Plotof
ofMach
MachAA
Normal
than .05 P-value. 99.9
99.9
Normal
Now look at Graph

Mean
Mean 15.24
15.24
StDev
StDev 5.379
5.379
99
99 NN 200
B? Ok now you have
200
AAD
D 1.550
1.550
95 P-Value
95 P-Value <0.005
<0.005
one graph that is 90
90
80
80 Probability
ProbabilityPlot
Plotof
ofMach
MachBB
Non-normal
Non normal Data and 70
ent
70
Normal
nt
60 Normal
Percen
60
Perce
50
50
the other that is 40
40
30
30
99.9
99.9
Mean
Mean 16.73
16.73
20 StDev 5.284
Normal. The good 20 99 StDev 5.284
99 NN 200
200
10
10 AAD
D 0.630
0.630
news is when
55 95
95 P-Value
P-Value 0.099
0.099
90
90
11
performing a 80
80
70
Percent
70
Percent
0.1 60
Nonparametric Test 0.1
00 10
10 20
20
60
50
50 30
40
40
30 40
40
Mach
MachAA 30
of 2 Samples, only 30
20
20
10
one has to be 10
55
N
Normal.l With ththatt 11
said, now let’s 0.1

0.1
00 55 10 15 20 25 30 35
10 15 20 25 30 35
perform a Mann- Mach
MachBB
Whitney.
Perform the Mann- 6 . Pra ctica l Conclusion: The medians of the machines are
Whitney test. Since different.
zero (the difference Stat>Nonparametric>Mann-Whitney…
between the 2
Medians) is not
contained within the
confidence interval If the sa mples a re the
sa m e, zero w ould be
we reject the null included w ithin the
hypothesis. Also, the confidence interva l.
last line in the
Session Window
Mann-Whitney Test and CI: Mach A, Mach B
where it says … is N Median
significant at 0.0010 Mach A 200 14.841
is the equivalent of a Mach B 200 16.346
P-value for the Mann- Point estimate for ETA1-ETA2 is -1.604
Whitney test. 95.0 Percent CI for ETA1-ETA2 is (-2.635,-0.594)
W = 36509.0
The Practical Test of ETA1 = ETA2 vs ETA1 not = ETA2 is

significant at 0.0019
Conclusion is that
there is a difference
between the Medians
of the two machines.

379
Exercise
Ex ercise objective: To practice solving problem presented

using the appropriate Hypothesis Test.
A credit card company now understands there is no variability

difference in customer calls/ week for the 2 different credit card types.
This means no difference in strategy of deploying the workforces.
However, now the credit card company wants to see if there is a
difference in call volume between the 2 different card types
types. The
company expects no difference since the total sales among the two
credit card types are similar. The Black Belt was selected and told to
evaluate with 95% confidence if the averages were the same. The
Black Belt reminded the credit card company the calls/ day were not
normal distributions so he would have to compare using medians
since medians are used to describe the central tendency of non-
normal populations.

2. Use the columns named CallsperW k1 and CallsperW k2.
3. Is there a difference in call volume between the 2 different card
types?
HYPOTESTSTUD.MPJ
Mann-Whitney Example: Solution
Since w e k now our da ta for Ca llperW k 1 a nd

Ca llperW k 2 a re N on-N orm a l w e ca n proceed Stat>Nonparametrics>Mann-Whitney
to perform ing a M a nn-W hitney Test.
M a nn-W hitney Test a nd CI: Ca llsperW k 1 , Ca llsperW k 2

N M edia n
Ca llsperW k 1 2 2 7 3 9 .0
Ca llsperW k 2 1 0 5 7 7 0 .0
Point estim a te for ETA1 -ETA2 is -2 6 .5
9 5 .0 Percent CI for ETA1 -ETA2 is (-9 1 .9 ,4 3 .0 )
W = 3 6 5 0 9 .0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significa nt a t 0 .4 5 8 0

380
Mann-Whitney Example: Solution
As y ou ca n see there is a difference in the M edia n betw een

Ca llsperW k 1 a nd Ca llsperW k 2 .
Therefore, there is a difference in ca ll volum e betw een the

tw o different ca rd types.
M a nn-W hitney Test a nd CI: Ca llsperW k 1 , Ca llsperW k 2

N M edia n
Ca llsperW k 1 2 2 7 3 9 .0
Ca llsperW k 2 1 0 5 7 7 0 .0
Point estima te for ETA1 -ETA2
ETA2 is -2
2 6 .5
5
9 5 .0 Percent CI for ETA1 -ETA2 is (-9 1 .9 ,4 3 .0 )
W = 3 6 5 0 9 .0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significa nt a t 0 .4 5 8 0
Mood’s Median Test
The final 2 tests are the Mood’s Median and the Kruskal Wallis.
1. An aluminum company wanted to compare the operation of its

three facilities worldwide. They want to see if there is a difference
in the recoveries among the three locations. A Black Belt was
asked to help management evaluate the recoveries at the locations
with 95% confidence.
2. Ho: M 1 = M 2 = M 3
Ha: at least one is different
3. Use the Mood’s

3 Mood s median test
test.
4. Based on the smallest sample of 13, the test will be able to detect
a difference close to 1.5.
5. Statistical Conclusions: Use columns named Recovery and
Location for analysis.
= = ?
381
Follow the Roadmap…Normality
Instead of using the Anderson

Anderson-Darling
Darling test for Normality
Normality, this time we used the graphical summary
method. It gives a P-value for Normality and allows a view of the data that the Normality test does
not.
Stat>Basic Statistics>Graphical Summary…
Summary
Summaryfor
forRecovery
Recovery
Location
Location==Savannah
Savannah
AAnderson-Darling
ormality Test
Test
AA-Squared
-S quared 0.81
0.81
PP-Value
-V alue 0.032
0.032
MMean
ean 87.660
87.660
SStDev
tD ev 7.944
7.944
VVariance
ariance 63.113
63.113
SSkewness
kew ness -0.15286
-0.15286
Kurtosis
K t i
Kurtosis -1
1.11764
11764
-1.11764
1 11764
N 25
N 25
MMinimum
inimum 75.300
75.300
1st Q uartile 79.000
1st Quartile 79.000
MMedian
edian 87.500
87.500
78 84 90 96 3rd Q uartile 96.550
78 84 90 96 3rd Quartile 96.550
M aximum 99.200
Maximum 99.200
84.381 90.939
84.381 90.939
86.179 90.080
86.179 90.080
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev
95% Confidence Intervals 95% Confidence Interval for StDev
Mean 6.203 11.052
Mean 6.203 11.052
Median
Median
84.0 85.5 87.0 88.5 90.0 91.5

84.0 85.5 87.0 88.5 90.0 91.5
Notice evidence of outliers in at least 2 of the 3 populations. You could do Box Plot to get a clearer idea
about outliers.
Summary
Summaryfor
forRecovery
Recovery
Location
Location==Bangor
Bangor
AAnderson-Darling
ormality Test
Test
AA-Squared
-S quared 0.72
0.72
PP-Value
-V alue 0.045
0.045
MMean
ean 93.042
93.042
SStDev
tD ev 5.918
5.918
VVariance
ariance 35.017
35.017
SSkewness
kew ness -1.81758
-1.81758
Kurtosis
Kurtosis 4.66838
4.66838
NN 13
13
MMinimum
inimum 76.630
76.630
1st
1stQQuartile
uartile 90.600
90.600
78 84 90 96
M edian
Median
3rd Q uartile
Summary
Summaryfor
94.800
94.800
97.350 forRecovery
Recovery
78 84 90 96 3rd Quartile 97.350
99.700 Location = Ankhar
99.700 Location = Ankhar
M aximum
Maximum
95% C onfidence Interv al for M ean AAnderson-Darling
ormality Test
95% Confidence Interval for Mean Test
89.466 96.617 AA-Squared
-S quared 0.86
89.466 96.617 0.86
95% C onfidence Interv al for M edian PP-Value
-V
PVValue
l 00.022
0.022
0022
022
95% Confidence
C fid Interval
I t l for
f Median
M di
90.637 97.036 MMean
ean 88.302
90.637 97.036 88.302
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev SStDev
tD ev 6.929
95% Confidence Intervals 95% Confidence Interval for StDev 6.929
4.243 9.768 VVariance
ariance 48.008
48.008
Mean 4.243 9.768
Mean SSkewness
kew ness -0.105610
-0.105610
Median
Kurtosis
Kurtosis 0.182123
0.182123
Median NN 20
20
90 92 94 96 98
90 92 94 96 98 MMinimum
inimum 73.500
73.500
1st Q uartile 85.150
1st Quartile 85.150
MMedian
edian 88.425
88.425
78 84 90 96 3rd Q uartile 89.700
78 84 90 96 3rd Quartile 89.700
M aximum 99.450
Maximum 99.450
85.059 91.545
85.059 91.545
86.735 89.299
86.735 89.299
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev
95% Confidence Intervals 95% Confidence Interval for StDev
Mean 5 269
5.269 10 120
10.120
Mean 5.269 10.120
Median
Median
85 86 87 88 89 90 91
85 86 87 88 89 90 91

382
Follow the Roadmap…Equal Variance
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRecovery
Recovery
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 1.33
1.33
Ankhar
Ankhar P-Value
P-Value 0.514
0.514
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 1.02
1.02
P-Value
P-Value 0.367
0.367
Location
Location
Bangor
Bangor
Savannah
Savannah
33 44 55 66 77 88 99 1010 11 11 1212
95%
95%Bonferroni
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Mood’s Median Test
Sta tistica l Solution: Since the P-value of the Mood’s Median test is
less than 0.05, we reject the null hypothesis.
Pra ctica l Solution: Bangor has the highest recovery of all three
facilities.
W e observe the confidence interva ls for
the M edia ns of the 3 popula tion’s. N ote
there is no overla p of the 9 5 %
confidence levels for Ba ngor—so w e
visua lly k now the P-va lue is below 0 .0 5 .
Mood Median Test: Recovery versus Location
Mood median test for Recovery

Chi-Square = 12.11 DF = 2 P = 0.002
Individual 95.0% CIs

Location N<= N> Median Q3-Q1 ---+---------+---------+---------+---
Ankhar 13 7 88.4 4.5 (-----*--)
Bangor 1 12 94.8 6.8 (-------------*------)
Savannah 15 10 87.5 17.6 (----*-------)
---+---------+---------+---------+---
87.0 90.0 93.0 96.0
Overall median = 88.9

383
Kruskal-Wallis Test
Using the same data set
set, analyze using the Kruskal-Wallis test.
test
Kruskal-Wallis Test: Recovery versus

Location
Kruskal-Wallis Test on Recovery
Location N Median Ave Rank Z

Ankhar 20 88.43 27.3 -0.73
Bangor 13 94.80 40.2 2.60
Savannah 25 87.50 25.7 -1.49
Overall 58 29.5
H = 6.86 DF = 2 P = 0.032
H = 6.87 DF = 2 P = 0.032 (adjusted for ties)
This output is the “ lea st friendly ” to interpret. Look for the P-

va lue w hich tells us w e reject the null hypothesis. W e ha ve
the sa me conclusion a s w ith the M ood’s M edia n test.
When comparing the Kruskal-Wallis test to the Mood’s Median test, the Kruskal-Wallis test is better.
In this case the Kruskal-Wallis Test showed the variances were equal and illustrated the same
conclusion.
Exercise

presented
t d using
i ththe appropriate
i t HHypothesis
th i T Test.
t
A manufacturing company making pagers is interested in

evaluating the defect rate of 3 months from one of its
facilities. A customer has said that the defect rate was
surprising lately but didn’t know for sure. A Black Belt was
selected to investigate from the first 3 months of this year.
Sh iis tto reportt back
She b k tto senior
i managementt withith 95%
confidence about any shift(s) in defect rates.

2. Use the columns named ppm defective1, ppm defective2 and
ppm defective3.
3. Are the defect rates equal for 3 months?.
HYPOTESTSTUD MPJ
HYPOTESTSTUD.MPJ

384
Pagers Defect Rate Example: Solution

Let s follow the Roadmap and check
Let’s
to see if the data is Normal.
Instead of performing a Normality

Test, we can find the P-value using
the Graphical Summary in
MINITABTM.
Now let’s take a moment and

compare the 3 variables. Since our
3 variables are less than 0.05 the
data is Non-normal.
Stat>Basic Statistics>Graphical Summary
Before we can perform a Mood’s-

Median Test we must first stack the
columns ppmdefective1,
ppmdefective2 and ppmdefective3.
When comparing the Kruskal-Wallis test to the Mood’s Median test, the Kruskal-Wallis test is better.
In this case the Kruskal-Wallis Test showed the variances were equal and illustrated the same
conclusion.

385
Pagers Defect Rate Example: Solution
The P-value is over

0.05…therefore, we After stacking the columns
accept the null we can perform a Mood’s-
hypothesis.
Median Test.
Stat>Nonparametric>Mood’s Median Test
Unequal Variance
W here do you go in the roadmap if the variance is not equal?

– Unequal variances are usually the result of differences in
p of the distribution.
the shape
• Extreme tails
• Outliers
• Multiple modes
These conditions should be explored through data

demographics.
For Skewed Distributions with comparable Medians it is

unusual for the variances to be different without some
assignable cause impacting the process.

386
Example
This is an example of comparable products.
products To view these graphs open the data set
“Var_Comp.mtw”.
As you can see, Model A is Normal but Model B is not Normal.
Model A and Model B are similar in nature (not exact), but are
manufactured in the same plant.
– Check for N ormality: Var_Comp.mtw
p
Probability Plot of Model A Probability Plot of Model B

Normal Normal
99 99
Mean 10.28 Mean 2.826
95 N 10 95 N 10
AD 0.227 AD 0.753
90 90
P-Value 0.747 P-Value 0.033
80 80
70 70
Percent
Percent
60 60
50 50
40 40
30 30
20 20
10 10
5 5
1 1
8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Model A Model B
M odel A is N orma l, M odel B is not N orma l.
Now let’s check the variance.
Does Model B have a larger variance than Model A? The Median for Model B is much lower. How
can we capitalize on our knowledge of the process? Let’s look at data demographic to help us
explain the differences between the two processes.
Check for equa l va ria nces using the Levene’s

Levene s test
test.
Test
Testfor
forEqual
EqualVariances
Variancesfor
forData
Data
FF-Test
-Test
Test
TestStatistic
Statistic 0.05
0.05
Model
ModelAA P-Value
P-Value 0.000
0.000
Lev ene's Test
Levene's Test
idvar
var
Test
TestStatistic
Statistic 4.47
4.47
id
P-Value
P-Value 0.049
0.049
Model
ModelBB
00 11 22 33 44 55 66 77
995%
5 % Bonfer roni Confidence
Bonferroni ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Model
ModelAA
idvar
idvar
Model
ModelBB
00 22 44 66 88 10
10 12
12
Data
Data
The p va lue is just under the lim it of .0 5 . W henever the

result is borderline, a s in this ca se, use y our process
k now ledge to m a k e a judgm ent.

387
Data Demographics
What clues can explain the difference in variances? This example illustrates how Non-normal Data
can have significant informational content as revealed through data demographics. Sometimes this
is all that is needed to draw conclusions.
Let’s look a t da ta demogra phics for clues.

Summary for Model A Summary for Model B
A nderson-D arling Normality Test A nderson-D arling N ormality Test
P -V alue 0.747 P -V alue 0.033
M ean 10.279 M ean 2.8260
S tD ev 0.703 S tD ev 3.0882
S kew ness 0.330968 S kew ness 1.29887
Kurtosis -0.614597 Kurtosis 0.92377
N 10 N 10
3rd Q uartile 10.816 3rd Q uartile 5.5508
9.0 9.5 10.0 10.5 11.0 11.5 M aximum 11.496 0 2 4 6 8 10 M aximum 9.4440
9.776 10.782
0.6169 5.0352
9.767 10.848
0.3465 5.5873
95% C onfidence Interv al for StD ev
9 5 % C onfidence Inter vals
95% C onfidence Interv al for S tD ev
0.483 1.283 9 5 % Confidence Inter vals
2.1242 5.6379
Mean
Mean
Median
Median
9.8 10.0 10.2 10.4 10.6 10.8 11.0
0 1 2 3 4 5 6
Dotplot
p of Model A,, Model B
Model A
Model B
-0.0 1.6 3.2 4.8 6.4 8.0 9.6 11.2
Data
Graph> Dotplot> Multiple Y’s, Simple
Black Belt Aptitude Exercise

presented
t d using
i ththe appropriate
i t HHypothesis
th i T Test.
t
• A recent deployment at a client raised the question of which

educational background is best suited to be a successful
Black Belt candidate.
• In order to answer the question, the MBB instructor randomly
p
sampled the results of a Six Sigma
g p
pretest taken by
y now
certified Black Belts at other businesses.
• Undergraduate backgrounds in Science, Liberal Arts,
Business, and Engineering were sampled.
• Management wants to know so they can screen prospective
candidates for educational background.

2. W hat educational background is best suited for a potential
Black Belt?
HYPOTESTSTUD.MPJ

388
Black Belt Aptitude Exercise: Solution
First follow the Roa dma p

a nd check the da ta for
N orm a lity .
Now let’s look at the MINITABTM Session Window. As you can see the P-value is greater than 0.05.
N ex t w e a re going to
check for va ria nce.
Before performing a
Test for Equa l
Va ria nce should the
da ta be sta ck ed?
The data illustrates that there is not a difference in variance.
Therefore we reject the accept the null hypothesis, there is no difference between a potential Black
Belt’s
’ degree and performance.
f

389
Conduct Hypothesis Testing for equal variance
Conduct Hypothesis Testing for Medians
Analyze and interpret the results
You have now completed Analyze Phase – Hypothesis Testing Non-Normal Data Part 1.
Notes

390
Lean Six Sigma

Black Belt Training
Analyze Phase
Hypothesis Testing Non-Normal Data
Part 2

Hypothesis Testing Non
Non-Normal
Normal Data Part 2”
2.

391
Overview
The core
W elcome
elcome to
to Ana
Analy
lyze
ze
phase are Tests for
Proportions and ““X
X”” Sifting
Sifting
Contingency Tables.
Inferentia
Inferentiall Sta
Statistics
tistics
We will examine the
meaning of each of Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
th
these andd show
h you
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Tests
Tests for
for Proportions
Proportions
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
2
P2
Contingency
Contingency Tables
Tables
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Hypothesis Testing Roadmap Attribute Data
te Attribute Da ta
t r ib u
A t a ta
D
O ne Fa ctor Tw o Fa ctors
Two Samples Two or More Samples
One Sample
O ne Sa m ple Tw o Sa m ple Chi Squa re Test

Proportion Proportion (Contingency Ta ble)
MIN ITABTM : MIN ITABTM :
Stat - Basic Stats - 2 Proportions Stat - Tables - Chi-Square Test
If P-value < 0.05 the proportions If P-value < 0.05 at least one
are different proportion is different
Chi Squa re Test

(Contingency Ta ble)
MIN ITABTM :
Stat - Tables - Chi-Square Test
If P-value < 0.05 the factors are not
independent
We will now continue with the roadmap for Attribute Data. Since Attribute Data is Non-normal by
definition, it belongs in this module on Non-normal Data.

392
Sample Size and Types of Data

Sample size is dependent on the type of data.
For Continuous Da ta :
– Ca pa bility a na lysis – a minimum of 3 0 sa mples
– Hypothesis Testing – depends on the pra ctica l
difference to be detected a nd the inherent va ria tion
in the process.
For Attribute Da ta :
– Ca pa bility a na lysis – a lot of sa mples
– Hypothesis Testing – a lot, but depends on pra ctica l
difference to be detected.
M IN ITABTM ca n estima te sa mple sizes, but remem ber the

sm a ller the difference tha t needs to be detected the
la rger the sa mple size w ill be!
Proportion versus a Target

This formula is an approximation for ease of manual calculation.
This test is used to determ ine if the process proportion (p)

equa ls some desired va lue, p 0 .
The hypotheses:
– H0: p = p 0
– Ha: p ¹ p 0
The observed test sta tistic is ca lcula ted a s follow s:
(norma l a pprox ima tion) Z =

(pˆ − p )
0
p (1 − p )n
obs
0 0
This is compa red to Z crit = Z a / 2

393
Now let’s
let s try an example:
1 . Shipping a ccura cy must be on ta rget of 9 9 %; determine if

the current process is on ta rget.
Enter m ultiple va lues for

2 . Hypotheses: a lterna tive va lues of p
– H 0 : p = 0 .9 9 a nd M IN ITABTM w ill give
th different
the diff t sa mple
l sizes.
i
– H a : p ¹ 0 .9 9
3 . O ne sa mple proportion test

– Choose a = 5 %
4 . Sa mple size:
Take note of the how quickly the sample size increases as the alternative proportion goes up. It
would require 1402 samples to tell a difference between 98% and 99% accuracy. Our sample of
500 will do because the alternative hypothesis is 96% according to the proportion formula.

Test for O ne Proportion
Testing proportion = 0 .9 9 (versus not = 0 .9 9 )
Alpha = 0 .0 5
Alterna tive Sa mple Ta rget

Proportion Size Pow er Actua l Pow er
0 .9 5 140 0 .9 0 .9 0 0 2 4 7
Yes sir,
0 .9 6 221 0 .9 0 .9 0 0 3 8 9 th ’ all
they’re ll
0 .9 7 428 0 .9 0 .9 0 0 3 1 6
0 .9 8 1402 0 .9 0 .9 0 0 0 2 6
good!
O ur sa mple included 5 0 0 shipped items of w hich 4 8 0

w ere a ccura te
te.
X 480
p̂ = = = 0.96
n 500

394

Follow the above commands in MINITABTM. Now for the “Number
Number of trials:
trials:” field,
field enter the number
of items shipped and for the number of events, enter how many items were shipped. Now click on
the “Options” button and verify the following fields.
5 . Sta tistica l Conclusion: Reject the null hypothesis.

6 . Pra ctica l Conclusion: W e a re not performing to the a ccura cy
ta rget of 9 9 %
The hy pothesized mea n

is not w ithin the
confidence interva l, reject
the null hy pothesis.
Test a nd CI for O ne Proportion

Test of p = 0 .5 vs p not = 0 .5
Ex a ct
Stat>Basic Statistics>1 Proportion… Sa m ple X N Sa m ple p 9 5 % CI P-Va lue
1 4 8 0 5 0 0 0 .9 6 0 0 0 0 (0 .9 3 8 8 9 7 , 0 .9 7 5 3 9 9 ) 0 .0 0 0
After you analyze the data you will see the statistical conclusion is to reject the null hypothesis.
What is the Practical Conclusion…(the process is not performing to the desired accuracy of 99%).
Sample Size Exercise
Ex ercise objective: To practice solving the problem

You are the shipping manager and are in charge of

improving shipping accuracy. Your annual bonus
d
depends
d on your ability
bilit tto prove th
thatt shipping
hi i
accuracy is better than the target of 80%.
1. How many samples do you need to take if the

anticipated sample proportion is 82%?
2 Out of 2000 shipments only 1680 were accurate

2. accurate.
• Do you get your annual bonus?
• W as the sample size good enough?

395
Proportion vs. Target Example: Solution

The Alternative Proportion should be .82
82 and the Hypothesized Proportion should be .80.
80 Select a
Power Value of ‘.9’ and click “OK”.
As you can see the Sample Size should be at least 4073 to prove our hypothesis.
First w e ha ve to figure out

the proper sa mple size to
a chieve our ta rget of 8 0 %
Stat>Power and Sample Size>1 Proportion
Do you get your bonus?
Yes, you get your bonus since .80 is not within the confidence interval. Because the improvement
was 84%, the sample size was sufficient.
Answer: Use alternative proportion of .82, hypothesized proportion of .80. n=4073. Either you’d
better ship a lot of stuff or you’d better improve the process more than just 2%!
N ow let us ca lcula te if w e
?
receive our bonus…
O ut of the 2 0 0 0
shipments, 1 6 8 0 w ere
a ccura te. W a s the X 1680
sa mple size sufficient? p̂ = = = 0.84
n 2000

396
Comparing Two Proportions
MINITABTM gives you a

choice of using the This test is used to determine if the process defect ra te (or
proportion, p) of one sa mple differs by a certa in a mount D
normal approximation
from tha t of a nother sa m ple (e.g., before a nd a fter your
or the exact method. improvement a ctions)
We will use the exact
method. The formula is The hypotheses:
an approximation for
H0: p1 - p2 = D
ease of manual
Ha: p1 - p2 ¹ D
calculation.
The test sta tistic is ca lcula ted a s follow s:
p̂1 − p̂ 2 − D
Zobs =
p̂1 (1 − p̂1 ) n1 + p̂ 2 (1 − p̂ 2 ) n 2
This is compa red to Z critica l = Z a / 2
Catch some Z’s!
Sample Size and Two Proportions Practice
Take a few moment to practice calculating the minimum sample size

required to detect a difference between two proportions using a
power of 0.90.
Enter the expected proportion for proportion 2 (null hypothesis)

hypothesis).
For a more conservative estimate when the null hypothesis is close to

100, use smaller proportion for p1 . W hen the null hypothesis is close
to 0, use the larger proportion for p1 .
a δ p1 p2 n
5% .01
01 0.79
0 79 0.8
0 8 ___________
5% .01 0.81 0.8 ___________ Answers:
5% .02 0.08 0.1 ___________ 34,247
32,986
5% .02 0.12 0.1 ___________ 4,301
5% .01 0.47 0.5 ___________ 5,142
5% .01 0.53 0.5 ___________ 5,831
5,831

397

In MINITABTM click “Stat>Power Proportions . For the “Proportion
Stat>Power and Sample Size>2 Proportions” Proportion 1 values:
values:” field
type ‘.85’ and for the “Power values:” field type ‘.90’; The last field “Proportion 2:” is ‘.95’ then click
“OK”.
1. Shipping accuracy must improve from a historical baseline of 85%

towards a target of 95%. Determine if the process improvements made
have increased the accuracy.
Stat>Power and Sample Size> 2 Proportions…
2. Hypotheses:
yp
– H0 : p1 – p2 = 0.0
– Ha: p1 – p2 ¹ 0.0
3. Two sample proportion test
– Choose a = 5%
4. Sample size:
P
Pow er a ndd Sa
S mple
l Size
Si
Test for Tw o Proportions
Testing proportion 1 = proportion 2 (versus not = )
Ca lcula ting pow er for proportion 2 = 0 .9 5
Alpha = 0 .0 5
Sa m ple Ta rget
Proportion 1 Size Pow er Actua l Pow er
0 .8 5 188 0 .9 0 .9 0 1 4 5 1
A sample of at least 188 is necessary for each group to be able to detect a 10% difference. If you
have reason to believe your improved process is has only improved to 90% and you would like to
be able to prove that improvement is occurring the sample size of 188 is not appropriate.
Recalculate using .90 for proportion 2 and leave proportion 1 at .85. It would require a sample size
of 918 for each sample!
The data
shown was The follow ing da ta w ere ta k en:
gathered for
two Tota l Sa mples Accura te
processes.
Before Im provem ent 600 510
After Im provement 225 212
Ca lcula te proportions:
X1 510
Before Im provem ent: 6 0 0 sa mples, 5 1 0 a ccura te p̂1 = = = 0.85
n1 600
X 2 212
After Improvement: 2 0 0 sa mples, 2 2 0 a ccura te p̂ 2 = = = 0.942
n 2 225

398
To compare two proportions in MINITABTM select “Stat>Basic Statistics>2 Proportions”…Select

Proportions” Select
the “Summarized data” option and in the “Trials:” and “Events:” column input the appropriate data
and click “OK”.
Stat>Basic Statistics>2 Proportions…

5. Sta tistica l Conclusion:
Reject the null
6. Pra ctica l Conclusion:
You ha ve a chieved a
significa nt difference in
a ccura cy.
Test a nd CI for Tw o Proportions

Sa mple X N Sa mple p
1 510 600 0 .8 5 0 0 0 0
2 212 225 0 .9 4 2 2 2 2
Difference = p (1 ) - p (2 )
Estima te for difference: -0 .0 9 2 2 2 2 2
9 5 % CI for difference: (-0 .1 3 4 0 0 5 , -0 .0 5 0 4 3 9 9 )
Test for difference = 0 (vs not = 0 ): Z = -4 .3 3 P-Va lue =
0 .0 0 0
Boris and Igor Exercise

Boris and Igor tend to make a lot of mistakes writing

requisitions.
1. W ho is worse?
2. Is the sample size large enough?

399
2 Proportion vs Target Example: Solution
First w e need to ca lcula te

our estima ted p 1 a nd p 2 for
Boris a nd Igor.
X1 47
Boris p̂1 = = = 0.132
n1 356
X 2 99
Igor p̂ 2 = = = 0.173
n 2 571
Results:
As you can see we N ow let’s see w ha t the
Fail to reject the null minimum sa mple size w ill
hypothesis with the be…
data given. One
conclusion is the
sample size is not
large enough. It
would take a
minimum sample of
1673 to distinguish
the sample Stat>Power and Sample Size>2 Proportions
proportions for Boris
and Igor
Igor.
Sample X N Sample p Difference = p (1) - p (2)

1 47 356 0.132022 Estimate for difference: -0.0413576
2 99 571 0.173380 95% CI for difference: (-0.0882694, 0.00555426)
Test for difference = 0 (vs not = 0): Z = -1.73 P-Value = 0.084
Power and Sample Size Test for Two Proportions

Testing proportion 1 = proportion 2 (versus not =)
Calculating power for proportion 2 = 0.13
Alpha = 0.05
Sample Target
Proportion 1 Size Power Actual Power
0.17 1673 0.9 0.900078
The sample size is for each group.

400
Contingency Tables
Contingency
C ti T bl a re used
Ta bles d to
t simulta
i lt neouslyl compa re
m ore tha n tw o sa mple proportions w ith ea ch other.
It is ca lled a Contingency Ta ble beca use w e a re testing

w hether the proportion is contingent upon, or dependent
upon the fa ctor used to subgroup the da ta .
This test genera lly w ork s the best w ith 5 or more

observa tions in ea ch cell. O bserva tions ca n be pooled by
combining cells.
Som e ex a mples for use include:

– Return proportion by product line
– Cla im proportion by custom er
– Defect proportion by m a nufa cturing line
The null hy pothesis is tha t the popula tion proportions of

ea ch group a re the sa me.
– H0: p1 = p2 = p3 = … = pn
– H a : a t lea st one p is different
Sta tisticia ns ha ve show n tha t the follow ing sta tistic forms
a chi-squa re distribution w hen H 0 is true:
∑
(observed − expected)
2
expected
W here “ observed
observed” is the sa mple frequency
frequency, “ ex pected
pected”
is the ca lcula ted frequency ba sed on the null hypothesis,
a nd the summa tion is over a ll cells in the ta ble.
Th ..oh,
That? oh, that’s
h
my contingency
table!
401
Test Statistic Calculations
Chi-squa re Test
r c (Oij − E ij ) 2
χ =∑ ∑
2 W here:
o
i =1 j=1 E ij O = the observed va lue
(from sa mple da ta )
E = th
the ex pected
t d va lue
l
(F * F )
E ij = row col r = number of row s
Ftotal c = number of columns
Frow = tota l frequency for tha t
2
χ critical = χ α,2 ν row
Fcol = tota l frequency for tha t
From the Chi-Square
Chi Square Table column
Ftota l = tota l frequency for the ta ble
n = degrees of freedom [(r-1 )(c-1 )]
Wow!!! Can you believe this is the math in a Contingency Table. Thank goodness for MINITABTM.
Now let’s do an example.
Contingency Table Example
1 . La rry, Curley a nd M oe a re order entry opera tors a nd

you suspect tha t one of them ha s a low er defect ra te
tha n the others.
others
2 . H o : p M oe = p La rry = p Curley
H a : a t lea st one p is different
3 . Use contingency ta ble since there a re 3 proportions.
4 . Sa mple size: To ensure tha t a minimum of 5
occurrences w ere detected
detected, the test w a s run for one
da y. Moe Larry Curley
Defective 5 8 20
OK 20 30 25
Can’t you clowns get

the entries correct?!
Note the data gathered in the table. Curley isn’t looking too good right now (as if he ever did).

402
The sample data are

the “observed” The sa mple da ta a re the “ observed” frequencies. To
frequencies. To ca lcula te the “ ex pected” frequencies, first a dd up the
calculate the row s a nd columns:
“expected”
frequencies, first add Moe Larry Curley Total
up the rows and Defective 5 8 20 33
columns. Then OK 20 30 25 75
calculate the overall T t l
Total 25 38 45 108
proportion for each
row.
Then ca lcula te the overa ll proportion for ea ch row :
Moe Larry Curley Total

Defective 5 8 20 33 0.306
OK 20 30 25 75 0.694 33/ 108 =
Total 25 38 45 108 0 .3 0 6
N ow use these proportions to ca lcula te the ex pected

frequencies in ea ch cell:
0 .3 0 6 *4 5 = 1 3 .8

Defective 5 8 20 33 0.306
OK 20 30 25 75 0.694
Total 25 38 45 108
0 .6 9 4 * 3 8 = 2 6 .4

403
N ex t ca lcula te the χ2 va lue for ea ch cell in the ta ble:
(observed - expected)2
expected
Moe Larry Curley (20 − 13.8)2 = 2.841

Defective 0.912 1.123 2.841
OK 0.401 0.494 1.250 13.8
Fina lly, a dd these numbers to get the observed chi-squa re:
2 = 0.912 +1.123 + 2.841+

χ obs
0.401+ 0.494 +1.250
χ obs = 7.02
2
The final step is to create a summary table including the observed chi-squared.
A summ a ry of the ta ble:

Moe Larry Curley
Observed 5 8 20
E
Expected
t d 76
7.6 11 6
11.6 13 8
13.8
2
Defective χ 0.912 1.123 2.841
Observed 20 30 25
Expected 17.4 26.4 31.3 2 = 7.02
χ obs
2
OK χ 0.401 0.494 1.250

404
Critica l Va lue:
• Like any other Hypothesis Test, compare the observed statistic with
the critical statistic. W e decide a = 0.05, what else do we need to
know?
• For a chi-square
chi square distribution, we need to specify n, in a
contingency table:
n = (r - 1)(c - 1), where
r = # of rows
c = # of columns
• In our example,
example we have 2 rows and 3 columns
columns, so n = 2
• W hat is the critical chi-square? For a Contingency Table, all the
risk is in the right hand tail (i.e. a one-tail test); look it up in
MIN ITABTM (Calc>Probability Distributions>Chisquare…)
2 = 5.99
χ crit
Gra phica l Sum m a ry:

Since the observed chi-squa re ex ceeds the critica l chi-
squa re, w e reject the null hypothesis tha t the defect
ra te is independent of w hich person enters the orders.
Chi
Chi-square probability
b bilit ddensity
it ffunction
ti ffor n = 2
0.5
0.4
0.3
Accept Reject
f
0.2 χobs
2 = 7.02
0.1
0.0
0 1 2 3 4 5 6 7 8
chi-square χcrit
2 = 5.99

405
Contingency Table Example (cont.)
Using M IN ITABTM
• O f course M IN ITABTM elimina tes the tedium of crunching

these numbers. Type the order entry da ta from the
Contingency Ta ble Ex a mple into M IN ITABTM a s show n:
• N otice tha t row la bels a re not necessa ry, a nd row a nd

column tota ls a re not used, just the observed counts for
ea ch cell.
As you can see the data confirms: to reject the null hypothesis.
5. Sta tistica l Conclusion: Reject

the null hypothesis.
6. Pra ctica l Conclusion: The
defect rate for one of these stooges
g
is different. In other words, defect
rate is contingent upon stooge.
Chi-Square Test
Expected counts are printed below observed counts
1 5 8 20 33
7.64 11.61 13.75
0.912 1.123 2.841
2 20 30 25 75
17.36 26.39 31.25
0.401 0.494 1.250
Stat>Tables>Chi-Square Test
Total 25 38 45 108
Chi-Sq = 7.021, DF = 2, P-Value = 0.030

406
Quotations Exercise

• You are the quotations manager and your team thinks that the
reason you don’t get a contract depends on its complexity.
• You determine a way to measure complexity and classify lost
contracts as follows:
Low Med High

Price 8 10 12
Lead Time 10 11 9
Technology 5 9 16
1. W rite the null and alternative hypothesis.

2. Does complexity have an effect?
Contingency Table Example: Solution
First w e need to crea te a

ta ble in M IN ITABTM
Secondly, in M IN ITABTM
perform
f a Chi-Squa
Chi S re Test
T t
Stat>Tables>Chi-Square Test

407
Contingency Table Example: Solution (cont.)
After analyzing the data we can see the P

P-value
value is 0.426
0 426 which is larger than 0.05.
0 05 Therefore,
Therefore we
accept the null hypothesis.
Are the fa ctors independent

of ea ch other?
Overview
Contingency Tables are another form of Hypothesis Testing.

They are used to test for association (or dependency) between two
classifications.
The null hypothesis is that the classifications are independent.
A Chi-square Test is used for frequency (count) type data.
If the data is converted to a rate (over time) then a continuous type
test would be possible. However, determining the period of time that
the rate is based on can be controversial. W e do not want to just pick
a convenient interval
interval, there needs to be some rational behind the
decision. Many times we see rates based on a day because that is
the easiest way to collect data. However, a more appropriate way
would be to look at the rate distribution per hour.
Per hour? Per day? Per month?

408
Calculate and explain test for proportions
Calculate and explain contingency tests
You have now completed Analyze Phase – Hypothesis Testing Non-Normal Data Part 2.
Notes

409
Lean Six Sigma

Black Belt Training
Analyze Phase
Now we will conclude the Analyze Phase with “Wrap

Items.

410
Analyze Phase Wrap Up Overview
The goa l of the Ana lyze Pha se is to:
• Locate the variables which are significantly impacting your Primary

Metric. Then establish root causes for “ X” variables using Inferential
Statistical Analysis such as Hypothesis Testing and Simple Modeling.
• Gain and demonstrate a working knowledge of inferential statistics as

a means of identification of leverage variables.
Six Sigma Behaviors
• Embracing change
• Continuous learning
• Being tenacious and courageous
• Make data-based decisions
• Being rigorous
• Thinking outside of the box
Ea
Each
ch ““pla
playerer” in
yyer” in the
the Six
Six Sigm
Sigmaa process
process m
must
ust be
be
AA RO
ROLELEMMOODEL
DEL
for
for the Six Sigm a culture.
the Six Sigm a culture.
A Six Sigma Black Belt has a tendency to take on many roles, therefore these behaviors help you
through the journey.

411
Analyze Deliverables
Sample
p size is dependent
p on the type
yp of data.
• Listed below are the Analyze Phase deliverables that each candidate
will present in a Power Point presentation at the beginning of the
Control Phase training.
• At this point you should all understand what is necessary to provide
these deliverables in your presentation.
– Primary Metric
– Data Demographics
– Hypothesis Testing (applicable tools)
– Modeling (applicable tools)
– Strategy to reduce X’s
– Project Plan
– Issues and Barriers It’s your show!
Analyze Phase - The Roadblocks

Each phase will have roadblocks. Many will be similar throughout your project.
Look for the potential roadblocks and plan to address them

before they become problems:
– Lack of data
– Data presented is the best guess by functional
managers
– Team members do not have the time to collect data
– Process participants do not participate in the analysis
planning
– Lack of access to the process

412
DMAIC Roadmap
Now you should be able to prove/disprove the impact “X’s”

X s have on a problem.
problem
Process Owner
Champion/

Define
Estimate COPQ
D
Establish Team
Measure

Analyze

Improve

Control
Analyze Phase
Over 80% of projects will

realize their solutions in the Vital Few X’s Identified
Analyze Phase – then we

State Practical Theories of Vital Few X’s Impact on Problem
must move to the Control
Phase to assure we can Translate Practical Theories into Scientific Hypothesis
sustain our improvements. Select Analysis Tools to Prove/ Disprove Hypothesis
Collect Data
Perform Statistical Tests
State Practical Conclusion
Statistically
Significant?
N
Y
Update FMEA
N
Practically
Significant?
Root
Cause
N
Y
Identify Root Cause
Ready for Improve and Control

413
Analyze Phase Checklist
Ana lyze Q uestions
Define Performa nce O bjectives Gra phica l Ana lysis

• Is existing data laid out graphically?
• Are there newly identified secondary metrics?
• Is the response discrete or continuous?
• Is it a mean or a variance problem or both?
Document Potentia l X ’s Root Ca use Ex plora tion

• Are there a reduced number of potential X’s?
• W ho participated in these activities?
• Are the number of likely X’s reduced to a practical number for analysis?
• W hat is the statement of statistical problem?
• Does the process owner buy into these root causes?
Ana ly ze Sources of Va ria bility Sta tistica l Tests

• Are there completed Hypothesis Tests?
• Is there an updated FMEA?
Genera l Q uestions
• Are there any issues or barriers that prevent you from completing this phase?
Planning for Action
This is a template that should be used with each project to assure you take the proper steps –
remember, Six Sigma is very much about taking steps. Lots of them and in the correct order.
Qualitative screening of vital from controllable trivial Xs

Qualitative screening for other factors
Quantitative screening of vital from controllable trivial Xs
Ensure compliance to problem solving strategy
Quantify risk of meeting needs of customer

customer, business and people
Predict risk of sustainability
Chart a plan to accomplish desired state of culture
Assess shift in process location

Minimize risk of process failure
Modeling Continuous or N on Continuous Output

Achieving breakthrough in Y with minimum efforts
Validate Financial Benefits

414
Have started to develop a project plan to meet the deliverables
Be ready to apply the Six Sigma method through your project
Y ’ on your way!
You’re a !
You have now completed the Analyze Phase. Congratulations!
Notes

415
Lean Six Sigma

Black Belt Training
Analyze Phase
Quiz
Now we will see what you have retained from the Analyze Phase of the course. Please answer
the Analyze Phase where your retention of the knowledge is less than you desire.

416
Analyze Phase Quiz
1. The Multi-Vari Chart was originally designed to show variation from 3 primary sources:
Within unit, Between unit, and Temporal (or over time).
True False
2. One Six Sigma tool helps to screen factors by using graphical techniques to logically
subgroup multiple Discrete X´s plotted against a Continuous Y is known as a
________________________Chart. (fill in the blank)
3 The following definition

3. definition, error in the testing of the samples is an example of what type of
error. (check all that apply)
A. Error in Sampling
B. Bias in Sampling
C. Lack of measurement validity
D. Error in measurement
4. As the sample size becomes large, the new distribution of Means will form a Normal
Distribution, no matter what the shape of the population distribution of individuals are.
This concept is known as the Central Limit Theorem.
True False
5. Which of the following statements are true regarding Hypothesis Testing? (check all that
apply)
A. A Hypothesis Test is an a priori theory relating to differences between variables
B A statistical test or Hypothesis Test is performed to prove or disprove the theory
B.
C. A Hypothesis Test converts the Practical Problem into a Statistical Problem.
D. A Hypothesis Test illustrates short-term results
6. What are the four primary causes for Non-normal Data? (check all that apply)
A. Skewness
B. Mixed Distributions
C. Kurtosis
D. Formulosis
E. Granularity
7. When a data set is Normally Distributed, making inferences about the true nature of the
population based on random samples drawn from the population is an example of using
Non-normal Data.
True False
8. From the list below, which is the best example of a Mann-Whitney Test? (check all that
apply)
A. Determine if one of a few machines has a different Mean cycle time
B. Determine if one of a few machines has a different Median cycle time
C. Determine if document A and document B have different Mean cycle times
D. Determine if document A and document B have different Median cycle times
9. Unequal variances can be the result of differing types of distributions.

True False

417
Analyze Phase Quiz
10. Having Unequal variance is a result of similar distributions having: (check all that apply)
A Extreme
A. E t tails
t il
B. Outliers
C. Multiple Modes
D. Having the tails of the distribution equal each other
11. Conducting a Capability Analysis using Attribute Data should contain a lot of samples to
be statistical sound.
True False
12. Contingency Tables are used to: (check all that apply)
A. Illustrate one tail proportion
B. Compare more than two sample proportions with each other
C. Contrast the outliers under the tail
D. Analyze the ´´what if´´ scenario
13 C
13. Contingency
ti T
Tables
bl are used
d tto ttestt ffor association
i ti ((or d
dependency)
d )bbetween
t ttwo or
more classifications.
True False
14. To conduct a proper Capability Analysis using Continuous Data, what is the minimum
recommended number of samples to use? (check all that apply)
A. 15
B. 20
C. 30
D. 50
15. For a Skewed Distribution, the appropriate statistic to describe the central tendency is:
A. Mean
B. Median
C M
C. Mode
d
16. A Non-parametric Test makes assumptions about the data are from Normal
Populations.
True False
17. If the results from a Hypothesis Test are located in the ´´Region of Doubt´´ area, what
can be concluded? ((check all that apply)
pp y)
A. Failure to reject the Null Hypothesis
B. Failure to accept the Null Hypothesis
C. The test was conducted improperly
D. Rejection of the alpha
18. Alpha risk is typically lower than beta risk.

True False
19. To conduct a proper Hypothesis Test there are six recommended steps to follow.
True False

418
Lean Six Sigma

Black Belt Training
Improve Phase
Welcome to Improve
Now that we have completed the Analyze Phase we are going to jump into the Improve Phase. In
Welcome to Improve we will give you a brief look at the topics we are going to cover.

419
Welcome to Improve
Overview
Well,, now that the
Analyze Phase is over,
W elcom e to Im prove
on to a more difficult
phase. The good news
is….you’ll hardly ever Process M odeling: Regression
use this stuff, so pay
close attention! Adva nced Process M odeling:
We will examine the M LR
meaning of each of
these and show you Designing Ex perim ents
how to apply them.
Ex perim enta l M ethods
Full Fa ctoria l Ex perim ents
Fra ctiona l Fa ctoria l Ex perim ents
DMAIC Roadmap
Process Owner
Champion/

Defi ne
Estimate COPQ
Establish Team
Measure

Analyze

Improve
Implement Solutions to Control or Eliminate Xs Causing Problems

Control
We are currently in the Improve Phase and by now you may be quite sick of Six Sigma, really! In this
module we are going to look at additional approaches to process modeling, its actually quite fun in a
weird sort of way!

420
Welcome to Improve
Improve Phase
Analysis Complete
Identify Few Vital X’s
Experiment to Optimize Value of X’s
Simulate the N ew Process
Validate N ew Process
Implement N ew Process
Ready for Control
After completing the Improve Phase you will be able to put to use the steps as depicted here.

421
Lean Six Sigma

Black Belt Training
Improve Phase
Process Modeling Regression
Now we will continue in the Improve Phase with “Process Modeling: Regression”.

422
Overview
W
W elcom
elcomee to
to Im
Improve
prove Correlation
Correlation
Process
Process M
Modeling:
odeling: Regression
Regression Introduction
Introduction to
to Regression
Regression
Adva
Advanced
nced Process
Process M
M odeling:
odeling: Simple
Simple Linear
Linear Regression
Regression
M
MLR
LR
Designing
Designing Ex
Experim
periments
ents
Ex
Experim
perimenta
entall M
Methods
ethods
Full
Full Fa
Factoria
ctoriall Ex
Experim
periments
ents
FFractiona
Fra ti
ctiona ll Fa
FFactoria
t i ll
ctoria
Ex
Experim
periments
ents
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
In this module of Process Modeling we will study Correlation, Introduction to Regression and
Simple Linear Regression. These are some powerful tools in our data analysis tool box.
We will examine the meaning of each of these and show you how to apply them.

423
Correlation
• The primary purpose of linea r correla tion a na lysis is to measure the

strength of linear association between two variables (X and Y).
• If X increases there is no definite shift in the values of Y, there is no
correla tion, or no association between X and Y.
• If X increases there is a shift in the values of Y, there is a correla tion.
• The correlation is positive when y tends to increase and negative when y
tends to decrease
decrease.
• If the ordered pairs (x, y) tend to follow a straight line path, there is a linea r
correla tion.
• The preciseness of the shift in y as x increases determines the strength of the
linear correlation.
• To conduct a linear correlation analysis you need:
– Bivariate Data – Two pieces of data that are variable
– Bivariate data is comprised of ordered pairs (X/ Y)
– X is the independent variable
– Y is the dependent variable
The primary purpose of linear correlation analysis is to measure the strength of linear
association between two variables (X and Y). You have already seen correlation graphically when
you created a Scatter Plot
Plot.
If as X increases there is no definite shift in the values of Y, there is no correlation, or no

association between X and Y.
If as X increases there is a shift in the values of Y, there is a correlation.
The correlation is positive when Y tends to increase and negative when Y tends to decrease.
If the ordered pairs (X, Y) tend to follow a straight line path, there is a linear correlation.
The preciseness of the shift in y as x increases determines the strength of the linear correlation.
To conduct the study you need:
- Bivariate Data – Two pieces of data that are variable
- Bivariate data is comprised of ordered pairs (X/Y)
- X is the independent variable
- Y is the dependent variable

424
Correlation Coefficient
Ho: N o Correlation Ho ho ho….

Ha: There is Correlation
Ha ha ha….
The correlation coefficient (always) assumes a value between –1 and +1.
The correlation
Th l ti coefficient
ffi i t off th
the population,
l ti R
R, iis estimated
ti t d bby the
th sample
l
correlation coefficient, r:
The null hypothesis for correlation is: there is no correlation, the alternative is there is correlation.
The correlation coefficient (always) assumes a value between –1 and +1.
The correlation coefficient of the population

population, large R
R, is estimated by the sample correlation
coefficient, small r and is calculated as shown.
Types and Magnitude of Correlation
The graphics shown here are labeled as the type and magnitude of their correlation: Strong,
Moderate or Weak correlation.

425
Limitations of Correlation
To properly
understand • A strong positive or negative correlation between X and Y does not indicate
regression you causality.
must first • Correlation provides an indication of the strength but does not provide us
understand with an exact numerical relationship (i.e. Y=f(x)).
correlation. Once • The magnitude of the correlation coefficient is somewhat relative and should
be used with caution.
a relationship is
• Just like any other statistic, you need to assess whether the correlation
described, then a
coefficient is statistically significant
significant, as well as practically significant.
significant
regression can be
performed. • As usual, statistical significance is judged by comparing a p-value with the
chosen degree of alpha risk.
A strong positive • Guidelines for practical significance are as follows:
or negative – If | r | > 0.80, relationship is practically significant
correlation
between X and Y – If | r | < 0.20, relationship is not practically significant
does not
ot indicate
d cate
Area of
Area ofnega
negative
tive Area of positive
causality. linear rcorrela
correlation
tion N o linea r correla tion linea r correla tion
linea
Correlation
provides an
indication of the -1 .0 -0 .8 -0 .2 0 0 .2 0 .8 + 1 .0
strength but does
not provide us with an exact numerical relationship. Regression however provides us with that data
more specifically a y equals f of x equation. Just like any other statistic, be sure to assess the
correlation coefficient is both statistically significant and practically significant
significant.
Correlation Example
Open MiniTab worksheet RB Stats Correlation.mtw
X va lues Y va lues
The correla tion coefficient [r]:
Pa y
yton ca rries Pa y ton y
ya rds
• Is a positive value if one variable 196 679
increases as the other variable 311 1390
increases. 339 1852
• Is a negative value if one variable 333 1359
decreases as the other increases. 369 1610
317 1460
339 1222
148 596
Correla tion Form ula
314 1421
381 1684
Σ ( X i − X )(Yi − Y )
r= 324 1551
∑ ( X i − X ) ∑ (Yi − Y )
2 2 321 1333
146 586
We will use some data from a National Football League player, Walter Payton of the Chicago
Bears. Open MINITABTM worksheet “RB Stats Correlation.mtw” as shown here.

426
Correlation Analysis
Get outta my way!
In MINITABTM select “Graph>Scatter

p Plot>Simple”.
p The following
g “Scatterplot
p – Simple”
p window will
open. To select your Y variable double-click on “payton yards” from the left hand box. For the X variable
double-click “payton carries” from the same box. To enable MINITABTM for the use of a “Lowess Scatter
Plot” click on the “Data View…” button and select the “Smoother” tab… from there you will see a Lowess
option. Select this option and click “OK”.
Correlation Example
Lowess stands for LOcally-

LOcally
Do you observe any correlation in this graph?
WEighted Scatterplot
Smoother. The Lowess
routine fits a smoothed line Scatterplot
Scatterplotof
of payton
paytonyards
yardsvs
vspayton
paytoncarries
carries
2000
to the data which should be 2000
used to explore the

1750
1750
relationship between two
variables without fitting
ga 1500
1500
specific model, such as a
yards
payton yards
regression line or 1250

1250
payton
theoretical distribution.
Lowess smoothers are 1000
1000
most useful when the
curvature of the relationship 750
750
does not change sharply. In

500
500
this example it appears that 150 200 250 300 350 400
150 200 250 300 350 400
there is correlation in the payton
paytoncarries
carries
data.

427
Correlation Example (cont.)

Now we will g generate
the correlation
coefficient using
MINITABTM. Follow the
MINITABTM command
path shown here and
select the “Variables:”
double-click on “payton
carries” and “payton
carries payton
yards” from the left box. Correla tion coefficient is high a nd
the P-va lue is low . Reject the null
The correlation hypothesis, there is a correla tion.
coefficient is high which
corresponds to the
Results for: RB STATS CO RRELATIO N .M TW
graph on the previous
slide that shows Sca tterplot of Pa y ton ya rds vs Pa y ton ca rries
positive correlation.
p Correla tions: Pa y ton ca rries, Pa yton y a rds
Pea rson correla tion of Pa yton ca rries a nd Pa y ton ya rds = 0 .9 3 5
The P-value is low at P-Va lue = 0 .0 0 0
.935 so we reject the
null hypothesis by
saying that there is significant correlation between Payton’s carries and the number of yards.
Regression Analysis
Correlation only tells us the strength of a relationship, not the numerical

relationship.
The last step to proper analysis of Continuous Ddata is to determine the

regression equation.
The regression equation can mathematically predict Y for any given X.
The regression equation from MIN ITABTM is the BEST FIT for the plotted
data.
Prediction Equations:
Y= a + bx (Linear or 1 st order model)
Y= a + bx + cx2 (Quadratic or 2 nd order model)
Y= a + bx + cx2 + dx3 (Cubic or 3 rd order model)
Y= a (bx) (Exponential)
Correlation ONLY tells us the strength of a relationship while Regression gives the mathematical
relationship or the prediction model.

428
Simple vs. Multiple Regression
Simple Regression: In Simple

– One X, One Y Regression there is
– Analyze in MIN ITABTM using only one X
• Sta t>Regression>Fitted Line Plot or commonly referred
• Sta t>Regression>Regression
to as predictors or
regressors. Multiple
Regression allows
M ultiple Regression: many Y’s. Recall
– Two or More X’s, One Y we are only
– Analyze in MIN ITABTM Using presenting Simple
• Sta t>Regression>Regression Regression in this
phase and will
present Multiple
I both
In b th cases the
th R-sq
R valuel ttells
ll us th
the amountt Regression in detail
of variation explained by our model. in the next phase.
Regression Analysis Graphical Output
Fitted
FittedLine
LinePlot
Plot
payton
paytonyards
yards== --163.5
163.5++4.916
4.916payton
paytoncarries
carries
2000
2000 SS 153.985
153.985
R-Sq
R-Sq 87.3%
87.3%
R-Sq(adj)
R-Sq(adj) 86.2%
86.2%
1750
1750
1500
1500
yards
paytonyards
1250
1250
payton
1000
1000
750
750
500
500
150
150 200
200 250
250 300
300 350
350 400
400
payton
paytoncarries
carries
There are two ways to perform a Simple Regression. One is the Fitted Line Plot which will give a
Scatter Plot with a Fitted Line and will generate a limited regression equation in the Session Window
of MINITABTM as shown above.
Follow the MINITABTM command prompt shown here, double-click “payton yards” for Response (Y)
and double-click “payton carries” for the Predictor (X) and click “OK” which will produce this output.

429
Regression Analysis Statistical Output
Stat > Regression > Regression
Regression Ana ly sis: pa yton ya rds versus pa yton ca rries

R-Sq va lue of 8 7 .3 % = 1 7 9 8 5 8 7 / 2 0 5 9 4 1 3
The regression equation is R-Sq (a dj) of 8 6 .2 % = (1 7 9 8 5 8 7 – 2 3 7 1 1 )/ 2 0 5 9 4 1 3
Payton yards = -163.497 + 4.91622 Payton carries
S = 153.985 R-Sq = 87.3 % R-Sq(adj) = 86.2 %

Analysis of Variance
Source DF SS MS F P M ea n Squa res
Regression 1 1798587 1798587 75.8531 0.000
Error 11 260826 23711
Total 12 2059413
R-Sqq va lue of 8 7 .3 % qua

q ntifies the strength
g of the a ssocia tion betw een
Ca rries a nd Ya rds. In this ca se, our prediction equa tion ex pla ins 8 7 .3 %
of the tota l va ria tion seen in “ Ya rds” . 1 2 .7 % of the va ria tion seen in
“ Ya rds” is not ex pla ined by our equa tion.
Let’s look at the Regression Analysis Statistical Output. The difference between R squared and
adjusted R squared is not terribly important in Simple Regression.
In Multiple Regression where there are many X’s it becomes more important which you will see
in the next module.
Regression (Prediction) Equation
The Regression
Analysis generates a Regression Ana lysis: Pa yton ya rds versus Pa yton ca rries
prediction model The regression equation is
based on the best fit
line through the data Payton yards = -163.497 + 4.91622 Payton carries
represented by the
equation shown here.
To p
predict the Consta nt Level of X
number of yards that
Coefficient
Payton would run if
he had 250 carries
To predict how many yards Payton would run if he had 250 carries use the
you simply fill in that
prediction equation above.
value in the equation
and solve.
Payton
y yyards = - 163.497 + 4.91622(250 ) = 1,065.6

430
Regression (Prediction) Equation (cont.)
Y could
You ld
make an fairly Compa re to the Fitted Line.
accurate
estimate by Fitted
FittedLine
LinePlot
Plot
payton
paytonyards
yards== --163.5
163.5++4.916
4.916payton
paytoncarries
using the Line carries
2000
2000
Plot also. SS
R-Sq
153.985
153.985
87.3%
R-Sq 87.3%
R-Sq(adj)
R-Sq(adj) 86.2%
86.2%
1750
1750
1500
1500
yards
paytonyards
payton 1250
1250
~1067 yds
1000
1000
750
750
500
500
150
150 200
200 250
250 300
300 350
350 400
400
payton
paytoncarries
carries
Regression Graphical Output
For a dem onstra tion, check other regression fits.

Stat>Regression>Fitted Line Plot
Q ua dra tic a nd Cubic – Check the r 2 va lue a ga inst the linea r m odel to
determ ine if the difference betw een the va ria nce ex pla ined by our
equa tion is significa nt.
MINITABTM will also generate both quadratic and cubic fits. Select the appropriate variables for (Y) and
(X) and for the type of Regression Model choose “Quadratic” or “Cubic” for the regression model type.

431
Regression Graphical Output (cont.)
Fitted
FittedLine
LinePlot
Plot
payton
paytonyards
yards== --199.7
199.7++5.239
5.239payton
paytoncarries
carries
--0.00064
0.00064payton
paytoncarries**2
carries**2
2000
If the R-Sq va lue im proves
2000 SS 161.474
R-Sq
R-Sq significa ntly, or if the
161.474
87.3%
87.3%
R-Sq(adj) 84.8%
1750
1750 Q ua dra tic R-Sq(adj)
a ssum ptions of the residua ls a re
84.8%
1500
1500
better m et a s a result of utilizing
yards
paytonyards
the qua dra tic or cubic equa tion

1250
1250
you w ill w a nt to use the best
payton
1000
1000 g equa
fitting q tion.
750
750
Fitted
FittedLine
LinePlot
Plot
500
500 payton
paytonyards
yards== 2188
2188- -24.71
24.71payton
paytoncarries
carries
150
150 200
200 250
250 300
300 350
350 400
400 ++0.1147
0.1147payton
paytoncarries**2
carries**2--0.000141
0.000141payton
paytoncarries**3
carries**3
payton
paytoncarries
carries
2000
2000 SS 164.218
164.218
R-Sq
R-Sq 88.2%
88.2%
R-Sq(adj)
R-Sq(adj) 84.3%
84.3%
1750
1750
Cubic
1500
1500
yards
yton yards
1250
1250
ayton
pay
pa
1000
1000
750
750
500
500
150
150 200
200 250
250 300
300 350
350 400
400
payton
paytoncarries
carries
Use the best fitting equation by looking at the R-Sq value. If it improves significantly, or if the
assumptions of the residuals are better met as a result of utilizing the quadratic or cubic equation
you should use it.
Here there is no big difference so we will stick with the linear model.
Residuals
Regression Analysis relies on assumptions about the residuals

(differences between predicted and actual Y values).
Analyze the residuals to look for evidence of an outlier (which could

mean a typo or some assignable cause) or nonlinearity.
As in AN OVA
OVA, the residuals should:
– Be normally distributed (normal plot of residuals)
– Be independent of each other
• no patterns (random)
• data must be time ordered (residuals vs. order graph)
– Have a constant variance (visual
(visual, see residuals versus fits chart
chart,
should be (approximately) same number of residuals above and
below the line, equally spread.)

432
Residuals (cont.)
Residual Plots can be generated from both the fitted line plot and
regression selection in MIN ITABTM .
Sta nda rdized residua l a lso

k now n a s the Studentized
residua l or interna lly
Studentized residua l. The
sta nda rdized residua l is the
residua l divided by a n estim a te
of its Sta nda rd Devia tion.
This form of the residua l ta k es
into a ccount tha t the residua ls
m a y ha
h ve different
diff t va ria
i nces,
w hich ca n ma k e it ea sier to
detect outliers.
Residual Plots can be generated from both the Fitted Line Plot and regression selection when using
MINITABTM.
Here we produced the graph by selecting the “Four
Four in one”
one option.
option
N orm a lity a ssum ption Equa l va ria nce

a ssum ption…
Residual Plots for payton yards
Normal Probability Plot of the Residuals Residuals Versus the Fitted Values
99
dual
2
Standardized Resid
90
1
Percent
50
0
10 -1
1 -2
-2 -1 0 1 2 600 900 1200 1500 1800
Standardized Residual Fitted Value
Independence a ssum ption
Histogram of the Residuals Residuals Versus the Order of the Data
8
esidual
2
6
Standardized Re
Frequency
y
1
4 0
2 -1
0 -2
-2 -1 0 1 2 1 2 3 4 5 6 7 8 9 10 11 12 13
Standardized Residual Observation Order

433
Residual Analysis
Standardized Stat>Regression>Regression
residuals greater
than 2 and less Regression Analysis: payton yards versus payton carries
The regression equation is
than -2 are
payton yards = - 163 + 4.92 payton carries
usually Predictor Coef SE Coef T P
considered large Constant -163.5 172.0 -0.95 0.362
and MINITABTM payton c 4.9162 0.5645 8.71 0.000
labels these
observations with S = 154.0 R-Sq = 87.3% R-Sq(adj) = 86.2%
an R in the table Analysis of Variance

P (Sta nda rdized
of unusual Source
Unusua l observa tions
DF SS MS F
Residua l) Residua l
observations or Regression 1
w ill be discussed la ter.
1798587 1798587 75.85 0.000
ex pressed in Sta nda rd
Residual Error 11 260826 23711
fits and residuals. Devia tions
Total 12 2059413
Unusual Observations
Obs payton c payton y Fit SE Fit Residual St Resid
3 339 1852.0 1503.1 49.3 348.9 2.39R
R denotes an observation with a large standardized residual
Normal Probability Plot of Residuals
To view a normal
probability plot in N orma lly distributed response a ssumption.
MINITABTM select
“Stat>Regression>Fit Normal
NormalProbability
ProbabilityPlot
Plotof
(response
ofthe
theResiduals
Residuals
(responseisispayton
paytonyards)
yards)
ted Line Plot” and 99
99
click on the “Graph” 95 Residua ls

95
button. You will 90
90 should la y nea r
notice underneath 80
80 the stra ight line
70
“Residual Plots” there 70
(to w ithin a fa t
Percent
Percent
60
60
50
are four options to 50
40
40
pencil of ea ch
choose from. For
30
30
20
other).
20
this example select 10

10
55
“Normal plot of
residuals”. We will 11
-3
-3 -2
-2 -1
-1 00 11 22 33
t t Residuals
test R id l vs. St
Standardized
d di d Residual
StandardizedRResidual
id l
Fitted Values and

Residual vs. Order of
Data in the next few
pages.
As you can see the Normal probability plot of residuals evaluates the Normally Distributed response
assumption. The residuals should lay near the straight line to within a fat pencil. Looking at a Normal
probability
b bilit plot
l t tto d
determine
t i normality
lit ttakes
k a littl
little practice.
ti T
Technically
h i ll speaking
ki h however, it iis
inappropriate to generate an Anderson-Darling or any other Normality test that generates a p-value
to determine normality. The reason is that residuals are not independent and do not meet a basic
assumption for using the Normality tests. Dr. Douglas Montgomery of Arizona State University
coined the phrase “fat pencil test” much to the chagrin of many of his colleagues.
434
Residuals vs Fitted Values

Residuals versus
Fitted Values Equa l Va ria nce Assumption
evaluates the
Equal Variance
Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
Assumption. Here (response
(responseisispayton
paytonyards)
yards)
you want to have a 33
random scattering Should be

of points. 22
ra ndom lyy
ual
al
StandardizedResidua
Residu
sca ttered w ith

11
You DO NOT want no pa tterns.
Standardized
to see a “funnel
00
effect” where the
residuals gets -1
-1
bigger and bigger
as the Fitted Value -2
-2
gets bigger or 500
500 750
750 1000
1000 1250
1250 1500
1500 1750
1750
Fitted
FittedValue
Value
smaller.
Residuals vs Order of Data
Independence Assumption
Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
ofthe
theData
Data
((response
esponse isispayton
(response pa ton yards)
payton a ds)
yards)
33
Should show no trends
either up or dow n a nd
22
should ha ve
Residual
ndardizedResidual
a pprox ima tely

11
the sa me num ber of
Standardized
points a bove a nd
00
below the line
( pprox im
(a i a tely
t l
Stan
-1
-1
consta nt va ria nce).
-2
-2
11 22 33 44 55 66 77 88 99 10
10 11
11 12
12 13
13
Observation
ObservationOrder
Order
Residuals
R id l versus th the order
d off d
data
t iis used
d tto evaluate
l t ththe IIndependence
d d A
Assumption.
ti It should
h ld nott
show trends either up or down and should have approximately the same number of points above
and below the line.

435
Modeling Y=f(x) Exercise
Ex ercise objective: To gain an understanding of how to use

regression/ correlation function in MIN ITABTM . Examine
correlation and regression for the Dorsett data in the RB stats
correlation file and answer the following questions.
1. W hat is the type and magnitude of the correlation?

g Positive
a. Strong
b. Moderate Positive
c. W eak Positive
d. Strong N egative
2. W hat is the prediction equation?
3. W hat is the predicted value or yardage if Dorsett carries the

football 325 times?
4. Are all assumptions met?

RB Stats Correlation.mtw

436
Modeling Y=f(x) Exercise: Question 1 Solution
To determine the Type and Magnitude of the relationship we need to

run a basic Scatter Plot.
Select “ Simple”
For “ Y variable” enter Dorsett Yards for “ X variable” enter Dorsett
carries.
The Scatter Plot demonstrates a “ Strong Positive Correlation” .
Scatterplot
Scatterplotof
of dorsett
dorsettyards
yards vs
vs dorsett
dorsett carries
carries
1750
1750
1500
1500
yards
1250
orsett yards
1250
dorsett
1000
1000
do
750
750
500
500
100
100 150
150 200
200 250
250 300
300 350
350
dorsett
dorsett carries
carries

437
To determine the prediction equation we need to run a Fitted Line Plot.

Stat > Regression > Fitted Line Plot…
Fitted Line Plot
For “ Response Y” enter Dorsett yards

For “ Predictor X” enter Dorsett carries
The prediction equation is shown here…
Fitted
FittedLine
LinePlot
Plot
dorsett
dorsettyards
yards== --160.1
160.1++4.993
4.993dorsett
dorsettcarries
carries
1750
1750 SS 79.3033
79.3033
R-Sq
R-Sq 95.0%
95.0%
R-Sq(adj)
R-Sq(adj) 94.5%
94.5%
1500
1500
yards
1250
dorsett yards
1250
dorsett
1000
1000
750
750
500
500
100
100 150
150 200
200 250
250 300
300 350
350
dorsett
dorsettcarries
carries

438
If Dorsett carries the football 325 times the predicted value would be
determined as follows…
Step 1: Dorsett Yards = 160.1 + 4.993 (Dorsett Carries)
Step 2: Dorsett Yards = 160.1 + 4.993 (325)
Step 3: Dorsett Yards = 160.1 + 1622.725
Solution: Dorsett Yards = 1782.825
If Dorsett carries the football 325 times the predicted value would be determined that Dorsett
would carry the football for 1782.825 yards – approximately!
All three assumptions

have been satisfied.
The N ormality Assumptions have been satisfied.
The Equal Variance Assumptions have been satisfied.
Th IIndependence
The d d A
Assumptions
i h
have b
been satisfied.
i fi d
Residual
ResidualPlots
Plotsfor
fordorsett
dorsettyards
yards
Normal
NormalProbability
ProbabilityPlot
Plot Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
99
99 22
Residual
N 12
StandardizedResidual
N 12
AD 0.309
90 AD 0.309
90 P-Value 0.510 11
P-Value 0.510
Percent
Percent
00
Standardized
50
50
-1-1
10
10
-2-2
11
SS
-2
-2 -1
-1 00 11 22 500
500 750
750 1000
1000 1250
1250 1500
1500
Standardized
Residual Fitted
FittedValue
Value
Histogram
Histogramof
ofthe
theResiduals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
ofthe
theData
Data
22
33
Residual
11
Frequency
Frequency
22
00
Standardized
11 -1-1
-2-2
00
-2.0
-2.0 -1.5
-1.5 -1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5 11 22 33 44 55 66 77 88 99 1010 11
11 12
12
Standardized
Residual Observation
ObservationOrder
Order
Ah, so much satisfaction!

439
Perform the steps in a Correlation and a Regression Analysis
Explain when Correlation and Regression is appropriate
You have now completed Improve Phase – Process Modeling Regression.
Notes

440
Lean Six Sigma

Black Belt Training
Improve Phase
Advanced Process Modeling
Now we will continue with the Improve Phase “Advanced Process Modeling MLR”.

441
Overview
W
W elcom
elcomee to
to Im
Improve
prove
Review
Review Corr./
Corr./ Regression
Regression
Process
Process M
Modeling:
odeling: Regression
Regression
N
Non-Linear
on-Linear Regression
Regression
Adva
Advanced
nced Process
Process M
Modeling:
odeling:
M
MLR
LR
Transforming
Transforming Process
Process Data
Data
Designing
Designing Ex
Experim
periments
ents
Multiple
Multiple Regression
Regression
Ex
Ex perim
perimenta
entall M
Methods
ethods
Full
Full Fa
Factoria
ctoriall Ex
Ex perim
periments
ents
Fra
Fractiona
ctionall Fa
Factoria
ctoriall
Ex
Ex perim
periments
ents
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
The core fundamentals of this phase are as shown.
W will
We ill examine
i ththe meaning
i off each
h off th
these and
d show
h you h
how tto apply
l th
them.
Correlation and Linear Regression Review
Correla tion a nd Linea r Regression a re used:

– W ith historica l process da ta . It is N O T a form of
ex perimenta tion.
– To determine if tw o va ria bles a re rela ted in a linea r fa shion.
– To understa nd the strength of the rela tionship.
– To understa nd w ha t ha ppens to the va lue of Y w hen the
va lue of X is increa sed by one unit.
– To esta blish a prediction equa tion tha t w ill ena ble us to
predict Y for a ny level of X .
Correla tion ex plores a ssocia tion.
Correla tion a nd regression do
nott iim ply
l a ca usa l rela
l tionship.
ti hi
Designed ex perim ents a llow
for true ca use a nd effect
rela tionships.
Correla tions: Stirra te, Impurity
Pea rson correla tion of Stirra te a nd Impurity = 0 .9 6 6
P-Va lue = 0 .0 0 0
Recall momentarily the Simple Linear Regression and Correlation proposed earlier in the Analyze
Phase. The essential tools presented here describe the relationship between two variables. A
independent or input factor and typically an output response. Causation is NOT always proved;
however, the tools do present a guaranteed relationship.

442
Correlation Review
The Pearson
coefficient, Correla tion is used to m ea sure the linea r rela tionship betw een tw o
represented here as continuous va ria bles (bi-va ria te da ta ).
“r”; shows the Pea rson correla tion coefficient “ r” w ill a lw a ys fa ll betw een –1
strength of a a nd + 1 .
relationship in A Correla tion of –1 indica tes a strong nega tive rela tionship, one
Correlation. fa ctor increa ses the other decrea ses.
Between -1 and +1 A Correla tion of + 1 indica tes a strong positive rela tionship, one
are the only values f ctor
fa t increa
i ses so does
d the
th other.
th
in which the value
of the coefficient P-Value ≤ 0.05, Ho: N o relationship
P-Value < 0.05, Ha: Is relationship
can be found and
zero has NO
“ r”
relationship.
Strong No Strong
Correla tion Correla tion Correla tion
The P-value proves
the statistical
th t ti ti l
confidences of our -1 .0 0 + 1 .0
conclusion Decision Points
representing
possibility that
relationship exists, simultaneously; the Pearson correlation coefficient shows the “strength” of the
relationship. For example, P-value standardized at .05, then 95% confidence in a relationship is
exceeded by the two factors tested.
tested
Linear Regression Review

Presented here Stir
Rate is directly Linea r Regression is used to model the rela tionship betw een
related to impurity of a continuous response va ria ble (Y) a nd one or m ore
the process; the continuous independent va ria bles (X ). The independent
relationship between predictor va ria bles a re m ost often continuous but ca n be
the two, is one unit ordina l.
Stir Rate causes – Ex a mple of ordina l - Shift 1 , 2 , 3 , etc.
.4566 Impurity P-Value ≤ 0.05, Ho: Regression equation is not significant
increase. Stir Rate P-Value < 0.05, Ha: Regression equation is significant
locked at 30, and
F itte d L ine P lo t
Impurity calculated 2 0 .0
Im p u r ity = - 0 . 2 8 9 + 0 . 4 5 6 6 S tir r a te
by 30 times .4566,
S 0.919316
R-S q 93.4%
R - S q ( ad j) 92.7%
moreover, 1 7 .5
subtracting .289 1 5 .0
Impurit y
gives us a 13.4 1 2 .5 The cha nge in Y-va lue

Impurity. Granted; for every one unit
1 0 .0 cha nge in (X ) Stirra te
that we have an error (Slope of the Line)
in our model, the red 20 25 30
S t ir r a t e
35 40 45
points do not lie on

the blue line
line.
The dependent response variable is Impurity and the Stir Rate is the independent predictor, as
both variables in this example are perpetual.

443
Correlation Review
Numerical
relationship is left Correla tion only tells us the strength of a linea r rela tionship,
out when speaking not the numerica l rela tionship.
of Correlation. The la st step to proper a na lysis of continuous da ta is to
Correlation shows determine the regression equa tion.
potency of linear
relationship, The regression equa tion ca n ma thema tica lly predict Y for
mathematical a ny given X .
relationship is The regression equa tion from M IN ITABTM is the best fit for
shown by and the plotted da ta .
through the
prediction equation Prediction Equa tions:
of regression. As Y = a + bx (Linea r or 1 st order model)
shown, these Y = a + bx + cx 2 (Q ua dra tic or 2 nd order
correlations or model)
regressions are not Y = a + bx + cx 2 + dx 3 (Cubic or 3 rd order model)
proven casual Y = a (b )x (Ex ponentia l)
relationships, we
are in attempt for
PROVING statistical commonality. Exponential, quadratic, simple linear relationships, or even
predictable outputs (Y) concerns REGERRESION equations. More complex relationships are
approaching.
Simple vs. Multiple Regression Review
Simply Regressions
have one X and are Simple Regression
referenced as the – O ne X , O ne Y
regressors or
predictors;
di t multiple
lti l – Ana lyze in M IN ITABTM using
X’s give reason to • Sta t>Regression>Fitted Line Plot or
output or response • Sta t>Regression>Regression
variable, this is
Multiple Regression
accounts. M ultiple Regression
– Tw o or M ore X ’s, O ne Y
Strength
g of the
regression known – Ana
A llyze in
i M IN ITABTM Using
U i
quantity by R • Sta t>Regression>Best Subsets
squared and dictates • Sta t>Regression>Regression
overall variation in
output (Y),
In both ca ses the R-sq va lue estima tes the
independent variable
a mount of va ria tion ex pla ined by the model.
subjected to the
regression equation.
equation

444
Regression Step Review
How to run a Regression

g
Th ba
The b sici steps
t to
t follow
f ll in
i Regression
R i a re a s follow
f ll s:
is directed above. Using
a Scatter Plot, and 1 . Crea te Sca tter Plot (Gra ph>Sca tterplot)
understanding the 2 . Determine correla tion (Sta t> Ba sic Sta tistics> Correla tion – p-va lue less
tha n 0 .0 5 )
variation between the
3 . Run fitted line plot choosing linea r option (Sta t>Regression>Fitted
X’s and Y’s, then Line Plot)
activate a Correlation 4 . Run regression (Sta t> Regression> Regression) (Unusua l
Analysis allowing a O bserva tions?)
potential
t ti l lilinear 5 . Eva lua te R2 , a djusted R2 a nd p-va
p va lues
relationship indication. 6 . Run non-linea r regression if necessa ry (Sta t>Regression>Fitted Line
Plot)
Third step is to find 7 . Ana lyze residua ls to va lida te a ssumptions.
existing linear (Sta t>Regression>Fitted Line Plot> Gra phs)
mathematical 1 . N orm a lly distributed
relationships which calls 2 . Equa l va ria nce
for a prediction equation, 3 . Independence
4 . Confirm one or tw o points do not overly influence m odel.
and fourth to find the
potency or strength of One step at a time….
the linear relationship
that does exist. Linear
regression accompanied by the variation of the input gives a variety of output results and a
completion of the fifth step denoted, the amount percentage a given output has, including the
answer to strength of statistical confidence within our Linear Regression.
To conclude a Linear Regression exists; majority has that a 95% statistical confidence or above
has to be obtained. If unsatisfied conclusions are drawn, a point of contingency, step 6 is
essential. At present, in step 6, we contemplate the potential Non-linear Regression, however, this
is only vital if we can not find a regression equation (statistical and practical) variation of output by
way of scoping the input; analyzing the model error for correctness. Step 7, is depicted in
subsequent slides, validating residuals are a necessity for a valid model.
Simple Regression Example
Recollection of learning This da ta set is from the mining industry . It is a n eva lua tion
tools in and throughout of ore concentra tors.
the Analyze Phase,
presented here is a Scatterplot
Scatterplotof
ofPGM
PGMconcentrate
concentrate(g/ton)
(g/ton)vs
vsAgitator
AgitatorRPM
RPM
simple Regression 70
70
example examining a 60
60
piece
i off equipment
i t
(g/ton))
concentrate(g/ton)
50
pertaining to a mining 50
PGMconcentrate
company. Plotting the 40

40
diagram output to input, 30

30
following the Regression

PGM
20
20
steps and noticing how
the equipment is agitated 10
10
10
10 15
15 20
20 25
25 30
30 35
35 40
40 45
45
by output of PGM Agitator
A gitatorRPM
RPM
concentrate.
Opening the MINITABTM named “Concentrator.MTW” will show how output is always applied to the
Y axis (dependent), as input is always applied to the X axis (independent).

445
Example Correlation
Identifying the existing

Linear Regression is the
2nd step. Having the
Pearson correlation
coefficient at .847 a P-
value subordinate to .05
we see in fact a very
strong statistical
confidence
fid iin a absolute
b l t
Linear Regression. If no
correlation existed the
coefficient would be
closer to Zero,
remember? Correla tions: PGM concentra te (g/ ton), Agita tor RPM
Pea rson correla tion of PGM concentra te (g/ ton) a nd Agita tor RPM = 0 .8 4 7
P Va lue = 0 .0
P-Va 001
Example Regression Line
Fitted
FittedLine
LinePlot
Plot
PGM
PGMconcentrate
concentrate(g/ton)
(g/ton)== 1.119
1.119++1.333
1.333Agitator
AgitatorRPM
RPM
70
70 SS 9.08220
9.08220
R-Sq
R-Sq 71.8%
71.8%
R-Sq(adj)
R-Sq(adj) 69.0%
69.0%
60
60
(g/ton)
centrate (g/ton)
50
50
ncentrate
40
40
PGMconc
con
30
30
PGM
20
20
10
10
10
10 15
15 20
20 25
25 30
30 35
35 40
40 45
45
Agitator
A gitatorRPM
RPM
Now finding the predicted equation of the linear relationship

relationship, two factors; output response and input
variable. Grams per ton of the PGM concentrate is output and the RPM of the agitator is input.
Knowing that a positive slope exists, by a greater than zero correlation coefficient betokens the
agitators RPM increases as does the PGM concentrate. The slope of Linear Regression equals
1.333. Did you recall that the Pearson correlation coefficient exceeded zero?

446
Example Linear Regression

Shown here is a
Linear Regression of
70% process
variation, considering Regression Analysis: PGM concentrate (g/ton) versus Agitator RPM
step five; a 12 data The regression equation is

point MINITABTM alert PGM concentrate (g/ton) = 1.12 + 1.33 Agitator RPM
for a large residual Predictor Coef SE Coef T P

comes to fruition. R Constant
Agitator RPM
1.119
1.3332
7.106
0.2642
0.16
5.05
0.878
0.001
squared R squared
squared,
adjusted and a S = 9.08220 R-Sq = 71.8% R-Sq(adj) = 69.0%
N otice the unusua l
unusual listing of Analysis of Variance
observa tion m a y indica te
Source DF SS MS F P
observation pertain to Regression 1 2101.1 2101.1 25.47 0.001 tha t a non-linea r a na ly sis
our full Regression Residual Error 10 824.9 82.5 ma y ex pla in m ore of the
Analysis. With these
Total 11 2925.9
va ria tion in the da ta .
concerns refer to Unusual Observations
PGM
MINITABTM window ((if Agitator concentrate
necessary) and a Obs
3
RPM
32.0
(g/ton)
23.30
Fit
43.78
SE Fit
3.21
Residual
-20.48
St Resid
-2.41R
Non-linear
R denotes an observation with a large standardized residual.
Regression might be
in consideration.
Example
p Regression
g Line
Stat>Regression>Fitted Line Plot
Fitted
FittedLine
LinePlot
Plot
PGM
PGMconcentrate
concentrate(g/ton)
(g/ton)== 30.53
30.53--1.460
1.460Agitator
AgitatorRPM
RPM
++0.05586
0.05586Agitator
AgitatorRPM**2
RPM**2
70
70 SS 7.61499
7.61499
R-Sq
R-Sq 82.2%
82.2%
R-Sq(adj)
R-Sq(adj) 78.2%
78.2%
60
60
(g/ton)
ate(g/ton)
50
50
concentrate
PGMconcentra
40
40
30
30
PGM
20
20
10
10
10
10 15
15 20
20 25
25 30
30 35
35 40
40 45
45
Agitator
A gitatorRPM
RPM
Noticing how the new line is more appropriate for our diagram, this is in essence of choosing a
Non-linear Regression and choosing Quadratic Regression. The model option can be used, simply
by clicking the “Quadratic:”. The curvature better fits the plotted points by the distances. Can you
see the difference?

447
Example Linear and Non-Linear Regression

We have here
both
b th R
Regression
i Linea r M odel
models. In terms Regression Analysis: PGM concentrate (g/ton) versus Agitator RPM
of R squared PGM concentrate (g/ton) = 1.119 + 1.333 Agitator RPM
M ore va ria tion is
ex pla ined using the non-
being higher in S = 9.08220 R-Sq = 71.8% R-Sq(adj) = 69.0%
linea r m odel since the R-
percentage rate Analysis of Variance Squa red is higher a nd
on the Non- Source DF SS MS F P the S sta tistic is low er
Regression 1 2101.07 2101.07 25.47 0.001 w hich is the estim a ted
linear model as Error 10 824.86 82.49
Sta nda rd Devia tion of
Total 11 2925.93
apposed to that the error in the model.
of the Linear we Non- Linear Model
see more Polynomial Regression Analysis: PGM concentrate (g/ton) versus Agitator RPM
process PGM concentrate (g/ton) = 30.53 - 1.460 Agitator RPM + 0.05586 Agitator RPM**2
variation, in S = 7.61499 R-Sq = 82.2% R-Sq(adj) = 78.2%
addition, S Analysis of Variance
presents
Source DF SS MS F P
Regression 2 2404.04 1202.02 20.73 0.000
estimated Error 9 521.89 57.99
Total 11 2925.93
St d d
Standard
Deviation of Sequential Analysis of Variance
Source DF SS F P
errors, Non- Linear 1 2101.07 25.47 0.001
Quadratic 1 302.97 5.22 0.048
linear model has
a lower decimal.
Referenced earlier in Measure Phase is Standard Deviation. Take a look if necessary. Let us now
consider the model error, you need not be perplexed, model error has many variables in and of
itself. Output dependency on the impact of other input variables and measurement system errors of
output and inputs can be causes. MINITABTM Session Window displays these very Regression
Analyses feel free to use.
Example Residual Analysis
The recommendation here would be to use standardized residuals and “Four in one” option for
plotting. In the upper left window “Graph” NEEDS to be clicked, appropriate modeling and
analyzing the residuals will conclude the seventh step.

448

Example Residual Analysis
Having selected the “Four in
one”” option,
ti we kknow see allll
presented and keep on the
forefront our assumptions to
consider a valid Regression.
Residuals do not have a
pattern across the data
collected, however, they do
have a similar variation
across the board of Fitted
Values; moreover, in a valid
Regression all residuals will
be distributed.
Similarities between the

residuals across the Fitted
Values in the upper right N orm a lly distributed residua ls (N orm a l Proba bility Plot)
graph show no monumental Equa l va ria nce (Residua ls vs. Fitted Va lues)
differences as to variation. Independence (Residua ls vs. O rder of Da ta )
Random placement of the
residuals are proven by the
bottom right graph, no pattern is in essence. Looking for normality the bottom left graph (the
Histogram) insists we have a bell curve, as does the upper right graph proving residuals placed
near the blue line. Now, have we met the necessary requirements of the criteria? With these
randomly dispersed residual data points finding the impact of just a single one is in confirmation.
Non-Linear Relationships Summary
M ethods to find N on-linea r Rela tionships:

– Sca tter Plot indica ting curva ture.
– Unusua l observa tions in Linea r Regression model.
– Trends of the Residua l versus the Fitted Va lues Plot in

simple Linea r Regression.
– Subject ma tter ex pert k now ledge or tea m ex perience.
In identifying Non-linear Relationships, graphically looking at the variation of output to input on any
given Scatter Plot the Non-linear Relationship is self evident. Using step four of the Regression
Analysis methodology, unusual observation will ask us to focus deeper at Fitted Line Plots to see
what is the solution for the historical data. Detecting a Non-linearity carefully look at the Residuals
vs. Fitted Values graph of a Linear Regression. Finding clustering and/or trends of data could
conclude to a Non-linear Regression. Relying on a team or expert whom has prior knowledge can
avail much information, also.

449
Types of Non-Linear Relationships

The simple
Linear Model,
the quadratic
model, the
logarithm model
and the inverse
model are
descriptive of
the more
conventional
relationships
between outputs
and inputs.
Oh, which formula to use?!
Mailing Response Example
This ex a mple
Thi l w ill demonstra
d t te
t how
h to
t use confidence
fid a nd
d
prediction interva ls.
W ha t percent discount should be offered to a chieve a
m inimum 1 0 % response from the m a iling?
The discount is in sa les coupons
being sent in the m a il.
Clip ’em!
Open the MINITABTM file called “Mailing Response vs. Discount.mtw”. This shows transactions by
a retail store chain, in essence, giving data relationship between discount amounts impact and
response of customers to the mailed coupons
coupons. With input variable being displayed in C1 and output
displayed in C2, Belts need to establish the discount rate that will yield 10% response of customers
mailed. The coupons used to buy merchandise by the % of customers whom received the mailings
is the measured % response.

450
Mailing Response Scatter Plot
The output vsvs.

the input is
graphically
plotted and the
output is only
plotted on the Y-
axis. Notice we Scatterplot
Scatterplotof
of%
%response
responsefrom
frommailing
mailingvs
vs%
%discount
discount
have some 70
70
curvature. 60
60
mailing
frommailing
50
50
40
responsefrom
40
%response
30
30
20
20
%
10
10
00
00 10
10 20
20 30
30 40
40
%
%discount
discount
Mailing Response Correlation
Correla tions: % discount, % response from ma iling
Pea rson Correla tion of % discount a nd % response from ma iling = 0 .9 7 2

P-Va lue = 0 .0 0 0
Now we are testing for a Linear Relationship by running a Correlation, the results of the analysis a
strong confidence because the P-value strikes under .05.
Do you notice the Pearson Correlation Coefficient is almost 1.0 indicating a strong correlation?

451
Mailing Response Fitted Line Plot
This model shows

a very high 94% Fitted Line Plot
% response from mailing = - 11.22 + 1.830 % discount
R-squared. 70 S
R-Sq
5.60971
94.5%
Having noticed 60 R-Sq(adj) 94.1%
% response from mailing

50
earlier the 40
curvature the next 30
step is to consider 20
10
a Non-linear 0 Regression Analysis: % response from mailing versus % discount
Regression -10
0 10 20 30 The40regression equation is
Analysis, % discount
% response from mailing = - 11.2 + 1.83 % discount
following right Predictor Coef SE Coef T P
along the N ote there a re no Constant
% discount
-11.215
1.8301
2.541 -4.41 0.001
0.1179 15.52 0.000
methodology. unusua l observa tions. S = 5.60971 R-Sq = 94.5% R-Sq(adj) = 94.1%
Even though the R Source DF SS MS F P
squa red va lues a re Regression 1 7580.0 7580.0 240.87 0.000
Residual Error 14 440.6 31.5
high,
g , a N on-linea r fit Total 15 8020
8020.5
5
ma y be better ba sed
on the Fitted Line Plot.
Mailing Response Non-Linear Fitted Line Plot
Fitted
FittedLine
LinePlot
Plot
%
%response
responsefrom
frommailing
mailing== - -0.416
0.416++0.1526
0.1526%
%discount
discount
++0.04166
0.04166%%discount**2
discount**2
80
80 S 2.91382
S 2.91382
R-Sq 98.6%
R-Sq 98.6%
70
70 R-Sq(adj) 98.4%
R-Sq(adj) 98.4%
mailing
frommailing
60
60
50
50
eefrom
40
40
response
The R squa red

Th d va lue
l for
f
%response
30
30
20
20 the N on-linea r fit
increa sed to 9 8 .6 % from
%
10
10
00 9 4 .5 % in the Linea r
00 10 20 30 40
10 20
% discount
% discount
30 40 Regression.
Polynomial Regression Analysis: % response from mailing versus % discount
% response from mailing = - 0.416 + 0.1526 % discount + 0.04166 % discount**2
S = 2.91382 R-Sq = 98.6% R-Sq(adj) = 98.4%
Source DF SS MS F P
Regression 2 7910.14 3955.07 465.83 0.000
Error 13 110.37 8.49
Total 15 8020.51
Sequential Analysis of Variance
Source DF SS F P
Linear 1 7579.95 240.87 0.000
Quadratic 1 330.19 38.89 0.000
W are satisfied!
We ti fi d! The
Th application
li ti off a N
Non-linear
li R
Regression
i M Model
d l shows
h an iincreased
dRR-
squared.

452
Confidence and Prediction Intervals
IIn order
d to t a nsw er the
th origina
i i l question
ti it is
i necessa ry to
t
eva lua te the confidence a nd prediction interva ls.
W ha t percent discount should be offered to a chieve a 1 0 %
response from the ma iling?
…..O ptions
A powerful option is the Fitted Line Plot analysis, so click “options” after running
“statregressionfittedlineplot” command. Now select “Display confidence interval” and “Display
prediction interval” and leave the Confidence Level at 95%.
Taking a look at
what has changed in
the MINITABTM Fitted
FittedLineLine Plot
Plot
%
%response
response from
from mailing
mailing == -- 0.416
0.416 ++0.1526
0.1526 %%discount
discount
window by selecting ++0.04166
0.04166 % %discount**2
discount**2
both interval options, Regression
80
80 Regression
Confidence and 95%
95%CI CI
70
70 95%
95%PI PI
Prediction; each
ailing
ailing
M a nua lly dra w a SS 2.91382

2.91382
60
60
interval is assigned vertica l line w here it R-Sq
RR Sq
R-Sq
Sq 98.6%
98 6%
98.6%
ma
fromma
R-Sq(adj) 98.4%
50
50 intersects the low er R-Sq(adj) 98.4%
a color code, the red
response from
prediction interva l line.

40
is Confidence and 40
%response
30
the green is 30
Prediction. In the 20
20
M a nua lly dra w a
previous “Option” 10
10 horizonta l line a t 1 0 %.
%
box we can widen or 0

0
narrow the interval -10
-10
by changing the 00 10
10 20
20 30
30 40
40
%
%discount
discount
Confidence Level, W ith 9 5 % confidence, a discount of
with the Prediction 1 8 % should crea te a t lea st a 1 0 %
response from the m a iling.
intervals we find how
all data falls in
between a range, having a particular confidence level of 95%. Much importance lies upon the
horizontal line, however to answer the original question, we need to find out what Prediction interval
is of our most importance
importance. The percentage of customers who would respond with 18% coupon
mailed would be 10 to 23 %, this being at 95% Confidence Level; moreover, if we had drawn this
horizontal line incorrectly we would have had a result of 10% or less.

453
Confidence and Prediction Intervals

Having g less data
available to Fitted
FittedLine
LinePlot
Plot
predict the %
%response
responsefrom
frommailing
mailing== --0.416
0.416++0.1526
0.1526%
%discount
discount
++0.04166
0.04166% %discount**2
regression discount**2
equation usually 80
80
Regression
Regression
95%
95%CICI
causes the 70
70 95%
95%PIPI
Confidence mailing
ailing
SS 2.91382
2.91382
60
60 R-Sq
R-Sq 98.6%
98.6%
Intervals to flare
mm
R-Sq(adj) 98.4%
50
50 R-Sq(adj) 98.4%
response from
from
out at the 40
40
extreme ends; if
%response
30
30
a prediction
20
20 The Prediction Interva l is the ra nge w here a new
equation exists, observa tion is ex pected to fa ll. In this ca se, w e a re
10
10
it would be
%
9 5 % confident a n 1 8 % discount w ill y ield betw een

00 1 0 % a nd 2 3 % response from the ma iling.
found within the
-10
red lines -10 The Confidence Interva l is the ra nge w here the
00 10
10 20
20 prediction
30
30 equa40tion is ex pected to fa ll. The true
40
indicatingg the %discount
%discount
prediction equa
p q tion could be different. How ever,,
Confidence given the da ta w e a re 9 5 % confident tha t the true
prediction equa tion fa lls w ithin the Confidence
Intervals and the Interva ls.
95% confidence.
Considering the question of yielding 10% or more, finding the regression equation is of menial
importance than to estimate where the data ought to predicted within the relationship. The
prediction intervals will provide a degree of confidence in how the customers will respond, this
estimate is of great importance
importance.
Residual Analysis
Confirming the validity, taking into
To complete the ex a mple, the Residua l Ana lysis
consideration our residuals and va lida tes the a ssumptions for Regression Ana lysis.
completing step seven is next. Having a
variation of outputs is due to a high Residual
ResidualPlots
Plotsfor
for%
%response
responsefrom
frommailing
mailing
Normal
NormalProbability
ProbabilityPlot
Plotof
ofthe
theResiduals Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
level in R-squared, but from that 99
99
Residuals Values
Residual
22
information we cannot draw the 90
90
11
Percent
conclusion it’s a sufficient model. We

Percent
Standardized
50
50 00
can have confidence in our model. 10

10 -1
-1
because all three assumptions are 11

-2
-2 -1
-1 00
Standardized
11 22
-2
-2
00 20
20 40
40 60
60 80
80
Residual Fitted
FittedValue
Value
satisfied; outputs are Normally and Histogram
Histogramof
ofthe
theResiduals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
ofthe
theData
Data
R d l Di
Randomly Distributed
t ib t d across th
the 6.0
Residual
6.0
22
observation order, and have similar 4.5

4.5 11
Frequency
Frequency
Standardized
variance across the fitted values. The 3.0

3.0 00
1.5 -1
store should give a discount of 18% and 1.5
0.0
-1
-2
0.0 -2
see if they redeem their 10% of -1
-1 00
Standardized
11
Residual
22 11 22 33 44 55 66 77 88 99 10
Observation
1011
1112
ObservationOrder
Order
1213
1314
1415
1516
16
customers mailed.
Now does the present data for the response fit the equation as predicted?

454
Transforming Process Data

In the ca se w here da ta is N on-linea r it is p
possible to perform
p
Regression using tw o different m ethods:
– N on-linea r Regression (a lrea dy discussed)
– Linea r Regression on tra nsform ed da ta
Either the X or Y ma y be tra nsformed.
Any sta tistica l tools tha t requires tra nsform a tion uses these
methods.
methods
Adva nta ges of tra nsform ing da ta :
– Linea r Regression is ea sier to visua lly understa nd a nd
ma na ge.
– N on-norm a l da ta ca n be cha nged to resem ble N orm a l da ta
for sta tistica l a na lyses w here N orma lity is required.
Di dva
Disa d nta
t ges off tra
t nsforming
f i da
d ta
t :
– Difficult to understa nd tra nsform ed units.
– Difficult w ithout a utom a tion or computers.
Majority has it that Belts find data that is abnormally distributed. We have learned doing Non-linear
Regression, but another approach is to transform it into Linear Regression. Outputs or inputs can
be transformed and many people will wonder "what'swhat s the point?”
point? Simplicity is very valuable.
Da ta tha t is a sy mmetric ca n often be tra nsform ed to m a k e it

more sym metric using a numeric function w hich opera tes more
strongly on la rge numbers tha n sma ll ones; such a s loga rithms
a nd roots.
Tra nsform Rules:
1 . Th
The tra
t nsform
f m ustt preserve the
th relal tive
ti order
d off the
th da
d tat .
2 . The tra nsform m ust be a sm ooth a nd continuous function.
3 . M ost often useful w hen ra tio of la rgest to sm a llest va lue is grea ter
tha n tw o (2 ). In m ost ca ses, the tra nsform w ill ha ve little effect
w hen this rule is viola ted.
4 . All ex terna l reference points (spec lim its, etc.) m ust use the sa m e
tra nsform .
Tra nsform a tion Pow er(p)
Cube 3
{ }
Square 2
xp
xtrans= N o Change 1
log(x) Square Root 0.5
Logarithm 0
Reciprocal Root -0.5
Reciprocal -1

455
Effect of Transformation
Using g a mathematical
Before Transform
function we have
transformed this data.
25
This wonderful example, 20
shows the simplicity of
Frequency
15
taking a square root of 10
this data and the 5
distribution became 0
Normal to our dismay;

Normal, 10 20 30 40 50
Right Skew
60 70 80 90 100
The tra nsform ed da ta
the trouble, is to find the After Transform
now show s a N orm a l
distribution.
appropriate transform 20
function. 15
Frequency
10
0
0 10 20 30 40 50 60 70 80 90 100
Sqrt
Transforming Data Using MINITABTM
The Box Cox tra nsform procedure in M IN ITABTM is a method of

determining the tra nsform pow er (ca lled “ la mbda ” in the
softw a re) for a set of da ta .
Transform.MTW
Stat>Control Charts>Box-Cox Transformation
In finding an appropriate transform MINITABTM performs a function to aid the Belt

Belt, this is known as
Box Cox Transformation.

456
Box Cox Transform
MINITABTM has selected a Box-Cox Plot of Pos skew

transform, in the upper 3.0
Lower CL Upper CL
Lambda
graph MINITABTM presents (using 95.0% confidence)

Estimate 0.337726
a lambda of .5, the lambda 2.5

Lower CL
Upper CL
0.136963
0.537207
is a mathematical function 2.0

Best Value 0.500000
StDev
applied to the data. In 1.5
taking a square root, you

can notice two probabilities 1.0
of plots in the graphs. The -1 0 1 2 3

Limit
right plot obviously shows a Before Tra nsform After Tra nsform Lambda
new data set after having 99.9

Probability Plot of Pos skew
Normal
99.9
Probability Plot of BoxCox
Normal
been transformed by the

Mean 1.050 Mean 0.9469
99 N 100 99 N 100
AD 2.883 AD 0.265
square root and the left

95 P-Value <0.005 95 P-Value 0.687
90 90
80 80
70 70
showing abnormal
Percent
Percent
60 60
50 50
40 40
x 0.50 or x
30 30
distribution with red dots

20 20
10 10
5 5
away from the blue line 1
0.1
1
0.1
-2 -1 0 1 2 3 4 5 6
symbolized by a P-value of
0.0 0.5 1.0 1.5 2.0 2.5
Pos skew BoxCox
under .05. Using the

function “Stat, Basic Statistic, Normality Test” confirmation of the change in distribution of the particular
data can be accomplished at your discretion.
Transforming Without the Box Cox Routine

Using the
“Calc.Calculator” Transform.MTW An a lterna tive method of
command in tra nsforming da ta is to use sta nda rd
MINITABTM can tra nsforms.
aid you in an The squa re root a nd na tura l log
attempt to do a tra nsform a re m ost com monly.
transformation by A disa dva nta ge of using the Box Cox
yourself. Type in tra nsforma tion is the difficulty in
a new column reversing the tra nsforma tion.
name in “Store The colum n of process da ta is in C1 ,
result in la beled Pos Sk ew . Rem em ber this
da ta w a s not norm a lly distributed a s
variable:”. If you determined w ith the Anderson
obtain a data set Da rling norm a lity test.
already next
already,
Using the M IN ITABTM ca lcula tor,
place the cursor ca lcula te the squa re root of ea ch
in the observa tion in C1 a nd store in C3 ,
“Expression:” box. ca lling it “ Squa re Root” .
Search for the
name of the function in the lower right area of the window and double click.
Before executing the transformation, make sure the word “number” is highlighted, and now within the
f
function
ti the
th new column
l shall
h ll appear iin th
the “Expression:”
“E i ”b box. Th
The ttransformed
f dddata
t will
ill show
h alongside
l id
the unchanged data, providing you clicking the “OK” button.

457
Transforming Without the Box Cox Routine
When using MINITABTM The output should resem ble this view .
for the majority of Confirm if the new da ta set found in C3 is
commands, the order of norm a lly distributed.
columns is unimportant,
Probability
ProbabilityPlot
Plotof
ofSquare
SquareRoot
moreover; if the square Normal
Normal
Root
root data set appears in

99.9
99.9 Mean 0.9469
Mean 0.9469
StDev 0.3934
99 StDev 0.3934
99 N 100
a different column it is
N 100
AD 0.265
95 AD 0.265
95 P-Value 0.687
P-Value 0.687
90
not a problem. Finding g

90
80
80
70
Percentt
70
Percent
60
that the new data is 60

50
50
40
40
30
30
Normally Distributed 20
20
10
10
5
after creating the 1

5
transformed data set 0.1

0.1
0.0 0.5 1.0 1.5 2.0 2.5
0.0 0.5 1.0 1.5 2.0 2.5
placed under the Square
SquareRoot
Root
column labeled “square

root” is a necessity. O ur tra nsform is the squa re root—the sa m e a s the Box Cox
tra nsform of la mbda = 0 .5
5 Transform.MTW
Hopefully remembering
back to the Measure Phase the “stat, basic statistics, normality test” command is now of great
importance, interestingly enough the Box Cox found the best transformation was the same square
root we executed.
Multiple Linear Regression

Regressions a re run on historica l process da ta . It is N O T a form of
ex perim enta tion.
M ultiple Linea r Regression investiga tes m ultiple input va ria bles’ effect on
a n output sim ulta neously.
– If R2 is not a s high a s desired in the Simple Linea r Regression.
– Process k now ledge im plies m ore tha n one input a ffects the output.
The a ssumptions for residua ls w ith Sim ple Regressions a re still necessa ry
for M ultiple Linea r Regressions.
An a dditiona l a ssum ption for M LR is the independence of predictors (X ’ s).
– M IN ITABTM ca n test for m ulticollinea rity (correla tion betw een the
predictors or X ’ s).
M odel error (residua ls) is im pa cted by the a ddition of m ea surem ent error
for a ll the input va ria bles.
In review, we only do Regression on historical data and Regression is not applied to experimental
data, furthermore, we covered performing Regression involving one input and one output. Now
taking into account Multiple Linear Regressions and when they are applicable, these allow us to
identify Linear Regression including one output and more than one input at the same time. If you
haven’tt identified enough of the output variation,
haven variation recall briefly R-squared measures the amount of
variation for the output in correlation with the input you selected. In looking at the equations on this
page we can assume that in Multiple Linear Regressions each input are independent of one another,
no correlation exists. Having the inputs independent of one another gives each of them their own
slope and we also see the epsilon at the end of the equation, every Regression has model error.

458
Definitions of MLR Equation Elements

The definitions for the elements of the M ultiple Linea r Regression
model a re a s follow s:
Y = β0+ β1X1 + β2X2 + β3X3 + ε
Y = The response (dependent) va ria ble.

X 1 , X 2 , X 3 : The predictor ((independent)) inputs. The predictor
va ria bles used to ex pla in the va ria tion in the observed response
va ria ble, Y.
β0 : The va lue of Y w hen a ll the ex pla na tory va ria bles (the X s) a re
equa l to zero.
β1 , β2 , β3 (Pa rtia l Regression Coefficient): The a mount by w hich
the response va ria ble (Y) cha nges w hen the corresponding X i
cha nges by one unit w ith the other input va ria bles rema ining
consta nt.
ε (Error or Residua l): The observed Y minus the predicted va lue of
Y from the Regression.
Simple linear equations and multiple linear equations are very similar, however each in Multiple
Linear Regression there is partial regression coefficient and beta one and beta zero apply to
Simple Linear Regressions. Earlier we did Regressions in this module, do you recall the residuals
we had? Residuals are defined as the observed value minus the predicted value.
MLR Step Review

The ba sic steps to follow in multiple linea r regression a re:
1 . Crea te ma trix plot (Gra ph> M a trix Plot)

2 . Run Best Subsets Regression (Sta t> Regression> Best Subsets)
3 . Eva lua te R2 , a djusted R2 , M a llow s’ Cp, number of predictors
a nd S.
S
4 . Itera tively determine a ppropria te Regression model.
(Sta t> Regression> Regression > O ptions)
5 . Ana lyze residua ls (Sta t> Regression> Regression > Gra phs)
1 . N orma lly distributed
2 . Equa l va ria nce
3 . Independence
4 . Confirm one or tw o points do not overly influence model.
6 . Verify your model by running present process da ta to confirm
your model error.
With many different input variables on hand and only one output it can be so tedious to find if
variations come from one particular input, using a Matrix Plot can greatly speed up the process and
it will show which is impacting the output the most. After narrowing the field of variables use the best
given command to complete the Multiple Linear Regression,
Regression we identify the correct command by
examining R-squared, R-squared adjustable, #’s of predictors, S variable and Mallows Cp; following
this we must iteratively confirm inputs are statistically significantly. We have then only confirmation
of this valid model and we MUST especially in consideration for Multiple Linear Regressions
process and witness the presently performing Regression.

459
Multiple Linear Regression Model Selection
W hen compa ring a nd verify ing models consider the follow ing:
1 . Should be a rea sona bly sma ll difference betw een R2 a nd R2
- a djusted (much less tha n 1 0 % difference).
2 . W hen more terms a re included in the model, does the
a djusted R2 increa se?
3 . Use the sta tistic M a llow s’ Cp. It should be sma ll a nd less
tha n the number of terms in the model.
4 . M odels w ith sma ller S (sta nda rd devia tion of error for the
model) a re desired.
5 . Simpler models should be w eighed a ga inst models w ith
multiple predictors (independent va ria bles).
6 . The best technique is to use M IN ITABTM ’s Best Subsets
comma nd.
Using “Best Subsets Regression” we will be given multiple statistics, provided by MINITABTM, it
is in our best interest to use the least confusing Multiple Linear Regression model using these
particular guidelines.
Flight Regression Example
An a irpla ne m a nufa cturer w a nted to see w ha t va ria bles

a ffect flight speed. The historica l da ta a va ila ble covered
a period of 1 0 months.
Flight Regression M LR.M TW
The MINITABTM “Flightg Regression

g MLR.MTW” needs to be openedp and we see historical data being
g
analyzed by an airplane manufacturer. Output is listed as flight speeds and the other columns contain
input variables, with these we will build a Matrix Plot and witness the possibility of relationships
among the variables come to fruition. Using the “Graph variables:” box we enter all inputs and
outputs.

460
Flight Regression Example Matrix Plot

Now we are given a
Look for plots tha t show correla tion.
fairly confusing graph
of outputs and inputs
Matrix
Matrix Plot
Plotof
of Flight
FlightSpeed,
Speed, Altitude,
Altitude, Turbine
Turbine Angl,
Angl, Fuel/Air
Fuel/Air rat,
rat,...
...
to interpret. Do not be 600 750 900 32 36 40 99 12 15
600 750 900 32 36 40 12 15
discouraged, this is O utput Response
600
600
just a plethora of Flight

FlightSpeed
Speed 500
500
sporadically plotted, 900

900
400
400
outputs and inputs, 750

750
600
Altitude
Altitude
600
flight speeds vs. 37.0
37.0
altitudes. Seeing at Turbine Angle

Turbine Angle
34.5
34.5
32.0
32.0
least two input having 40

40
correlation shows the

36 Fuel/Air ratio
36 Fuel/Air ratio
32
32
necessity to continue
19.5
19.5
18.0
18.0
with a Multiple Linear ICR

ICR 16.5
16.5
15
Regression. The 15
12
12
lower half has 9

9
Temp
Temp
p
identical data as the 400

400
500
500
600
600
32.0
32.0
34.5
34.5
37.0
37.0
16.5
16.5
18.0
18.0
19.5
19.5
upper half of the

Since 2 or m ore predictors show correla tion, run M LR. Predictors
outputs just the axis
are not reversed.
Flight Regression Example Best Subsets
Best Subsets Regression: Flight Speed versus Altitude,

Turbine Angl, ...
Response is Flight Speed
F
T u
u e
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e
Mallows d l i C m
Vars R-Sq R-Sq(adj) C-p S e e o R p
1 72.1 71.1 38.4 28.054 X
1 39.4 37.2 112.8 41.358 X
2 85.9 84.8 9.0 20.316 X X
2 82
82.0
0 80
80.6
6 17
17.9
9 22
22.958
958 X X
3 87.5 85.9 7.5 19.561 X X X
3 86.5 84.9 9.6 20.267 X X X
4 89.1 87.3 5.7 18.589 X X X X
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
In MINITABTM using “Best Subsets Regression” command is efficient and powerful by loading all inputs
to a single output; we use the “Free predictors:” box and place all inputs of interest inside it. This
particular command can be helpful in other circumstances,
circumstances however,
however now by placing the output column
of data in the “Response:” box it should be on the right of your screen. This is very simple, evaluation is
done and results are given to you in rows; 1st column - # of variables, 2nd column - R squared, 3rd
column - R squared adjusted, 4th column is mallows Cp, 5th column - Standard Deviation of the model
error and finally the 6th column - input variables.

461
Flight Regression Example Model Selection
Best Subsets Regression: Flight Speed versus

Altitude, Turbine Angl, ...
T
F
u
List of a ll the
u e Predictors (X ’s)
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e W ha t model w ould you select?
Mallows d l i C m
Vars R-Sq R-Sq(adj) C-p S e e o R p
1 72.1 71.1 38.4 28.054 X Let’s consider the 5 predictor m odel:
1
2
39.4
85.9
37.2
84.8
112.8
9.0
41.358
20.316
X
X X
• Highest R-Sq(a dj)
2 82.0 80.6 17.9 22.958 X X • Low est M a llow s Cp
3
3
87.5
86.5
85.9
84.9
7.5
9.6
19.561
20.267 X
X X
X
X
X
• Low est S
4 89.1 87.3 5.7 18.589 X X X X • How ever there a re m a ny term s.
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
In choosing the correct model our attention goes to the bottom5 term Linear Regression. Are they
all statistically significant?
Stat>Regression>Regression>Options
Let’s go back to “Stat>Regression>Regression” again and click on the “Options” button. Place all
outputs in the “Response:” box and the inputs in the “Predictors:” box.

462
Flight Regression Example Model Selection
Regression Analysis: Flight Speed versus Altitude, Turbine Angle, ...

Flight Speed = 770 + 0.153 Altitude + 5.81 Turbine Angle + 8.70 Fuel/Air ratio
- 52.3 ICR + 4.11 Temp
Predictor Coef SE Coef T P VIF The VIF for tem p indica tes it
Constant 770.4 229.7 3.35 0.003
should be rem oved from the
Altitude 0.15318 0.06605 2.32 0.030 2.3
Turbine Angle 5.806 2.843 2.04 0.053 1.4 m odel. Go ba ck to the Best
F l/Ai ratio
Fuel/Air ti 8
8.696
696 3
3.327
327 2
2.61
61 0
0.016
016 3
3.2
2 Subsets a na ly sis a nd select
ICR -52.269 6.157 -8.49 0.000 2.6
the best m odel tha t does not
Temp 4.107 3.114 1.32 0.200 5.4
include the predictor tem p.
S = 18.3088 R-Sq = 89.9% R-Sq(adj) = 87.7%
Va ria nce Infla tion Fa ctor (VIF) detects correla tion a m ong
predictors.
• VIF = 1 indica tes no rela tion a mong g predictors
p
• VIF > 1 indica tes predictors a re correla ted to som e degree
• VIF betw een 5 a nd 1 0 indica tes regression coefficients a re
poorly estim a ted a nd a re una ccepta ble.
Do you notice any similarities here? A foreign column has appeared, labeled VIF, this appears if a
high correlation among inputs exists. Temp has a high VIF, so we will remove it.
Best Subsets Regression: Flight Speed versus

Altitude, Turbine Angl, ...

N ote: It is not necessa ry to
F
T u re-run the Best Subsets
u
r
e
l
a na ly sis. The num bers do
b / not cha nge.
i A
A n i
l e r
t
i A r
t n a T
u g t I e
Vars R-Sq R-Sq(adj)

Mallows
C-p S
d l
e e
i C m
o R p
Select a m odel w ith 4
1 72.1 71.1 38.4 28.054 X term s beca use Tem p
1
2
39.4
85.9
37.2
84.8
112.8
9.0
41.358
20.316
X
X X
w a s rem oved a s a
2 82.0 80.6 17.9 22.958 X X predictor since it ha d
3 87.5 85.9 7.5 19.561 X X X
3 86.5 84.9 9.6 20.267 X X X correla tion w ith the
4
4
89.1
88.1
87.3
86.1
5.7
8.2
18.589
19.481
X
X
X X
X
X
X X
other va ria bles.
5 89
89.9
9 87
87.7
7 6
6.0
0 18
18.309
309 X X X X X Re-run
Re run the regression.
regression
To start step four we want to take into account the Regression Model that does not include TEMP.
We have satisfied the Best Subsets model; we need not rerun this command.

463
Flight Regression Example Model Selection (cont.)
Regression Analysis: Flight Speed versus Altitude

Altitude, Turbine Angle
Angle, ...

Flight Speed = 616 + 0.117 Altitude + 6.70 Turbine Angle + 12.2 Fuel/Air ratio
- 48.2 ICR
The VIF va lues a re N O W
a ccepta ble.
Predictor Coef SE Coef T P VIF
Constant 616.1 200.7 3.07 0.005
Altitude 0.11726 0.06109 1.92 0.067 1.9 Eva lua te the p-va lues.
Turbine Angle 6.702 2.802 2.39 0.025 1.3 • If p > 0 .0 5 , the term (s)
Fuel/Air ratio 12.151 2.082 5.84 0.000 1.2
ICR -48.158 5.391 -8.93 0.000 1.9 should be rem oved from
tthe
e regression.
eg ess o
S = 18.5889 R-Sq = 89.1% R-Sq(adj) = 87.3%
Rem ove a ltitude, re-run m odel.
In removing Temp, we rerun “stat>regression>regression” command and choose four terms

remaining. No temp in the box, we want 95% confidence and four are remaining of the terms, rerun
to Multiple Linear Regression proceeding the removal of Altitude.
Regression Analysis: Flight Speed versus Turbine Angl, Fuel/Air rat, ICR

Flight Speed = 887 + 4.82 Turbine Angle + 12.1 Fuel/Air ratio - 55.0 ICR
Predictor Coef SE Coef T P VIF

Constant 886.6 150.4 5.90 0.000
Turbine Angle 4.822 2.763 1.75 0.093 1.1
Fuel/Air ratio 12.106 2.191 5.53 0.000 1.2
ICR -55.009 4.251 -12.94 0.000 1.1
S = 19.5613 R-Sq = 87.5% R-Sq(adj) = 85.9%
Re-run the
The P-va lue for Turbine
Angle now indica tes it
Regression
should be rem oved a nd re-
run the Regression
beca use p > 0 .0 5
Here we have removed Altitude from the “Predictors:” box and the Regression
g output
p now
shows the Turbine Angle is not statistically significant.

464
Flight Regression Final Regression Model
Regression Analysis: Flight Speed versus Fuel/Air ratio, ICR

Flight Speed = 1101 + 10.9 Fuel/Air ratio - 55.2 ICR
This is the fina l Regression model
Predictor Coef SE Coef T P VIF beca use a ll rema ining terms a re
Constant 1101.04 90.00 12.23 0.000
sta tistica lly significa nt (w e w a nted
Fuel/Air ratio 10.921 2.163 5.05 0.000 1.1
ICR -55.197 4.414 -12.51 0.000 1.1 9 5 % confidence or P-va lue of less
tha n 0 .0 5 ) a nd the R-Sq show s the
S = 20.3162 R-Sq = 85.9% R-Sq(adj) = 84.8% rema ining terms ex pla in 8 5 % of
the va ria tion of flight
g speed.
p
Source DF SS MS F P
Regression 2 65500 32750 79.35 0.000
Residual Error 26 10731 413 Consider removing this
Total 28 76231 N ote the ICR outlier but be ca reful, this
predictor a ccounts is historica l da ta tha t ha s
Source DF Seq SS for 8 4 .7 % of the
Fuel/Air ratio 1 951 no further informa tion.
ICR 1 64549
va ria tion.
8 4 .7 % = Remember, the objective
Unusual Observations 64549/ 76231 is to get informa tion tha t
Fuel/Air Flight ca n be used in a
Obs ratio Speed Fit SE Fit Residual St Resid
1 40.6 618.00 624.29 11.55 -6.29 -0.38 X
Designed Ex periment
22 36.3 578.00 524.45 5.43 53.55 2.74R w here true ca use a nd
R denotes an observation with a large standardized residual. effect rela tionships ca n
X denotes an observation whose X value gives it large influence. be esta blished.
Shown here is the entire Regression output for a complete discussion of the final Multiple Linear
Regression model. We have 2 predictor variables and all are statistically significant.
Flight Regression Example Residual Analysis
Now having a final model, it is VITAL to confirm the residuals are correct and the model is valid. How do
we do this? Graph and appropriate commands to analyze.

465
Flight Regression Example Residual Analysis (cont.)
Residual
Residual Plots
Plots for
for Flight
FlightSpeed
Speed
Normal
NormalProbability
Probability Plot
Plot of
of the
the Residuals
Residuals Residuals
ResidualsVersus
Versusthe
theFitted
Fitted Values
Values
99
99
Residual
90 22
90
Percent
Percent
50
Standardized
50
00
10
10
11 -2
-2
-3.0
-3.0 -1.5
-1.5 0.0
0.0 1.5
1.5 3.0
3.0 450
450 500
500 550
550 600
600 650
650
Standardized
Residual Fitted
FittedValue
Value
Histogram
Histogramof
of the
the Residuals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
of the
theData
Data
88
Residual
22
66
Frequency
Frequency
Standardized
44
00
22
00 -2
-2
-2
-2 -1
-1 00 11 22 22 44 66 88 10
10 12
12 14
14 16
16 18
18 20
20 22
22 24
24 26
26 28
28
Standardized
ObservationOrder
Order
• N orma lly distributed residua ls (N orma l Proba bility Plot)

• Equa l va ria nce (Residua ls vs. Fitted Va lues)
• Independence (Residua ls vs. O rder of Da ta )
It appears our model is valid and the residuals are satisfactory!
Notes

466
Perform Non-Linear Regression Analysis
Perform Multiple Linear Regression Analysis (MLR)
Examine Residuals Analysis and understand its effects
You have now completed Improve Phase – Advanced Process Modeling.
Notes

467
Lean Six Sigma

Black Belt Training
Improve Phase
Designing Experiments
Now we are going to continue with the Improve Phase “Designing Experiments”.

468
Overview
Within this
module we W
W elcom
elcomee to
to Im
Improve
prove
will provide an
introduction to
Process
Process M
Modeling:
odeling: Regression
Regression
Design of
Experiments, Adva
Advanced
nced Process
Process M
M odeling:
odeling:
explain what M LR Reasons
Reasons for
for Experiments
Experiments
M LR
they are, how
they work and Designing
D i i
Designing Ex
E
Experim
i ents
perim t
ents Graphical
G hi l Anal
Graphical Analysis
A l sis
Analysisi
when to use
them. DOE
DOEMethodology
Methodology
Ex
Ex perim
perimenta
entall M
Methods
ethods
Full
Full Fa
Factoria
ctoriall Ex
Experim ents
periments
Fra
Fractiona
ctionall Fa
Factoria
ctoriall
Ex
Ex perim
periments
ents
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Project Status Review
• Understa nd our problem a nd it’s impa ct on the business.

(Define)
• Esta blished firm objectives/ goa ls for improvement. (Define)
• Q ua ntified our output cha ra cteristic.
cteristic (Define)
• Va lida ted the m ea surem ent system for our output
cha ra cteristic. (M ea sure)
• Identified the process input va ria bles in our process.
(M ea sure)
• N a rrow ed our input va ria bles to the potentia l “ X ’s”
s through
Sta tistica l Ana lysis. (Ana lyze)
• Selected the vita l few X ’s to optimize the output response(s).
(Improve)
• Q ua ntified the rela tionship of the Y’ s to the X ’s w ith Y=f(x ).
((Improve)
p )

469
Six Sigma Strategy
iers Cu ts
Suppl st o pu SIPO C
O
m n
ut
ersI
VO C
Con
pu
Project Scope
trac Emplo
tors yees
st
P-M a p, X Y, FM EA
(X1) (X2) (X3) (X4) (X8) (X11) (X9) Ca pa bility
(X6) (X7) (X5) (X10)
Box Plot, Sca tter
(X3) (X4) (X1) (X11) Plots, Regression
(X5) (X8)
(X2)
Fra ctiona l Fa ctoria l

Full Fa ctoria l
(X5) (X3) Center Points
(X11)
(X4)
This is reoccurring awareness. By using tools we filter the variables of defects. When talking of the
Improve Phase in the Six Sigma methodology we are confronted by many designed experiments;
transactional, manufacturing, research.
Reasons for Experiments

The Ana lyze Pha se na rrow ed dow n the ma ny inputs to a critica l
few , now it is necessa ry to determine the proper settings for the
vita l few inputs beca use:
– The vita l few potentia lly ha ve intera ctions.
– The vita l few w ill ha ve preferred ra nges to a chieve optim a l results.
– Confirm ca use a nd effect rela tionships a m ong fa ctors identified in
a na ly ze pha se (e.g. regression)
Understa nding the rea son for a n ex periment ca n help in selecting
the design a nd focusing the efforts of a n ex periment.
Rea sons for ex perimenting a re:
– Problem Solving (Improving a process response)
– O ptimizing (Highest yield or low est customer compla ints)
– Robustness (Consta nt response time)
– Screening (Further screening of the critica l few to the vita l
few X ’s)
Design
g where y
you’re g
going
g - be sure y
you g
get there!
Designs of Experiments help the Belt to understand the cause and effect between the process
output or outputs of interest and the vital few inputs. Some of these causes and effects may include
the impact of interactions often referred to synergistic or cancelling effects.

470
Desired Results of Experiments

Designed
experiments Problem Solving
allows us to – Eliminate defective products or services.
describe a – Reduce cycle time of handling transactional processes.
mathematical O ptim izing
relationship
– Mathematical model is desired to move the process response.
between the
inputs and – Opportunity to meet differing customer requirements (specifications or
VOC)
VOC).
outputs.
t t
However, often Robust Design
the mathematical – Provide consistent process or product performance.
equation is not – Desensitize the output response(s) to input variable changes including
necessary or used N OISE variables.
depending on the – Design processes knowing which input variables are difficult to maintain.
focus of the Screening
experiment. – Past
P t process data
d t is i limited
li it d or statistical
t ti ti l conclusions
l i prevented
t d good
d
narrowing of critical factors in Analyze Phase
When it rains it PORS!
DOE Models vs. Physical Models

Here we have models that are the result of designed experiments. Many have difficulty determining
DOE models from that of physical models. A physical model includes: biology, chemistry, physics
and usually many variables, typically using complexities and calculus to describe. The DOE model
doesn’t include any variables or complex calculus: it includes most important variables and shows
variation of data collected. DOE will focus on the specific region of interest.
W ha t a re the differences betw een DO E modeling a nd

physica l models?
– A Phy sica l model is k now n by theory using concepts of
physics, chemistry , biology, etc...
– Physica l models ex pla in outside a rea of immedia te project
needs a nd include more va ria bles tha n typica l DO E models.
– DO E describes only a sm a ll region of the ex perimenta l
spa ce.
ce
The objective is to
minimize the response.
The physica l m odel is
not important for our
business objective. The
DO E M odel will focus in
the region of interest.

471
Definition for Design of Experiments
Design
D i off E
Ex perim t (DO E) is
i ents i a scientific
i tifi method
th d off Design of Experiment
pla nning a nd conducting a n ex periment tha t w ill yield shows the cause and effect
the true ca use-a nd-effect rela tionship betw een the X relationship of variables of
va ria bles a nd the Y va ria bles of interest. interest X and Y. By way of
input variables, designed
DO E a llow s the ex perimenter to study the effect of ma ny
input va ria bles tha t m a y influence the product or process experiments have been
simulta neously, a s w ell a s possible intera ction effects (for noted within the Analyze
ex a mple synergistic effects). Phase then are executed in
the Improve Phase. DOE
The end result of ma ny ex periments is to describe the tightly controls the input
results a s a m a thema tica l function.
variables and carefully
y = f (x )
monitors the uncontrollable
The goa l of DO E is to find a design tha t w ill produce the variables.
inform a tion required a t a minimum cost.
P
Properly
l designed
d i d DOE’s
DOE’ are more efficient
ffi i experiments.
i
One Factor at a Time is NOT a DOE
Let’s assume a Belt has O ne Fa ctor a t a Time (O FAT) is a n ex perimenta l style but not a
found in the Analyze Phase pla nned ex perim ent or DO E.
that p
pressure and The ggra phic
p show s yield
y contours for a process
p tha t a re
temperature impact his unk now n to the ex perim enter.
Trial Temp Press Yield
process and no one knows Yield Contours Are 1 125 30 74
what yield is achieved for the Unknown To Experimenter 75 2 125 31 80
3 125 32 85
possible temperature and 4 125 33 92
pressure combinations. 80 5 125 34 86
6 130 33 85
ure (psi)
7 120 33 90
If a Belt inefficiently did a One 135
6
85
Factor at a Time experiment 130

Pressu
90
125
1 2 3 4 5 Optimum identified
(referred to as OFAT), one 95 with OFAT
120
variable would be selected to 7
change first while the other

True Optimum available
variable is held constant, 30 31 32 33 34 35 with DOE
once the desired result was Temperature (C)
observed, the first variable
is set at that level and the second variable is changed. Basically, you pick the winner of the
combinations tested.
The curves shown on the graph above represent a constant process yield if the Belt knew the
theoretical relationships of all the variables and the process output of pressure. These contour lines
are familiar if you’ve ever done hiking in the mountains and looked at an elevation map which shows
contours of constant elevation. As a test we decided to increase temperature to achieve a higher
yield. After achieving a maximum yield with temperature, we then decided to change the other factor,
pressure. We then came to the conclusion the maximum yyield is near 92% because it was the highest
p g
yield noted in our 7 trials.
With the Six Sigma methodology, we use DOE which would have found a higher yield using
equations. Many sources state that OFAT experimentation is inefficient when compared with DOE
methods. Some people call it hit or miss. Luck has a lot to do with results using OFAT methods.
472
Types of Experimental Designs
DOE is iterative in The m ost common ty pes of DO E’s a re:

nature and may require
– Fra ctiona l Fa ctoria ls
more than one
experiment at times. • 4 -1 5 input va ria bles
– Full Fa ctoria ls
As we learn more about • 2 -5 input va ria bles
the important variables, – Response Surfa ce M ethods (RSM )
our approach will • 2 -4 input va ria bles
change as well. Iff we Response
have a very good Surfa ce
understanding of our Full Fa ctoria l
process maybe we will Fra ctiona l Fa ctoria ls

only need one
experiment, if not we
very well may need a
series of experiments
experiments.
Fractional Factorials or screening designs are used when the process or product knowledge is low.
We may have a long list of possible input variables (often referred to as factors) and need to screen
them down to a more reasonable or workable level.
Full Factorials are used when it is necessary to fully understand the effects of interactions and when
there are between 2 to 5 input variables.
Response surface methods (not typically applicable) are used to optimize a response typically when
the response surface has significant curvature.
Value Chain
DOE is iterative in Generally

noted is 2 to the k and k is The genera l nota tion used to designa te a full fa ctoria l
number of input variables or design is given by:
factors and 2 is the number of
levels all factors used. If the
experiment called for 3
factors, each with levels, it
would be 2 cubed designs; as • W here k is the number of input va ria bles or fa ctors.
the number of experimental – 2 is the number of “ levels” tha t w ill be used for ea ch
runs are shown
h b
by th
the MATH fa ctor.
denoted. Two levels and four
• Qua ntita tive or qua lita tive fa ctors ca n be used.
factors are shown at the
bottom of our slide; by using
the notation, how many runs
would be involved in this
design? 16 is the answer, of
course.

473
Visualization of 2 Level Full Factorial
Let s consider a 2 squared

Let’s
design which means we have 2 600 (-1,+1) (+1,+1)
300
levels for 2 factors. The factors Temp
of interest are temperature and 350
2
pressure. There are several 2 Press
500
ways to visualize this 2 level Press
Full Factorial design. In 600 500
experimenting we often use Uncoded levels for factors (-1,-1) (+1,-1)
what’s called coded variables. 300F Temp 350F

Coding simplifies the notation.
The low level for a factor is T P T*P Four ex perimenta l runs:
-1 -1 +1 • Tem p = 3 0 0 , Press = 500
minus one, the high level is plus
+1 -1 -1 • Tem p = 3 5 0 , Press = 500
one. Coding is not very friendly
-1 +1 -1 • Tem p = 3 0 0 , Press = 600
when trying to run an +1 +1 +1
experiment so we use uncoded • Tem p = 3 5 0 , Press = 600
Coded levels for factors
or actual variable levels
levels. In our
example 300 degrees is the low
level, 500 degrees is the high level for temperature.
Back when we had to calculate the effects of experiments by hand it was much simpler to use coded
variables. Also when you look at the prediction equation generated you could easily tell which
variable had the largest effect. Coding also helps us explain some of the math involved in DOE.
Fortunately for us,

us MINITABTM calculates the equations for both coded and uncoded data
data.
Graphical DOE Analysis - The Cube Plot
The representation
Consider a 2 3 design on a ca ta pult...
here has two cubed
designs and 2
8.2 4.55 A B C Response
levels of three
factors and shows Run Start
Number Angle
Stop
Angle Fulcrum
Meters
Traveled
a treatment 3.35 1.5 1 -1 -1 -1 2.10
combination table 2 1 -1 -1 0.90
using coded inputs
Stop Angle
3 -1 1 -1 3.35
level settings. The 5.15 2.4
4 1 1 -1 1.50
table has 8 5 -1 -1 1 5.15
6 1 -1 1 2.40
experimental runs. Fulcrum
7 -1 1 1 8.20
Run 5 shows start
8 1 1 1 4.55
angle, stop angle 2.1 Start Angle 0.9
very low and the

fulcrum relatively W ha t a re the inputs being m a nipula ted in this design?
high. How m a ny runs a re there in this ex periment?

474
Graphical DOE Analysis - The Cube Plot (cont.)
MINITABTM generates
Stat>DOE>Factorial>Factorial Plots … Cube, select response and factors
various plots, the cube plot is
one. Open the MINITABTM This gra ph is used by the ex perimenter to visua lize how the
response da ta is distributed a cross the ex perimenta l spa ce.
worksheet “Catapult.mtw”.
Cube
CubePlot
Plot(fitted
(fittedmeans)
means)for
forDistance
Distance
This cube plot is a 2 cubed How do you rea d
design for a catapult using or interpret this 8.20
8.20 4.55
4.55
plot?
three variables:
Start Angle 3.35
3.35 1.50
1.50
11
Stop Angle
Fulcrum
W ha t a re
Stop
StopAngle 5.15 2.40
Here we used coded variable these? 11
Angle 5.15 2.40
level settings so we do not Fulcrum

Fulcrum
2.10 0.90
know what the actual -1
-1
2.10 0.90
-1
-1
-1 11
process setting were in -1
St
Start
t Angle
StartAAngle
l
uncoded units. The data Catapult.mtw
means for the response

distances are the boxes on the corners of the cube. If we set the stop angle high, start angle low and
fulcrum high we would expect to launch a ball about 8.2 meters with the catapult.
Make sense?
Graphical DOE Analysis - The Main Effects Plot
This gra ph is used to see the rela tive effect of ea ch fa ctor

on the output response.
Hint: Check the slope!

p
Main Effects Plot (data means) for Distance Main Effects Plot (data means) for Distance Main Effects Plot (data means) for Distance
5.0 5.0 5.0
4.5 4.5 4.5

Mean of Distance
Mean of Distance
Mean of Distance
4.0 4.0 4.0
3.5 3.5 3.5
3.0 3.0 3.0
2.5 2.5 2.5
2.0 2.0 2.0

-1 1 -1 1 -1 1
Start Angle Stop Angle Fulcrum
Stat>DOE>Factorial>Factorial Plots … Main Effects, select response and factors
W hich fa ctor ha s the la rgest impa ct on the output?
The Main Effects Plots shown here display the effect that the input values have on the output
response.
The y axis is the same for each of the plots so they can be compared side by side.
Which has the steepest Slope? What has the largest impact on the output?
Answer: Fulcrum
475
Main Effects Plot’s Creation
Avg Distance at Low Setting of Start Angle: 2.10 + 3.35 + 5.15 + 8.20 = 18.8/4 = 4.70
Main Effects Plot (data means) for Distance
-1 1 -1 1 -1 1
5.2
4.4
Dist
3.6
28
2.8
2.0
Start Angle Stop Angle Fulcrum
Avg. distance at High Setting of Start Angle: 0.90 + 1.50 + 2.40 + 4.55 = 9.40/4 = 2.34
Run # Start Angle Stop Angle Fulcrum Distance
1 -1 -1 -1 2.10
2 1 -1 -1 0.90
3 -1 1 -1 3.35
4 1 1 -1 1.50
5 -1
1 -1
1 1 5 15
5.15
6 1 -1 1 2.40
7 -1 1 1 8.20
8 1 1 1 4.55
In order to create the Main Effects Plot we must be able to calculate the average response at the low
and high levels for each Main Effect. The coded values are used to show which responses must be
used to calculate the average.
Let’s review what is happening here. How many experimental runs were operated with the start angle
at the high level or 1. The answer is 4 experimental runs shows the process to run with the start angle
at the high level. The 4 experimental runs running with the start angle at the high level are run number
2, 4, 6 and 8. If we take the 4 distances or process output and take the average, we see the average
distance when the process had the start angle running at the high level was 2.34 meters. The second
dot from the left in the Main Effects Plots shows the distance of 2.34 with the start angle at a high
level.
Interaction Definition
Intera ctions occur w hen va ria bles a ct together to impa ct the output of
the process. Intera ctions plots a re constructed by plotting both va ria bles
together on the sa m e gra ph. They ta k e the form of the gra ph below .
N ote tha t in this gra ph, the rela tionship betw een va ria ble “ A” a nd Y
cha nges a s the level of va ria ble “ B” cha nges. W hen “ B” is a t its high (+)
level, va ria ble “ A” ha s a lm ost no effect on Y. W hen “ B” is a t its low (-)
level, A ha s a strong effect on Y. The fea ture of intera ctions is non-
pa ra llelism betw een the tw o lines.
Higher
B-
Y
W hen B cha nges
from low to high,
utput
the output drops

Ou
W hen
h B cha
h nges dra ma tica lly.
lly
from low to high,
B+
the output drops
Lower
very little.
- A +

476
Degrees of Interaction Effect

Degrees
g of interaction can
Some Interaction N o Interaction Full Reversal
be related to non-
High High
parallelism and the more B- High
B-
B-
non-parallel the lines are
B+
B+
the stronger the Y Y B+ Y
interaction.
B+
Low Low Low
A common - A + - A + - A +
misunderstanding is that Strong Interaction Moderate Reversal
the lines must actually High
B- High
B-
cross each other for an
interaction to exist but Y Y
that’s NOT true. The lines
B+
may cross at some level B+ B+
Low Low
OUTSIDE of the - A + - A +
experimental region, but
we really don’t know that.
Parallel lines show absolutely no interaction and in all likelihood will never cross.
Interaction Plot Creation
Calculating the points

Interaction Plot (data means) for Distance
to plot the interaction Start Angle
6.5
is not as straight -1
1
forward as it was in 5.5
the Main Effects Plot. 4.5

Mean
Here we have four 3.5
points to plot and 2.5

since there are only 8 (4.55 + 2.40)/ 2 = 3.48
1.5
data points each
(0.90 + 1.50)/ 2 = 1.20
average will be -1
Fulcrum 1
created using data

Run # Start Angle Stop Angle Fulcrum Distance
points from 2 1 -1 -1 -1 2.10
experimental runs. 2 1 -1 -1 0.90
3 -1 1 -1 3.35
This plot is the 4 1 1 -1 1.50
interaction of Fulcrum 5 -1 -1 1 5.15
6 1 -1 1 2.40
with Start Angle on the 7 -1 1 1 8.20
distance. Starting with 8 1 1 1 4.55
the point indicated
with the green arrow above we must find the response data when the fulcrum is set low and start
angle is set high (notice the color coding MINITABTM uses in the upper right hand corner of the plot for
the second factor). The point indicated with the purple arrow is where fulcrum is set high and start
angle is high. Take a few moments to verify the remaining two points plotted.
Let’s review what is happening here. The dot indicated by the green arrow is the mean distance when
the fulcrum is at the low level as indicated by a -1 and when the start angle is at the high level as
i di t d b
indicated by a 1
1. EEarlier
li we said
id th
the point
i t iindicated
di t d bby th
the green arrow h
had
d th
the ffulcrum
l att th
the llow
level and the start angle at the high level. Experimental runs 2 and 4 had the process running at those
conditions so the distance from those two experimental runs is averaged and plotted in reference to a
value of 1.2 on the vertical axis. You can note the red dotted line shown is for when the start angle is
at the high level as indicated by a 1.
477
Graphical DOE Analysis - The Interaction Plots
Based on how many

Stat>DOE>Factorial>Factorial Plots … Interactions, select response and factors
factors you select
MINITABTM will create W hen you select more tha n tw o va ria bles, M IN ITABTM genera tes
a number of a n Intera ction Plot M a trix w hich a llow s you to look a t intera ctions
interaction plots. sim ulta neously. The plot a t the upper right show s the effects of
Sta rt Angle on Y a t the tw o different levels of Fulcrum. The red
Here there are 3 line show s the
factors selected so it effects of Interaction
InteractionPlot
Plot(data
(datameans)
means)for
forDistance
Distance
generates
t the
th 3 Fulcrum on Y -1-1
1
Start
11 -1-1
1 11
Start 66
interaction plots. w hen Sta rt AAngle

ngle
-1-1
These are referred to Angle is a t its Star t AAngle

Start ngle 11 44
as 2-way interactions. high level. The 22
Stop
bla ck line Stop
AAngle
ngle
66
-1-1
represents the Stop 11
StopAAngle
ngle
44
effects of 2
2
Fulcrum on Y
w hen Sta rt Fulcr um
Fulcrum
Angle is a t its
low level.
MINITABTM will also plot the mirror images, just in case it is easier to interpret with the variables
flipped. If you care to create the mirror image of the interaction plots, while creating interaction plots,
click on “Options”
p and choose “Draw full interaction pplot matrix” with a checkmark in the box. These
mirror images present the same data but visually may be easier to understand.
Stat>DOE>Factorial>Factorial Plots … Interactions, select response and factors

The plots a t the low er left in the gra ph a bove (outlined in blue) a re the
“ m irror im a ge” plots of those in the upper right. It is often useful to look
a t ea ch intera ction in both representa tions.
Interaction
InteractionPlot
Plot(data
(datameans)
means)for
forDistance
Distance
-1-1 11
Start
Start
66
AAngle
ngle
-1-1
44
Star t AAngle
ngle 11
Start
22
Stop
Stop
66
AAngle
ngle
-1-1
44
Stop
St 11
StopAAngle
ngle
l
22 Choose this option

6
6
Fulcrum
Fulcrum
-1-1
for the a dditiona l
4
4 Fulcr um
Fulcrum
11
plots.
2
2
-1 1 -1 1
-1 1 -1 1

478
DOE Methodology
1 . Define the pra ctica l problem

2 . Esta blish the ex perimenta l objective
3 . Select the output (response) va ria bles
4 . Select the input (independent) va ria bles
5 . Choose
Ch th
the levels
l l for
f the
th input
i t va ria
i bles
bl
6 . Select the ex perimenta l design
7 . Ex ecute the ex periment a nd collect da ta
8 . Ana lyze the da ta from the designed ex periment a nd
dra w sta tistica l conclusions
9 . Dra w p
pra ctica l solutions
1 0 .Replica te or va lida te the ex perimenta l results
1 1 .Implement solutions
Generate Full Factorial Designs in MINITABTM
It is easy to
generate full
factorial designs in
MINITABTM.
Follow the
command path
shown here.
These are the
designs that
MINITABTM will
create. They are
color coded using
th R
the Red,
d Y
Yellow
ll
and Green. Green
are the “go”
designs, yellow
are the “use
caution” designs
and red are the
“stop,
stop, wait and
think” designs. It
has a similar
meaning as do
street lights.

479
Create Three Factor Full Factorial Design
Stat>DOE>Factorial>Create Factorial Design
Let’s create a three factor full factorial design using the MINITABTM command shown at the top of the
graphic above. This design we selected will give us all possible experimental combinations of 3 factors
using 2 levels for each factor.
factor
Be sure to have changed the number of factors as seen in the upper left to “3”. Also be sure not to forget
to click on the “Full factorial” line within the Designs box shown in the lower right of the graphic.
In the “Options” box

of the upper left
MINITABTM display,
display
one can change the
order of the
experimental runs.
To view the design in
standard order (not
randomized for now)
be sure to uncheck
the default of
“Randomize runs” in
the “Options” tab.
“Un-checking”
means no
checkmark is in the
white box next to
“Randomize
Randomize runs
runs”.

480
Create Three Factor Full Factorial Design (cont.)

Enter the names of the
three factors as well as
the numbers for the
levels shown in the
lower right portion of
this graphic. To reach
this display, click on
“Factors…” in the upper
left hand display.
display
Remember when we
discussed uncoded
levels? The process
settings of 140 and 180
for the start angle are
examples of uncoded
levels.
Three Factor Full Factorial Design

Here is the worksheet
MINITABTM creates. If you
had left the randomize runs
selection checked in the
Options box, your design
would be in a different
order than shown. Notice
the structure of the last 3
columns where the factors
are shown. The first factor,
start angle
angle, goes from low
to high as you read down
the column. The second
factor, stop angle, has 2
low then 2 high all the way
down the column and the
third factor, fulcrum, has 4
low then 4 high.g Notice the structure jjust keeps
p doublingg the p
pattern. If we had created a 4 factor full
factorial design the fourth factor column would have had 8 rows at the low setting then 8 rows at the
high setting. You can see it is very easy to create a full factorial design. This standard order as we
call it is not however the recommended order in which an experiment should be run. We will discuss
this in detail as we continue through the modules.
One warning to you as a new Belt to using MINITABTM. Never copy, paste, delete or move columns
within the first 7 columns or MINITABTM may not recognize the design you are attempting to use.
Is our experiment done? Not at all. The process must now be run at the 8 experimental set of
conditions shown above and the output or outputs of interest must be recorded in columns to the
right of our first 7 columns shown. After we have collected the data we will then analyze the
experiment. Remember the 11 Step DOE methodology from earlier?

481
Determine the reason for experimenting
Describe the difference between a physical model and a

DOE model
Explain an OFAT experiment and its primary weakness
Shown Main Effects Plots and interactions, determine

which effects and interactions may be significant
Create a Full Factorial Design
You have now completed Improve Phase – Designing Experiments.
Notes

482
Lean Six Sigma

Black Belt Training
Improve Phase
Experimental Methods
Now we will continue with the Improve Phase “Experimental Methods”.

483
Within this module

we will go through Welcome
Welcome to
to Improve
Improve
a basic introduction
to Designing
Process
Process Modeling:
Modeling: Regression
Regression
Experiments
Advanced
Advanced Process
Process Modeling:
Modeling:
MLR
MLR
Designing
Experiments Methodology
Methodology
Experimental
Methods Considerations
Considerations
Full
Full Factorial
Factorial Experiments
Experiments Steps
Steps
Fractional
Fractional Factorial
pp
Experiments
Wrap
Wrap Up
Up &
& Action
Action Items
Items
DOE Methodology
In this module we will describe the 11 step DOE methodology some basic concepts and lots of
fun and exciting terminology. Once again great content for dinner conversation later tonight!
1 . Define the Pra ctica l Problem

2 . Esta blish the Ex perim enta l O bjective
3 . Select the O utput (response) Va ria bles
4 . Select the Input (independent) Va ria bles
5 . Choose the Levels for the input va ria bles
6 . Select the Ex perimenta l Design
7 . Ex ecute the ex perim ent a nd Collect Da ta
8 . Ana lyze the da ta from the Designed Ex periment a nd
dra w Sta tistica l Conclusions
9 . Dra w Pra ctica l Solutions
1 0 .Replica te or va lida te the ex perimenta l results
1 1 .Im
Im plem ent Solutions

484
Questions to Design Selection
Project Management Considerations
What is the process environment:

1. How much access to the process?
2. Are the team members fully involved and any subject matter experts?
3. Who are the process owners and stakeholders?
4. Are the process owners involved?
5. Do the process owners know what a DOE is ?
6. Do the process owners know what the DOE means to them?
7. How many runs can you afford (time and money)?
8. Will you run the DOE at the process or in a lab?
9. What noise variables need to be designed around?
10. How large of an experimental region will be explored for the DOE?
So you’ve decided to use Designed Experiments. Shown here are 10 basic project management
considerations before running any experiment. This is obviously not an exhaustive list, but certainly
some important questions to consider and answer.
What is behind some of these questions? Let’s briefly discuss a few aspects individually.
1.Access to a process is necessary for proper monitoring and execution of a project. If restricted
access for whatever reason exists, then work around must exist.
2.If the team members or subject matter experts aren’t fully involved, then potential conflicts or
unrealistic designs may be awaiting you for a poor experiment.
3.If the Process Owners and stakeholders are unknown to you before execution of an experiment rude
awakenings such as cancellations, scheduling conflicts and other nightmares can occur.
4 No one wants to be told what will happen to the process they are managing so if you don’t involve
4.No
them in the experimental design even if it involves reviewing the team’s designed experiment, how do
you expect cooperation?
5.If the Process Owners don’t understand what your DOE is, how can they assist you?
6.Does your DOE intend to make a wide range of quality product or potentially produce an
unacceptable product in the quest to improve the process? If the Process Owner has never known
what your DOE intentions were, how can they not be upset if they are surprised by the results of the
DOE?
7.Time and money impact scheduling, randomization, testing concerns. All of these must be
considered especially when using the actual process.
8.It is often desirable to run DOE’s in a pilot plant or facility but this is not often the case. If a pilot
facility is to be used, do the results match the process when translated outside of the laboratory?
9.Noise variables cannot be controlled, by definition, but if ambient weather is considered to have an
effect on your process, why would you execute an experiment when a cold or warm front is passing
through your area. This is one example of a known disturbance being designed around.
10 Manage your project to know if the DOE is intended to stretch the boundaries of conceived product
10.Manage
creation or work well within a small experimental area.
There are many considerations to consider. Often learning comes through experience so if you are
unsure about your future experiment in this project or another, consult with mentors or Six Sigma belts.

485
Questions to Design Selection (cont.)
Technical Considerations
What are the objectives/goals for the experiment:
1. What factors are important? (narrowed from Analyze Phase)
USL
2. What is the operating range for each factor?
3. How can I minimize both the cost of DOE and the cost of
6 Sigma
running the process? 5 Sigma
4. How much change in the process do we require?
4 Sigma
5. How close to optimal does the process currently run?
6. Are we tackling a centering or variation problem? 3 Sigma
7. What impact to the process while running the DOE?
2 Sigma
8. What is the cost of competing DOE designs?
1 Sigma
9. What do you know about the process interactions?
The technical considerations to be made, these need be answered before running an experiment.
Making sense of these at the present is not necessary.
DOE Methodology Step 1
First define the

problem in a 1 . Define the Pra ctica l Problem
practical sense.
Will we achieve
hi allll • W rite down how the experiment connects with the original project
that is necessary; scope. Practically speaking, what is this experiment supposed to
it does in certain accomplish?
circumstances
take multiple 1. Identify Root Cause
experiments? 2. Measure Variation
3. Measure Output Response
Notice an
• Have the measurement systems been verified for the Input Variables
example of this
and Output Response?
shown here.
A circuit boa rd m a nufa cturer w a nted to identify w ha t fa ctors im pa ct

the a dhesion level betw een circuit boa rds. The fa ctors a nd output ha d
sa tisfa ctory ga ge R& R results of less tha n 1 5 % study va ria tion.

486

In Step 2,
2 we have to determine the critical characteristics and the desired outcome; This gives us
our critical characteristic.
2 . Esta blish the Ex perim enta l O bjective
• Objective must include the critical characteristics and the desired outcome.
– If the experiment
p and pproject
j is tackling
g recurring
g issues,, consider a
different critical characteristic.
• The characteristic may require a different physical phenomenon
being measured or with a differing measurement system.
• The measurement system precision and accuracy may influence the
specific output to be measured.
• Identify the desired experimental outcome.
1 Eliminate Root Cause
1.
2. Reduce Variation
3. Achieve a target
4. Maximize Output Response
5. Minimize Output Response
6. Robust process or product
Step 3 is knowing that a DOE is going to be performed, does it makes sense to go an extra mile?
Let’s get our money’s worth by measuring more than one output if it could benefit us in any way.
• Is the output(s) qualitative or quantitative?

• W hat was the past Response Variable’s baseline results?
• Is the output(s) typically under statistical control?
• Does the output(s) vary with time?
• How much change in the output(s) do you want to detect?
• Is the measurement system adequate with the same units of measure
as identified in Step 1?
– For experimental reasons, this measurement may be different
than your past outputs considered.
• How many outputs?
The output is ta ck iness a nd is m ea sured in N ew tons (force).

The output m ea surem ent must be done w ithin a n hour of production a nd
the m ea surem ent sy stem ha s not cha nged. W e w a nt to detect a t lea st a
cha nge in ta ck iness of 1 5 N ew tons in the Response Va ria ble.

487

Step 4 is to select the input or independent variables.
variables At this point you should have a decent
understanding of the variables that need to be explored as a result of the work accomplished in the
previous phases.
4 . Select the Input (independent) Va ria bles
• Use the Analyze Phase and subject matter experts to select these factors.
• All factors must be independent of each other.
• Consider past results from previous experiments.
• Test the most likely candidates first.
• Factors not included in the designed experiment should be held constant
and recorded.
• N oise
i or uncontrollable
t ll bl ffactors
t (t
(typically
i ll environmental
i t l conditions)
diti ) should
h ld
be monitored and the experimental design may be impacted (see Step 6).
The inputs selected by the tea m follow ing the Six Sigm a m ethodology a re
dw ell tim e (sec), tem pera ture of solution (deg F) a nd concentra tion of solution
((% solids).
) N oise fa ctors of a mbient tem ppera ture a nd hum idity
y w ere
recorded a nd m onitored.
Step 5 is to choose
the levels for the 5 . Choose the Levels for the Input Va ria bles
input variables. The
• Factor levels must be considered to create the desired change in Output
factor levels must be
Response identified in Step 3.
considered to create
the desired change • Do N OT create unsafe conditions or beyond the feasibility of the process.
in the output – This does N OT mean constraining Input Variable levels to current
response as process range.
identified in Step 3
3. – Be wary y if operating
p g near the extremes or operating
p g limits.
Poor choices for • Realize some experimental runs may produce unacceptable product or
input variable level process results. These results must be weighed against the risk of future
settings could very production.
well render an • Even when designing your experiment with coded levels for the factors, the
experiment useless team MUST be aware of what the levels mean in the process language.
so be smart. • Factor levels can be impacted by the Experimental Objective in Step 2.
– Screening g experiments
p have wider settings
g for factors
– Full Factorials have narrower settings than screening experiments
– Response surface Designed Experiments have quite narrow settings

488
DOE Methodology Step 5 (cont.)

Do not set the levels too wide, this may cause our experiment to lose very valuable output response.
Making an assumption by way of drawing what you have in your mind of what it will look like, helps a
great deal.
5 . Choose the Levels for the Input Va ria bles

• Setting the factor levels too wide may cause the experiment to miss an
important region or change in the Output Response.
Results of experiment show no

significant difference in settings
onse
Output Respo
“-” “+”
F
Factor Settings
S i
Be aware you do not want to set the factor levels too low either. We could be shown no difference in
output to input relationship.
5 . Choose the Levels for the Input

p Va ria bles
• Setting the factor levels too narrow will show no difference in the output or
not give enough statistical confidence in the effect of the factor on the output
relative to the noise in the experiment.
Output Response
O
“ -” “ +” Factor Settings

489

Input
p variable level settingsg
should be set far enough 5 . Choose the Levels for the Input Va ria bles
apart to detect a difference • Should be set far enough apart to detect a difference in the response and to
in the response and to have have enough statistical confidence in the change of the output relative to the
enough statistical experimental noise.
confidence in the change of
the output relative to the
ponse
experimental noise.
A
Assume this
thi graphic
hi was a
Output Resp
sketch generated from our
basic understanding of the
theory. We don’t know
exactly what factor setting Factor Settings
would produce the output “ -”
response but we do know “ +”
the ggeneral shape
p of the The ex p perim ent is usingg coded levels:
curve. Notice that we Dw ell tim e: +1 (2 0 sec); -1 (1 0 sec)
Temp of sol’n: + 1 (8 0 deg F); -1 (1 0 0 deg F)
stayed away from the sharp Conc. of sol’n: + 1 (4 0 %) ; -1 (2 0 %)
peak. It is very easy to slide
off such a steep peak,
unless your process controls are very tight it would be better to find the nice robust region where the
output response is high but flat, meaning that the factor settings can change a bit, but it does not
have much effect on the output response. If the concern for spending too much time on this comes
up, also,
l consider
id h how many d defects
f t are ttaken
k iin when h th the statistical
t ti ti l significance
i ifi iis d
deemed d
inadequate.
You might think we have spent too much time on just setting the levels for the input variables or
factors in your experiment. However, consider the learning of others who have had to go back to
their Process Owners or Champions and explain that no factors were deemed statistically significant
because the design was inadequate.
Step 6 is to select the

experimental design. In
the green where it says • Factorial Design (full vs.
full, we have full factorials; fractional)
and for the rest of the – Full designs typically have
factorials we will discuss 5 or fewer factors
them later. Here we are • All interactions can be
selecting the estimated
Experimental Design. – Screening or Fractional
Factorial designs have
many factors
• N ot all interactions
can be estimated

490

Step 6 involves
selecting the Balanced and orthogonal designs are highly encouraged and the
Experimental Design. definition of balanced and orthogonal is covered in a later module.
DOE’s can be designed
Center Points are used for investigating curvature and advanced
in many ways but
designs. Center Points are covered in a later module.
balanced and
orthogonal designs are Blocking can be used to account for noise variables and is covered in a
highly encouraged. later module.
MINITABTM will always
design a balanced and
orthogonal design if you
use the program to
design your I’m keeping out the Noise coach!!
experiment.
Remember our advice

that subject matter experts along with your team members should pay attention to their experience
and the previously gathered and analyzed data. If curvature is suspected, center points are used to
confirm if curvature exists within the experimental region.
Remembering that noise variables can’t be controlled but managed around, blocking is a technique
for managing your experiment around noise variables considered of importance. Remember, you are
interested in understanding the effects and interactions of your controlled variables so you want
statistical confidence
confidence.
Randomization has an
impact on your statistical Randomization has an impact on your statistical confidence because your
experimental noise is spread across the runs.
confidence because your
experimental noise is
spread across the runs.
What would happen if
another unknown significant
variable changed halfway
during our experiment?
It is possible that an
unknown significant variable
such as machine warm up
would g get confused with the What would happen if
C variable because without another unknown
randomization all the low significant variable
changed halfway
levels would be generate thru our experiment?
first and then all the high
levels?

491
Determining sample
size is very similar to
what we did in the Sa m ple size m ust be determ ined.
Analyze Phase.
There are a few
Determine
distinctions. Much of d by Step
4.
For full
fa ctoria ls,
the values are self- this equa ls

2 fa ctors
explanatory.
As in the Analyze Specified

in Step 2 .
Phase, we are
typically solving for
the number of See first
Ty pica lly
0 .9
replicates, but you σ of process

output
slide of
Step 6 .
va ria ble
can work the
numbers backwards
as we did before and
estimate how big an After number of replicates is determined, we must decide the sampling
effect could be strategy.
detected.
“Number of corner points” is the number of experimental runs in the base design before any
replication or center points are added.
Effects is the same as delta in the Analyze Phase.

“Effects” Phase How much of a difference do you need to detect
detect.
You have the choice of using real values or simply estimating in terms of Standard Deviations. If you
use an estimate in Standard Deviations, then the Standard Deviation should be 1.0.
Here we have
a 2 cubed
design which
gives us 8
corner points
i t A sa mplel size
i off 2 iis
and have indica ted for the
used an effect ex a mple show n. W ha t
of 2 Standard does this mea n?
Deviations to
determine the
sample size. Power and Sample Size
2 L
2-Level
l F
Factorial
t i l DDesign
i
MINITABTM
then shows Alpha = 0.05 Assumed standard deviation = 1
us that we Factors: 3 Base Design: 3, 8

need to have Blocks: none
2 Reps.
Center Total Target
WHAT THE Points Effect Reps Runs Power Actual Power
HECK IS A 0 2 2 16 0.9 0.936743
REP??

492

A rep is a
Replication of an experimental run is an independent observation of the run that
replication. A represents variation from experimental run to experimental run.
replication is an – A replicate must be made at a unique time or sequence in the experiment.
independent
observation of the
Single Replicate Design Replicated Design (2)
run that represents
variation from
experimental run to
experimental run.
A replication is
NOT a duplicate or
a repeat. Look at
the two designs
shown here. The
first is a single
g
replicate design,
which means there
is only one value
for each unique experimental run. The terminology is a bit confusing, but don’t worry.
The replicated design has double the runs. The design is fully randomized whenever possible so this is
not the order in which it is run.
Notice how experimental run #1 and #9 have the three factors which are start angle, stop angle and
fulcrum, running with the same combination of levels and then experimental run #9 is a replicate of run
#1.
Additional considerations are required when determining what a

sample size means.
For the experimental results to be representative of the process,

sample across the largest family of variation.
– It is also necessary to determine how to define a representative
sample and experimental unit.
• Characteristics of a representative sample are:
– Repeatable measurement and represents natural
variation of the process.
• An experimental unit is the basic unit to which an
experimental run can be applied and includes all the
qualities of a representative sample.

493
Recall from the Analyze Phase the Multi-Vari tool described the three
families of variation. Consider these families of variation to determine
how to sample with replication for an experiment.
– W ithin Unit or Positional
• W ithin piece variation related to the geometry of the part.
• Variation across a single unit containing many individual parts
such as a wafer containing many computer processors.
• Location in a batch process such as plating.
– Between Unit or Cyclical
• Variation among consecutive pieces.
• Variation among groups of pieces.
• Variation among consecutive batches.
• Temporal or Over time
• Shift-to-Shift
• Day-to-Day
• W eek-to-W eek

Step 7 is to Execute the Experiment and Collect Data.
7 . Ex ecute the Ex periment a nd Collect Da ta
• Discuss the experimental scope, time and cost with the process owners
prior to the experiment.
• Some team members must be present during the entire experiment.
• After the experiment has started, are you getting output responses you
expected?
– If not, quickly evaluate for N oise or other factors and consider
stopping or canceling the experiment.
• Use a log book to make notes of observations, other factor settings, etc.
• Communicate with the operators, technicians, staff about the
experimental details and why the experiment is being discussed before
running the experiment.
– This communication can prevent “ helping” by the operators,
technicians, etc. that might damage your experimental design.
• Alert the laboratory or quality technicians if your experiment will
increase the number of samples arriving during the experiment
experiment.

494

Step
p 8 is to Analyze
y the data from the Designed
g Experiment
p and draw Statistical Conclusions.
8 . Ana ly ze the Da ta from the Designed Ex perim ent

a nd dra w Sta tistica l Conclusions
• Graphical Analysis has already been covered in the previous

modules.
modules
• Further analysis of “ reducing” the model to the significant terms
will be covered in the next module.
• Further analysis of “ reducing” the model to the significant terms
will be covered in the next module.
• The final model fitting will occur.
• Terms in the final DOE equation will have statistical confidence
you needed
needed.
• Diagnose the residuals similarly to that of Regression Analysis.
• Details of this step are covered in the next module.

Step 9 is to Draw Practical Solutions.
• This will be covered in detail in the next module.

• Even if terms or factors are statistically significant, for practical
significance the term might be removed.
• “ Stat>DOE>Factorial>Response Optimizer” will help the project team
find where the vital few factors need to be targeted to achieve the
desired output response.
– This will be covered in detail in the next module.
• This step is how the project team determines the project’s potential
success.
• Immediately share the results with the process owner for feedback on
implementation of the experimental results.

495

Step 10 is to Replicate or Validate the Experimental Results
Results.
1 0 . Replica te or Va lida te the Ex perimenta l Results
• After finding the Practical Results from Step 9, verify the results:
– Set the factors at the Practical Results found with Step 9 and see
if the process output responds as expected. This verification
replicates the result of the experiment.
– Do not forget your model has some error.

And the final step is to Implement Solutions. We spend so much time with the 11 step
methodology for a couple of reasons. One, it is easy to get confused or excited about running a
Designed Experiment. Two, experiments are easy to design with the help of MINITABTM but
difficult to execute appropriately and achieve statistical results unless you follow a planning
approach as we have discussed here. Overall there is a lot that can be overlooked or not done
properly, take your time and follow this process, it WILL ensure better results.
1 1 . Implement Solutions
• If the objective of the experiment was accomplished and the Business

Case is satisfied, then proceed to the Control Plan which is covered
in the Control Phase.
• Do not just run experiments and not implement the solutions.
• Further experiments may need to be designed to further change the
output
ou pu to
o sa
satisfy
s y the
e Business
us ess Case
Case.
– This possible need for another experiment is why we stated in
earlier modules that DOE’s can be an iterative process.
You will p
probably
y not fully
y appreciate
pp all the comments in the modules of this p
phase until yyou have
designed, managed, executed and analyzed a few real life experiments for yourself.

496
Be able to Design, Conduct and Analyze an Experiment
You have now completed Improve Phase – Experimental Methods.
Notes

497
Lean Six Sigma

Black Belt Training
Improve Phase
Full Factorial Experiments
p
Now we will continue in the Improve Phase with “Full Factorials”.

498
In this module
we will discuss W
W elcom
elcomee to
to Im
Improve
prove
the Full
Factorial in Process
Process M
Modeling:
odeling: Regression
Regression
detail.
Adva
Advanced
nced Process
Process M
Modeling:
odeling:
M
MLR
LR
Designing
gg gg Ex
Designing Ex perim
pperim
p ents
ents
Mathematical
Mathematical Models
Models
Ex
Experim
perimenta
entall M
Methods
ethods
Balance
Balance and
and Orthogonality
Orthogonality
Full
Full Fa
Factoria
ctoriall Ex
Ex perim
periments
ents
Fit
Fit and
and Diagnose
Diagnose Model
Model
Fra
Fractiona
ctionall Fa
Factoria
ctoriall
Ex
Experim
periments
ents Center
Center Points
Points
W
W ra
rapp Up
U &
Up & Action
A
Action
ti Item
Itemss
It
Why Use Full Factorial Designs
Two level Full Factorial designs are the most powerful and efficient set of experiments.
2 k Full Fa ctoria l designs a re used to:

• Investigate multiple factors at only two levels, requiring fewer runs than multi-
level designs.
g
• Investigate large number of factors simultaneously in relatively few runs.
• Provide insight into potential interactions.
• Frequently used in industrial DOE applications because of simplicity and ease
of analysis.
• Obtain a mathematical relationship between X’s and Y’s.
• Determine a numerical, mathematical relationship to identify the most
important or critical factors in the experiments.
experiments
Full Fa ctoria l designs a re used w hen:

• There are five or fewer factors.
• You know the critical factors and need to explain interactions.
• Optimizing processes.

499
Mathematical Output of Experiments
• The end result of a DOE is a mathematical function to describe the

results of the experiment.
• For the 2k Factorial designs this module discusses, linear
relationships are covered.
• All models will have some error as shown by the ε in the below
equation.
equation
• The mathematical equation below is the prediction from the

experimental data. Notice there is no error term in this form.
• Ŷ is the predicted output response as a function of the input
variables used in the experiment.
This may look similar to regression, but the important difference is that DOE is considered true
cause and effect because of the controlled nature of experimentation. This is an important tool in
manufacturing environments.
The only difference between the model equation and the prediction equation shown is that the
prediction equation is simplified for describing the data gathered in the experiment and using it to
predict future events
events. Just because you end up with a prediction equation in an experiment does not
mean it is a good predictive model. We will discuss this further when we introduce Center Points.

500
Linear Mathematical Model
The linear model is sufficient for most industrial experimental objectives.

The linear model can explain response planes and twisted response surfaces
because of interactions.
– The following is a linear prediction model used in a two-level full or
fractional factorials.
Surface Plot of % Reacted Surface Plot of % Reacted
65
65
60 55
% Reacted % Reacted
1 1
55
45
-1
0
Cn -1
0
Cn
0 -1 0 -1
Ct 1 T 1
Linear Models are usually sufficient for most industrial experimental objectives. This goes back to
the difference between a physical model and a DOE model. Just because we know by theory that
the model should not be linear, it may express itself as sufficiently Linear in the particular design
space.
People can get confused between the concept of curvature and twisted response planes. We do
not have enough information (not enough levels for each variable) to describe true curvature. Take
a piece
i off paper which
hi h will
ill representt 2 iinputt variables.
i bl Lift opposite
it corners. Th
Thatt iis a graphical
hi l
representation of an interaction. The response plane (paper) is twisted. Now lift up the paper to eye
level and rotate until the projection looks like a curved line. We are simply looking at the projection
of the twisted plane with Linear Models. There may be true curvature in the real world, we simply
can’t describe it with a Linear Model.
HOWEVER, in most manufacturing processes the Linear Model is very powerful because of the
constrained design space. Draw a box on the paper and hold it up by two opposite corners.
Depending on how much twist you give the paper and how big the box is you will either see a
curve or not in the defined space.
The surface plot on the left has no significant interaction, but both Main Effects are significant. The
surface plot on the right shows a significant interaction with T and Cn.

501
Quadratic Mathematical Model

True curvature can be described using g the Quadratic Model. The squared
q term in the model g
gives
us the ability to describe true curvature. With the ability of describing curvature comes a cost. The
experiment gets much bigger. Central composite designs are an example of a Quadratic Model.
Here is a surface plot of true curvature in a Quadratic Model. This shape is referred to as a saddle
for obvious reasons.
Quadratic Models can be obtained with designs not described in this module.
Quadratic Models explain curvature, maximums, minimums and twisted
maximums and minimums when interactions are active.
– The following is the quadratic prediction model used in some response
surface models not covered in this training.
– The simpler 2k models do not include enough information to generate
the Quadratic Model.
21
16
C6
11
1.5
1.0
6 0.5
-1.5
5 -0.5
0.0
B
-1.0
-1 0 -0.5
05 -1.0
-1 0
0.0 05
0.5 -1.5
A 1.0
0 1.5
Nomenclature for Factorial Experiment

The nomenclature for 2 level designs is 2 to the K. If you had an experiment with 3 factors it would
be a 2 cubed design. If you simply do the math, that is the number of experimental runs in the basic
design.
2-level designs are most commonly used:
–2k where k is the number of factors

– The total number of runs in the design is equal to the result of
the math
math.
• Example: 3 factors
• 23 = 8 runs
Other designs have more levels in the factorial designs.

– Example is a 34 factorial design with 4 factors at 3 levels for each
factor
factor.

502
Treatment Combinations
Treatment combinations or experimental runs, show how to set the

levels for each of the factors.
Minuses and plusses can be used to indicate low and high factor level
settings, center points are indicated with zeros.
If the process is evaluated with combinations of the temperature set at

10 and 20 degrees and pressure at 50 and 100 psi, an example of an
experimental run or treatment combination would be 20 degrees and
50 psi.
– This 22 design shown below has 2 factors at 2 levels.
– A total of 4 treatment combinations are in this experiment.
Temperature
10 20 T
Treatment
t t combination
bi ti
Pressure 50 1 2 for run number 2 is:
100 3 4
Temperature at 20 deg
and Pressure at 50 psi.
Standard Order of 2 Level Designs

Dr. Frank Yates created this standard order to aid in calculating the effects of each effect by
hand. Thank goodness we no longer have to perform hand calculations. It is common to draw a
cube for a 2 cubed design as shown.
The design matrix for 2 k factorials are shown in standard order (not
randomized).
– The low level is indicated by a “ -” and the high level by a “ +” .
– This order is commonly referred to Yates standard order for Dr.
Frank Yates.

503
Full Factorial Design with 4 Factors
Here we h
H have
standard notation for
2 to the 4 design and
above using 2 cubes,
a common
representation; now
for the low levels of
tthe
e 4 the
t e factor
acto and
a d
one for the high.
Full Factorial Design

Let’s walk through and Stat>DOE>Factorials>Create Factorial Design
design a 2 cubed design This design is in coded units because it simply lists minus and plus signs for the
again for practice. You factor levels. Coded units provide some advantages in the analysis but is not
can name the columns A, useful for process owners when running an experiment.
B and C or any name
The table is also referred to as a Table of Contrasts.
you’d like.
This ttable
Thi bl created
t d with
ith Factors
the factors is referred to
as a table of contrasts.
The contrast columns are
the minus ones and plus
ones in the factor
columns. In order to
calculate contrast
columns for interactions,
we need the contrast
columns for the main
factors.
Warning, whatever you do, do not change the names of the columns by simply typing over the
names. MINITABTM creates a model that it uses for the analysis later. If it can’t find the column
names used to generate the worksheet, it will give an error message.

504
Balanced Design
Factorial designs should be
Factorial Designs should be balanced for proper interpretation of the
balanced for proper
mathematical equation.
interpretation of the
mathematical equation. An experiment is balanced when each factor has the same number of
experimental runs at both high and low levels.
An experiment is balanced
when each factor has the Summing the signs of the column contrast should yield a zero.
same number of
experimental runs at both Balance simplifies the math necessary to analyze the experiment
experiment.
high and low levels. – If you always use the designs MIN ITABTM provides, they will always be
balanced.
Summing the signs of the A B
column contrast should yield 1 - -
a zero. In this example, there 2 + -
are 2 minuses and 2 plusses.
3 - +
Balance
B l simplifies
i lifi th the math
th 4 + +
necessary to analyze the ∑ Xi 0 0
experiment.
MINITABTM creates balanced, orthogonal designs. If they aren’t changed, this isn’t a problem.
Orthogonal Design
An orthogonal design
allows each effect in An Orthogonal Design allows each effect in an experiment to be
an experiment to be p
measured independently, y theyy are vectors that are at 90 degrees
g to
measured each other.
independently, these If every interaction for all possible variable pair sums to zero, the
are vectors which are design is orthogonal.
at 90 degrees to each W ith an Orthogonal Design, if an interaction is found to be significant,
other. When every it is because of the data and not the experimental design.
interaction for all – If you always use the designs MIN ITABTM provides, they will always be
possible variable pair orthogonal and balanced.
sums to zero, the
A B C AB AC BC
design is orthogonal.
1 - - + + - -
2 + - - - - +
3 - + - - + -
4 + + + + + +
∑ XiX y = 0 0 0

505
Biomedical Production Example
In this example we will walk through the 11 Step DOE methodology.

The biomedical firm is attempting to increase the yield of a specific
protein expression for use in research by universities and
pharmaceutical companies.

• Increase the yield by 50% of current production. The Measurement System
A l i ffor yield
Analysis i ld h
has b
been verified.
ifi d Th
The b
baseline
li ffor th
the primary
i metric
t i off
yield is at 50%. The objective of the Project Charter required the team to
achieve at least a 50% increase in yield.
2 . Esta blish the Ex perimenta l O bjective

• Maximize the yield.

• Yield of protein expression is the only output of interest.
• It is desirable to change the yield from 50% to at least 75%.
4. Select the Input (independent) Va ria bles

• Temperature
• Concentration
• Catalyst
• N oise and other variables such as ambient room temperature and technician
will be recorded during the experiment
experiment.

• The following levels were determined with tools from the Analyze Phase such
as Regression, Box Plots, Hypothesis Testing and Scatter Plots. The levels
were set far enough to attempt large yield changes to get statistical
confidence in our results.
– Temperature C (25, 45)
– Concentration % (5, 15)
– Catalyst (Supplier A, Supplier B)

• A Full Factorial Design is desired because the team has no knowledge of the
interactions and the number of factors is only 3.
• Randomization is desired because of statistical confidence.
• Randomization is possible because all factors can be changed easily without
large, long disruptions to the process.
• The sample size will be based on a delta of 2 Standard Deviations.
Sta t> Pow er a nd Sa m ple Size> 2 -level Full Fa ctoria l
Power and Sample Size

2-Level Factorial Design
Alpha = 0.05 Assumed standard deviation = 1
Factors: 3 Base Design: 3, 8

Blocks: none
Center Total Target
Points Effect Reps Runs Power Actual Power
0 2 2 16 0.9 0.93674

506
Biomedical Production Example (cont.)
Stat>DOE> Create Factorial Design When creating the worksheet

in MINITABTM be sure to
change the default in the
“Number of replicates:”
window to 2.
Enter the names of the

factors and their levels
here in MINITABTM.
This is where these
are created so
remember to do it
here,
e e, itt will not
ot carry
ca y
through if you only do
it in the worksheet
itself.
For ease of data entry for the

For ease of data entry for the results of the DOE, we have turned off
results of the DOE, we have “Randomize runs” by deselecting in this “Options…” tab.
turned off “Randomize runs” by
deselecting g in this “Options…”
p
tab. You will almost always use
the randomization selection
when creating designs for real
experiments. There are some
exceptions that we will cover
later in this module.

507

In an empty
p y column, type
yp in
‘Yield’ where we will place the In an empty column, type in ‘Yield’ where we will place the experimental
results. Column C8 was selected in this example.
experimental results. Column
Do NOT edit, copy, paste or alter anything in the first 7 columns or
C8 was selected in this MINITABTM will not understand the worksheet.
example.
If we had more than one

response we would have added
that as a column as well
well.
Take a moment to look at your

worksheet. It should look the
same as the one shown here.
Why is the supplier column
justified to the left instead of the
right?
That’s right, it’s a text column.
7. Execute the Experiment and Collect Data
Even though we do not have a • Enter the results of the experiment in the column labeled “Yield”, our output.
number for the supplier • The ambient room temperature and technician were recorded per our
original plan but we did not place the information into this worksheet.
variable, MINITABTM will
handle the calculations easily.
In fact, it would be misleading
to assign numbers to the
variable names to trick
MINITABTM into thinking it was
a continuous variable. There is
no “in between” value for 2
different suppliers.
Type in the yield information in

the worksheet yyou created.
Over the next several graphics
we will walk through the 8 . Ana lyze the Da ta from the Designed Ex perim ent
analysis. Sta t>DO E> Fa ctoria l>Ana ly ze Fa ctoria l Design
Select “ N orma l” a nd
“ Pa reto” effects plots.
We first need to estimate the Select “ Sta nda rdized”
residua ls.
effects for ALL possible effects
in the design, including all
main effects and all interaction
effects. Then we will decide
which ones are important to
describing the variation in the
data set.
We will remove the effects that

are not important to describing
the variation
ariation in the data set
and re-run the model with only
those effects. This is similar to
the work you have already done in Regression Analysis. After we have run the final model fit we will
check our Residual Analysis to validate our assumptions, the same as in Regression.

508

Select the “Terms”
Terms tab and you
MINITABTM defaults with all
will see MINITABTM effects in the model. After
automatically selects all the significant effects are
determined, the insignificant
possible terms for the design effects will be removed.
you are using. If any of the

seven effects listed here are
found to be insignificant in
explaining error then we need to
remove them
th ffrom the
th modeld l
soon.
We have selected two graphical
tools to help us select the correct
model. The Normal Probability
Plot assumes that insignificant
effects or effects that have values
close
l tto zero are due
d tto noise
i
which is distributed Normally. The N orm a l Proba bility Plot
Normal Probability Plot of the Standardized Effects
Any insignificant effects should 99
(response is Yield, Alpha = .05) a ssum es tha t insignifica nt
effects a re due to noise a nd
plot closely to the Normal 95
Effect Ty pe
Not Significant
Significant therefore N orm a lly Distributed.
Probability line. Effects that are 90
80
A F actor
A
B
N ame
Temp
C onc
Any significa nt effects w ill be
AC
plotted off the stra ight line a nd
large are indicated in red and
C S upplier
70
Percent
60
50
highlighted in red.
labeled. This method is referred 40
30
20 Pareto Chart of the Standardized Effects
to as the Daniels method in some 10
5 2.31
(response is Yield, Alpha = .05)
literature. 1
A
F actor N ame
A
B
Temp
C onc
0 10 20 30 40 50 60 70 C S upplier
Standardized Effect AC
The Pareto Chart also shows us B
the significant effects based on The Pa reto Cha rt of

Term
BC
sta nda rdized effects
the selected alpha level. Any gra phica lly show s w hich
ABC
effects a re significa nt AB
Effect that is beyond the red line ba sed on the selected C
is considered significant. a lpha level. Any effect
0 10 20 30 40 50 60 70
tha t goes bey ond the red Standardized Effect
line is significa nt.
At this point, Temperature and
the interaction of Temperature
with supplier are the significant
Effects. In the Session W indow under the factorial fit, any effect that has a P-
value less than 0.05 (for an alpha of 0.05) is considered significant.
Look for the factorial fit
N otice that all three methods of determining what effects belong in
information. We interpret this
the final model fit agree.
based on the same way as we
Factorial Fit: Yield versus Temp, Conc, Supplier
have interpreted as we do any
other statistical test. Estimated Effects and Coefficients for Yield (coded units)
What does this tell us….there are Term Effect Coef SE Coef T P
2 significant Effects that should

Constant 61.1250 0.1811 337.44 0.000
Temp 23.4500 11.7250 0.1811 64.73 0.000
be in this model. Conc 0.5750 0.2875 0.1811 1.59 0.151
Supplier 0.0000 0.0000 0.1811 0.00 1.000
Temp*Conc
e p Co c -0.0250
0.0 50 -0.0125
0.0 5 0.
0.1811
8 -0.07
0.0 0.9
0.947
Temp*Supplier 10.0500 5.0250 0.1811 27.74 0.000
Conc*Supplier -0.4750 -0.2375 0.1811 -1.31 0.226
Temp*Conc*Supplier 0.1750 0.0875 0.1811 0.48 0.642

509
Biomedical Production Example (cont>)
Since we have removed the

Re-fit the model by removing the insignificant factors.
insignificant factors we need
to go back and refit the
model. Even though there Even though Supplier was not a
significant effect, it is necessary to
were only 2 significant include it in the model because the
Effects we must include all Temp/Supplier effect was significant.
Main Effects in the model This type of model is referred to as a
that are involved in an Hierarchical Model.
interaction since we don’t
completely understand the
interactions.
Under “Graphs” uncheck

“Normal” and “Pareto” plots
and include either “Individual
plots” or the “Four in one” to
evaluate our assumptions with
the Residual Plots. Another
plot that should always be
explored is the “Residuals
versus variables:” plot.
The Residual Analysis will

be discussed shortly.
We need to create some Sta t> DO E> Fa ctoria l>Fa ctoria l Plots
factor plots before evaluating Anytim e there is a significa nt
the residuals. Follow the intera ction, it is useful to plot.
Plot both “ M a in Effects Plot” a nd
MINITABTM path shown here.
here “ Intera ction Plot” in this ex a m ple.
ple

510

Thee steep slope
s ope oon a Main
a
Effects Plot means the Main Effects Plot (data means) for Yield N on-pa ra llel lines in the
Intera ction Plot indica ted
variable is significant. Flat Temp Conc
significa nce. The lines do

70
lines, as shown for 65 not ha ve to cross ea ch
concentration and supplier, 60 other to be significa nt.
55
Also, they ca n cross
Mean of Yield
indicate they are not 50
25 45 5 15
slightly a nd still be
significant. Supplier insignifica nt.
70 Interaction Plot (data means) for Yield
65 5 15 A B
The interaction plot shows all 60

70
Temp
25
the plots with the variables you

55 45
50 T emp 60
selected in the previous A B 50
MINITABTM command. The

C onc
70 5
15
interaction of interest for our A steep slope in the M a in C onc 60
Effects Plot indica te 50
example is temperature with significa nce. A fla t slope

supplier. Here it looks like high indica tes no significa nce.
Supplier
temperature with supplier B

gives the highest yield which,
in our case, is exactly what we
want.
Factorial Fit: Yield versus Temp, Supplier

Estimated Effects and Coefficients for Yield (coded units) Review the fitted
Term Effect Coef SE Coef T P
t bl iin your S
table Session
i
Constant 61
61.1250
1250 0
0.1847
1847 330
330.94
94 0
0.000
000
Temp 23.4500 11.7250 0.1847 63.48 0.000
Supplier 0.0000 0.0000 0.1847 0.00 1.000
Window. This
Temp*Supplier 10.0500 5.0250 0.1847 27.21 0.000 provides a lot of
information that we
S = 0.738805 R-Sq = 99.75% R-Sq(adj) = 99.69%
will explore later in the
module, for now
Analysis of Variance for Yield (coded units)
Source DF Seq SS Adj SS Adj MS F P
Main Effects 2 2199.61 2199.61 1099.80 2014.91 0.000
notice the P-values.
M odel is significa nt
2-Way
2 Way Interactions 1 404
404.01
01 404
404.01
01 404
404.01
01 740
740.17
17 0
0.000
000
Residual Error 12 6.55 6.55 0.55
Pure Error 12 6.55 6.55 0.55
Total 15 2610.17
Interpret the Residual Analysis the same as in Regression.
Residual
Residual Plots
Plots for
for YYield
ield
This shows us our 4 in 1 plot of 99

99
Normal
NormalPro babilit y Plo
Probability Plott of
of tthe
he Residuals
Residuals
22
Residuals
ResidualsVersus
Versustthe
he Fit
Fittted
ed Values
Values
Residual
Residuals for yield. The 90

90 11
Percent
Percent
interpretation is the same as we’ve

Standardized
50
50 00
used in the past for Regression. 10

10
-1
-1
11 -2
-2
-2
-2 -1
-1 00 11 22 40
40 50
50 60
60 70
70 80
80
Standardized
Residual Fitted
FittedValue
Value
Hist
Histogram
ogramof
of tthe
he Residuals
Residuals Residuals
ResidualsVersus
Versustthe
heOrder
Orderof
of tthe
he Dat
Dataa
22
44
esidual
esidual
33 11
Re
Frequency
yy
StandardizedRe
Frequency
Standardized
22 00
11 -1
-1
00 -2
-2
-1.5
-1.5 -1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5 11 22 33 44 55 66 77 88 99 10
10 11
11 12
12 13
13 14
14 15
15 16
16
Standardized Obser vation OOrder
rder

511
The Residuals versus The Residuals versus variables are most important when deciding what level to
Variables are most set an insignifica nt factor.
important when A typical guideline is a difference of a factor of 3 in the spread of the Residuals
deciding what level to between the low and high levels of an insignificant input variable.
set an insignificant – In this case concentration was not significant, but we still need to make a decision
factor. on how to set it for the process. The low level for concentration has a smaller
spread of Residuals, but there is not a difference of 3:1. Other considerations for
setting the variable are cost and reducing cycle time.
A typical guideline is a
difference of a factor Residuals Versus Temp Residuals Versus Conc
(response is Yield) (response is Yield)
of 3 in the spread of 2
Spread of residuals
2
the Residuals
1 1
Standardized Residual
between the low and
high levels of an 0 0
insignificant input
-1 -1
variable. In this case
concentration was not -2
25 30 35 40 45
-2
5.0 7.5 10.0 12.5 15.0
significant, but we still Temp Conc
need to make a
decision on how to set it for the process. The low level for concentration has a smaller spread of
Residuals, but there is not a difference of 3:1. Other considerations for setting the Variable are
cost and reducing cycle time.
The Response Optimizer in MINITABTM is a great tool to visually determine where to set the input
variables to achieve the desired output response. Play with it for a while and see what you get. The
more you play around with these thing the better your understanding will be of how it works.

Sta t> DO E> Fa ctoria l> Response O ptim izer
Reca ll the objective w a s

to m a x im ize the y ield.
It is necessa ry to
esta
t blish
bli h a ta
t rgett a nd
d
low er lim it for the yield
va lues.

512

As you can see from
this there is only one
continuous input
variable which
MINITABTM came up
with the best solution
based on the data we
have used.
Practical Solution:
Temp 45C
Concentration 5%
Supplier B

• Verify, verify, verify.
• Verify settings determined in the last step, by producing several Now that we have
typical manufacturing quantities. completed one example we
• The variation or error seen in the experiment will be different than the are going to add to your
variation seen in the manufacturing validation. knowledge base by
covering Center Points and
1 1 . Implem ent Solutions run through another
• If the objective of the experiment was accomplished and the Business example adding further
Case is satisfied, then proceed to the Control Plan which is covered explanation of the statistics
in the Control Phase. as well.
output to satisfy the Business Case.
p
• Implement the changes
g necessary y to maintain the new g
gains to the
process.
Center Points
As you can see in the A Center Point is an additional experimental run made at the physical center of
the design.
graphic there may be an – Center Points do not change the model to quadratic.
unknown hump in the – They allow a check for adequacy of linear model.
Response Curve, by The Center Point provides a check to see if it is valid to say that the output
response is linear through the center of the design space.
adding the Center Point it If a straight line connecting high and low levels passes through the center of the
allows us to calculate an design, the model is adequate to predict inside the design space.
– “Curvature” is the statistic used to interpret the adequacy of the Linear
additional statistic. If there Model.
is significant curvature in – If curvature is significant the P-value will be less than 0.05.
the model all we know is Do NOT predict outside the design space.
that the model is not

ponse
Linear
Linear.
Output Resp
We don’t know what it is,

just what it is not.
“-” “c” “+” Factor Settings

513
Center Point Clues

Pseudo Center Points are used
when there are discrete input A Center
C Point is always a good insurance policy, but is most effective
ff
when all the input factors are Continuous.
variables in the model.
A guideline is to run 2-4 Center Point runs distributed uniformly through
The model can be collapsed the experiment when all the input factors are continuous in a Full or
creating real Center Points if Fractional Factorial.
the discrete input variables are
not significant. Y
M a x imize Response
If the desire was to maximize Does it m a tter tha t
the response (as shown in the linea r model is
graphic) then the model doesn’t ina ppropria te?
matter. The model is an
important tool to predict output
response inside the design
space. If the experimenter x
“-” “c” “+”
decides to set upp another
experiment to continue in the
direction indicated, then
predicting is not an issue.
Panel Cleaning Example
In this example
p we will walk throughg the 11 stepp DOE methodologygy for
a panel cleaning machine using Center Points in the analysis. The
manufacturing firm is attempting to start up a new panel cleaning
machine and would like to getting it running quickly. They have
experience with this type of machine, but they do not have experience
with this particular model of equipment.

• Start
St t the
th new equipment
i t as efficiently
ffi i tl as possible.
ibl ThThe need
d ffor the
th new
equipment was determined in the Analyze Phase.
• A Measurement System Analysis has been completed and modified to bring
within acceptable guidelines.

• Hit a target for W idth of 40 +/ - 5.
• Minimize variation as much as possible
possible.

514
Panel Cleaning Example (cont.)
Na2S2O8 is Sodium
Persulfate; please 3 . Select the O utput (response) Va ria bles
use that any time • W idth of conductor is the only response.
you see that
notation. 4. Select the Input (independent) Va ria bles
• Dwell Time
• Temperature
• N a2 S2 O 8
• The experts believe that ambient temperature and humidity will have
no effect on the process. Monitors will be placed in the room to
record temperature and humidity.

– Dwell Time ( 4, 6) minutes
– Temperature (40, 80) C
– N a2 S2 O 8 (1.8, 2.4) gm/ lit
Open file “Panel Cleaning.MTW”.
You actually know the answer already since the sample size is the same as the previous example
since they were both 2 cubed designs. Look at your worksheet and find the Center Point runs.
Why are the Center Points uniformly distributed?

• A Full Factorial will be used since there are only 3 input variables.
• Randomization is possible because all factors can be changed easily without
large, long disruptions to the process.
• Is the sample size adequate based on a delta of 2 Standard Deviations?
Why are
Wh
N otice the
Center
Center Points
Points not
a re uniform ly
random?
distributed
through the
design.
Pa nel Clea ning.m tw
Center Points not only tell us something about how well the linear model works, but is also a
reality check for our data. By eyeballing the Center Point data as our experiment progressed we
can see if anything has effected our experiment that we were not expecting. If your Center Points
are dramatically different from each other, you’ve got a problem -- somewhere. They should be
fairly close in magnitude, at least within normal variation.

515
Creating Designs with Center Points

You most likelyy alreadyy know how to create a design
g with Center Points added. Simply
pyg go through
g
the usual steps to create a design and include Center Points.
MIN ITABTM will place the Center Points randomly in the worksheet. The
next few slides will demonstrate how to move the Center Points so they
are uniformly distributed.
1. Create a 3 factor design with 3 Center Points and 2 replicates,
be sure to randomize the design
design.
Your designg should look different than the one in the illustration because we more likely
y than not have
a different random seed that generated the designs. It is possible that our designs are the same, but
trying to calculate the odds of that occurring is not worth the bother. You should have 19 rows in your
design, so if you do not, go back and fix it.
N otice the Center

Points are not
uniformly distributed
with this random
design. It is
desirable to move
one Center Point
near or att the
th
beginning, middle
and end.

516
Creating Designs with Center Points

Do the same for the Center
DO NOT move rows or generate new worksheets in MINITABTM’s DOE
Point you want in the platform, it will corrupt the model stored in memory!
middle and end of the
To move the center points to new locations, find a Center Point and type a ‘1’
design. We have color in the “RunOrder” column. Find the original 1 and replace with the original
coded our example for Center Point RunOrder number.
ease of understanding.
The rows you move most
likely will be different.
To complete the Center Point arrangement, sort the data on the

RunOrder column but DO N O T create a new worksheet.
Data>Sort
You should now have a
worksheet that has a Center
Point at or near the
beginning, middle and end. If
your original design had the
Center Points roughly in
those positions, great that
saved a little work.
7. Execute the Experiment and Collect Data

• The experiment has been run in the order shown below.
• One of the most common mistakes in DOE is typing the data in the data Let’s continue on with
sheet incorrectly.
incorrectly Always verify number entry!
the Panel Cleaning
Example. You may
close the worksheet
we just used
demonstrating how to
move Center Points.

517

Analyze the experiment in
MINITABTM. For fun since 8. Analyze the Data from the Designed Experiment
you’ve already done this

once in this module, stop
reading and work on your
own for a while. When you
think you know what
should be removed from
the model, go ahead and
do it.
Stat>DOE> Factorial>Analyze Factorial Design
So how did it go? Looks like

the significant effects are Normal Probability Plot of the Standardized Effects
(response is Width, Alpha = .05)
Sodium Persulfate,
99
Effect Type
Not Significant
95 Significant
temperature, the interaction of 90
80
B
C F actor
A
B
C
Name
Dw ell Time
Temp
Na2S 2O 8
temp with Sodium Persulfate

70
A
Percent
60
50
and dwell time in that order of

40
30
20
Pareto Chart of the Standardized Effects
importance. 10
5
BC
2.23
(response is Width, Alpha = .05)
F actor N ame
1 A D w ell Time
-10 -5 0 5 10 15 20 C B Temp
Standardized Effect C N a2S 2O 8
B
BC
Term
A
The significa nt effects a re
N a 2 S2 O 8 , Tem p, Dw ell AB
Tim e a nd the intera ction AC

of Tem p w ith N a 2 S2 O 8 .
ABC
0 5 10 15 20
Standardized Effect
Notice that all three methods of determining what effects belong in the final
model fit agree.
Factorial Fit: Width versus Dwell Time,

, Temp,
p, Na2S2O8
Estimated Effects and Coefficients for Width (coded units)
The P-values from
the analysis in the
Constant 34.724 0.2605 133.30 0.000
Dwell Time 4.871 2.436 0.2605 9.35 0.000
Temp 6.484 3.242 0.2605 12.44 0.000 session agree as
Na2S2O8 9.169 4.584 0.2605 17.60 0.000 well.
Dwell Time*Temp 0.941 0.471 0.2605 1.81 0.101
Dwell Time*Na2S2O8 0.861 0.431 0.2605 1.65 0.129
Temp*Na2S2O8 -4.876 -2.438 0.2605 -9.36 0.000
Dwell Time*Temp*Na2S2O8 -0.199 -0.099 0.2605 -0.38 0.711
Ct Pt 0.296 0.6556 0.45 0.662
S = 1.04201 R-Sq = 98.48% R-Sq(adj) = 97.26%

518

Re-fit the model byy
removing the insignificant Re-fit the model by removing the insignificant factors.
factors if you have not
already done this. Be sure
to generate the necessary
Residual Plots and turn off
the “Normal” and “Pareto”
plots.
Here we are going to define the calculations in the ANOVA table.
When working with 2 level designs you will always have 1 degree of freedom for each effect
(including interactions) which is calculated as 2 levels minus 1 equals 1 degree of freedom. In the
ANOVA table for Main Effects we have 3 degrees of freedom for the 3 Main Effects placed in the
model There is one degree of freedom for the temperature Sodium Persulfate interaction
model. interaction.
A Degree of Freedom (DF) is a measure of the number of independent pieces of

information used to estimate a parameter. It is a measure of the precision of an
estimate of variability. A typical definition is n -1= D. F., however, it depends
on what parameters are being estimated.
Analysis of Variance for Width (coded units) 3 DF for the 3 Main Effects, 1 DF for
Source DF Seq SS Adj SS the Adj
interaction
MS effectF in the model.
P
Main Effects 3 599
599.336
336 599.336
599 336 199.779
199 779 148 148.18
18 0 0.000
000
1 DF for curvature based on the
2-Way Interactions 1 95.111 95.111 95.111 70.55 0.000
difference between the average of
Curvature 1 0.221 0.221 0.221 0.16 0.692
the factorial points and the average
Residual Error 13 17.527 17.527 of the1.348
Center Points.
Lack of Fit 3 6.669 6.669 2.223 2.05 0.171
Pure Error 10 10.858 10.858 13 DF for residual error broken into
1.086
Total 18 712.195 two components: Lack of Fit and
Pure Error.
18 DFEstimated
for the TotalCoefficients for Width using data in uncoded units
Lack of Fit: 3 DF for the 3
(# of Term
data points -1). Coef insignificant interaction effects that
Constant -70.4706 were removed from the model.
Dwell Time 2
2.43562
43562
Pure Error: 10 DF: 8 from the
Temp 1.01544 replicated runs (#reps-1 * # of runs)
Na2S2O8 39.6625 and 2 from the Center Points
Temp*Na2S2O8 -0.406354 (#CP – 1).
The Residual error is broken into 2 sources. The 3 degrees of freedom for lack of fit are from the 3
interaction effects that were removed from the model because they were not significant in explaining
the variation of the data. The 10 degrees of freedom come from replication. The 8 runs from the
original design generated 8 degrees of freedom
freedom, in this case there were 2 replicates minus 1 equals 1
degree of freedom for each run in the design. Add to that 2 degrees of freedom from the Center
Points (3 Center Points minus 1 equals 2 degrees of freedom) and we have a total of 10 degrees of
freedom for pure error. Pure error can be defined as the failure of things treated alike to act alike
which are the replicates.

519
Adj MS = Adj SS/DF

For each respective source. F= Adj MS/MSError
Analysis of Variance for Width (coded units)

Main Effects 3 599.336 599.336 199.779 148.18 0.000
Curvature 1 0.221 0.221 0.221 0.16 0.692
Residual Error 13 17.527 17.527 1.348
L k of
Lack f Fit 3 6
6.669
669 6 669
6.669 2 223
2.223 2
2.05
05 0
0.171
171
Pure Error 10 10.858 10.858 1.086 No significant
Total 18 712.195 curvature, the
linear model is
Estimated Coefficients for Width using data in uncoded units adequate.
Term Coef
Constant -70.4706 Prediction Equation No significant
Dwell Time 2.43562 based on coefficients. lack of fit, the
Temp 1.01544 effects do not
Na2S2O8 39.6625 belong in the
model
model.
Temp*Na2S2O8 -0.406354
Ŷ = - 70.47 + 2.44 * Dwell Time + 1.02 * Temp +

39.6625 * Na 2S 2 O 8 - 0.41 * Temp * Na 2S 2 O 8
Continuing here with some definitions….
The SS or Sum
S off Squares
S calculations are simply an unscaled or unadjusted measure off
dispersion or spread of the data. Seq or Sequential Sum of Squares and Adj or Adjusted Sum of
Squares are the same for DOE analyses. (There may be differences in Regression Analysis).
Adj MS or Adjusted Mean Square takes the Sum of the Squares number and scales it using the
number of degrees of freedom for that calculation. Mean Squares are the equivalent of variance.
Here we use the F statistic. An F statistic is simply variance divided by variance. In the case of
DOE it is the Variance of an effect divided by the variance due to residual error. In this platform,
MINITABTM sums the sum of the squares for certain elements of the model to report in the ANOVA
table instead of keeping them separate. The F statistic with respect to the Main Effects is calculated
by taking 199.779 and dividing by 1.348 which equals 148.18. The associated P-value is 0.000
which is less than 0.05 so our conclusion is that the model is significant.
Notice in this example the curvature is not significant which means our assumption of linearity is
good. Also
good so tthe
e p value
a ue for
o lack
ac oof fitt is
s not
ot ssignificant.
g ca t Thatat means
ea s tthe
eeeffects
ects we
e removed
e o ed from
o tthe e
model really do not belong in the model. If there was significant lack of fit, that would indicate that
some of the effects that were removed from the model actually belong in the model.
The last to discuss here is the prediction equation. Please note here the coefficients for the
prediction equation are based on uncoded units. In other words, you can use this equation directly
in real units. Let’s do an example next.

520
Prediction Equation
Take a few minutes to study
Determine the predicted value when:
the equation above. It really
– Dwell time = 4.2 minutes
is simply “plug and chug”.
– Temp = 75C
– Sodium Persulfate = 2.0
Please note, we have taken
liberties with rounding
numbers! You won’t actually Simply insert these values into the equation and do the math.
have to do this by hand
because that is exactlyy what
the response optimizer does
in MINITABTM.
The most interesting thing

Main Effects Plot (data means) for Width to look at here is the
Dwell Time Temp Point Ty pe
C orner
interaction plot. The
temperature with Sodium
38 C enter
36
34
Persulfate interaction
of Width
32
30 shows there is very little

diff
difference iin th
the predicted
di t d
4 5 6 40 60 80
Mean o
Na2S2O8
Interaction Plot (data means) for Width
38
36
40 60 80 1.8 2.1 2.4
Dwell
response as long as
40
34
Time
4
5
Point Ty pe
C orner
C enter
Sodium Persulfate is held
Dwell T ime
at the high level. But if the

32
32 6 C orner
30
concentration of Sodium
24
1.8 2.1 2.4
Temp Point Ty pe
40
40 C orner
Interaction shows there is very T emp 32

60
80
C enter
C orner Persulfate is lowered,
little difference in the predicted
response as long as Sodium 24
temperature and in
Persulfate is held at the high particular 40 degrees
level. Na2 S2 O 8
lowers the width more
rapidly than if temperature
was set at 80 degrees.
This is the Cube Plot again and the

average of the actual data points Cube
Cube Plot
Plot(data
(data means)
means)for
for Width
Width
appear around the cube as
previously discussed. 36. 875
36.875 43. 350
43.350
CCen terpo int
enterpoint
FFactorial
acto rial PPoint
o in t
33.245
33.245 38.395
38.395
80
80
35.020
35.020
Temp
Temp 36. 010
36.010 41. 000
41.000
2.4
2.4
Na2S2O8
N 2S2O8
Na2S2O8
23.025
23.025 25.895
25.895
40
40 1.8
1.8
44 66
Dwell
DwellTime
Time

521

p
There are no assumption
violations within the plots The Residual Plots look good.
shown here.
Residual
Residual Plots
Plotsfor
for Width
Width
No rmal Pro
Normal b abilit yy Plo
Probabilit Plott oof
f tthe
heRResiduals
esid uals RResiduals
esid uals Versus
Versustthe
he Fit t ed Values
Fitted Values
99
99 22
Residual
90
90 11
Percent
Percent
00
Standardized
50
50
-1
-1
10
10
11 -2
-2
-2
2
-2 -1
1
-1 00 11 22 20
20 25
25 30
30 35
35 40
40
Standar dized Residual
Standardized Residual Fitted
FittedValue
Value
Hist o gram oof

Histogram f tthe
he Resid uals
Residuals Residuals
ResidualsVersus
Versustthe
he Ord er oof
Order f tthe
heDat
Dataa
22
6.0
Residual
6.0
4.5 11
4.5
Frequency
Frequency
00
Standardized
3.0
3.0
1.5 -1
-1
1.5
0.0 -2
-2
0.0
-2
-2 -1
-1 00 11 22 22 44 66 88 10 10 1212 14 14 16
16 18
18
Standar dized Residual
Standardized Residual OObservation
bser vation OOrder
r der
Residuals Versus Dwell Time Residuals Versus Temp

(response is Width) (response is Width)
2 2
1 1
0 0
As depicted here the
Residuals Versus
-1 -1
Factor Plots do NOT
-2
4.0 4.5 5.0 5.5 6.0
-2
40 50 60 70 80
show
h any diff
differences
Dwell Time Temp
in the variation of the
Residuals Versus Na2S2O8 data from the low to
(response is Width)
2 the high values.
1
-1
S
-2
1.8 1.9 2.0 2.1 2.2 2.3 2.4
Na2S2O8
9 . Dra w Pra ctica l Solutions Sta t>DO E> Fa ctoria l> Response O ptimizer
Here we will use the
Response Optimizer
to draw some
Practical
Conclusions. Play
with the Response
Optimizer and see
what you can do
remembering that the
original
i i l objective
bj ti was
to hit a target of 40
+/- 5 for the width.

522

This looks a little odd. Even
Th Response
The R O ptim
i izer
i ha
h s a little
li l trick
i k ; if you include
i l d
though each of the input Center Points in the m odel it w ill trea t the low , center a nd
variables is continuous if you high va lues a s discrete points.
include Center Points in the
As you ca n see the
model it will treat the low, center
Center Points fit the
and high values as discrete
Linea r M odel.
points.
As you can see the Center

Points fit the linear model.
Do it again and this time

turn off “Include center
points in the model” so that
MINITABTM will generate its
best optimization.
Is this the only solution?
Are there other solutions?

E l
Explore your options
ti b
by
sliding the red lines around
Setting each factor
to see the various at these settings will
reactions. achieve the target
output.
Predicted output

523
MINITABTM does an excellent jjob of

W hat
h t if you assume N a2 S2 O 8 is
i very expensive?
i ? Wh
here
optimizing according to the data,
would you set the variables.
what it does not know are all the
quirks of your equipment, cost of raw
materials, increasing throughput, etc.
Is it possible to achieve the target

value of 40 with Sodium Persulfate
set at the minimum value?
It looks like we can get close, but we

can’t hit the target. We know our
Use the m ouse a nd slide the red line for
lower specification limit is 35 and it N a 2 S2 O 8 to the low level first, then a djust the
looks like we can get to 38 with the other sliders to move the predicted response to
4 0 . Is it possible to a chieve 4 0 w ith Sodium
Sodium Persulfate at the low level, Persulfa te set a t the m inimum va lue?
temp and dwell time high. Is the good
enough?h? M Maybe,
b maybeb not. t
If you knew the spread of the data or variation and it was small you could capitalize on that capability
by using 38 as the target instead of 40 and still guarantee your customer they would never see any
product with widths smaller than 35.
Imagine if you were working with gold or platinum. What effect could that have on the bottom line?
Look at another
graphical tool you can There is another MINITABTM function that will show the complete
use in MINITABTM to solution set for a targeted values.
visualize the solution Stat>DOE>Factorial>Overlaid Contour Plot
set of input variable

level settings in order
to achieve the desire
result.

524
As shown here we
generate 3 Overlaid Contour Plot of Width Overlaid Contour Plot of Width
different graphs as 2.40

W idth
35
2.40
W idth
35
45
a result of
45
Hold Values Hold Values

2.25 Dwell Time 4 2.25 Dwell Time 5
changing the set
Na2S2O8
Na2S2O8
point for dwell 2.10
Dw ell Tim e 2.10 Dw ell Tim e
time. The areas a t low a t m iddle
1.95
setting 1.95 setting
shown in white are
the solution set for 1.80
40 50 60 70 80
1.80
40 50 60 70 80
adjusting Temp Temp
temperature and Overlaid Contour Plot of Width

The a rea s 2.40
Sodium Persulfate show n in
W idth
35
45
to get a predicted w hite a re 2.25

Hold Values
Dwell Time 6
response between the solution Na2S2O8
35 and 45. This is set for 2.10 Dw ell Tim e

a djusting a t high
g
an alternative to Tem p a nd 1.95 setting
the Response Sodium
Optimizer. Persulfa te. 1.80
40 50 60 70 80
Temp
It’s a wrap……. Fun stuff, right?!

• Verify, verify, verify.
• Verify settings determined in the last step
step, by producing several
typical manufacturing quantities.
• The variation or error seen in the experiment will be different than the
variation seen in the manufacturing validation.
1 1 . Implem ent Solutions

• If the objective of the experiment was accomplished and the Business
Case is satisfied, then proceed to the Control Plan which is covered
in the Control Phase.
output to satisfy the Business Case.
• Implement the changes necessary to maintain the new gains to the
process.
p

525
Understand how to create Balanced and Orthogonal

Designs
Explain how Fit, Diagnose and Center Points factor into
an Experiment
You have now completed Improve Phase – Full Factorial Experiments.
Notes

526
Lean Six Sigma

Black Belt Training
Improve Phase
Fractional Factorial Experiments
Now we will continue with the Improve Phase “Fractional Factorial Designing Experiments”.

527
Within this module we

will explore how to
Welcome
Welcome to
to Improve
Improve
conduct a Fractional
Factorial Experiment.
Process
Process Modeling:
Modeling: Regression
Regression
Advanced
Advanced Process
Process Modeling:
Modeling:
MLR
MLR
Designing
Experiments
Experimental
Methods
Designs
Designs
Full
Full Factorial
Experiments
Creation
Creation
Fractional
Fractional Factorial
Experiments
Generators
Generators
Wrap
Wrap Up
Up &
& Action
Action Items
Items
Confounding
Confounding && Resolution
Resolution
Why Use Fractional Factorial Designs
Fra ctiona l Fa ctoria l Designs a re used to:

• Analyze factors to find cause/ effect relationships if the Analyze Phase was
unable to sufficiently narrow the number of factors impacting the output(s).
– Fractional Factorials are often referenced as “ screening experiments” -- fewer runs
with
i h llarger number
b off ffactors.
– Fractional Factorials are usually done in early stages of the improvement process.
Fractional Factorial Design

StdOrder A B C D Full Factorial Design
1 -1 -1 -1 -1 StdOrder A B C D
1 -1 -1 -1 -1
2 1 -1 -1 1
2 1 -1 -1 -1
3 -1 1 -1 1
3 -1 1 -1 -1
4 1 1 -1 -1 4 1 1 -1 -1
5 -1
1 -1
1 1 1 5 -1
1 -1
1 1 -1
1
6 1 -1 1 -1 6 1 -1 1 -1
7 -1 1 1 -1 7 -1 1 1 -1
8 1 1 1 1 8 1 1 1 -1
9 -1 -1 -1 1
10 1 -1 -1 1
11 -1 1 -1 1
12 1 1 -1 1
13 -1 -1 1 1
14 1 -1 1 1
15 -1 1 1 1
16 1 1 1 1
Fractional Factorial Designs are a powerful sub-set of Factorial Designs. As the name implies, you
may expect they are some fraction of the original Factorial Designs – and you’d be correct. The
question is what fraction?

528
Why Use Fractional Factorial Designs (cont.)

Fractional
act o a Factorial
acto a Designs
es g s aare
e used to a
analyze
a y e factors
acto s to find
d cause a
andde
effect
ect relationships
e at o s ps if tthe
e
Analyze Phase was unable to sufficiently narrow the number of factors impacting the output(s).
Fractional Factorials are often referenced as “screening experiments” meaning that fewer runs are
required with larger number of factors. Fractional Factorials are usually done in early stages of the
improvement process.
We’ve shown two 4 factor designs side by side so you can contrast the two designs. Notice the
Fractional Factorial Design requires only a fraction of the experimental runs to evaluate 4 input
factors In this case,
factors. case it is a half fraction
fraction. As with most things in life there is a price to be paid for
reducing the number of runs required which we will go through in detail in this module.
Fractional Factorial Designs are also used to:

• Study Main Effects and 2-way interactions if the experimenter and team has
good process knowledge and can assume higher order interactions are
negligible.
• Reduce time and cost of experiments because the number of runs have
been lowered.
– As the number of factors increases, the number of runs required to run a full 2k
factorial experiment also increases (even without repeats or replicates)
• 3 factors: 2x2x2 = 8 runs
• 4 factors: 2x2x2x2 = 16 runs
• 5 factors: 2x2x2x2x2 = 32 runs etc….
• Be an initial experiment
p that can be augmented
g with another fraction to
reduce confounding and estimate factors of interest.
The answer is in there

somewhere!!
Fractional Factorial designs are also used to study Main Effects and 2-way interactions if the
experimenter and team has good process knowledge and can assume higher order interactions are
negligible. There is the cost in a nutshell. In exchange for reducing the overall experiment’s size you will
give up the ability to evaluate higher order interactions. It turns out this is a pretty good assumption in
many cases. We’ll talk about this more later.
Fractional Factorial designs are also used to reduce the time and cost of experiments because the
number of runs are lowered. As the number of factors increases, the number of runs required to run a
full 2k factorial experiment also increases (even without repeats or replicates) as you already know.
3 factors: requires 8 runs
4 factors: requires 16 runs
5 factors: requires 32 runs etc….
The number of runs required for a Fractional Factorial will depend on how many factors are included in
the design and how much fractioning can be tolerated based on the facts of the process
process.
Fractionals are also used as an initial experiment that can be augmented with another fraction to
reduce confounding and estimate factors of interest. We’ll define this as we advance through the
module.

529
Nomenclature for Fractional Factorials
The general notation for a

Fractional Factorial is k-p
2R
The genera l nota tion for
similar to that of a Full
Factorial. Take a few Fra ctiona l Fa ctoria ls is:
moments and read through
the definitions for the – k = number of factors to be investigated
notation. – p = number of factors assigned to an interaction column (also called
g
“ degree of fractionating”
g with 1=1/ 2,, 2=1/ 4,3=1/
, 8,, etc.))
Let’s look at the 2 to the 5 – R = design resolution (III, IV, V, etc.). It details amount of
minus 1 example here: confounding to compare design alternatives
How many factors are in – 2 k-p = the number of experimental runs
the experiment? That is the
first number in the exponent The example clarifies how to use the nomenclature. 5-1
2V
or in this case, 5. • How many factors in the experiment?
• How many runs if no repeats or replicates?
At this p
point we are not
• W hat
h t fractional
f ti l design
d i iis thi
this (1/ 8
8, 1/ 4 or 1/ 2)?
ready to discuss the
resolution since we have
not covered it yet.
How many runs if no repeats or replicates? Simply do the math. 2 to the 5 minus 1 is the same as
2 to the fourth which is 8 runs.
What Fractional Design is this? Since this design uses only half the number of runs as a Full
Factorial with 5 factors it is a half fraction.
Half-Fractional Experiment Creation
Recall the 2x2x2 full 3-factor, 2-level Factorial Design. Suppose we needed to investigate
a fourth factor but we could N OT increase the number of runs because of time or cost.
g
Select the highest order interaction to represent
p the levels of the fourth factor. The ABC
interaction will determine the levels for factor D.
W hen we replace the ABC interaction with factor D, we say the ABC 3-way interaction
was aliased or confounded with D. This experiment maintains balance and orthogonality.
– The first experimental run in the first row indicates the experiment is executed with factor D at
the low level while running all the 3 other factors at the low level.
Factor D
A B C AxB AxC BxC AxBxC
-1
1 -1
1 -1
1 1 1 1 -1
1
1 -1 -1 -1 -1 1 1
-1 1 -1 -1 1 -1 1
1 1 -1 1 -1 -1 -1
-1 -1 1 1 -1 -1 1
1 -1 1 -1 1 -1 -1
-1 1 1 -1 -1 1 -1
1 1 1 1 1 1 1
This is a ha lf-fra ction 2 4 -1 design - a Resolution IV design w ith only 8 runs.

530
Half-Fractional Experiment Creation

g 4 runs can not p
Having project
j 4 factor therefore,, this would have 3 degrees
g of freedom,, so the
answer is a big fat NO.
Why is the design, shown as orange rows, called a “half” fraction? This is the design
just created on the previous slide. This is a half fraction since a full 2x2x2x2 factorial
would take 16 runs. With the half fraction we can estimate the effects of 4 factors in 8
runs. What is the cost? We lose the ability to study the higher order interaction
independently!
A B C D AxB AxC BxC AxBxC

-1 -1 -1 -1 1 1 1 -1
1 -1 -1 -1 -1 -1 1 1
-1 1 -1 -1 -1 1 -1 1
Half Fraction: 1 1 -1 -1 1 -1 -1 -1
-1 -1 1 -1 1 -1 -1 1
Alias Structure: 1 -1 1 -1 -1 1 -1 -1
D = ABC -1 1 1 -1 -1 -1 1 -1
Note D settings 1 1 1 -1 1 1 1 1
-1 -1 -1 1 1 1 1 -1
are the same as 1 -1 -1 1 -1 -1 1 1
the ABC -1
1 1 -1
1 1 -1
1 1 -1
1 1
1 1 -1 1 1 -1 -1 -1
interaction -1 -1 1 1 1 -1 -1 1
1 -1 1 1 -1 1 -1 -1
-1 1 1 1 -1 -1 1 -1
1 1 1 1 1 1 1 1
Could we create a quarter fraction experiment out of the above

matrix and still study four factors at once?
Why or why not?
Graphical Representation of Half-Fraction

Why would we call this a half fraction? Because half the number of runs is necessary as apposed to
that of a Full Factorial.
We have discussed half-fractional Experimental Designs for 4 factors:

The graphical representation shows the 8 runs we created on the previous 2
slides.
- A + - A +
Top line of previous slide
-
- C
+
B
-
+ C
+
- D +
Remember that D is confounded with the ABC interaction in this half-fractional
design.

531
Design Generators
Don’tt worry – MINITABTM will take care of this! THANK YOU MINITABTM!!!!
Don
Design Generators are an easier technique to use than to generate the

Fractional Factorial Designs by hand as done in the previous slides.
Design Generators help us EASILY find the confounding within the

Fractional Design.
g
A Design Generator is the mathematical definition for how to begin

aliasing a Full Factorial to create a Fractional Factorial.
Example of a Design Generator:
Design Generator D = ABC
This means the D column is the same as the ABC

interaction column; they cannot be distinguished from
each other so are called “confounded”.
This graph helps us visually draw the conclusion of the data that we already have. We have
highlighted in green two boxes and this can very simply be filled in by the data expressed by the
generator; A times B times C equals D.
Design
g Generator D = ABC
• Because of the Design Generator we can now fill out the D column
– For each row of D, multiply the values in the columns of A, B and
C together and create the column
• You may correctly suspect some 2-factor interactions are
confounded
• Create contrast columns for AD,
AD BD,
BD CD using a similar technique
used to create the column for D
A B C AB AC BC D AD BD CD
-1 -1 -1 1 1 1
1 -1 -1 -1 -1 1
-1 1 -1 -1 1 -1
1 1 -1 1 -1 -1
-1 -1 1 1 -1 -1
1 -1 1 -1 1 -1
-1 1 1 -1 -1 1
1 1 1 1 1 1

532
Design Generators (cont.)
Which columns are the same?

A B C AB AC BC D AD BD CD
-1 -1 -1 1 1 1 -1 1 1 1
1 -1 -1 -1 -1 1 1 1 -1 -1
-1 1 -1 -1 1 -1 1 -1 1 -1
1 1 -1 1 -1 -1 -1 -1 -1 1
-1 -1 1 1 -1 -1 1 -1 -1 1
1 -1 1 -1 1 -1 -1 -1 1 -1
-1 1 1 -1 -1 1 -1 1 -1 -1
1 1 1 1 1 1 1 1 1 1
Why do I want to know this?
Generate this design in MINITABTM and bring up the Session Window.
MINITABTM Session Window
This MINITABTM output

What does this mean?
gives the summary of
what you did on the
previous slides much
quicker than we can do
by hand. The reason we Factors: 4 Base Design: 4, 8 Resolution: IV
have you did things Runs: 8 Replicates: 1 Fraction: 1/2
manually earlier is to Blocks: none Center pts (total): 0
being to appreciate and
understand
d t d th the D i
Design Generators:
G t D = ABC
MINITABTM output Alias Structure
I + ABCD
generated in the session
window after you create A + BCD
a Fractional Factorial B + ACD
design with 4 factors, half C + ABD
fraction with no Center D + ABC
Points or replicates and AB + CD
the number of blocks AC + BD
equal to 1. You should AD + BC
get the same output. Try
it.
Notice after the design structure an alias structure is indicated. The line under the alias structure
showing A plus BCD means the A Main Effect is confounded with the 3 way interaction BCD. Also,
later we can see the AB 2 2-way
way interaction is confounded with the CD 2 2-way
way interaction meaning we
cannot distinguish if the interaction is statistically significant whether it is a result of the AB or CD
interaction or a combination.

533
So What is “Confounding”?
Confounding is the consequence an experimenter accepts for not running a

Full Factorial Design.
W hen using the “ Confounding” or “ Alias” pattern we assume that the
higher order interactions in a Confounded effect are not significant.
– Sparsity of effects principle indicates that higher order interactions
are very rare.
• “ W hile intera ctions a re im porta nt they do not a bound…,
intera ctions tha t a re m ore com plex tha n those involving
tw o fa ctors a re ra re” Thom a s B. Ba rk er
In the past example, the D factor was Confounded with the ABC 3-way
interaction. W hen the effect is assigned to D which is Confounded with
ABC, we assume because of the sparsity of effects principle the effect is
entirely because of the D factor.
Remember when two items such as an interaction with a Main Effect are
Confounded, one cannot distinguish if the statistical significance is a result
of the Main Effect or the interaction or a combination.
Alia sing is a nother term for “ Confounding” .
Confounded Effects With Fractionals

Using more enhance visuals, here is another Fractional Design structure, notice how the Alias
structure A is Confounded with the two way interaction. The light green box indicates this to be true
the most obvious.
M IN ITABTM w ill a utoma tica lly Sa me levels

genera te the a lia s structure w hich
lists a ll the Confounded Effects. Alia s
N ote: For this ca se AA BC
BC ABC
ABC
Structure +1
– A is Confounded w ith BC +1 +1
+1 +1
+1
-1
-1 -1
-1 +1
+1
– B is Confounded w ith AC I + ABC -1
-1 -1
-1 +1
+1
– C is Confounded w ith AB
The Confounding m ea ns a ny effect A + BC B AC ABC
noted ca nnot be specifica lly -1 -1 +1
a ssigned to either of the Confounded
B + AC
+1 +1 +1
fa ctors. C + AB
-1 -1 +1
– Rem
R ember
b w e w ill use the
th +1 +1 +1
spa rsity principle.
C AB ABC
N ote: This is a level III design -1 -1 +1
a nd is N O T recommended since -1 -1 +1
Confounding ex ists betw een +1 +1 +1
M a in Effects a nd 2 -fa ctor +1 +1 +1
intera ctions.

534
Experimental Resolution
k-p
2R
Remember R in the nomenclature referenced the Resolution.
This useful visual aid remembers definitions of the
Confounding designated by the Resolution.
Resolution III Fully Saturated Design

Hold up Three Fingers, One on one
hand and Two on the other
other. This
illustrates the Confounding of main
Main Effects Two Way Interactions effects with two way interactions.
Resolution IV
Next hold up four fingers
The Confounding is main effects with
three way interactions or…
Main Effects Three Way Interactions
Two way interactions Confounded with

other two way interactions.
Two Way Interactions Two Way Interactions
k-p
2R
The visual aid is shown through Resolution V.
Resolution V
Hold up Five Fingers, One on one hand and
F
Four on the
th other.
th This
Thi illustrates
ill t t the th
Confounding of main effects with four way
Main Effects Four Way Interactions interactions or …
Two way interactions Confounded with

three way interactions.
Two Way Interactions Three Way Interactions

535
MINITABTM Fractional Factorial Design Creation

We have already seen this
MINITABTM output from the Fortunately, MINITABTM creates the designs for us to prevent having to create
Session Window after a a fractional factorial by hand. This output, found in the MINITABTM session
Fractional Factorial Design window after creating a Fractional Factorial design, should be understood
was created. We have because it also informs us of the Resolution of the design.
highlighted in green an area Stat>DOE>Factorial>Create Factorial Design … 4 factors, Designs, ½ fraction
not focused on yet until
Resolution was discussed.
MINITABTM automatically Factors:
F t
Runs:
4 Base
B D
Design:
i
8 Replicates:
4,
4 8 RResolution:
l ti
1 Fraction: 1/2
IV
tells us the Resolution and if Blocks: none Center pts (total): 0
we use the hands technique Design Generators: D = ABC

Alias Structure
to remember the Aliasing I + ABCD
type of structure, we can
A + BCD
save time. The Resolution B + ACD
C + ABD
can get very complicated D + ABC
with those screening AB + CD
AC + BD
Fractional Factorial Designs AD + BC
with factors more than 5 so
this help is desirable.
2V (5 -1) Fractional Design Resolution V
Example of a very useful Fractional Design often used for screening designs.
Run A B C D E
1 -1 -1 -1 -1 1
2 1 -1 -1 -1 -1
3 -1 1 -1 -1 -1
E
4 1 1 -1 -1 1
5 -1 -1 1 -1 -1
6 1 -1 1 -1 1
7 -1 1 1 -1 1
B
8 1 1 1 -1 -1
C
A 9 -1 -1 -1 1 -1
D 10 1 -1 -1 1 1
11 -1 1 -1 1 1
Pros Cons 12 1 1 -1 1 -1
13 -1 -1 1 1 1
5 factors (Main Effects) 16 trials to get 5 Main Effects 14 1 -1 1 1 -1
10 2-way interactions 2nd order interactions are
15 -1 1 1 1 -1
Main Effects only Confounded Confounded with 3rd order
with rare 4-way interactions 16 1 1 1 1 1

536
MINITABTM’s Display of Available Designs

Lots of options
p here – once again
g the g
great MINITABTM itself!
Fra ctiona l Designs a re

colored box es w ithout “ Full”
N ote: Since w e discoura ge Design Resolution III or IV, M IN ITABTM ha s

sha ded these a s RED a nd YELLO W for ca utiona ry. GREEN is a ccepta ble
beca use M a in Effects a re not Confounded w ith low er level intera ctions.
DOE Methodology
We have included a copy of the methodology here for you to use when following our practical
example for Fractional Factorials.
1. Define the Pra ctica l Problem

2. Esta blish the Ex perimenta l O bjective
3. Select the O utput (response) Va ria bles
4. Select the Input (independent) Va ria bles
5. Choose the Levels for the input va ria bles
6. Select the Ex perimenta l Design
7. E ecute
Ex t the
th Ex
E periment
i t a nd
d collect
ll t da
d ta
t
8. Ana lyze the Da ta from the designed ex periment a nd
dra w Sta tistica l Conclusions
1 0 .Replica te or Va lida te the Ex perimenta l Results
1 1 .Implement Solutions
Just follow these simple steps…..

537
Fractional Factorial Example

• 8 factors are of interest in increasing the output but process knowledge
is limited because of a previously poor gauge for the output
• The output is to be maximized
3 . Select the O utput Va ria bles
• The output is labeled Y and has a Gage R&R % study variation of less
than 5%
4 . Select the Input Va ria bles
• The Input Variables are simply labeled A through H
• For simplicity sake of this exercise, the Levels can be expected to be
appropriately
pp p y set and we will only
y work with coded levels
This is a two to the eighth minus four power design with a resolution four design
design. This design has
16 runs as you see in the graphic with all eight factors at two levels.

Select the appropriate design in MIN ITABTM and create this exact worksheet in
columns C1 through C12.
W e have no reason to believe curvature exists and are satisfied that no
replicates
li t are required.
i d
For ease of this exercise, be sure N OT to have randomized the experiment.

538
Fractional Factorial Example (cont.)
7 . Ex ecute the Ex periment a nd Collect the Da ta

Select the appropriate design in MIN ITABTM and create this exact worksheet in
columns C1 through C12.
W e have no reason to believe curvature exists and are satisfied that no
replicates are required.
The resources and time allow us to only run the experiment with 16
treatment combinations or experimental runs.
Take a look at what Confounding exists before you jump into analysis.
8 . Ana lyze the Da ta a nd dra w Sta tistica l Conclusions

Before doing any analysis, let’s review what Confounding exists in this highly
fractionated Factorial Design
The Main Effects are Confounded with numerous 3-way interactions
The 2-way interactions are Confounded with numerous 2-way interactions
This is important and must be remembered in our analysis.

539

We chose to set alpha
p to 0.1 initially
y but this is not required.
q We find the factors with important
p Main
Effects are E, H and B. The 2-way interactions AC, AF and AE seem important at an alpha level of
0.1.
We want 95% confidence in our Statistical Conclusions for this example.
We have generated the initial Pareto of effects.
Pareto
ParetoChart
Chartof
ofthe
theEffects
Effects
(response
(responseisisY,
Y,Alpha
Alpha==.10)
.10)
0.26
0.26
EE FFactor
actor Name
N ame
AA AA
AC
AC BB BB
HH CC CC
BB DD DD
AF EE EE
AF FF FF
AE
AE GG GG
AD
AD HH HH
TTerm
Term
AA
AG
AGG
CC
AH
AH
AB
AB
GG
FF
DD
00 22 44 66 88 10
10 12
12 14
14
Effect
Effect
Lenth's
Lenth's PSE
PSE==0.129375
0.129375
A choice m ust be m a de in reducing our m odel or reducing the num ber of

term s in the m odel. W e ha ve chosen to look a t the Confounding ta ble
genera ted by M IN ITABTM .
The AC 2 fa ctor intera ction is Confounded w ith other 2 -w a y

intera ctions but w e w ill a ssum e for now using the Confounding
ta ble from M IN ITABTM tha t the 2 -w a y AC intera ction is a ctua lly
the EH 2 fa ctor intera ction beca use both fa ctors E a nd H a re
significa nt.
The second highest effect for a 2 fa ctor intera ction AF. W e w ill
look a t the Confounding ta ble a nd a ssum e it is the BE 2 -w a y
intera ction since the B a nd E fa ctors a re significa nt.
The 2 -w a y intera ction AE a lso is significa nt w ith the a lpha
a bove 0 .1 . W e ca nnot find a nother 2 -w a y intera ction tha t
m ight be significa nt using just the B, E, a nd H fa ctors.
If the AE intera ction is k ept in the m odel, then to m a inta in
“ hiera rchica l order” fa ctors A a nd E m ust be k ept in the
m odel.
W e w ill now low er reduce the m odel a nd see if w e ca n
further reduce the m odel.

540
The Reduced M odel is show n here a nd w e w a nt 9 5 % confidence to

include term s.
N otice the AE 2-way interaction has the smallest effect of the statistically
significant terms and factor A kept in the model to maintain the “ hierarchical
order” also has a small term and is statistically insignificant. W e choose to
reduce the model and remove those terms. R-sq should not be severely impacted.
If it was impacted severely, we would reconsider this choice.
Factorial Fit: Y versus A, B, E, H
Estimated Effects and Coefficients for Y (coded units)

Constant 22.001 0.04381 502.21 0.000
A 0.144 0.072 0.04381 1.64 0.139
B 4.939 2.469 0.04381 56.37 0.000
E 12.921 6.461 0.04381 147.48 0.000
H -6.246 -3.123 0.04381 -71.29 0.000
A*E -0.351 -0.176 0.04381 -4.01 0.004
B*E -3.836 -1.918 0.04381 -43.78 0.000
E*H 8.244 4.122 0.04381 94.09 0.000
S = 0.175232 R-Sq
R Sq = 99.98% R-Sq(adj)
R Sq(adj) = 99.96%
Analysis of Variance for Y (coded units)
Main Effects 4 921.55 921.545 230.386 7502.91 0.000
Residual Error 8 0.25 0.246 0.031
Total 15 1252.99
The further refit m odel show s a n a dequa te m odel beca use:

Simplicity of terms; which is desired but N OT required
R-sq is quite high (overly unusual for practical experiments)
N o or few unusual observations which would be noted below the AN OVA in
MIN ITABTM ’s session window
Residuals are appropriate
Factorial Fit: Y versus B, E, H

Constant 22.001 0.07167 306.98 0.000
B 4.939 2.469 0.07167 34.46 0.000
E 12.921 6.461 0.07167 90.15 0.000
H -6.246 -3.123 0.07167 -43.58 0.000
B*E -3.836 -1.918 0.07167 -26.76 0.000
E*H 8.244 4.122 0.07167 57.51 0.000
S = 0.286673 R-Sq = 99.93% R-Sq(adj) = 99.90%

Analysis of Variance for Y (coded units)

Main Effects 3 921.46 921.462 307.154 3737.52 0.000
Residual Error 10 0.82 0.822 0.082
Lack of Fit 2 0.10 0.099 0.050 0.55 0.597
Pure Error 8 0.72 0.722 0.090
Total 15 1252.99

541
The Residua ls Ana ly sis is a dequa te a nd a ppropria te beca use:

The residuals are concluded to be normally distributed
N o pattern for residuals in the order or versus Fitted Value
Residual
Residual Plots
Plots for
for YY
Normal
NormalProbability
Probability Plot
Plot Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
99
99 NN 16
16
0.4
0.4
AD
AD 0.532
0.532
90
90 P-Value
P-Value 0.146
0.146 0.2
02
0.2
al
Percentt
Residua
Residual
Percent
50
50
0.0
0.0
10
10 -0.2
-0.2
11
-0.50
-0.50 -0.25
-0.25 0.00
0.00 0.25
0.25 0.50
0.50 00 10
10 20
20 30
30
Residual
Residual Fitted
FittedValue
Value
Histogram
Histogramof
of the
theResiduals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
ofthe
theData
Data
44 0.4
0.4
33 0.2
ency
0.2
ncy
ual
al
Residua
Residu
Frequen
Freque
22
0.0
0.0
11
-0.2
-0.2
00
-0.3
-0.3 -0.2
-0.2 -0.1
-0.1 0.0
0.0 0.1
0.1 0.2
0.2 0.3
0.3 0.4
0.4
11 22 33 44 55 66 77 88 99 10
1011
11 12
1213
1314
14 15
1516
16
ObservationOrder
Order
Residual
Sta tistica l Conclusions to m a inta in term s in the m odel m ust consider:

Maintaining hierarchical order
A 2-way interaction must have the involved factors in the model also
High statistical confidence with the P-value less than your alpha risk
A higher R-sq or model explanation of the process changes is desired
Proper residuals and few to no unusual observations
No, no unusual
observations here…

542

W e will have to remember our Experimental Objective to increase the output Y.
Looking at the positive coefficient for B and E, we know if we put those factors
at the high level or value of +1, the output increases
Looking at the negative coefficient for H, we would think we should operate at
the low level or value of -1. However, the 2-way interaction of EH shows a
coefficient that is larger and would result in a net decrease in the output of Y
so we must set H to a +1 or the high level
level.
A big reminder is we have ASSUMED the 2-way interactions involved the
factors we left in the model.
Factorial Fit: Y versus B, E, H

Constant 22.001 0.07167 306.98 0.000
B 4.939 2.469 0.07167 34.46 0.000
E 12.921 6.461 0.07167 90.15 0.000
H -6.246 -3.123 0.07167 -43.58 0.000
B*E -3.836 -1.918 0.07167 -26.76 0.000
E*H 8.244 4.122 0.07167 57.51 0.000
S = 0.286673 R-Sq = 99.93% R-Sq(adj) = 99.90%
It ca n be difficult to optim ize the solutions a nd get the Pra ctica l Solution
desired.
Using Response O ptim izer w ithin M IN ITABTM helps us find the Pra ctica l
Solution of setting the fa ctors left in the m odel a ll a t the high level or + 1

543
Pra ctica l Conclusions to k eep in the m odel include:

Simple models can be useful depending on the project or process
requirements
Terms with practically large enough significance even if statistically
significant
Impact of R-sq by removing a term with low effects
Ability to set and control the controllable inputs in the model may
decide on the use of terms
Robust designs or minimal variation requirements may require
close inspection of interactions’ effects on the Y
If multiple outputs are involved in the process requirements,
balancing of requirements will be necessary
That’s a lot of juggling….

After we have determined with 95% statistical confidence, we must
replicate the results to confirm our assumptions; such as which 2-way
interactions were significant among the Confounded ones
If the results do not match the expected results OR the project goal,
further experimentation may be needed
In this case, we were able to achieve 29.8 on average with the
process setting of E, B and H and so the results are considered
successful in the project
We win, we win…!!

544
1 1 . Im plement Solutions
W ork with the Process Owners and develop the Control Plans
to sustain your success
Fractional Factorial Exercise
Ex ercise objective: Open file “ bhh379.mtw” and

analyze using the 11 Step methodology.
1. W hat kind of Factorial Design is this?
2. Generate Factorial Plots in MIN ITABTM .
3. Create the Statistical and Practical model.

545
• Explain why & how to use a Fractional Factorial Design
• Create a proper Fractional Factorial Design
• Analyze a proper model with aliased interactions
Not that kind of model!!
You have now completed Improve Phase – Fractional Factorial Experiments.
Notes

546
Lean Six Sigma

Black Belt Training
Improve Phase
Congratulations on completing the training portion of the Improve Phase. Now comes the
exciting and challenging part…implementing what you have learned to real world projects.

547
Improve Phase Overview—The Goal

This is a summary of
the purpose for the The goa l of the Improve Pha se is to:
Improve Phase.
Avoid getting into
analysis paralysis, • Determine the optimal levels of the variables which are significantly
only use DOE’s as impacting your Primary Metric.
necessary. Most
problems will NOT • Demonstrate a working g knowledge
g of modeling
g as a means of
require
i the
h use off process optimization.
Designed
Experiments
however to qualify as
a decent Green Belt
you at least need to
have an
understanding of
DOE as described
above.
Improve Phase Action Items
• Listed below are the Improve Phase deliverables that each candidate
will present in a Power Point presentation at the beginning of the
Control Phase training.
• At this point you should all understand what is necessary to provide
these deliverables in your presentation.
– Primary Metric
– Experiment Justification
– Experiment Plan / Objective
– Experiment Results
– Project Plan
It’s your show!
Before beginning the Control Phase you should prepare a clear presentation that addresses each
topic shown here.

548
Six Sigma Behaviors
• Being tenacious
tenacious, courageous
• Being rigorous, disciplined
• Making data-based decisions

the
Walk!
Ea ch ““pla
Each player ” in
yer” in the
the Six
Six Sigma
Sigmaa process
Sigm process m ust be
must be
AA RO LE M O DEL
RO LE M O DEL
for
for the
the Six
Six Sigm
Sigmaa culture.
culture.
Improve Phase - The Roadblocks
Look for the potential roadblocks and plan to address them before they
become problems:
– Lack of data
– Data p presented is the best g
guess by
y functional managers
g
– Team members do not have the time to collect data
– Process participants do not participate in the analysis planning
– Lack of access to the process
Each phase will have roadblocks. Many will be similar throughout your project.

549
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure
Assess Stability, Capability and Measurement Systems

Analyze

ve
Identify, Prioritize, Select Solutions Control or Eliminate X’s

X s Causing Problems
Improv
Implement Solutions to Control or Eliminate Xs Causing Problems

Control
The objective of the Improve Phase is simple – utilize advanced statistical methods to identify
contributing variables OR more appropriately optimize variables to create a desired output.
Improve Phase
Over 80% of projects will realize there
solutions in the Analyze Phase – Analysis Complete
Designed Experiments can be extremely
effective when used properly, it is Identify Few Vital X’s
imperative that a designed experiment is

justified. From an application and Experiment to Optimize Value of X’s
practical standpoint, if you can identify a
solution by utilizing the strategy and tools Simulate the N ew Process
within the Measure and Analyze Phases,
then do it. Do not force Designed
Validate N ew Process
Experiments.
Remember, your sole objective in Implement N ew Process
conducting a Lean Six Sigma project is to

find a solution to the problem. You
created a Problem Statement and an Ready for Control
Objective Statement at the beginning of

your project.
you p oject However
o e e you ca can reach
eac a
solution that achieves the stated goals in
the Objective Statement, than implement
them and move on to another issue –
there are plenty!

550
Improve Phase Checklist
Improve Pha se Q uestions
• Are the potential X’s measurable and controllable for an experiment?
• Are they of statistically significant and practical significance?
• How much of the problem have you explained with these X’s?
X s?
• Have you clearly justified the need for conducting a designed

experiment?
• Are adequate resources available to complete the project?
• W hat next steps

p are yyou recommending?
g
These are questions that the participant should be able to answer in clear, understandable language
at the end of this phase.
Planning for Action
A DOE to meet your problem solving strategy
Scheduling your experimental plan
Executing your planned DO E
Analysis of results form your DOE
Obtain mathematical model to represent process
Planning the pilot validation for breakthrough
Present statistical promise to process owner
Prepare for implementation of final model
Schedule resources, for implementation timeline
Conclude on expected financial benefits
Over the last decade of deploying Six Sigma it has been found that the parallel application of the
tools and techniques in a real project yields the maximum success for the rapid transfer of
knowledge. It is imperative that you complete this and submit your plan for action for review with
your mentors. Thanks and good luck!

551
Have started to develop a project plan to complete the

action items
Be ready to apply the Six Sigma method within your

business
You’re on your way!
You have now completed Improve Phase – Wrap Up and Action Items.
Notes

552
Lean Six Sigma

Black Belt Training
Improve Phase
Quiz
Now we will see what you have retained from the Improve Phase of the course. Please answer
the Improve Phase where your retention of the knowledge is less than you desire.

553
Improve Phase Quiz
1 M
1. Multiple
lti l RRegressions
i are b
bestt used
d ffor?
?
A. Non-linear relationships between an X and a Y.
B. Uncertainty in the slope of the linear relationship between an X and a Y.
C. Relationships between Y and two or more X’s.
D. Replacing the use of a Designed Experiment.
2. Which relationships can be modeled with a Regression Equation? (check all that apply)
A. Simple Linear
B. Quadratic
C. Cubic
D. Multiple Linear
E. Logarithmic
3. Which statements are true about Multiple Regressions? (check all that apply)
A. Multiple Regressions are a form of experimentation.
B The
B. Th X’X’s are assumed d tto b
be iindependent
d d t off each
h other.
th
C. The X’s are assumed to not be correlated.
D. The residuals or errors are assumed to be Normally Distributed.
E. Interactions are NOT included in Multiple Linear Regressions.
F. R2 and the statistical confidence of the coefficients are impacted by the measurement
error of the inputs or X’s.
4. If a process output was mathematically transformed to achieve a Normal Distribution,

then which statements are NOT true? (check all that apply)
A. Independent of the transform, the upper specification will be a larger number than the
lower specification when transformed.
B. If the transform by the Box Cox transformation command in MINITABTM generated a
lambda equal to 0.5, then the upper specification limit of 100 would then be transformed
to 10.
C. The transformation function must be a smooth and continuous function.
D Th
D. The process d data
t iis ttransformed
f dbbutt nott the
th specification
ifi ti lilimits.
it
5. The results for experiments include the desire for problem solving, screening factors and
A. Physically model a process
B. Screening factors among possibilities
C. Achieving a robust design
D. Provide Regression Analysis
E. Understand the impact of an improved Measurement System
6. Which Experimental Design typically is most associated with the fewest number of input
variables or factors in the design?
A. Fractional Factorial Design
B. Full Factorial Design
C. Simple Linear Regression
D R
D. Response S Surface
f D
Design
i

554
Improve Phase Quiz
7. The 11 step methodology recommended for performing a DOE has which item as the first
step?
A. Select the output response variable(s)
B. Select the Experimental Design
C. Select the input variables
D. Define the Practical Problem
8. How many experimental runs exist in a full factorial 2-level design for 5 factors with 2
replicates for the Corner Points and no Center Points?
A. 10
B. 16
C. 32
D. 34
E. 64
9. Which statements are true about Full Factorials? (check all that apply):
A. Full Factorials are used when 5 or fewer factors are involved.
B. Full Factorials are better for optimizing a process than Fractional Factorials.
C. Full Factorials are used instead of Fractional Factorials if interactions need to be fully
understood.
D. Full Factorials are used for screening factors if the Analyze Phase was unable to
narrow the critical factors sufficiently.
E. Full Factorials never have Center Points in the design.
10. Examples of the first step in the recommended 11 step methodology for a DOE include:
A. Consider the cost of a DOE.
B. The root cause for the defective product characteristic needs to be found.
C. The variation needs to be affected by the input factors.
D. The response time to calls needs to be reduced.
E. The DOE effect on the project timeline needs to be considered.
11. What is the best reason for not selecting too large of a difference among the factor
levels in the Experimental Design?
A. The process output must not change too much.
B. The process may show little change if curvature exists and the local maximum of the
process output is between the large differences of factor levels chosen.
C. The experimental factors have rarely been operating in such a wide range.
D The experiment must have Center Points if the factor levels are wide.
D. wide
12. Which statements are correct about Experimental Designs? (check all that apply):
A. An Experimental Design cannot be orthogonal if not balanced.
B. An Experimental Design can be a balanced design but not orthogonal although it is
encouraged to use only balanced and orthogonal designs.
C. The use of blocking can be used for accounting of the impact of Noise variables.
D. Center Points are not recommended unless the experimenter is attempting to
optimize the process.
E. A resolution IV design has only 4-way interactions confounded with Main Effects.

555
Improve Phase Quiz
13. Experiments are only used in manufacturing belt projects.

True False
14. Executing the Experimental Design implies which correct statements? (check all that
apply)
A. The experiment can only be run on a product or service that the customer will not
experience.
B. If the experiment is well documented with the operators, it is not recommended to
have team members present during the experiment to save on space and allow for
uninhibited movement of the process.
C. If the experiment is going to start in a week, contact the Process Owners to work out
the needs before the experiment.
D. Use a log book and note any unusual observations during the experiment.
15. Statistically significant are the only important criteria for factors being included in the
experiment’s mathematical model.
True False
16. The last step of the recommended 11 step DOE methodology is:
A. Draw Practical Solutions
B. Implement solutions
C. Discuss results with the Process Owner
D. Plan the next Design of Experiment required
17. In 2-level factorial Experimental Designs, the total number of degrees of freedom is equal
to:
A. The number of experimental runs minus 2
B. The number of experimental runs minus 1
C. The number of experimental runs
D. The number of experimental runs minus the number of main factors in the
mathematical model
E The number of residuals
E.
18. If an Experimental Design has 3 factors with no replicates and 5 Center Points in the full
factorial 2-level design, the total number of experimental runs is best described as:
A. 13
B. 8 if the experiment can have two sets of conditions run simultaneously
C. 15
D. 30 if the number of blocks is 2
19. Which statements are correct about these 2-level factorial designs? (check all that apply)
A. A design with III resolution will not have Main Effects confounded with 2-way
interactions.
B. A design with IV resolution will not have Main Effects confounded with 2-way
interactions.
C. A design with V resolution will have 2-way interactions confounded with 3-way
interactions
interactions.
D. A design with V resolution has no Main Effects confounded with any interactions.
E. A design with V resolution has no Main Effects confounded with other Main Effects
F. A design with III resolution has no Main Effects confounded with other Main Effects

556
Lean Six Sigma

Black Belt Training
Control Phase
Welcome to Control
Now that we have completed the Improve Phase we are going to jump into the Control Phase.
Welcome to Control will give you a brief look at the topics we are going to cover.

557
Welcome to Control
Overview
These are the modules
we will cover in the
W
W elcome
elcom e to
to Control
Control
Control Phase as we
attempt to insure that
the gains we have Adva
Advanced
nced Ex
Experim ents
periments
made with our project
remain in place..
Adva
Advanced
nced Ca
Capa
pability
bility
We will examine the
meaning of each of
these and show you Lea
Leann Controls
Controls
how to apply them.
Defect
Defect Controls
Controls
Sta
Statistica
tisticall Process
Process Control
Control
(SPC)
(SPC)
Six
Six Sigm
Sigmaa Control
Control Pla
Plans
ns
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
DMAIC Roadmap
Process Owner
Champion/

Define
Estimate COPQ
Establish Team
Measure

yze
Analy

Improve

Control

C

558
Welcome to Control
Control Phase Finality with Control Plans
Improvement Selected
Develop Training Plan
Implement Training Plan
Develop Documentation Plan
Implement Documentation Plan
Develop Monitoring Plan
Implement Monitoring Plan
Develop Response Plan
Implement Response Plan
Develop Plan to Align Systems and Structures
Align
g Systems
y and Structures
Go to N ext Project

559
Lean Six Sigma

Black Belt Training
Control Phase
Lean Controls
Now we will continue in the Control Phase with “Lean Controls”.

560
Lean Controls
Overview
You can see in this section of the course we will look at the Vision of Lean, Lean Tools and
Sustaining Project Success.
W
W elcom
elcomee to
to Control
Control
Adva
Ad
Advancedd Ex
nced E
Ex perim
i ents
t
periments
Adva
Advanced
nced Ca
Capa
pability
bility Vision
Vision of
of Lean
Lean Supporting
Supporting Six
Six Sigma
Sigma
Lea
Leann Controls
Controls Lean
Lean Tool
Tool Highlights
Highlights
Defect
Defect Controls
Controls Project
Project Sustained
Sustained Success
Success
Sta
Statistica
tisticall Process
Process Control
Control
(SPC)
(SPC)
Six
Six Sigm
Sigmaa Control
Control Pla
Plans
ns
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Lean Controls
You’ve begun the process of sustaining your project after finding the “vital few” X’s to your project.
In the last module with Advanced Process Capability, we discussed removing some of the Special
Causes causing spread from outliers in the process performance.
This module gives more tools from the Lean toolbox to stabilize your process.
Belts, after some practice, often consider this module’s set of tools a way to improve some
processes that are totally “out of control” or of such poor Process Capability before applying the Six
Sigma methodology.
The tools we are going to review within this module can be used to help control a process. They can
be utilized at any time in an improvement effort not just control. These Lean concepts can be applied
to help reduce variation, effect outliers or clean up a process before, during or at the conclusion of a
project.

561
Lean Controls
The Vision of Lean Supporting Your Project
Remember the goal is to achieve and the SUSTAIN our improvements

Remember, improvements. We discussed 5S in the
Define Phase but we are going to review it with a twist here in the Control Phase.
Ka nba n
The Continuous Goa l…l… p W e ca nnot susta in
Susta ining Results Ka izen Ka nba n w ithout
Ka izen.
Sta nda rdized W ork

p W e ca nnot susta in Ka izen
(Six Sigm a ) w ithout
Sta nda rdized W ork .
Visua l Fa ctory p W e ca nnot susta in

Sta nda rdized W ork w ithout a
Visua l Fa ctory.
p W e ca nnot susta in a
visua l fa ctory w ithout 5 S.
Lea n tools a dd discipline required to further susta in ga ins

rea lized w ith Six Sigm a Belt Projects.
What is Waste (MUDA)?

The first step toward waste elimination is waste identification which you did originally with your Project
Charter and measured with your primary metric even if you didn’t use the term waste. All Belt projects
focus efforts into one (or more) of these seven areas.
W a ste is often the root of a ny Six Sigma project. The 7 ba sic

elem ents of w a ste (muda in Ja pa nese) include:
– M uda of Correction
– M uda of O verproduction
– M uda of Processing
– M uda
d off Conveya
C nce
– M uda of Inventory
– M uda of M otion
Get that garbage outta here!
– M uda of W a iting
The specifics of the M UDA w ere discussed in the Define Pha se:
– The reduction of M UDA ca n reduce y our outliers a nd help

w ith defect prevention. O utliers beca use of differing w a ste
a mong procedures, ma chines, etc.

562
Lean Controls
The Goal
Remember that anyy project
p j
Don’t forget the goa l -- Susta ining your Project w hich elimina tes
needs to be sustained. Muda M UDA!
(pronounced like mooo dah)
W ith this in mind, w e w ill introduce a nd review some of the Lea n
are wastes than can reappear tools used to susta in y our project success.
if the following Lean tools are
not used. The goal is to have
your Belts move onto other
projects and not be used as
firefighters
firefighters.
5S - Workplace Organization
The term “5S” derives from the
Japanese words for five practices • 5S means the workplace is
leading to a clean and clean, there is a place for
manageable work area. The five everything and everything
is in its place
place.
“S” are: ‘S
‘Seiri'
i i' means to
t separate
t
needed tools, parts, and • 5S is the starting point for
instructions from unneeded implementing
improvements to a process.
materials and to remove the
latter. 'Seiton' means to neatly • To ensure your gains are
sustainable, you must start
arrange and identify parts and with a firm foundation.
tools for ease of use. 'Seiso'
means to conduct a cleanup • Its strength is contingent
upon the employees and
campaign. 'Seiketsu' means to company being committed
conduct seiri, seiton, and seiso at to maintaining it.
frequent, indeed daily, intervals to
maintain a workplace in perfect
condition. 'Shitsuke' means to form the habit of always following the first four S’s.
On the next page the Japanese words are translated to English words. Simply put, 5S means the
workplace is clean
clean, there is a place for everything and everything is in its place.
place The 5S will create
a workplace that is suitable for and will stimulate high quality and high productivity work.
Additionally it will make the workplace more comfortable and a place in which you can take pride.
Developed in Japan, this method assume no effective and quality job can be done without clean
and safe environment and without behavioral rules.
The 5S allow you to set up a well adapted and functional work environment, ruled by simple yet
effective rules.
rules 5S deployment is done in a logical and progressive way
way. The first three S
S’s
s are
workplace actions, while the last two are sustaining and progress actions.
It is recommended to start implementing 5S in a well chosen pilot workspace or pilot process and
spread to the others step by step.

563
Lean Controls
5S Translation - Workplace Organization
Step Ja pa nese Litera l Tra nsla tion English
Step 1 : Seiri Clearing Up Sorting
Step 2 : Seiton Organizing Straightening
Step 3 : Seiso Cleaning Shining
Step 4 : Seketsu Standardizing Standardizing
Step 5 : Shitsuke Training & Discipline Sustaining
Focus on using the English w ords, much ea sier to remember.
The English translations are:
Seiri = Sorting
Eliminate everything not required for the current work, keeping only the bare essentials.
Seiton = Straightening
Arrange items in a way that they are easily visible and accessible.
Seiso = Shining
Clean everything and find ways to keep it clean. Make cleaning a part of your everyday
work.
Seketsu = Standardizing
Create rules by which the first three S’s are maintained.
Shitsuke = Sustaining
Keep 5S activities from unraveling

564
Lean Controls
SORTING - Decide what is needed.

The first stage of 5S is to
organize the work area, Definition:
leaving only the tools
– To sort out necessary and
and materials necessary unnecessary items.
to perform daily
– To store often used items at
activities. When “sorting” the work area, infrequently
is well implemented, used items away from the
communication between work area and dispose of
workers is improved and items that are not needed.
product quality and W hy: Things
Thingsto
torem
remember
ember
productivity are •• Start
Startininone
onearea,
area,then
thensort
sort
– Removes waste. through
througheverything.
everything.
increased. – Safer work area. •• Discuss
Discussremoval
removalofofitems
itemswith
withall
all
persons
personsinvolved.
involved.
– Gains space. •• Use appropriate
Use appropriate
– Easier to visualize the decontamination,
decontamination, environmental,
environmental,
and
andsafety
safetyprocedures.
procedures.
process. •• Items that cannot be
It
Items that b removed
th t cannott be removedd
immediately
immediatelyshould
shouldbe betagged
tagged
for later removal.
for later removal.
•• ififnecessary,
necessary,useusemovers
moversandand
riggers.
riggers.
A Method for Sorting
5S usually begins with a

Item
great initial cleaning,
where sorting out the
items is a highlight. For
Useful Unk now n Useless
each item, it must be
stated if it is useful,
useless or undetermined.
For some items, the Keep &
M onitor
statement may be touchy,
Keep &
as nobody seems to know Store
if they are really useful or
not, and what is their Useful Useless
Sorting
frequency of use.
Always start with the ABC

Dispose
easiest items to classify. Stora ge
Difficulty should be no
excuse, go for it, starting with easiest: Sort each item according to three categories:
1. Useful 2. Useless 3. Unknown
The two first categories are problem to sort as their status is clear. Dispose of immediately any
useless items, because they just clutter the workspace, lead to loss of time, confusion and poor
quality. For items in the unknown category or the frequency of use is unclear, keep them where
they are for a predetermined period of time and if found that they are not used dispose of them.
For items that are useful, there is also a method for determining how and where they should be
stored to help you achieve a clean and orderly workplace.

565
Lean Controls
A Method for Sorting
Use this gra ph a s a genera l guide

Frequency of Use for deciding w here to store items
a long w ith the ta ble below .
A
B
C
F
Distance
Frequency of Keep within arms Keep in local Keep in remote

Utilization Class reach location location
Daily or several times
a day A YES MAYBE NO
Weekly B MAYBE YES NO
Monthly or quarterly C NO NO YES
After you have determined the usefulness of an item, set three classes for determining where to store an
item based on the frequency of use and the distance to travel to get the item. “A” is for things which are
to be kept close at hand, because the frequency of use is high. “B” is if the item is used infrequently but
approximately on a weekly basis. Do no put it on your work surface, rather keep in easy walking
distance, i.e. on a bookshelf or in a nearby cabinet, usually in the same room you are in. For “C” items it
is acceptable to store in a somewhat remote place, meaning a few minutes walk away.
By rigorously applying the sort action and the prescribed method, you will find that the remainder of the
5S items will be quite easy to accomplish. It is very difficult to order a large number of items in a given
space and the amount of cleaning increases with the number of items. Your workplace should only
contain those items needed on a daily to weekly basis to perform your job.
STRAIGHTENING – Arranging Necessary Items
The second stage of

Definition:
5S involves the
– To arrange all necessary items.
orderly arrangement
– To have a designated place
of needed items so
for everything.
they are easy to use
– A place for everything and
and accessible for everything in its place.
“anyone” to find. – Easily visible and accessible.
Orderliness eliminates W hy: Things
Thingsto
toremember
rem ember
waste in production – Visually shows what is required •• Things
Thingsused
usedtogether
together
and clerical activities. should
shouldbe bekept
kepttogether.
or is out of place. together.
•• Use
Uselabels,
labels,tape,
tape,floor
floor
– More efficient to find items and markings,
markings,signs,
signs,and
and
documents (silhouettes/ labels). shadow
shadowoutlines.
outlines.
– Saves time by not having to •• Sharable
Sharableitems
itemssho
shouldld be
should be
search for items. kept
keptatataacentral
centrallocation
location
(eliminated excess).
(eliminated excess).
– Shorter travel distances.

566
Lean Controls
SHINING – Cleaning the Workplace
The third stage of 5S

is keeping everything Definition:
clean and swept. – Clean everything and
find ways to keep it
This maintains a clean.
safer work area and – Make cleaning a part
problem areas are of your everyday work.
quickly identified. An W hy:
important part of – A clean workplace
“shining” is “Mess indicates a quality Things
product and process. Thingsto
torem
remember
ember
Prevention.” In other •• “ “Everything
– Dust and dirt cause Everythingininits
itsplace”
place” frees
freesup up
words, don’t allow time
timefor
forcleaning.
cleaning.
litter, scrap, product contamination •• Use
and potential health Useananoffice
officeororfacility
facilitylayout
layoutas asaa
shavings, cuttings, visual
visualaid
aidtotoidentify
identifyindividual
individual
hazards. responsibilities
responsibilitiesforforcleaning.
cleaning.This This
etc., to land on the eliminates
– A clean workplace eliminates“ “no
noman’s
man’sland.”
land.”
floor in the first p identify
helps y abnormal •• Cleaning
Cleaningthethework
workarea
areaisislike
like
place. conditions. bathing.
bathing.ItItrelieves
relievesstress
stressandandstrain,
strain,
removes
removessweat
sweatand anddirt,
dirt,and
and
prepares
preparesthethebody
bodyforforthe
thenext
nextday.
day.
STANDARDIZING – Creating Consistency
The fourth stage of 5S

involves creating a Definition:
consistent approach – To maintain the workplace
for carrying out tasks at a level that uncovers
and procedures. problems and makes them
Orderliness is the obvious.
core of – To continuously improve
“standardization” and your office or facility by
is maintained by continuous assessment and
Visual Controls which action.
might consist of: Things
Thingsto torem
remem emberber
W hy:
Signboards, Painted •• WW e must keepthe
e must keep thework
workplace
placeneat
neat
Lines, Color-coding – To sustain sorting, storage enough
enoughforforvisual
visualidentifiers
identifierstotobe
be
strategies and and shining activities every effective
effectiveininuncovering
uncoveringhidden
hidden
day. problems.
problems.
Standardizing “Best
•• Develop
Developaasystem
systemthat
thatenables
enables
Methods” across the everyone
everyoneininthetheworkplace
workplacetotosee see
organization. problems
problemswhenwhenthey
theyoccur.
occur.

567
Lean Controls
SUSTAINING – Maintaining the 5S
This last stage of 5S

is the discipline and
commitment of all Definition:
other stages. – To maintain our
Without “sustaining”, discipline, we need to
your workplace can
practice and repeat until
easily revert back to
being dirty and it becomes a way of life.
chaotic. That is why Things
Thingsto toRem
Remem ember
ber
it is so crucial for W hy: •• Develop
Develop schedulesand
schedules and
your team to be check
checklists.
lists.
empowered to – To build 5S into our •• Good
Goodhabits
habitsare arehard
hard
improve and everyday process. totoestablish.
establish.
maintain their •• Commitment
Commitmentand anddiscipline
discipline
workplace. Keeping toward
towardhousekeeping
housekeepingare are
a 5S program vital in essential
essentialfirst
firststeps
p toward
steps toward
an organization being
beingworld
worldclass.
class.
creates a cleaner
workplace, a safer
workplace. It
contributes to how we feel about our product, our process, our company and ourselves. It provides
a customer showcase to promote your business and product quality will improve – especially by
reducing contaminants. Efficiency will increase also. When employees take pride in their work and
workplace it can lead to greater job satisfaction and higher productivity.
The Visual Factory
A visual factory can

The ba sis a nd founda tion of a Visua l Fa ctory a re the 5 S Sta nda rds.
best be represented by
a workplace
k l where
h a A Visua l Fa ctory ena bles a process to ma na ge its processes w ith clea r
recently hired indica tions of opportunities. Your tea m should a sk the follow ing
supervisor can easily questions if look ing for a project:
identify inventory – Ca n w e rea dily identify Dow ntime Issues?
levels, extra tools or – Ca n w e rea dily identify Scra p Issues?
supplies, scrap issues, – Ca n w e rea dily identify Cha ngeover Problem s?
downtime concerns or – Ca n w e rea dily identify Line Ba la ncing Opportunities?
even issues with setups – Ca n w e rea dily identify Ex cessive Inventory Levels?
or changeovers. – Ca n w e rea dily identify Ex tra neous Tools & Supplies?
Ex ercise:
– Ca n you com e up w ith a ny opportunities for “ VISUAL” a ids in y our
project?
– W ha t visua l a ids ex ist to ma na ge y our process?

568
Lean Controls
What is Standardized Work?
If the items a re orga nized a nd orderly ,

then sta nda rdized w ork ca n be
a ccomplished. Affected
– Less sta nda rd devia tion of results Sta nda rdized W ork employees
– Visua l fa ctory dema nds fra mew ork should
of sta nda rdized w ork . understand that
once they
The “ one best w a y” to perform ea ch together have
opera tion ha s been identified a nd Visua l Fa ctory defined the
a greed upon through genera l consensus standard, they
(not ma jority rules) will be expected
– This defines the “ Sta nda rd” w ork W e ca nnot to perform the
procedure susta in job according to
Sta nda rdized that standard.
W ork w ithout
5 S a nd the
Visua l Fa ctory .
5 S - W ork pla ce O rga niza tion
Prerequisites for Standardized Work
Sta nda rdized w ork does not ha ppen w ithout the visua l fa ctory
w hich ca n be further described w ith:
Ava ila bility of required tools (5 S). O pera tors ca nnot be ex pected
to m a inta in sta nda rd w ork if required to loca te needed tools
Consistent flow of ra w m a teria l. O pera tors ca nnot be ex pected

to m a inta in sta nda rd w ork if they a re sea rching for needed pa rts
Visua l a lert of va ria tion in the process (visua l fa ctory).

ctory)
O pera tors, ma teria l ha ndlers, office sta ff a ll need visua l signa ls to
k eep “ sta nda rd w ork ” a sta nda rd
Identified a nd la beled in-process stock (5 S). As inventory levels of

in-process stock decrea se, a visua l signa l should be sent to the
ma teria l ha ndlers to replenish this stock
The steps in developing CTQ’s are identifying the customer, capturing the Voice of the Customer and
finally validating the CTQ’s.

569
Lean Controls
What is Kaizen?
• Definition* : The philosophy of continual Ka izen

improvement, that every process can and
should be continually evaluated and
improved in terms of time required, resources
used, resultant quality, and other aspects Sta nda rdized W ork
relevant to the process.
• Kaikaku are breakthrough successes which

are the
th fifirstt focus
f off Six
Si Sigma
Si projects.
j t
Visua l Fa ctory
* N ote: Ka izen Definition from: All I

N eeded To Know About
M a nufa cturing I Lea rned in Joe’s
Ga ra ge. M iller a nd Schenk ,
Ba y rock Press, 1 9 9 6 . Pa ge 7 5 . 5 S - W ork pla ce O rga niza tion
A Kaizen event is very similar to a Six Sigma project. A Six Sigma project is actually a Kaizen.
By involving your project team or others in an area to assist with implementing the Lean Control
or concepts you will increase buy in of the team which will effect your projects sustainability.
Prerequisites for Kaizen
Ka izen’ s need the follow ing cultura l elem ents:
M a na gem ent Support. Consider the corpora te support w hich is the

rea son w hy Six Sigm a focus is a success in y our orga niza tion
M ea sura ble Process. W ithout sta nda rdized w ork , w e rea lly w ouldn’ t
ha ve a consistent process to m ea sure. Cy cle tim es w ould va ry , a ssem bly
m ethods w ould va ry , ba tches of m a teria ls w ould be m ix ed, etc…
Ana ly sis Tools. There a re im provem ent projects in ea ch orga niza tion
w hich ca nnot be solved by a n opera tor. This is w hy w e tea ch the
a na ly sis tools in the brea k through stra tegy of Six Sigm a .
O pera tor Support. The orga niza tion needs to understa nd tha t its future
lies in the success of the va lue-a dding em ploy ees. O ur roles a s Belts a re
to convince opera tors tha t w e a re here for them --they w ill then be there
for us.
A Kaizen event can be small or large in scope. Kaizens are improvement with a purpose of constantly
improving a process. Some Kaizens are very small changes like a new jig or placement of a product
or more involved projects. Kaizens are Six Sigma projects with business impact.

570
Lean Controls
What is Kanban?
Ka nba ns a re the best control m ethod of inventory w hich

impa cts som e of the 7 elements of M UDA show n ea rlier.
Ka nba n provides production, conveya nce, a nd delivery
Ka nba n
inform a tion. In it’s purest form the sy stem w ill not a llow
a ny goods to be moved w ithin the fa cility w ithout a n
a ppropria te Ka nba n (or signa l) a tta ched to the goods.
– The Ja pa nese w ord for a com munica tion signa l
or ca rd
rd--typica
typica lly a signa l to begin w ork Ka
a izen
e
– Ka nba n is the technique
used to “ pull” products a nd
Sta nda rdized W ork
m a teria l through a nd into
the lea n m a nufa cturing system.
– The a ctua l “ Ka nba n” ca n be a physica l
signa l such a s a n empty conta iner or
a sma ll ca rd. Visua l Fa ctory
5 S - W ork pla ce O rga niza tion
This is a building block. A Kanban needs to be supported by the previous steps we have reviewed. If
Kanbans are abused they will actually backfire and effect the process in a negative manner.
Two Types of Kanban

There are two categories of Kanbans, finished good Kanbans and incoming material Kanbans as
depicted here.
There are two main categories of Kanbans:

Type 1 : Finished goods Ka nba ns Intra - process
– Signa l Ka nba n: Should be
posted a t the end of the P.I.K.
processing a rea to signa l for
production to begin. Production
– P.I.K Ka nba n: Used for a much Instruction Kanban
more refined level of inventory
control. Ka nba n is p posted a s g l
Signa
inventory is depleted thus
insuring only the minim um
a llow a ble level of product is
ma inta ined.
W ithdra w a l
Type 2 : Incoming M a teria l Ka nba ns
– Used to purcha se ma teria ls Inter - Process
from a supplying depa rtment Between two
either
ith interna
i t l or ex terna
t l to
t processes
the orga niza tion. Regula tes
the a mount of W IP inventory
loca ted a t a pa rticula r process. Supplier

571
Lean Controls
Prerequisites for a Successful Kanban System

Kanbans should
smooth out inventory
and keep product These item s support successful Ka nba ns:
flowing but use them • Im prove cha ngeover procedures.
cautiously. If you
prematurely • Rela tively sta ble dem a nd cy cle.
implement a Kanban
it WILL backfire.
• N umber of pa rts per Ka nba n (ca rd) M UST be sta nda rd a nd
SHO ULD be k ept to a s few a s possible pa rts per ca rd.
• Sm a ll a mount of va ria tion (or defects).
• N ea r zero defects should be sent to the a ssembly process

(Result of ea rlier belt projects).
• Consistent cy cle times defined by Sta nda rdized W ork .
• M a teria l ha ndlers must be tra ined in the orga niza tion of the
tra nsporta tion system.
Warnings Regarding Kanban
As w e ha ve indica ted, if y ou do N O T ha ve 5 S,
visua l fa ctory , sta nda rdized w ork a nd ongoing
k a izen’s,, Ka nba ns ca nnot succeed.
Ka nba n systems a re not quick fix es to la rge

inventory problem s, w ork force issues, poor
product pla nning, fluctua ting dem a nd cy cles,
etc...
Don’t forget that “weakest Link” thing!
It is
i nott possible
ibl to
t implement
i l t a viable
i bl KKanban
b systemt without
ith t a strong
t supportt structure
t t maded up
of the prerequisites. One of the most difficult concepts for people to integrate is the simplicity of the
Lean tools… and to keep the discipline. Benchmarks have organizations using up to seven years
to implement a successful Kanban System all the way through supplier and customer supply chain.

572
Lean Controls
The Lean Tools and Sustained Project Success
The Lea n tools help susta in project success

success. The m a in lessons you
should consider a re:
1 . The TEAM should 5 S the project a rea a nd begin integra ting visua l
fa ctory indica tors.
– Indica tions of the need for 5 S a re:
– O utliers in y our project m etric
– Loss of initia l ga ins from project findings
2 . The TEAM should develop Sta nda rdized W ork Instructions

– They a re required to susta in your sy stem benefits.
– How ever, rem em ber w ithout a n orga nized w ork pla ce w ith
5 S sta nda rdized w ork instructions w on’t crea te consistency
3 . Ka izen’s a nd Ka nba n’s ca nnot be a ttem pted w ithout orga nized

w ork pla ces a nd orga nized w ork instructions.
– Rem em ber the need for 5 S a nd Sta nda rdized W ork
Instructions to support our projects.
4 . Project Scope dicta tes how fa r up the Lea n tools la dder y ou need
to im plem ent m ea sures to susta in a ny project success from y our
DM AIC efforts.
The 5 Lean concepts are an excellent method for Belts to sustain their project success. If you have
outliers, declining benefits
f or dropping process capability, you need to consider the concepts
presented in this module.
Class Exercise
In the bounda ries for your project scope, give some

p
ex a mples of Lea n tools in opera
p tion.
– O thers ca n lea rn from those items you consider ba sic.
List other Lea n tools you a re most interested in a pply ing to

susta in y our project results.
To genera te the Ex ercise informa tion consider w a lk ing

a round y our fa cility, especia lly if it is N O T a ma nufa cturing
one, a nd consider w here a visua l fa ctory w ould be useful
a long w ith the other 4 Lea n concepts review ed.

573
Lean Controls
Describe some Lean tools
Understand how these tools can help with project

sustainability
Understand how the Lean tools depends on each other
Understand how tools must document the defect prevention

created in the Control Phase
You have now completed Control Phase – Lean Controls.
Notes

574
Lean Six Sigma

Black Belt Training
Control Phase
Defect Controls
Now we will continue in the Control Phase with the “Defect

Defect Controls”
Controls .

575
Defect Controls
Overview
W
W elcom
elcomee to
to Control
Control
Adva
Advanced
nced Ex
Ex perim
periments
ents
Adva
Advanced
nced Ca
Capa
pability
bility
Lea
Leann Controls
Controls Realistic
R li ti T
Realistic Tolerance
l
Tolerance and
d Si
and Six
Six Si
Sigma
Sigma D
Design
i
Design
Defect
Defect Controls
Controls Process
Process Automation
Automation or
or Interruption
Interruption
Sta
Statistica
tisticall Process
Process Control
Control
Poka-Yoke
Poka-Yoke
(SPC)
(SPC)
Six
Si Sigm
Six a Control
Sigma
Si C t l Pla
Control Pl ns
Pla ns
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
In an effort to put in place Defect Controls we will examine Tolerances, Process Automation and
Poka-Yoke.
Purpose of Defect Prevention in Control Phase
Process improvem ent efforts often fa lter during implementa tion of

new opera ting
ti methods
th d lea
l rned d in
i the
th Ana
A lyl ze a nd
d Improve
I
Pha ses.
Susta ina ble improvem ents ca n not be a chieved w ithout control

ta ctics to gua ra ntee perma nency .
Defect Prevention seek s to ga in perma nency by elimina ting or

rigidly defining huma n intervention in a process.
Yes sir, we are in CONTROL!
With Defect Prevention we want to ensure that the improvements created during the project stay in place.

576
Defect Controls
Sigma Level for Project Sustaining in Control
The best approach to Defect Prevention is to design Six Sigma right into the process
process.
Designing products a nd processes such tha t the output Y meets or

ex ceeds the ta rget ca pa bility.
24
n on Y
22
Specification
Distribution 21
of Y
19
Relationship
17 Y = F(x)
10 11 12 13 14 15 16 17 18 19 20
Distribution of X
W hen designing the part or process, specifications on X are set such that the
target capability on Y is achieved.
Both the target and tolerance of the X must be addressed in the spec limits.
6s Product/Process Design
Upper
24
Prediction
Interval
Specification on
22
Y
Distribution 21 Relationship
of Y
Y = F(x)
19
17
10 11 12 13 14 15 16 17 18 19 20 Lower
Prediction
Distribution of X Interval
If the rela tionship betw een X a nd Y is empirica lly developed

through regressions or DO E’s uncerta inty ex ists.
As a result
result, confidence interva ls should be used w hen esta blishing
the specifica tions for X .

577
Defect Controls
Product/Process Design Example
Using 95% prediction bands within MIN ITABTM

Stat > Regression>Fitted Lin Plot …..Options…Display Prediction Interval
Regression Plot
Y = 7.75434 + 5.81104X
R-Sq = 88.0 %
90
80
70
W ha t a re the 60
spec limits for

Output
50
the output? 40
30
20
Regression
10
95% PI
0
0 5 10
Input
W ha t is the tolera nce ra nge for the input?
If you w a nt 6 σ perform a nce, you w ill rem em ber to tighten the
output’s specifica tion to select the tolera nce ra nge of the input.
Usually we use the prediction band provided by MINITABTM. This is controllable by manipulation of
the confidence intervals. 90%, 05%, 99%, etc. Play with adjusting the prediction bands to see the
effect it has.
Regression Plot
Y = 2.32891 - 0.282622X
R-Sq = 96.1 %
10
N ote: High
g output
p spec
p connects
w ith top line in both ca ses.
Output2
Regression
0 95% PI Regression Plot

Y = 7.75434 + 5.81104X
R-Sq = 88.0 %
-30 -20 -10 0
Input2 90
80
70
60
Output
50
40
30
20
Regression
Low er input spec

10
95% PI
0
0 5 10
Input
Using top output spec determ ines high or low tolera nce for input
d
depending
di on slope
l off regression
i

578
Defect Controls
Poor Regression Impacting Tolerancing
Regression Plot
Y = -4.7E-01 R-Sq =
+ 0.811312X 90.4 % Poor correla tion does
not a llow for tighter
tolera ncing
20
Outp1
10
Regression Regression Plot

0 95% PI Y = 1.46491 R-Sq =
+ 0.645476X 63.0 %
0 10 20 30
30
Inp1
20
Outp2 10
Regression
0
95% PI
0 10 20 30
Inp1
5 – 6 σ Full Automation
Full Automa tion: Systems that monitor the process and automatically
adjust critical X’s to correct settings
• Automatic gauging and system adjustments

• Automatic detection and system activation systems - landing gear
extension based on aircraft speed and power setting
• Systems that count cycles and automatically make adjustments based
on an optimum number of cycles
• Automated temperature controllers for controlling heating and cooling
systems
• Anti-Lock braking systems
• Automatic welder control units for volts, amps and distance traveled
on each weld cycle
Automation can be an option as well which removes the human element and its inherent
variation. Although use caution to automate a process, many times people jump into automation
prematurely, if you automate a poor process what will that do for you?

579
Defect Controls
Full Automation Example
A Black Belt is working on controlling rust on machined surfaces of

brake rotors:
– A rust inhibiter is applied during the wash cycle after final
machining is completed
– Concentration of the inhibiter in the wash tank is a critical X that
must be maintained
– The previous system was a standard S.O.P. requiring a process
technician to audit and add the inhibiter manually
As part of the Control Phase, the team has implemented an automatic

check and replenish system on the washer
washer.
Full Automa tion
Don’t worry boss, it’s automated!!
4 – 5 σ Process Interruption
Process Interruption: Mechanism installed that shuts down the

process and prevents further operation until a required action is
preformed:
• Ground fault circuit breakers
• Child proof caps on medications
• Software routines to prevent undesirable commands
• Safety interlocks on equipment such as light curtains, dual palm
buttons, ram blocks
• Transfer system guides or fixtures that prevent over or undersized
parts from proceeding
• Temperature conveyor interlocks on ovens
• Missing component detection that stops the process when triggered

580
Defect Controls
4 – 5 σ Process Interruption (cont.)
Ex a mple:
• A Bla ck Belt is w ork ing on la unching a new electric drive unit
on a tra nsfer sy stem
– O ne common fa ilure mode of the sy stem is a bea ring fa ilure
on the m a in motor sha ft
– It w a s determ ined tha t a high press fit a t bea ring
insta lla tion w a s ca using these fa ilures
– The root ca use of the problem turned out to be undersized
bea rings from the supplier
• Until the supplier could be brought into control or repla ced, the
tea m im plem ented a press loa d m onitor a t the bea ring press
w ith a indica tor
– If the monitor detects a press loa d higher tha n the set point,
it shuts dow n the press a nd w ill not a llow the unit to be
removed from press until a n interlock k ey is turned a nd the
ra m resett iin the
th m a nua l m ode
d
– O nly the line lea d person a nd the supervisor ha ve k eys to
the interlock
– The non-conforming pa rt is a utom a tica lly ma rk ed w ith red
dye
Process Interruption
3 – 5 σ Mistake Proofing
Mistake Proofing is
great because it is M ista k e Proofing is best defined as:
usually inexpensive – Using wisdom, ingenuity, or serendipity to create devices
and very effective. allowing a 100% defect free step 100% of the time
Consider the many
everyday examples of
Poka-Yoke is the Japanese term for mistake proofing or to avoid
Mistake Proofing.
You can not fit the “ yokeuro” inadvertent errors “ poka” .
diesel gas hose into 1 2 3 4
an unleaded vehicle
gas tank. Pretty
straightforward right?
straightforward, See if you can
find the Poka-
5 7 8
Yokes!
6

581
Defect Controls
Traditional Quality vs. Mistake Proofing
This clearly
highlights the Tra ditiona l Inspection
difference between
the two Result
approaches. What Sort
are the benefits to W orker or Don’t Do Defective At Other
Machine Error Anything
the Source Step
Inspection method?
Discover Take Action/ No N ext

Error Feedback Defect Step
Source Inspection
“ KEEP ERRO RS FRO M
TURN IN G IN TO DEFECTS”
Styles of Mistake Proofing
There a re 2 sta tes of a defect w hich a re a ddressed w ith

mista k e proofing.
ERRO R ABO UT TO O CCUR ERRO R HAS O CCURRED
DEFECT ABO UT TO O CCUR DEFECT HAS O CCURRED

(Prediction) (Detection)
W ARN IN G SIGN AL W ARN IN G SIGN AL
CO N TRO L / FEEDBACK CO N TRO L / FEEDBACK
SHUTDO W N SHUTDO W N
(Stop O pera tion) (Stop O pera tion)

582
Defect Controls
Mistake Proofing Devices Design
Hints to help design a mista k e proofing device:

– Simple
– Inex pensive
– Give prompt feedba ck
– Give prompt a ction (prevention)
– Focused a pp
pplica tion
– Ha ve the right people’s input
BEST ...makes it impossible for errors to occur

BETTER ……allows for detection while error is being made
GO O D ...detects defect before it continues to the next operation
The very best approaches make creating a defect impossible, recall the gas hose example, you
can not put diesel fuel into an unleaded gas tank unless you really try hard or have a hammer.
Types of Mistake Proof Devices
Conta ct M ethod
Guide Pins of
– Physica l or energy conta ct
w ith product
1 Different Sizes
• Lim it sw itches
• Photo-electric bea m s
Error Detection
Fix ed Va lue M ethod 2 and Alarms
– N umber of pa rts to be
a tta ched/ a ssem bled etc.
a re consta nt
– N umber of steps done
3 Limit Switches
in opera tion
• Lim it sw itches
M otion-step M ethod 4 Counters
– Check s for correct sequencing

– Check s for correct tim ing
• Photo-electric sw itches 5 Checklists
a nd timers

583
Defect Controls
Mistake Proofing Examples

Let s consider
Let’s
examples of Every da y ex a m ples of mista k e-proofing: • Autom obile
mistake proofing • Hom e – Sea t belts
or Poka-Yoke – Autom a ted shutoffs on electric – Air ba gs
devices even in coffee pots – Ca r engine w a rning lights
the home. Have a – Ground fa ult circuit brea k ers for • O ffice
ba throom in or outside electric Spell check in w ord processing
discussion about circuits
–
softw a re
them in the work – Pilotless
ot ess ga s ra
a nges
ges a nd
d hot
ot – Q uestioning “ Do you w a nt to
environment as w a ter hea ters delete” a fter depressing the
well. – Child proof ca ps on m edica tions “ Delete” button on y our
– Buta ne lighters w ith sa fety com puter
button • Fa ctory
• Computers – Dua l pa lm buttons a nd other
– M ouse insertion gua rds on m a chinery
• Reta il
– USB ca ble connection
– Ta m per proof pa ck a ging
– B tt
Ba ttery iinsertion
ti
– Pow er sa ve fea ture
Advantages of Mistake Proofing as a Control Method
M ista k e Proofing a dva nta ges include:

– O nly sim ple tra ining progra m s a re required
– Inspection opera tions a re elimina ted a nd the process is sim plified
– Relieves opera tors from repetitive ta sk s of ty pica l visua l inspection
– Prom otes crea tivity a nd va lue a dding a ctivities
– Results in defect free w ork
– Requires im m edia te a ction w hen problem s a rise
– Provides 1 0 0 % inspection interna l to the opera tion
The best resource for pictoria l ex a m ples of M ista k e Proofing is:
Pok a -Yok e: Improving Product Q ua lity by Preventing Defects.

O verview by Hiroyuk i Hira no. Productivity Press, 1 9 8 8 .)
To see a much more in-depth review of improving the product or service quality by preventing defects
you MUST review the book shown here. A comprehensive 240 Poka-Yoke examples are shown and
can be applied to many industries. The Poka-Yoke’s
Poka Yoke s are meant to address errors from processing,
assembly, mounting, insertion, measurement, dimensional, labeling, inspection, painting, printing,
misalignment and many other reasons.

584
Defect Controls
Defect Prevention Culture and Good Control Plans
IInvolve
l everyone in i defect
d f t prevention
ti
– Esta blish process ca pa bility through SPC
– Esta blish a nd a dhere to sta nda rd procedures
– M a k e da ily improvem ents
– Invent M ista k e-proofing devices
M a k e immedia te feedba ck a nd a ction pa rt of culture
Don’t just stop a t one mista k e proofing device per product
Defect Prevention is needed for a ll potentia l defects
Defect Prevention implemented M UST be docum ented in your

living FM EA for the process/ product
Class Exercise
Brea k into your groups a nd discuss m ista k e proofing systems

currently a t your fa cilities
Identify one a utoma tion ex a mple a nd one process interruption

ex a mple per group
Be prepa red to present both ex a mples to the cla ss
Answ er the follow ing questions a s pa rt of the discussion a nd

presenta tion:
– How w a s the need for the control system identified? If a
critica l X is mista k e proofed, how w a s it identified a s being
critica l?
– How a re they ma inta ined?
– How a re they verified a s w ork ing properly?
– Are they ever disa bled?
You ha ve 3 0 minutes!
Prepare a probable defect prevention method to apply to your

project.
j t
List any potential barriers to implementation.

585
Defect Controls
Describe some methods of Defect Prevention
Understand how these techniques can help with project

sustainability:
- Including reducing those outliers as seen in
the Advanced Process Capability section
- If the vital X was identified, prevent the cause
of defective Y
Understand what tools must document the Defect Prevention

created in the Control Phase
You have now completed Control Phase – Defect Controls.
Notes

586
Lean Six Sigma

Black Belt Training
Control Phase
Statistical Process Control
We will now continue in the Control Phase with “Statistical Process Control or SPC”.

587
Overview
W
W elcom
elcomee to
to Control
Control
Adva
Advanced
nced Ex
Experiments
periments
Adva
Advanced
nced Ca
Capa
pability
bility
Lea
Leann Controls
Controls
Elements
Elements and
and Purpose
Purpose
Defect
Defect Controls
Controls
Methodology
Methodology
Sta
Statistica
tisticall Process
Process Control
Control
(SPC)
(SPC)
Special
Special Cause
Cause Tests
Tests
Six
Si
Si Sigma
Six Si
Sigm
Si a Control
C
C tt ll Pla
Control Pl
Pl ns
Pla ns
Examples
Examples
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Statistical techniques can be used to monitor and manage process performance. Process
performance, as we have learned, is determined by the behavior of the inputs acting upon it in the
form of Y=f(X). As a result it must be well understood that we can only monitor the performance of a
process output. Many people have applied Statistical Process Control (SPC) to only the process
outputs. Because they were using SPC, their expectations were high regarding a new potential level
of performance and control over their processes. However, because they only applied SPC to the
outputs, they were soon disappointed. When you apply SPC techniques to outputs, it is
appropriately called Statistical Process Monitoring or SPM.
You of course know that you can only control an output by controlling the inputs that exert an
influence on that output. This is not to say that applying SPC techniques to an output is bad, there
are valid reasons for doing this. Six Sigma has helped us all to better understand where to apply
such control techniques.
In addition to controlling inputs and monitoring outputs, control charts are used to determine the
Baseline performance of a process, evaluate measurement systems, compare multiple processes,
compare processes before and after a change, etc. Control Charts can be used in many situations
that relate to process characterization,
characterization analysis and performance
performance.
To better understand the role of SPC techniques in Six Sigma, we will first investigate some of the
factors that influence processes, then review how simple probability makes SPC work and finally
look at various approaches to monitoring and controlling a process.

588
SPC Overview: Collecting Data
Control Charts are usually derived P

Populal tion:
ti
from samples taken from the – An entire group of objects that have been made or will be
larger population. Sampling must made containing a characteristic of interest
be collected in such a way that it Sa m ple:
does not bias or distort the – A sample is a subset of the population of interest
interpretation of the Control Chart. – The group of objects actually measured in a statistical
study
The process must be allowed to – Samples are used to estimate the true population
operate normally when taking a parameters
sample. If there is any special
treatment or bias given to the
process over the period the data is Popula tion
collected, the Control Chart
interpretation will be invalid. The Sa m ple
frequency of sampling depends on Sa m ple
the volume of activity and the Sa m ple
ability to detect trends and
patterns in the data. At the onset,
you should error on the side of taking extra samples, and then, if the process demonstrates its ability to
stay in control, you can reduce the sampling rate.
Using rational subgroups is a common way to assure that this does not happen. A rational subgroup is a
sample of a process characteristic in which all the items in the sample were produced under very similar
conditions and in a relatively short time period. Rational subgroups are usually small in size, typically
consisting of 3 to 5 units to make up the sample.
sample It is important that rational subgroups consist of units
that were produced as closely as possible to each other, especially if you want to detect patterns, shifts
and drifts. If a machine is drilling 30 holes a minute and you wanted to collect a sample of hole sizes, a
good rational subgroup would consist of 4 consecutively drilled holes. The selection of rational subgroups
enables you to accurately distinguish Special Cause variation from Common Cause variation.
Make sure that your samples are not biased in any way, meaning that they are randomly selected. For
example, do not plot only the first shift’s data if you are running multiple shifts. Don’t look at only one
vendor’s material if you want to know how the overall process is really running. Finally, don’t concentrate
on a specific time to collect your samples; like just before the lunch break.
If your process consists of multiple machines, operators or other process activities that produce streams
of the same output characteristic you want to control, it would be best to use separate Control Charts for
each of the output streams.
If the process is stable and in control, the sample observations will be randomlyy distributed around the
average. Observations will not show any trends or shifts and will not have any significant outliers from the
random distribution around the average. This type of behavior is to be expected from a normally operating
process and that is why it is called Common Cause variation. Unless you are intentionally trying to
optimize the performance of a process to reduce variation or change the average, as in a typical Six
Sigma project, you should not make any adjustments or alterations to the process if it is demonstrating
only Common Cause variation. That can be a big time saver since it prevents “wild goose chases.”
If Special Cause variation occurs,

occurs you must investigate what created it and find a way to prevent it from
happening again. Some form of action is always required to make a correction and to prevent future
occurrences.

589
SPC Overview: I-MR Chart
• An I-M R Cha rt com bines a Control Cha rt of the a vera ge m oving ra nge w ith the
individua ls Cha rt.
• You ca n use Individua ls Ccha rts to tra ck the process level a nd to detect the
presence of specia l ca uses w hen the sa mple size is 1 .
• Seeing both cha rts together a llow s y ou to tra ck both the process level a nd process
va ria tion a t the sa m e tim e, providing grea ter sensitivity tha t ca n help detect the
presence of specia l ca uses.
Individuals Chart
Individuals Chart
Observation
Observation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
4
34
3
2 Data
12 Data
Measure
LCL
01
Measure
LCL
-1 0 Xbar
Xbar
-1 UCL
-2
UCL
-3-2
-4-3
-4
M Rbar Chart
M Rbar Chart
Observation
Observation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
5
5
4
4 Range
Range
Range
3 LCL
Range
3 LCL
2 Rbar
2 Rbar
UCL
1 UCL
1
0
0
Individuals (I) and Moving Range (MR) Charts are used when each measurement represents one
batch. The subgroup size is equal to one when I-MR Charts are used. These charts are very
simple to prepare and use. The graphic shows the Individuals Chart where the individual
measurement values are plotted with the Center Line being the average of the individual
measurements. The Moving Range Chart shows the range between two subsequent
measurements.
There are certain situations when opportunities to collect data are limited or when grouping the
data into subgroups simply doesn't make practical sense. Perhaps the most obvious of these
cases is when each individual measurement is already a rational subgroup. This might happen
when each measurement represents one batch, when the measurements are widely spaced in
time or when only one measurement is available in evaluating the process. Such situations include
destructive testing, inventory turns, monthly revenue figures and chemical tests of a characteristic
in a large container of material
material.
All of these situations indicate a subgroup size of one. Because this chart is dealing with individual
measurements it, is not as sensitive as the X-Bar Chart in detecting process changes.

590
SPC Overview: Xbar-R Chart
If each of your observations consists of a subgroup of data, rather than just individual
measurements, an Xbar-R Chart providers greater sensitivity. Failure to form rational
subgroups correctly will make your Xbar-R Charts dangerously wrong.
Xbar Chart
Xbar Chart
Subgroup
Subgroup
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1.5
1.5
1
1
0.5
0 5 Xbar
0.5 Xbar
0 LCL
Xbar
0 LCL
Xbar
-0.5 Xbarbar
-0.5 Xbarbar
-1 UCL
-1 UCL
-1.5
-1.5
-2
-2
Rbar Chart
Rbar Chart
Subgroup
Subgroup
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
6
6
5
5 Rbar
4 Rbar
4 LCL
Rbar
LCL
Rbar
3
3 Rbar
2 Rbar
2 UCL
1 UCL
1
0
0
An XBar-R Chart is used p primarily

y to monitor and control the stability
y of the average
g value. The XBar
Chart plots the average values of each of a number of small sampled subgroups. The averages of the
process subgroups are collected in sequential, or chronological, order from the process. The XBar
Chart, together with the R Chart shown, is a sensitive method to identify assignable causes of product
and process variation and gives great insight into short-term variations.
These charts are most effective when they are used together. Each chart individually shows only a
portion of the information concerning the process characteristic. The upper chart shows how the
process average (central tendency) changes
changes. The lower chart shows how the variation of the process
has changed.
It is important to control both the process average and the variation separately because different
corrective or improvement actions are usually required to effect a change in each of these two
parameters.
The R Chart must be in control in order to interpret the averages chart because the Control Limits are
calculated considering both process variation and center
center. When the R Chart is not in control
control, the
control limits on the averages chart will be inaccurate and may falsely indicate an out of control
condition. In this case, the lack of control will be due to unstable variation rather than actual changes
in the averages.
XBar and RBar Charts are often more sensitive than I-MR, but are frequently done incorrectly. The
most common error is failure to perform rational sub-grouping correctly.
A rational subgroup is simply a group of items made under conditions that are as nearly identical as
possible. Five consecutive items, made on the same machine, with the same setup, the same raw
materials and the same operator, are a rational subgroup. Five items made at the same time on
different machines are not a rational subgroup. Failure to form rational subgroups correctly will make
your XBar-R Charts dangerously wrong.

591
SPC Overview: U Chart
• C Charts and U Charts are for tracking defects.

• A U Chart can do everything a C Chart can, so we’ll just learn how to do a U
Chart. This chart counts flaws or errors (defects). One “ search area” can have
more than one flaw or error.
• Search area (unit) can be practically anything we wish to define. W e can look
for typographical errors per page, the number of paint blemishes on a truck
door or the number of bricks a mason drops in a workday
workday.
• You supply the number of defects on each unit inspected.
U Chart
U Chart
Sample
Sample
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1
1
0.8
0.8 DPU
DPU
0.6 LCL
DPU
0.6 LCL
DPU
0.4 Ubar
0.4 Ubar
UCL
0.2 UCL
0.2
0
0
The U Chart plots defects per unit data collected from subgroups of equal or unequal sizes. The “U”
in U Charts stands for defects per Unit
Unit. U Charts plot the proportion of defects that are occurring.
occurring
The U Chart and the C Chart are very similar. They both are looking at defects but the U Chart does
not need a constant sample size like the sample size like the C Chart. The Control Limits on the U
Chart vary with the sample size and therefore they are not uniform, similar to the P Chart which we
will describe next.
Counting defects on forms is a common use for the U Chart. For example, defects on insurance
claim forms are a problem for hospitals
hospitals. Every claim form has to be checked and corrected before
going to the insurance company. When completing a claim form, a particular hospital must fill in 13
fields to indicate the patient’s name, social security number, DRG codes and other pertinent data. A
blank or incorrect field is a defect.
A hospital measured their invoicing performance by calculating the number of defects per unit for
each day’s processing of claims forms. The graph demonstrates their performance on a U Chart.
The general procedure for

f U Charts
C is as follows:
f
1. Determine purpose of the chart
2. Select data collection point
3. Establish basis for sub-grouping
4. Establish sampling interval and determine sample size
5. Set up forms for recording and charting data and write specific instructions on
use of the chart
6 Collect and record data
6. data.
7. Count the number of nonconformities for each of the subgroups
8. Input into Excel or other statistical software.
9. Interpret chart together with other pertinent sources of information on the process
and take corrective action if necessary

592
SPC Overview: P Chart
• N P Charts and P Charts are for tracking defectives.

• A P Chart can do everything an N P Chart can, so we’ll just learn how to do
a P Chart!
• Used for tracking defectives – the item is either good or bad, pass or fail,
accept or reject.
• Centerline is the proportion of “ rejects” and is also your process capability.
• Input to the P Chart is a series of integers — number bad, number rejected.
In addition, you must supply the sample size.
P Chart
P Chart
Sam ple
Sam ple
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Proportion Defective (P)
0.35
P)
0 35
0.35
Proportion Defective (P
0.3
0.3 P
0.25 P
0.25
0.2 LCL
0.2 LCL
0.15 Pbar
0.15 Pbar
0.1 UCL
0.1 UCL
0.05
0.05
0
0
The P Ch
Th Chart plots
l the
h proportion
i off nonconforming
f i unitsi collected
ll d ffrom subgroups
b off equall or
unequal size (percent defective). The proportion of defective units observed is obtained by dividing
the number of defective units observed in the sample by the number of units sampled. P Charts
name comes from plotting the Proportion of defectives. When using samples of different sizes, the
upper and lower Control Limits will not remain the same - they will look uneven as exhibited in the
graphic. These varying Control Chart limits are effectively managed by Control Charting software.
A common application
pp of a P Chart is when the data is in the form of a p
percentage
g and the sample
p
size for the percentage has the chance to be different from one sample to the next. An example
would be the number of patients that arrive late each day for their dental appointments. Another
example is the number of forms processed daily that had to be reworked due to defects. In both of
these examples, the total quantity would vary from day to day.
The general procedure for P Charts is as follows:

1. Determine purpose of the chart
2 Select data collection point
2.
3. Establish basis for sub-grouping
4. Establish sampling interval and determine sample size
5. Set up forms for recording and charting data and write specific instructions on
use of the chart
6. Collect and record data. It is recommended that at least 20 samples be used to
calculate the Control Limits
7. Compute P, the proportion nonconforming for each of the subgroups
8. Load data into Excel or other statistical software
9. Interpret chart together with other pertinent sources of information on the
process and take corrective action if necessary

593
SPC Overview: Control Methods/Effectiveness
Ty pe 1 Corrective Action = Counterm ea sure: improvement made to the process

which will eliminate the error condition from occurring. The defect will never be created.
This is also referred to as a long-term corrective action in the form of mistake proofing or
design changes.
Ty pe 2 Corrective Action = Fla g: improvement made to the process which will

detect when the error condition has occurred. This flag will shut down the equipment so
that the defect will not move forward.
SPC on X’s
X’ or Y’s
Y’ with
ith ffully
ll ttrained
i d operators
t and
d staff
t ff who
h respectt the
th rules.
l O
Once a
chart signals a problem everyone understands the rules of SPC and agrees to shut down
for special cause identification. (Cpk > certain level).
Ty pe 3 Corrective Action = Inspection: implementation of a short-term containment

which is likely to detect the defect caused by the error condition. Containments are
typically audits or 100% inspection.
SPC on X’s or Y’s with fully trained operators. The operators have been trained and
understand the rules of SPC
SPC, but management will not empower them to stop for
investigation.
S.O .P. is implemented to attempt to detect the defects. This action is not sustainable
short-term or long-term.
SPC on X’s or Y’s without proper usage. = W ALL PAPER.
The most effective form of control is called a type 1 corrective action

action. This is a control applied to the
process which will eliminate the error condition from occurring. The defect can never happen. This is
the “prevention” application of the Poka-Yoke method.
The second most effective control is called a type 2 corrective action. This a control applied to the
process which will detect when an error condition has occurred and will stop the process or shut
down the equipment so that the defect will not move forward. This is the “detection” application of
the Poka-Yoke method.
The third most effective form of control is to use SPC on the X’s with appropriate monitoring on the
Ys. To be effective, employees must be fully trained, they must respect the rules and management
must empower the employees to take action. Once a chart signals a problem, everyone understands
the rules of SPC and agrees to take emergency action for special cause identification and
elimination.
The fourth most effective correction action is the implementation of a short-term containment which
i lik
is likely
l to detect
d the
h defect
d f causedd by
b the
h error condition.
di i C
Containments
i are typically
i ll audits
di or 100%
inspection.
Finally you can prepare and implement an S.O.P. (standard operating procedure) to attempt to
manage the process activities and to detect process defects. This action is not sustainable, either
short-term or long-term.
Do not do SPC for the sake of just saying that you do SPC. It will quickly deteriorate to a waste of
time and a very valuable process tool will be rejected from future use by anyone who was
associated with the improper use of SPC.
Using the correct level of control for an improvement to a process will increase the acceptance of
changes/solutions you may wish to make and it will sustain your improvement for the long-term.

594
Purpose of Statistical Process Control
Every process ha s Ca uses of Va ria tion k now n a s:

– Com mon Ca use: N a tura l va ria bility
– Specia l Ca use: Unna tura l va ria bility
• Assigna ble: Rea son for detected Va ria bility
• Pa ttern Cha nge: Presence of trend or unusua l pa ttern
SPC is a ba sic tool to monitor a nd improve va ria tion in a process.
SPC is used to detect specia l ca use va ria tion telling us the process
is “ out of control” but does N O T tell us w hy.
SPC gives a glim pse of ongoing process ca pa bility AN D is a visua l

m a na gem ent tool.
SPC has its uses because it is known that every process has known variation called Special Cause and
Common Cause variation. Special Cause variation is unnatural variability because of assignable causes
or pattern changes. SPC is a powerful tool to monitor and improve the variation of a process. This
powerful tool is often an aspect used in visual factories. If a supervisor or operator or staff is able to
quickly monitor how its process is operating by looking at the key inputs or outputs of the process
process, this
would exemplify a visual factory.
SPC is used to detect Special Causes in order to have those operating the process find and remove the
Special Cause. When a Special Cause has been detected, the process is considered to be “out of
control”.
SPC gives an ongoing look at the Process Capability. It is not a capability measurement but it is a visual
indication of the continued Process Capability of your process.
process
This is a special cause!!

595
Elements of Control Charts
Developed by Dr W a lter A. Shew ha rt of Bell La bora tories from 1 9 2 4

Gra phica l a nd visua l plot of cha nges in the da ta over tim e
– This is necessa ry for visua l m a na gem ent of y our process.
Control Cha rts w ere designed a s a m ethodology for indica ting cha nge in
perform a nce, either va ria tion or m ea n/ m edia n.
Cha rts ha ve a centra l line a nd control lim its to detect specia l ca use
va ria tion.
Control Chart of Recycle
60 1
UCL=55.24
Special Cause 50
Variation Detected 40
vidual Value
30
_
X=29.06 Process Center
Indiv
20
(
(usually
ll th
the mean))
Control Limits
10
LCL=2.87
0
1 4 7 10 13 16 19 22 25 28
Observation
Control Charts were first developed by DrDr. Shewhart in the early 20th century in the U
U.S.
S Control
Charts are a graphical and visual plot of a process and is charted over time like a Time Series
Chart. From a visual management aspect, a Time Plot is more powerful than knowledge of the last
measurement. These charts are meant to indicate change in a process. All SPC charts have a
Central Line and Control Limits to aid in Special Cause variation.
Notice, again, we never discussed showing or considering specifications. We are advising you to
never have specification limits on a Control Chart because of the confusion often generated.
R
Rememberb we wantt tto control
t l and
d maintain
i t i th
the process iin th
the newly
l iimproved
d process bbased
d on
the recently improved past. These Control Charts and their limits are the Voice of the Process not
the Voice of the Customer which are the specification limits.

596
Understanding the Power of SPC
C t l Cha
Control Ch rts
t iindica
di tet w hen
h a process is
i “ outt off control”
t l” or ex hibiting
hibiti specia
i l ca use
va ria tion but N O T w hy !
SPC cha rts incorpora te upper a nd low er control lim its.

– The lim its a re typica lly +/ - 3 σ from the centerline.
– These lim its represent 9 9 .7 3 % of na tura l va ria bility for norma l distributions.
SPC cha rts a llow w ork ers a nd supervision to ma inta in improved process perform a nce
from Six Sigm a projects.
Use of SPC cha rts ca n be a pplied w ith a ll processes.

– Services, m a nufa cturing, a nd reta il a re just a few industries w ith SPC
a pplica tions.
– Ca ution m ust be ta k en w ith use of SPC for non-norm a l processes.
Control lim its describe the process va ria bility a nd a re unrela ted to custom er
specifica tions. (Voice of the Process instea d of Voice of the Custom er)
– An undesira ble situa tion is ha ving control lim its w ider tha n custom er
specifica tion lim its. This w ill ex ist for poorly perform ing processes w ith a Cp
less tha n 1 .0
M a ny SPC cha rts ex ist a nd selection m ust be a ppropria te for effectiveness.
The Control Chart Cookbook
Genera l Steps for Constructing Control Cha rts

1. Select characteristic (critical “ X” or CTQ) to be charted.
2
2. Determine the purpose of the chart
chart.
3. Select data-collection points.
4. Establish the basis for sub-grouping (only for Y’s).
5. Select the type of Control Chart.
6. Determine the measurement method/ criteria.
7
7. Establish the sampling interval/ frequency
frequency.
8. Determine the sample size.
9. Establish the basis of calculating the control limits.
Stirred or
10. Set up the forms or software for charting data. Shaken?
11. Set up the forms or software for collecting data.
12 Prepare written instructions for all phases
12. phases.
13. Conduct the necessary training.

597
Focus of Six Sigma and the Use of SPC
This conceptt should

Thi h ld bbe very ffamiliar
ili
to you by now. If we understand the Y= F(x )
variation caused by the X’s, then we
should be monitoring with SPC the To get results, should we focus our behavior on the Y or X?
X’s first. Y X1 . . . XN
Dependent Independent
By this time in the methodology you
Output Input
should clearlyy understand the
concept of Y=f(x). Using SPC we Effect Cause
are attempting to control the Critical Symptom Problem
X’s in order to control the Y. Monitor Control
If we find the “ vital few” X’s, first consider using SPC

on the X’s to achieve a desired Y?
Control Chart Anatomy
Specia l Ca use Statistical Process Control (SPC)

Va ria tion Run cha rt of
Process is da ta points
“ O ut of involves the use of statistical
Control”
techniques, to interpret data, to control
Upper Control the variation in processes. SPC is used
Limit
primarily
p y to act on out of control
processes, but it is also used to monitor
+/ - 3 sigma
Common
Ca use
Va ria tion
the consistency of processes producing
Process is “ In
Control”
products and services.
Low er Control
Limit
A primary SPC tool is the Control Chart
- a graphical representation for specific
M ea n
Specia l Ca use quantitative measurements of a process
Va ria tion
Process is input or output
output. In the Control Chart,
Chart
“ O ut of Process Sequence/ Time Sca le
Control” these quantitative measurements are
compared to decision rules calculated
based on probabilities from the actual measurement of process performance.
The comparison between the decision rules and the performance data detects any unusual variation
in the process that could indicate a problem with the process. Several different descriptive statistics
can be used in Control Charts. In addition, there are several different types of Control Charts that can
t t for
test f different
diff t causes, such
h as how
h quickly
i kl major
j vs. minor
i shifts
hift iin process averages are d
detected.
t t d
Control Charts are Time Series Charts of all the data points with one addition. The Standard
Deviation for the data is calculated for the data and two additional lines are added to the chart. These
lines are placed +/- 3 Standard Deviations away from the Mean and are called the Upper Control
Limit (UCL) and the Lower Control Limit (LCL). Now the chart has three zones: (1) The zone between
the UCL and the LCL which called the zone of Common Cause variation, (2) The zone above the
UCL which a zone of Special Cause variation and (3) another zone of Special Cause variation below
the LCL.
Control Charts graphically highlight data points that do not fit the normal level of expected variation.
This is mathematically defined as being more than +/- 3 Standard Deviations from the Mean. It’s all
based off probabilities. We will now demonstrate how this is determined.
598
Control and Out of Control
O utlier
3
2
1
99.7%
95%
68%
-1
-2
-3
O utlier
Control Charts provide you with two basic functions; one is to provide time based information on the
performance of the process which makes it possible to track events affecting the process and the
second is to alert you when Special Cause variation occurs. Control Charts graphically highlight data
points that do not fit the normal level of variation expected. It is standard that the Common Cause
variation level is defined as +/- 3 Standard Deviations from the Mean. This is also know as the UCL and
LCL respectively.
Recall the “area under the curve” discussion in the lesson on Basic Statistics, remembering that +/- one
Standard Deviation represented 68% of the distribution, +/- 2 was 95% and +/- 3 was 99.7%. You also
learned from a pprobability
ypperspective
p that yyou would expect
p the output
p of a p
process would have a
99.7% chance of being between +/- 3 Standard Deviations. You also learned that sum of all probability
must equal 100%. There is only a 0.3% chance (100% - 99.7%) that a data point be beyond +/- 3
Standard Deviations. In fact, since we are talking about two zones; one zone above the + 3 Standard
Deviations and one below it. We have to split 0.3% in two, meaning that there is only a 0.15% chance of
being in one of the zones.
There is only a .0015 (.15%) probability that a data point will either be above or below the UCL or LCL.
That is a very small probability as compared to .997
997 (99
(99.75%)
75%) probability the data point will be between
the UCL and the LCL. What this means is there must have been something special happen to cause a
data point to be that far from the Mean, like a change in vendor, a mistake, etc. This is why the term the
term Special Cause or assignable cause variation applies. The probability that a data point was this far
from the rest of the population is so low that something special or assignable happened. Outliers are
just that, they have a low probability of occurring, meaning we have lost control of our process. This
simple, quantitative approach using probability is the essence of all Control Charts.

599
Size of Subgroups
Typical subgroup sizes are 3-12 for variable data:

– If difficulty of gathering sample or expense of testing exists, the size, n, is
smaller
– 3, 5, and 10 are the most common size of subgroups because of ease of
calculations when SPC is done without computers.
Size of subgroups aid in detection of shifts of mean indicating special cause
exists. The larger the subgroup size, the greater chance of detecting a special
cause. Subgroup size for Attribute Data is often 50 – 200.
Lot 1 Lot 5
Lot 3
Lot 2
Lot 4
Short-term studies
Long-term study
The Impact of Variation
Remember the Control

Sources of Sources of Sources of
Limits are based on your Va ria tion Va ria tion Va ria tion
PAST data and depending - N atural Process Variation - N atural Process Variation - N atural Process Variation
as defined by subgroup - Different Operators - Different Operators
on what sources of selection - Supplier Source
variation you have
included in your
subgroups, the Control
-UCL
Limits which detect the
Special Cause variation -LCL
will be affected. You really
want to have subgroups
First, select the sprea d
with only Common Cause tha t w e w ill decla re a s
the “ N a tura l Process
variation so if other Va ria tion” , so tha t
w henever a ny point So, w hen a second And, of course, if tw o a dditiona l
sources of variation are la nds outside these source of va ria tion sources of va ria tion a rrive, w e
“ control lim its” , a n a ppea rs, w e w ill w ill detect tha t, too!
detected, the sources will a la rm w ill sound k now !
be easily found instead of
buried within your If y ou ba se y our lim its on a ll three sources of va ria tion, w ha t w ill sound the a la rm ?
definition of subgroups.
Let’s consider if you were tracking delivery times for quotes on new business with an SPC chart. If
you decided to not include averaging across product categories, you might find product categories
are assignable causes but you might not find them as Special Causes since you’ve included them
in the subgroups as part of your rationalization.
rationalization
You really want to have subgroups with only Common Cause variation so if other sources of
variation are detected, the sources will be easily found instead of buried within your definition of
subgroups.

600
Frequency of Sampling
Sampling Frequency is a balance between cost of sampling and testing versus cost of not detecting
shifts in mean or variation.
Process knowledge is an input to frequency of samples after the subgroup size has been decided.
- If a process shifts but cannot be detected because of too infrequent sampling, the
customer suffers
- If choice is given of large subgroup samples infrequently or smaller subgroups
more frequently, most choose to get information more frequently.
- In some processes, with automated sampling and testing frequent sampling is
easy.
If undecided as to sample frequency, sample more frequently to confirm detection of process shifts
and reduce frequency if process variation is still detectable.
A rule of thumb also states “sample a process at least 10X more frequent than the frequency of ‘out of
control’ conditions”.
Sometimes it can be a struggle how often to sample your process when monitoring results. Unless the
measurement is automated, inexpensive and recorded with computers and able to be charted with
SPC software without operator involvement, then frequency of sampling is an issue.
Let’s reemphasize some points. First, you do NOT want to under sample and not have the ability to
find Special Cause variation easily. Second, do not be afraid to sample more frequently and then
reduce the frequency
q y if it is clear Special
p Causes are found frequently.
q y
Sa mpling too little w ill not a llow for sufficient detection of shifts
in the process beca use of specia l ca uses.
I Chart of Sample_3
Output 7.5
UCL=7.385
All possible samples 7.0
7.5
Individual Value
6.5
7
_
X=6.1
6.5 6.0
6
5.5
5.5
5 5.0
Sample every half hour LCL=4.815
1 7 13 19 25 31 37 1 2 3 4 5 6 7 8 9 10 11 12 13
Observation
I Chart of Sample_6 I Chart of Sample_12

6.6
UCL=8.168 UCL=6.559
8
6.4
6.2
7
Individual Value
Individual Value
6.0
_ _
X=6.129 X=5.85
6 5.8
5.6
5 5.4
Sample every hour LCL=4.090

Sample
5.2
4x/shift LCL=5.141
4 5.0
1 2 3 4 5 6 7 1 2 3 4
Observation Observation

601
SPC Selection Process

The Control Charts Choose Appropria te
you choose to use will Control Cha rt
always be based first
on the type of data type
ATTRIBUTE CONTINUOUS
you have and then on of data
the objective of the

type of
subgroup
Control Chart. The attribute
size
data
first selection criteria DEFECTS DEFECTIVES
will be whether you Sample size 1 2-5 10+
have Attribute or type

of defect
type of
subgroups
Continuous Data. I – MR
Chart
X–R
Chart
X–S
Chart
CONSTANT VARIABLE CONSTANT VARIABLE
Individuals Mean & Mean &
Continuous SPC & Moving Range Std. Dev.
Range
refers to Control
NP
Charts that display C Chart U Chart
Chart
P Chart SPECIAL CASES
process input or Number of Incidences Number of Proportion

output characteristics Incidences per Unit Defectives Defectives
CumSum EWMA
Chart Chart
based on Continuous
Cumulative Exponentially
Data - data where Sum Weighted Moving
Average
decimal subdivisions
have meaning. When these Control Charts are used to control the Critical X input characteristic it is
called Statistical Process Control (SPC). These charts can also be used to monitor the CTQ’s, the
important process outputs. This is referred to as Statistical Process Monitoring (SPM).
There are two categories of Control Charts for Continuous Data: charts for controlling the process
average and charts for controlling the process variation. Generally, the two categories are combined.
The principal types of Control Charts used in Six Sigma are: charts for Individual Values and Moving
Ranges (I-MR), charts for Averages and Ranges (XBar-R), charts for Averages and Standard
Deviations (XBar-S) and Exponentially Weighted Moving Average charts (EWMA).
Although it is preferable to monitor and control products, services and supporting processes with
Continuous Data,
Data there will be times when Continuous Data is not available or there is a need to
measure and control processes with higher level metrics, such as defects per unit. There are many
examples where process measurements are in the form of Attribute Data. Fortunately, there are
control tools that can be used to monitor these characteristics and to control the critical process
inputs and outputs that are measured with Attribute Data.
Attribute Data, also called discrete data, reflects only one of two conditions: conforming or non-
conforming, pass or fail, go or no go. Four principal types of Control Charts are used to monitor and
control
t l characteristics
h t i ti measured d iin Att
Attribute
ib t DData:
t ththe p ((proportion
ti nonconforming),
f i ) np ((number
b
nonconforming), c (number of non-conformities), and u (non-conformities per unit) charts. Four
principle types of Control Charts are used to monitor and control characteristics measured in
Discrete Data: the p (proportion nonconforming), np (number nonconforming), c (number of non-
conformities), and u (non-conformities per unit) charts. These charts are an aid to decision making.
With Control Limits, they help us filter the probable noise by adequately reflecting the Voice of the
Process.
A defective is defined as an entire unit that fails to meet acceptance criteria, regardless of the
number of defects in the unit. A defect is defined as the failure to meet any one of the many
acceptance criteria. Any unit with at least one defect may be considered to be a defective.
Sometimes more than one defect is allowed, up to some maximum number, before the product is
considered to be defective.
602
Understanding Variable Control Chart Selection
T
Type off Cha
Ch rtt W hen
h do
d you need
d it?
Avera ge & u Production is higher volum e; a llow s process m ea n a nd va ria bility

Ra nge or S to be view ed a nd a ssessed together; more sa m pling tha n w ith
(X Ba r a nd R or Individua ls cha rt (I) a nd M oving Ra nge cha rts (M R) but w hen
X Ba r a nd S) subgroups a re desired. O utliers ca n ca use issues w ith Ra nge (R)
cha rts so Sta nda rd Devia tion cha rts (S) used instea d if concerned.
M ost common
IIndividua
di id l a nd
d u Production
P d ti is
i low
l volum
l e or cycle
l tim
ti e to
t build
b ild product
d t is
i long
l or
M oving Ra nge homogeneous sa m ple represents entire product (ba tch etc.);
sa m pling a nd testing is costly so subgroups a re not desired.
Control lim its a re w ider tha n X Ba r cha rts. Used for SPC on m ost
inputs.
Pre-Control u Set-up is critica l, or cost of setup scra p is high. Use for outputs
Ex ponentia lly u Sm a ll shift needs to be detected, often beca use of a utocorrela tion
W eighted of the output results. Used only for individua ls or a vera ges of
M oving Avera ge O utputs. Infrequently used beca use of ca lcula tion com plex ity .
Cum ula tive Sum u Sa m e rea sons a s EW M A (Ex ponentia lly W eighted M oving Ra nge)
ex cept the pa st da ta is a s im porta nt a s present da ta .
Less Com m on
Understanding Attribute Control Chart Selection
Type of Cha rt W hen do you need it?
P u N eed to tra ck the fra ction of defective

units; sa m ple size is va ria ble a nd usua lly > 5 0
nP u W hen you w a nt to tra ck the num ber of defective

units per subgroup; sa m ple size is usua lly
consta nt a nd usua lly > 5 0
C u W hen you
o w a nt to tra ck the num
n m ber of defects
per subgroup of units produced; sa m ple size is
consta nt
U u W hen you w a nt to tra ck the num ber of

defects per unit; sa m ple size is va ria ble
The P Chart is the most common type of chart in understanding Attribute Control Charts.

603
Detection of Assignable Causes or Patterns
Control Cha rts indica te specia l ca uses being either a ssigna ble ca uses or pa tterns.
The follow ing rules a re a pplica ble for both va ria ble a nd Attribute Da ta to detect
specia l ca uses.
These four rules a re the only a pplica ble tests for Ra nge (R), M oving Ra nge (M R), or
Sta nda rd Devia tion (S) cha rts.
– O ne point m ore tha n 3 Sta nda rd Devia tions from the center line.
– 6 points in a row a ll either increa sing or a ll decrea sing.
– 1 4 points in a row a lterna ting up a nd dow n.
– 9 points in a row on the sa m e side of the center line.
These rema ining four rules a re only for va ria ble da ta to detect specia l ca uses.
– 2 out of 3 points grea ter tha n 2 Sta nda rd Devia tions from the center line on the
sa m e side.
– 4 out of 5 points grea ter tha n 1 Sta nda rd Devia tion from the center line on the
sa m e side.
– 1 5 points in a row a ll w ithin one Sta nda rd Devia tion of either side of the center
line.
– 8 points in a row a ll grea ter tha n one Sta nda rd Devia tion of either side of the
center line.
Remember Control Charts are used to monitor a process performance and to detect Special Causes
due to assignable causes or patterns. The standardized rules of your organization may have some of
the numbers slightly differing. For example, some organizations have 7 or 8 points in a row on the
same side of the Center Line. We will soon show you how to find what your MINITABTM version has
for defaults for the Special Cause tests.
There are typically 8 available tests for detecting Special Cause variation
variation. Only 4 of the 8 Special
Cause tests can be used. Range, Moving Range or Standard Deviation charts are used to monitor
“within” variation.
If you are unsure of what is meant by these specific rule definitions, do not worry. The next few pages
will specifically explain how to interpret these rules.

604
Recommended Special Cause Detection Rules
• If implementing
i l i SPC manuallyll without
ih software
f iinitially,
i i ll the
h most visually
i ll obvious
b i violations
i l i are
more easily detected. SPC on manually filled charts are common place for initial use of defect
prevention techniques.
• These 3 rules are visua lly the most easily detected by personnel.
– One point more than 3 Standard Deviations from the center line.
– 6 points in a row all either increasing or all decreasing.
– 15 points
i t iin a row allll within
ithi one St
Standard
d dD Deviation
i ti off either
ith side
id off th
the center
t liline.
• Dr. Shewhart that worked with the W estern Electric Co. was credited with the following 4 rules
referred to as W estern Electric Rules.
– One point more than 3 Standard Deviations from the center line.
– 8 points in a row on the same side of the center line.
– 2 out of 3 points greater than 2 Standard Deviations from the center line on the same side.
– 4 out of 5 points greater than 1 Standard Deviation from the center line on the same side.
• You might notice the W estern Electric rules vary slightly. The importance is to be consistent in
your organization and decide what rules you will use to detect special causes.
• VERY few organizations use all 8 rules for detecting special causes.
Special Cause Rule Default in MINITABTM
If a Belt is using MIN ITABTM , you must be aware of what default settings
for the rules. You can alter your program defaults with:
Tools>Options>Control Charts and Quality Tools>Define Tests
This would be
changed to 8 if
you prefer the
W estern Electric
Rules.
Many experts have commented on the appropriate tests and numbers to

be used. Decide and be consistent when implementing.

605
Special Cause Rule Selection in MINITABTM
W hen a Belt is using MIN ITABTM , the default tests can be set when
running SPC on the variable or Attribute Data.
Tools>Options>Control Charts and Quality Tools>Tests to Perform
A Belt can always change which tests are selected for any individual
SPC chart.
Special Cause Test Examples
As promised, we will now This is the M OST common specia l ca use test used in SPC cha rts.
closely review the definition
of the Special Cause tests.
The first test is one point
Test 1 One point beyond zone A
more than 3 sigmas from 1
the Center Line. The 3 A

sigma lines are added or B
subtracted from the Center 1 C
Line. The sigma estimation C
for the short-term variation B
will be shown later in this A
module.
If only one point is above

the upper 3 sigma line or
below the lower 3 sigma
line, then a Special Cause
is indicated. This does not
mean you need to confirm if another point is also outside of the 3 sigma lines before action is to be
taken. Don’t forget the methodology of using SPC.
If you want to see the MINITABTM output on the left, execute the MINITABTM command “Stat,
C t l Ch
Control Charts,
t VVariable
i bl Ch
Charts
t ffor IIndividuals,
di id l IIndividuals”
di id l ” and
d th
then select
l t th
the “I chart
h t options
ti and
d
Tests tab”. Remember, your numbers may vary in the slide and those are set in the defaults as
you were shown recently in this module. From now on, we will assume your rules are the same as
shown in this module. If not, just adjust the conclusions.

606
The second test

for detecting This test is a n indica tion of a shift in the process M ea n.
Special Causes is
nine points in a
row on the same Test 2 Nine points in a row on
side of the Center same side of center line
Line. This literally A

means if nine B
consecutive points C
are above the C
Center Line, then B 2
a Special Cause is A
detected that
would account for
a potential Mean
shift in the
process.
This rule would

also be violated
if nine consecutive points are below the Center Line. The amount away from the Center Line does
not matter as long as the consecutive points are all on the same side of the Center Line.
The third test looking

This test is indica ting a trend or gra dua l shift in the M ea n.
for a Special Cause
is six points in a row
all increasing or all
decreasing. This Test 3 Six points in a row, all
increasing or decreasing
means if six
A 3
consecutive times,
B
the present point is
C
higher than the
C
previous point than
B
the rule has been
A
violated and the
process is out of
control. The rule is
also violated if for six
consecutive times
the present point is
lower than the
previous point on the SPC chart.
This rule obviously needs the time order when plotting on the SPC charts to be valid. Typically,
these charts plot increasing time from left to right with the most recent point on the right hand side of
the chart.
chart Do not make the mistake of seeing six points in a line indicating an out of control condition
condition.
Note on the example shown on the right, a straight line shows seven points but it takes that many in
order to have six consecutive points increasing. This rule would be violated no matter what zone the
points occur.

607
Special Cause Test Examples (cont.)
The fourth rule

for a Special This test is indica ting a non-ra ndom pa ttern.
Cause indication
is fourteen points
in a row Test 4 Fourteen points in a
alternating up row, alternating up and down
and down. In A
other words, if B
the first point C
increased from C 4
the last point and B
the second point A
decreased from
the first point and
the third point
increased from
the second point
and so on for
fourteen points,
then the process is considered out of control or a Special Cause is indicated. This rule does not
depend on the points being in any particular zone of the chart. Also note the process is not considered
to be out of control until after the 14th point has followed the alternating up and down pattern.
The fifth Special Cause

This test is indica ting a shift in the M ea n or a w orsening of
test looks for 2 out of 3 va ria tion.
consecutive points more
than 2 sigma away from Test 5 Two out of three points in
a row in zone A (one side of center
the Center Line on the line)
same side.
id Th
The 2 sigma
i A
5
line is obviously 2/3 of the B

distance from the Center C
Line as the 3 sigma line. C
Please note it is not B
required that the points A 5
more than 2 sigma away

be in consecutive order,
they just have to be within
a group of 3 consecutive
points. Notice the example
shown on the right does
NOT have 2 consecutive points 2 sigma away from the Center Line but 2 out of the 3 consecutive
are more than 2 sigma away. Notice this rule is not violated if the 2 points that are more than 2
sigma but NOT on the same side.
Have you noticed that MINITABTM will automatically place a number by the point that violates the
Special Cause rule and that number tells you which of the Special Cause tests has been violated.
In this example shown on the right, the Special Cause rule was violated two times.

608
Special Cause Test Examples (cont.)

The sixth Special Cause
test looks for any four out of This test is indica ting a shift in the M ea n or degra da tion of
five points more than one va ria tion.
sigma from the Center Line Test 6 Four out of five points in
zone B or beyond (one side of
all on the same side. Only center line)
the 4 points that were more A
6
than one sigma need to be B

on the same side. If four of C
th fifive consecutive
the ti points
i t C
are more than one sigma B 6
A
from the Center Line and on
the same side, do NOT
make the wrong assumption
that the rule would not be
violated if one of the four
points was actually more
than 2 sigma from the
Center Line.
This test is indica ting a dra m a tic im provem ent of the
The seventh Special Cause va ria tion in the process.
test looks for 15 points in a
row all within one sigma Test 7 Fifteen points in a row in
zone C (both sides of center line)
from the Center Line. You
A
might think this is a good B
thing and it certainly is. C
However, a process might C 7
want to find the Special B

Cause for this reduced A
variation so the
improvement can be
sustained in the future.
The eighth and final test

for Special Cause
detection is having eight
points in a row all more
This test is indica ting a severe w orsening of va ria tion.
than one sigma from the
Center Line. The eight
consecutive p points can be Test 8 Eight points in a row
beyond zone C (both sides of
any number of sigma center line)
away from the Center A
Line. Do NOT make the B
wrong assumption this rule C
would not be violated if C
some of the points were B 8
A
more than 2 sigma away
from the Center Line
Line. If
you reread the rule, it just
states the points must be
more than one sigma from
the Center Line.

609
SPC Center Line and Control Limit Calculations
This is a reference for you in case you really want to get into the nitty-gritty
nitty gritty. The formulas shown here
are the basis for Control Charts.
Ca lcula te the pa ra meters of the Individua l a nd M R Control Cha rts

w ith the follow ing:
Centerline Control Limits

k
∑R
k
∑x i
i
UCL x = X + E 2 MR UCL MR = D 4 MR
X= i =1 MR = i
k k LCL x = X − E 2 MR LCL MR = D 3 MR
Where:
Xbar: Average of the individuals, becomes the centerline on the Individuals chart
Xi: Individual data points
k: Number of individual data points
Ri : Moving range between individuals, generally calculated using the difference between
each successive pair of readings
MRbar: The average moving range,range the centerline on the range chart
UCLX: Upper control limit on individuals chart
LCLX: Lower control limit on individuals chart
UCLMR: Upper control limit on moving range
LCLMR : Lower control limit on moving range (does not apply for sample sizes below 7)
E2, D3, D4: Constants that vary according to the sample size used in obtaining the moving range
M Rba r (com puted a bove)
σ (st. dev. Estimate)
>
= d 2 (ta ble of consta nts for subgroup size n)
Ca lcula te the pa ra meters of the X Ba r a nd R Control Cha rts w ith the

follow ing:
Centerline Control Lim its

k
∑R
k
∑x i i UCL x = X + A 2 R UCL R = D 4 R
X= i =1
R = i
LCL x = X − A 2 R LCL R = D3 R
Where: k k
Xi: Average of the subgroup averages, it becomes the centerline of the control chart
Xi: Average of each subgroup
k: Number of subgroups
Ri : Range of each subgroup (Maximum observation – Minimum observation)
Rbar: The average range of the subgroups, the centerline on the range chart
UCLX: Upper control limit on average chart
LCLX: Lower control limit on average chart
UCLR: Upper control limit on range chart
LCLR : Lower control limit range chart
A2, D3, D4: Constants that vary according to the subgroup sample size
Rba r (computed a bove)
σ (st. dev. Estimate) =
>
d 2 (ta ble of consta nts for subgroup size n)

610
SPC Center Line and Control Limit Calculations (cont.)
Yet another reference just in case anyone wants to do this stuff manually
manually…have
have fun!!!!
Ca lcula te the pa ra meters of the X Ba r a nd S Control Cha rts w ith the

follow ing:

k k
∑x i ∑s i UCL x = X + A 3 S UCLS = B4 S
X= i =1
S= i=1
k k LCL x = X − A 3 S LCLS = B3 S
Where:
Xi: Average of the subgroup averages, it becomes the centerline of the control chart
Xi: Average of each subgroup
k: Number of subgroups
si : Standard deviation of each subgroup
Sbar: The average s. d. of the subgroups, the centerline on the S chart
UCLX: Upper control limit on average chart
LCLX: Lower control limit on average chart
UCLS: Upper control limit on S chart
LCLS : Lower control limit S chart
A3, B3, B4: Constants that vary according to the subgroup sample size
Sba r (com puted a bove)
σ (st. dev. Estimate) =
>
c4 (ta ble of consta nts for subgroup size n)
We are now moving to the formula summaries for the attribute SPC Charts. These formulas are fairly
basic. The upper and lower Control Limits are equidistant from the Mean % defective unless you
reach a natural limit of 100 or 0%. Remember the p Chart is for tracking the proportion or % defective.
These formulas are a bit more elementary because they are for Attribute Control Charts. Recall p
Charts track the proportion or % defective.
Ca lcula te the pa ra m eters of the P Control Cha rts w ith the follow ing:
Total number of defective items p (1 − p )

p= UCL p = p + 3
Total number of items inspected ni
p (1 − p )
LCL p = p − 3
Where: ni
p: Average proportion defective (0.0 – 1.0)
ni: Number inspected in each subgroup
LCLp: Lower control limit on p chart
UCLp: Upper control limit on p chart
Since the Control Limits are a function of

sample size, they will vary for each sample.

611
The nP Chart
Chart’s
s formulas resemble the P Chart
Chart. This chart tracks the number of defective items in a
subgroup.
Ca lcula te the pa ra meters of the nP Control Cha rts w ith the

follow ing:
Total number of defective items UCL np = n i p + 3 ni p(1 − p)

np =
Total number of subgroups
Where: LCL np = n i p − 3 n i p(1 - p)
np: Average number defective items per subgroup
LCLnp: Lower control limit on nP chart
UCLnp: Upper control limit on nP chart
Since the Control Limits AN D Center Line are a function

of sample size, they will vary for each sample.
The U Chart is also basic in construction and is used to monitor the number of defects per unit.
Ca lcula te the pa ra meters of the U Control Cha rts w ith the

follow ing:

u
u=
Total number of defects Identified UCL u = u + 3
Total number of Units Inspected ni
u
Where: LCL u = u − 3
ni
u: Total number of defects divided by the total number of units inspected.
LCLu: Lower control limit on u chart.
UCLu: Upper control limit on u chart.
Since the Control Limits are a function of sample

size they will vary for each sample
size, sample.

612
The C Control Charts are a nice way of monitoring the number of defects in sampled subgroups
subgroups.
Ca lcula te the pa ra meters of the C Control Cha rts w ith the

follow ing:

Total number of defects UCL c = c + 3 c
c=
Total number of subgroups
LCLc = c − 3 c
W here:
c: Total number of defects divided by the total number of subgroups.

LCLc: Lower control limit on c chart.
UCLc: Upper control limit on c chart.
This EWMA can be considered a smoothing monitoring system with Control Limits. This is rarely used
without computers or automated calculations. The items plotted are NOT the actual measurements
b t th
but the weighted
i ht d measurements. t Th
The exponentially
ti ll weighted
i ht d movingi average iis useful
f l ffor considering
id i
past and historical data and is most commonly used for individual measurements although has been
used for averages of subgroups.
Ca lcula te the pa ra meters of the EW M A Control Cha rts w ith

the follow ing:

σ λ
Zt = λ Xt + (1− λ) Zt −1 UCL = X + 3 ( )[1 − (1 − λ) 2t ]
n 2 − λ
σ λ
LCL = X − 3 ( )[1 − (1 − λ) 2t ]
W here: n 2−λ
Zt: EWMA statistic plotted on control chart at time t
Zt-1: EWMA statistic plotted on control chart at time t-1
λ: The weighting factor between 0 and 1 – suggest using 0.2
σ: Standard deviation of historical data (pooled standard deviation for subgroups
– MRbar/d2 for individual observations)
Xt: Individual data point or sample averages at time t
UCL: Upper control limit on EWMA chart
LCL: Lower control limit on EWMA chart
n: S b
Subgroup samplel size
i

613
Ca lcula te the pa ra m eters of the CUSUM Control Cha rts w ith

M IN ITABTM or other progra m since the ca lcula tions a re even
m ore complica ted tha n the EW M A cha rts.
Beca use of this complex ity of form ula s, ex ecution of either

this or the EW M A a re not done w ithout a utom a tion a nd
com puter a ssista nce.
Ah, anybody got a laptop?
The CUSUM is an even more difficult technique to handle with manual calculations.
calculations We aren’t
aren t even
showing the math behind this rarely used chart. Following the Control Chart selection route shown
earlier, we remember the CUSUM is used when historical information is as important as present data.

614
Pre-Control Charts
Pre-Control Cha rts use limits relative to the specification limits. This is the first
and ON LY chart you will see specification limits plotted for statistical process
control. This is the most basic type of chart and unsophisticated use of process
control.
0.0 0.25 0.5 0.75 1.0 Red Zones. Zone outside the
specification limits. Signals the
process is out-of-control and
should be stopped
Yellow Zones. Zone between

RED Yellow GREEN Yellow Red the PC Lines and the
specification limits, indicates
caution and the need to watch
the process closely
Green Zone. Zone lies

LSL Target USL between the PC Lines, signals the
process is in control
The Pre-Control Charts are often used for startups with high scrap cost or low production volumes
between setups. Pre-Control Charts are like a stoplight are the easiest type of SPC to use by
operators or staff. Remember Pre-Control Charts are to be used ONLY for outputs of a process.
Another approach to using Pre-Control Charts is to use process capability to set the limits where
yellow and red meet.
Process Setup and Restart with Pre-Control
Q ua lifying
lif i Process
P
• To qualify a process, five consecutive parts must fall within the green zone
• The process should be qualified after tool changes, adjustments, new
operators, material changes, etc
M onitoring O ngoing Process

• Sample two consecutive parts at predetermined frequency
– If either part is in the red, stop production and find reason for variation
– W hen one part falls in the yellow zone inspect the other and
• If the second part falls in the green zone then continue
• If the second part falls in the yellow zone on the same side, make an
adjustment to the process
• If second part falls in the yellow zone on the opposite side or in the
red zone, the process is out of control and should be stopped
– If any part falls outside the specification limits or in the red zone, the
process is out of control and should be stopped

615
Responding to Out of Control Indications
SPC is an exciting
tool but we must • The power of SPC isn’t to find out what the Center Line and Control Limits are.
not get enamored • The power is to react to the Out of Control (OOC) indications with your Out of Control
Action Plans (OCAP) for the process involved. These actions are your corrective actions
with. The power of to correct the output or input to achieve proper conditions.
SPC is not to find
the Center Line and Individual SPC chart for Response Time
1 VIOLATION :
Control Limits but 40 UCL=39.76
special cause is indicated

to react to out of 30
Individual Value
control indications _
20
with an out of X=18.38
control action plan. 10 OCAP

SPC for If response time is too high, get
0
effectiveness at LCL=-3.01
additional person on phone bank
1 4 7 10 13 16 19 22 25 28 31
controlling and Observation
reducing long-term
• SPC requires immediate response to a special cause indication.
variation is to • SPC also requires no “ sub optimizing” by those operating the process.
respond – Variability will increase if operators always adjust on every point if not at the center
immediately to out line. ON LY respond when an Out of Control or special cause is detected.
of control or – Training is required to interpret the charts and response to the charts.
Special Cause
indications.
SPC can be actually harmful if those operating the process respond to process variation with
suboptimizing. A basic rule of SPC is if it is not out of control as indicated by the rules, then do not
make any adjustments. There are studies where an operator that responds to off center
measurements will actually produce worse variation than a process not altered at all. Remember,
being off the Center Line is NOT a sign of out of control because Common Cause variation exists.
Training is required to use and interpret the charts not to mention training for you as a Belt to properly
create an SPC chart.
Attribute SPC Example
Pra ctica l Problem : A project has been launched to get rework
reduced to less than 25% of paychecks. Rework includes contacting a
manager about overtime hours to be paid. The project made some
progress but decides they need to implement SPC to sustain the gains
and track % defective. Please analyze the file “ paycheck2.mtw” and
determine the Control Limits and Center Line.
Step 3 and 5 of the methodology is the primary focus for this example.
– Select the a ppropria te control cha rt a nd specia l ca use tests
to employ
– Ca lcula te the Center Line a nd Control Lim its
– Looking at the data set, we see 20 weeks of data.
– The sample size is constant at 250.
– The amount of defective in the sample is in column C3.
Paycheck2.mtw

616
Attribute SPC Example (cont.)
The example includes % paychecks defective. The metric to

be charted is % defective. W e see the P Chart is the most
appropriate attribute SPC chart.
N otice specifications were never discussed. Let us calculate the Control

Limits and Central Line for this example.
W e will confirm what rules for special causes are included in our Control
Chart analysis.

617
Remember to click on the Options and Tests tab to clarify the rules for
detecting special causes.
…. Chart Options>Tests
Chart analysis. The top 3 were selected.
N o special causes were detected. The average % defective checks were

20.38%. The UCL was 28.0% and 12.7% for the LCL.
P Chart of Empl_w_Errors
0.30
UCL=0.2802
0.25
Proportion
_
0.20 P=0.2038
0.15
LCL=0 1274
LCL=0.1274
1 3 5 7 9 11 13 15 17 19
Sample
N ow we must see if the next few weeks are showing special cause from
the results. The sample size remained at 250 and the defective checks
were 61, 64, 77.

618
Remember, we have calculated the Control Limits from the first 20 weeks.
W e must now put in 3 new weeks and N OT have MIN ITABTM calculate
new Control Limits which will be done automatically if we do not follow
this technique. W e are executing Steps 6-8
– Step 6 : Plot process X or Y on the new ly crea ted control
cha rt
– Step 7 : Check for O ut-O f-Control (O O C) conditions a fter ea ch
point
– Step 8 : Interpret findings, investiga te specia l ca use
va ria tion, & m a k e improvem ents follow ing the O ut of
Control Action Pla n (O CAP)
N otice the new 3 weeks of data was entered

into the spreadsheet.
…… Chart Options>Parameters
Place the pbar from the 1 st chart

we created in the estimates tab.
This will prevent MIN ITABTM from
calculating new control limits
which is step 9.
1
0.30
The new updated SPC chart UCL=0.2802
is shown with one special 0.25

Proportion
cause. _
0.20 P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19 21 23
Sample

619
Beca use of the specia l ca use, the process must refer to the O CAP or O ut of Control
Action Pla n tha t sta tes w ha t root ca uses need to be investiga ted a nd w ha t a ctions a re
ta k en to get the process ba ck in control.
1
0.30
UCL=0.2802
0.25
Proportion
n
_
0.20 P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19 21 23
Sample
After the corrective a ctions w ere ta k en, w a it until the nex t sa mple is ta k en to see if the
process ha s cha nged to not show specia l ca use a ctions
ctions.
– If still out of control, refer to the O CAP a nd ta k e further a ction to improve the
process. DO N O T ma k e a ny more cha nges if the process show s ba ck in
control a fter the nex t rea ding.
• Even if the nex t rea ding seem s higher tha n the center line! Don’t ca use
more va ria bility.
If process cha nges a re documented a fter this project w a s closed, the Control Limits
should be reca lcula ted a s in step 9 of the SPC methodology.
Pra ctica l Problem: A job shop drills holes for its largest customer as
a final step to deliver a highly engineered fastener. This shop uses five
drill presses and gathers data every hour with one sample from each
press representing a subgroup. The data is gathered in columns C3-C7.
Step 3 and 5 of the methodology is the primary focus for this example.
– Select the a ppropria te Control Cha rt a nd specia l
ca use tests to employ
– Ca lcula te the Center Line a nd Control Limits
Holediameter.mtw
Let’s walk through another example of using SPC within MINITABTM but in this case it will be
with Continuous Data. Open the MINITABTM worksheet called “hole diameter” and select the
appropriate type of Control Chart and calculate the Center Line and Control Limits.
Lets try another one

one, this time variable…
variable

620
The example has Continuous Data, subgroups and we have

no interest in small changes in this small process output.
The XBar R Chart is selected because we are uninterested in
the XBar S Chart for this example.
Specifications were never discussed. Let us calculate the Control Limits

and Central Line for this example.
Chart analysis.

621
Remember to click on the Options and Tests tab to clarify the rules for
detecting special causes.
……..Xbar-R Chart Options>Tests
W e will confirm what rules for special causes are included in our
Control Chart analysis. The top 2 of 3 were selected.
Also confirm the Rbar method is used for estimating Standard Deviation.
Stat>Control Charts>Variable Charts for Subgroups>Xbar-R>Xbar-R Chart Options>Estimate

622
N o special causes were detected in the XBar Chart. The average hole
diameter was 26.33. The UCL was 33.1 and 19.6 for the LCL.
Xbar-R
Xbar-RChart
Chartof
ofPart1,
Part1,...,
...,Part5
Part5
35
35
UUCL=33.07
C L=33.07
Mean
eMean
30
30
__
__
Sample
X=26.33
Sample
X=26.33
25
25
20
20 LC L=19.59
LCL=19.59
11 66 11
11 16
16 21
21 26
26 31
31 36
36 41
41 46
46
Sample
Sample
1
1
24 UUCL=24.72
C L=24.72
24
Range
SampleRange
18
18
_
12 _
12 R=11.69
R=11.69
Sample
6
6
0 LC L=0
0 LCL=0
1 6 11 16 21 26 31 36 41 46
1 6 11 16 21 26 31 36 41 46
Sample
Sample
N ow we will use the Control Chart to monitor the next 2 hours and see if
we are still in control.
Remember, we have calculated the Control Limits from the first 20 weeks.
W e must now put in 2 more hours and N OT have MIN ITABTM calculate
new Control Limits which will be done automatically if we do not follow
this step. W e are executing Steps 6-8
– Step 6 : Plot process X or Y on the new ly crea ted Control
Cha rt
– Step 7 : Check for O ut-O f-Control (O O C) conditions a fter ea ch
point
– Step 8 : Interpret findings, investiga te specia l ca use
va ria tion, & ma k e improvem ents follow ing the O ut of
Control Action Pla n (O CAP)
N otice the new 2 hours of data was

entered into the spreadsheet.

623
……..Xbar-R Chart Options>Parameters
Pla ce the M ea n from the 1 st cha rt w e crea ted

in the estima tes ta b. The Sta nda rd Devia tion
is Rba r/ d2 . This w ill prevent M IN ITABTM
from ca lcula ting new Control Limits
w hich is step 9 . d2 is found by finding
the ta ble of consta nts show n ea rlier.
Xbar-R Chart of Part1, ..., Part5

35
U C L=33.07
Sample M ean
30
_
_
X=26.33
25
The new upda ted SPC cha rt is show n w ith 20 LC L=19.59
no indica ted specia l ca uses in the X Ba r 1 6 11 16 21 26

Sample
31 36 41 46 51
cha rt. The m ea n, UCL a nd LCL a re 24

1
U C L=24.72
uncha nged beca use of the completed option

Sample Range
18
_
12 R=11.69
a bove. 6
0 LC L=0
1 6 11 16 21 26 31 36 41 46 51
Sample
Beca use of no specia l ca uses, the process does not refer to the O CAP or O ut
of Control Action Pla n a nd N O a ctions a re ta k en.
Xbar-R
Xbar-RChart
Chartof
ofPart1,
Part1,...,
...,Part5
Part5
35
35
U C LL=33.07
33.07
UCL 33 07
UCL=33.07
Sample M ean
30
Sample Mean
30
_
__
_
X=26.33
X=26.33
25
25
20 LC L=19.59
20 LCL=19.59
1 6 11 16 21 26 31 36 41 46 51
1 6 11 16 21 26 31 36 41 46 51
Sample
Sample
1
1
24 U C L=24.72
24 UCL=24.72
Sample Range
18
ge
18
Sample Rang
_
12 _
R=11.69
12 R=11.69
6
6
0 LC L=0
0 LCL=0
1 6 11 16 21 26 31 36 41 46 51
1 6 11 16 21 26 31 36 41 46 51
Sample
Sample
If process cha nges a re docum ented a fter this project w a s closed, the Control
Lim its should be reca lcula ted a s in step 9 of the SPC m ethodology .

624
Recalculation of SPC Chart Limits
• St
Step 9 off the
th methodology
th d l refers
f to
t recalculating
l l ti SPC lilimits.
it
• Processes should see improvement in variation after usage of SPC.
• Reduction in variation or known process shift should result in Center
Line and Control Limits recalculations.
– Statistical confidence of the changes can be confirmed with
Hypothesis Testing from the Analyze Phase
Phase.
• Consider a periodic time frame for checking Limits and Center Lines.
– 3, 6, 12 months are typical and dependent on resources and
priorities
– A set frequency allows for process changes to be captured.
• Incentive to recalculate limits include avoiding false special cause
detection with poorly monitored processes.
• These recommendations are true for both Variable and Attribute data.
SPC Chart Option in MINITABTM for σ Levels
Remembering many of the tests are based on the 1 st and 2 nd Standard

Deviations from the Center Line, some Belts prefer to have some additional
lines displayed. This is possible with:
Stat>Quality Charts> ….. Options>S Limits “tab”
tab
The extra lines can be helpful if users are using MIN ITABTM for the SPC.

625
Describe the elements of an SPC chart and the purposes of

SPC
Understand how SPC ranks in defect prevention
Describe the 13 Step route or methodology of implementing a
chart
Design subgroups if needed for SPC usage
Determine the frequency of sampling
Understand the Control Chart selection methodology
Be familiar with Control Chart parameter calculations such as
UCL, LCL and the Center Line
You have now completed Control Phase – Statistical Process Control.
Notes

626
Lean Six Sigma

Black Belt Training
Control Phase
Six Sigma Control Plans
Now we are going to continue in the Control Phase with “Six Sigma Control Plans”.

627
Overview
The last physical result
of the Control Phase is W
W elcom
elcomee to
to Control
Control
the Control Plan. This
module will discuss a Adva
Advanced
nced Ex
Ex perim
periments
ents
technique to selection
various solutions you Adva
Advanced
nced Ca
Capa
pability
bility
might want from all of
your defect reduction Lea
Leann Controls
Controls
techniques found
earlier in this phase.
Defect
Defect Controls
Controls
We will also discuss
elements of a Control Sta
Statistica
tisticall Process
Process Control
Control
Plan to aid you and (SPC)
(SPC)
your organization to Solution
Solution Selection
Selection
sustain your project’s Six
Six Sigma
Sigma Control
Control Pla
Plans
ns
results. Control
Control Plan
Plan Elements
Elements
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
We will examine the
meaning of each of
these and show you
how to apply them.
End of Control: Your Objectives
You’ve already decided on the some defect reduction methodology.
Final decisions need to clarify which defect reduction tools to use.

– Capital expenditures may be required.
– Training hurdles to overcome.
– Management
g buy-in
y not completed.
p
This module will help select solutions with a familiar tool.
The Control Phase allows the Belt and its team to tackle other processes in the
future.
– The elements of a Control Phase aid to document how to maintain the
process.
This module identifies the elements of strong Control Plans.
Remember: The objective is to sustain the gains initially found in the

D,M,A,I Phases.
We have discussed all of the tools to improve and sustain your project success. However, you might have
many options
ti or too
t many options
ti to
t implement
i l t final
fi l monitoring
it i or controls.
t l This
Thi module
d l will
ill aid
id you iin
defect reduction selection.
Another objective of this module is to understand the elements of a good Control Plan needed to sustain
your gains.

628
Selecting Solutions
Selecting improvements to implement:

– High-level objective evaluation of all potential improvements
• Impact of each improvement
• Cost to implement each improvement
• Time to implement each improvement
– Balance desire with quantifiable evaluation
• Engineering always wants the gold standard
• Sales always wants inventory
• Production always wants more capacity
The tool for selecting defect prevention methods is unnecessary for just a
ffew changes
h to
t the
th process.
– Many projects with smaller scopes have few, but vital control
methods put into the process.
Selecting solutions comes down to a business decision. The impact, cost and timeliness of the
improvement are all important. These improvement possibilities must be balanced against the
business needs. A cost benefit analysis
y is always
y ag good tool to use to assist in determining
g the
priorities.
Recall us talking about the progression of a Six Sigma project? Practical Problem – Statistical
Problem – Statistical Solution – Practical Solution. Consider the Practical Solutions from a
business decision point of view.
Impact Considerations
Impa ct of the improvem ent:

– Time frame of improvements
• Long-term vs. Short-term effectiveness
– If a supplier will lose a major customer because of defects,
the short term benefit will prevail first.
– Effectiveness of the improvement types
• Removing the root cause of the defect
• Monitoring/ flagging for the condition that produces a defect
• Inspecting to determine if the defect occurred
• Training people not to produce defects
Now that’s IMPACT!

629
Cost Considerations
Cost to implement improvement:

– Initial cost to implement improvement
• Cost to train existing work force
• Cost to purchase any new materials necessary for
improvement
• Cost of resources used to build improvement
• Any capital investments required
– On-going costs to sustain improvement
• Future training, inspection, monitoring, and material costs
It’s all about the cash!
Time Considerations
Time to implement improvement:

– Technical time constraints
• W hat is the minimum time it would take to implement?
– Time to build/ create improvement, time to implement
improvement
– Political time constraints
• W hat other priorities are competing for the technical time to
build the improvement?
– Cultural time constraints
• How long will it take to gain support from necessary
stakeholders?
The clock
clock’ss ticking……
ticking

630
Improvement Selection Matrix
IImplementing
l ti this
thi fa
f milia
ili r tool
t l tot prioritize
i iti proposed d
improvements is ba sed on the three selection criteria of
time, cost a nd impa ct.
– All the process outputs are rated in terms of their relative
importance to the process
• The
Th outputs
t t off interest
i t t will
ill be
b the
th same as those
th in
i your X
X-Y
Y
Matrix.
• The relative ranking of importance of the outputs are the same
numbers from the updated X-Y Matrix.
– Each potential improvement is rated against the three criteria of
time cost
time, cost, and impact using a standardized rating scale
– Highest overall rated improvements are best choices for
implementation
This should
resemble the X-Y
Matrix. This tool is
of no use if you
have one or two
improvement
efforts to consider.
The outputs listed
above in most
cases resemble
those of your
original X-Y Matrix
but you might have
another business
output
p added.
The significance rating is the relative ranking of outputs. If one output is rated a 10 and it is twice the
importance of a second output, the rating for the second output would be a 5. The improvements, usually
impacting the X’s, are listed and the relative impact of each item on the left is rated against its impact to
the output. The overall impact rating for one improvement is the sum of the individual impact ratings
multiplied by their respective significant rating of the output impacted. Items on the left having more
impacts on multiple outputs will have a higher overall impact rating. The cost and timing ratings are
multiplied against the overall impact rating.
The improvements listed with the highest overall ratings are the first to get consideration. The range of
impact ratings can be zero to seven. An impact of zero means no impact. The cost and timing ratings are
rated zero to seven. With zero being prohibitive in the cost or timing category.

631
Improvement Selection Matrix Project Outputs
Pi
Primary and
dSSecondary
d M
Metrics
t i off your P
Project.
j t
– List each of the Y’s across the horizontal axis
– Rate the importance of the process Y’s on a scale of 1 to 10
• 1 is not very important, 10 is critical
• The Significance rankings must match your updated X-Y Matrix
rankings
Improvement Selection Matrix
Just like when using

Impact Ratings
the FMEA, your
7 X's are removed from impacting the process output.
ratings may vary for
Continual control and adjustment of critical X's impacting the
the three Selection 6
process output.
Matrix categories.
Continual control of critical X's prevents defects in the process
Feel free to use 5
whatever objective output from X.
ratings you desire. Defect detection of the process output prevents unknown defects
4
from leaving the process.
These are some 3 Process inspection or testing is improved to find defects better.
general guideline
ratings, customize Process is improved with easier control of a critical X impacting the
2
them to meet your process output.
business, just try to 1 Personnel are trained about X's impact on the process output.
standardize whatever 0 X's have no impact on the process output.
criteria you choose.
The recommended
cost ratings from zero Cost to Implement Ratings
to seven are here. In Improvement Costs are minimal with upfront and ongoing
7
many companies, expenses.
expenditures that are Improvement Costs are low and can be expensed with no capital
6
not capitalized usually authorization and recurring expenses are low.
are desired because Improvement Costs are low and can be expensed with no capital
5
authorization
th i ti and d recurring
i expenses are hi higher.
h
they are smaller and
Medium capital priority because of relative ranking of return on
are merely expensed. 4
investment.
Your business may
Low capital priority because of relative ranking of return on
have different 3
investment.
strategies or need of
High capital and ongoing expenses make a low priority for capital
cash so consider your 2
investment.
business’ situation. High capital and/or expenses without acceptable return on
1
investment.
Significant capital and ongoing expenses without alignment with
0
business priorities.

632
Improvement Selection Matrix (cont.)
Time to Implement Ratings

7 Less than a week to get in place and workable.
6 7 - 14 days to get in place and workable.
5 2 - 8 weeks to get the improvement in place and workable.
4 2 - 3 months to get the improvement in place and workable.
2 6 - 9 months
th tto gett the
th improvement
i t iin place
l and
d workable.
k bl
Over a year to get the improvement in place and workable. All
0
above times include time for approvals process.
These time ratings are ranked from zero to seven. You might wonder why something that would take
a year or more we suggest gets a zero rating
i suggestingi the
h iimprovement not b
be considered.
id d M Many
businesses have cycle times of products less than a year so improvements that long are ill
considered.
Example of Completed Solution Selection Matrix
water
o not
kers
ude
Coffee is hot and rrich
Food choices inclu

Plenty of bottled w
interfer with speak
Outside noises do
"healthy choices"
OVERALL
COST TIME OVERALL
IMPACT
RATING RATING RATING
RATING
available
tasting
Significance Rating 10 9 8 9
Impact Impact Impact Impact
Potential Improvements Rating Rating Rating Rating
1 Hotel staff monitors room 2 2 6 0 86 7 7 4214
2 Mgmt visits/leaves ph # 2 0 4 0 52 7 7 2548
3 Replace old coffee makers/coffee 0 7 0 0 63 3 6 1134
4 Menus provided with nutrition info 0 0 0 4 36 5 5 900
5 Comp. gen. "quiet time" scheduled 6 0 0 0 60 3 3 540
6 Dietician approves menus 0 0 0 7 63 5 2 630
Im provem ent Selection M a trix O utput
Improvements with the higher overall rating should be given first priority.
Keep in mind that long time frame capital investments
investments, etc
etc. should have
parallel efforts to keep delays from further occurring.
This is just an example of a completed solution selection matrix. Remember that a cost or time
rating of zero would eliminate the improvement from consideration by your project. Remember your
ratings of the solutions should involved your whole team to get their knowledge and understanding
of final priorities.
Again, higher overall ratings are the improvements to be considered. Do NOT forget about the
potential to run improvements in parallel. Running projects of complexity might need the experience
of a trained project manager. Often projects need to be managed with gantt charts or timelines
showing critical milestones.

633
Implementing Solutions in Your Organization
Once you’ve
O ’ decided
d id d
Implementation Plans should emphasize the need to: defect reduction
solutions, you need to
– Organize the tasks and resources
plan those solutions. A
– Establish realistic time frames and deadlines plan means more than
– Identify actions necessary to ensure success the proverbial back of
the envelope solution
Components of an Implementation Plan include: and should include
– W ork breakdown structure timelines, critical
milestones, project
– Influence strategy for priorities and resourcing review dates and
– Risk management plan specific actions noted
– Audit results for completion and risks. for success in your
solution
All solutions must be part of Control Plan Document. implementation. Many
peoplel use EExcell or MS
Project but many
We have a plan don’t we? options exist to plan
your project closing
with these future
sustaining plans.
What is a Control Plan?
A Control Pla n is:

• W ritten sum m a ry describing sy stem s used for m onitoring/ controlling
process or product va ria tion
• Docum ent a llow ing tea m to form a lly docum ent a ll control m ethods
used to m eet project goa l
• Living docum ent to be upda ted a s new m ea surem ent system s a nd
control m ethods a re a dded for continuous im provem ent
• O ften used to crea te concise opera tor inspection sheet
• N O T a repla cem ent of inform a tion conta ined in deta iled opera ting,
m a intena nce, or design instructions
• ESSEN TIAL portion
ti off fina
fi l project
j t reportt
– Fina l projects a re orga niza tiona lly dependent
• Inform a l or form a l
– Filed a s pa rt of project tra ck ing m echa nism for orga niza tion
• Tra ck benefits
• Reference for unsusta ined results

634
WHO Should Create a Control Plan
Th team
The t working
ki on the
th project!!!!
j t!!!!
AN YON E who has a role in defining, executing or changing the

process:
– Associates
– Technical Experts
– Supervisors
– Managers
– Site Manager
– Human Resources We did it!!
i
WHY Do We Need a Control Plan?
Project results need to be sustained.

• Control Plan requires operators/ engineers, managers, etc. to
follow designated control methods to guarantee product
quality throughout system
• Allows a Belt to move onto other projects!
• Prevents need for constant heroes in an organization who
repeatedly solve the same problems
• Control Plans are becoming more of a customer requirement
Going for distance, not the sprint!

635
Control Plan Elements
The 5 elements of
a Control Plan
include the
Control Pla n
documentation,
monitoring,
response, training Documenta tion Response Pla n Process ow ners
Pla n a ccounta ble to
and aligning m a inta in new
level of
systems and Aligning
process
Sy stems M onitoring Tra ining
structures. Pla n perform a nce
Pla n
& Structures
IM PLEM EN TED Verified

V ifi d Fina
Fi ncia
i l
IM PRO VEM EN TS Impa
Im pa ct
Control Plan Information
Control Plans use all

of the information The tea m develops the Control Pla n by utilizing a ll a va ila ble
from the previous inform a tion from the follow ing:
phases of your project – Results from the M ea sure a nd Ana lyze Pha ses
and the defect – Lessons lea rned from simila r products a nd processes
prevention methods – Tea m’s k now ledge of the process
selected. Control – Design FM EAs
Plans may y not be
– Design review s
exciting because you
are not doing anything – Defect Prevention M ethods selected
new to the process
but stabilizing the
process in the future Documenta tion Response Pla n
Pla n
with this document.
Aligning
Systems M onitoring a ining
Tra g
Pla n Pla n
& Structures

636
Training Plan
W ho/ W ha t orga niza tions require tra ining?

– Those impacted by the improvements
• People who are involved in the process Tra ining
Pla n
impacted by the improvement
• People who support the process impacted by
the improvement
– Those impacted by the Control Plan

• Process owners/ managers
• People who support the processes involved in the Control Plan
• People who will make changes
g to the process in the future
W ho w ill com plete the tra ining?

– Immediate training
• The planning, development and execution is a Tra ining
responsibility of the project team Pla n
• Typically
yp y some of the training g is conducted by
y
the project team
– Qualified trainers
• Typically owned by a training department or process owner
• Those who are responsible for conducting the on-going
training must be identified
Specific training materials need developing

developing.
– PowerPoint, On the Job checklist, Exercises, etc.
W hen will training be conducted?
W hat is the timeline to train everyone

Tra ining
on the new process(es)? Pla n
W hat will trigger ongoing training?

– N ew employee orientation?
– Refresher training?
– Part of the response plan when monitoring shows
performance degrading?

637
Training Plan (cont.)
Tra ining Pla n O utline
Tra ining
Pla n
I t
Integration
ti into
i t
Schedule for O ngoing New Final Location of
W ho W ill Create Training M odules W ho W ill be S chedule for E m ployee E m ployee
T raining M odule M odules Com pletion T rained T raining Trainer(s) T raining M anuals
Documentation Plan
Documentation
Documentation is necessary y to ensure that what Documenta tion
Plan Pla n
Pl
has been learned from the project is shared and
institutionalized:
– Used to aid implementation of solutions
– Used for on-going training
This is often the actual Final Report some organizations use.
Documenta tion must be k ept current to be useful

638
Documentation Plan (cont.)
Items to be included in the Documenta tion Pla n:
Docum enta tion

– Process documenta tion Pla n
• Upda ted Process M a ps/ flow cha rts

• Procedures (SO P’s)
P s)
• FM EA
– Control Pla n documenta tion

• Tra ining ma nua ls
• M onitoringg pla
p n—process
p ma na gement
g cha rts,, reports,
p ,
sops
• Response pla n—FM EA
• Systems a nd structures—job descriptions, performa nce
ma na gement objectives
Assigning responsibility for Documenta tion Pla n:

– Responsibility at implementation
Documentation
Plan
• Black Belt ensures all documents are current
att hand
h d off
ff
• Black Belt ensures there is a process to modify
documentation as the process changes in place
• Black Belt ensures there is a process in place to review
documentation on regular basis for currency/ accuracy
– Responsibility
p y for ongoing
g gp process ((organizationally
g y based))
• Plan must outline who is responsible for making
updates/ modifications to documentation as they occur
• Plan must outline who is responsible to review documents—
ensuring currency/ accuracy of documentation

639
Documentation Plan (cont.)
Documenta tion Pla n O utline

Documenta tion
Pla n
Update/
Items
It IImmediate
di t R i
Review
Document Modification
Necessary Responsibility Responsibility
Responsibility
Monitoring Plan
Purpose of a Monitoring Plan:

– Assures gains are achieved and sustained
– Provides insight for future process improvement activities M onitoring
Pla n
Development of a Monitoring Plan:

– Belt is responsible for the development of the monitoring plan
– Team members will help to develop the plan
– Stakeholders must be consulted
– Organizations with financial tracking would monitor results.
Sustaining the Monitoring Plan:

– Functional managers will be responsible for adherence to the
monitoring
gpplan
• They must be trained on how to do this
• They must be made accountable for adherence

640
Monitoring Plan (cont.)
Tests:
– W hen to Sample
M onitoring
• After training Pla n
• Regular intervals
• Random intervals (often in auditing sense)
– How to Sample
– How to Measure
I knew I should have paid more attention!
Sta tistica l Process Control:

– Control Charts
M onitoring
Pla n
• Posted in area where data collected
• Plot data points real time
– Act on Out of Control Response with guidelines from the
Out of Control Action Plan (OCAP).
– Record actions taken to achieve in-control results.
• N otes impacting performance on chart should be encouraged
– Establishing new limits
• Based on signals that process performance has changed

641
Response Plan
FM EA is a grea t tool to use for the M onitoring Pla n
M onitoring
Pla n
Potential C
Process Potential S Potential O Current D R Responsible
p S O D R
Failure
F il M
Modes
d l Recommend
R d T k
Taken
# Function Failure Effects E Causes of C Process E P Person & E C E P
(process a Actions Actions
(Step) (Y's) V Failure (X's) C Controls T N Target Date V C T N
defects) s
1
– Allow s process m a na ger a nd those involved in the process

to see the entire process a nd how everyone contributes to a
defect free product/ service.
– Provides the mea ns to k eep the document current—
rea ssessing RPN s a s the process cha nges
Monitoring Plan
Check Lists/ M a trices

– Key
K ititems tto check
h k
M onitoring
– Decision criteria; decision road map Pla n
– Multi-variable tables
Visua l M a na gem ent

– Alerts or signals to trigger action
action.
• Empty bins being returned to when need stock replenished
• Red/ yellow/ green reports to signal process performance
– Can be audible also.
– 5S is necessary for Visual Management

642
Response Plan
Response Pla ns — outline process(es) to follow when Response Pla n
there is a defect or Out of Control from monitoring:

– Out of control point on Control Chart
– N on random behavior within Control Limits in
Control Chart
– Condition/ variable proven to produce defects present in process
– Check sheet failure
– Automation failure
Response to poor process results are a must in training.
Response Pla ns a re living documents upda ted w ith

new informa tion a s it becomes a va ila ble.
Components of Response Pla n: Response Pla n
– The triggers for a response

• W hat are the failure modes to check for?
• Usually monitor the highest risk X's in the process
– The recommended response for the failure mode
– The responsibilities for responding to the failure mode
– Documentation
D t ti off Response
R Plan
Pl b being
i ffollowed
ll d iin a ffailure
il mode
d
– Detailed information on the conditions surrounding the failure
mode

643
Response Plan – Abnormality Report
Response Pla n
• Deta iled documenta tion Process
w hen fa ilure modes Metric

occur.
urrent Situation
Signal
• Provide a m ethod for Situation Code
Cu
on-going continuous
improvement. Detailed Situation
Date
Investigation of Cause
• Reinforce
commitment to Code of Cause
elimina ting defects.

Corrective Action
• Fits w ith ISO 9 0 0 0 sta nda rd of Who To Be Involved
ha ving a CAR or Corrective

Root Cause Analysis
Action Request. What To Be Done
Date for completion of analysis
• M ethod to collect frequency of

corrective a ctions. Date for implementation of permanent prevention
Aligning Systems and Structures
Systems and structures are the basis for allowing

people to change their behaviors permanently:
Aligning
– Performance goals/ objectives System s
– Policies/ procedures & Structures
– Job descriptions
– Incentive compensation
– Incentive programs, contests, etc
There a re long- a nd short-term stra tegies

for a lignment of systems a nd structures.

644
Aligning Systems and Structures (cont.)
• Get rid of measurements that do not align with

desired behaviors
Aligning
System s
• Get rid of multiple measures for the same & Structures
desired behaviors
• Implement measures that align with desired behaviors currently not

motivated by incentives
• Change management must consider your process changes and how

the process will respond?
• Are the hourly incentives hurting your chance of success?
Project Sign Off
Best method to assure acceptance of Control Plan is

having supervisors and management for the area
involved. Aligning
Systems
– Meeting for a summary report & Structures
– Specific changes to the process highlighted
– Information where Control Plan is filed
Now that’s
h a Controll Plan!
l

645
Identify all 5 phases of the Six Sigma methodology
Identify at least 3 tools from each phase
Show progress on your ongoing project
Now for the last few questions to ask if you have been progressing on a real world project while
taking this learning. First, has your project made success in the primary metric without
compromising your secondary metrics? Second, have you been faithfully updating your metric
charts and keeping your process owner and project champion updated on your team’s activities. If
not, then start NOW.
Remember a basic change management idea you learned in the Define Phase. If you get
involvement of team members who work in the process and keep the project Champion and
Process Owner updated as to results, then you have the greatest chance of success.
You have now completed Control Phase – Six Sigma Control Plans.
Notes

646
Lean Six Sigma

Black Belt Training
Control Phase

647
Control Phase Overview—The Goal
The goa l of the Control Pha se is to:
• Assess the final process capability.
• Revisit Lean with an eye for sustaining the project

project.
• Evaluate methods for defect prevention.
• Explore various methods to monitor process using SPC.
• Implement a Control Plan.
Gooooaaallllll!!
Organizational Change
Ea ch “ pla yer ” in the process ha s a role in

SUSTAIN IN G project success a chieved.
• Accept responsibility
• M it i
Monitoring
• Responding
• Managing
• Embracing change & continuous learning
• Potential for horizontal replication or expansion of results

648
Control Phase—The Roadblocks
Look for the potential roadblocks and plan to address them

before they become problems:
– Lack of project sign off
– Team members are not involved in Control Plan design
– Management
g does not have knowledgeg on monitoring g and
reacting needs
– Financial benefits are not tracked and integrated into
business
– Lack of buy in of process operators or staff
DMAIC Roadmap
wner
Champion/
Process Ow

Define
Estimate COPQ
Establish Team
ure
Assess Stability
Stability, Capability
Capability, and Measurement Systems
Measu

Analyze

Improve

Control

649
Control Phase
Improvement Selected
Develop Training Plan
Implement Training Plan
Develop Documentation Plan
Implement Documentation Plan
Develop Monitoring Plan
Implement Monitoring Plan
Develop Response Plan
Implement Response Plan
Develop Plan to Align Systems and Structures
Align Systems and Structures
Go to N ext Project
Control Phase Checklist
Control Questions
Step One: Process Enhancement And Control
Results
• How do the results of the improvement(s) match the requirements of the business
case and improvement goals?
• What are the vital few X’s?
• How will you control or redesign these X’s?
• Is there a process control plan in place?
• Has the control plan been handed off to the process owner?
Step Two: Capability Analysis for X and Y
Process Capability
• How are you monitoring the Y’s?
Step Three: Standardization And Continuous Improvement
• How are you going to ensure that this problem does not return?
• Is the learning transferable across the business?
• What is the action plan for spreading the best practice?
• Is there a project documentation file?
• How is this referenced in process procedures and product drawings?
• What is the mechanism to ensure this is not reinvented in the future?
Step Four: Document what you have learned
• Is there an updated FMEA?
• Is the control plan fully documented and implemented ?
• What are the financial implications?
• Are there any spin-off projects?
• What lessons have yyou learned?
General Questions
• Are there any issues/barriers preventing the completion of the project?
• Do the Champion, the Belt and Finance all agree that this project is complete?

650
Planning for Action
Test validation plan for a specific time
Calculate benefits for breakthrough
Implement change across project team
Process map of improved process
Finalize Key Input Variables (KPIV) to meet goal
Prioritize risks of output failure
Control plan for output
Control plan for inputs
Chart a plan to accomplish the desired state of the culture
Mistake proofing plan for inputs or outputs
Implementation plan for effective procedures
Knowledge transfer between Belt, PO, and team members
Knowledge sharing between businesses and divisions
Lean project control plan
Establish continuous or attribute metrics for Cpk
Identify actual versus apparent Cpk
Finalize problem solving strategy
Complete RPN assessment with revised frequency and controls
Show improvement in RPN through action items
Repeat same process for secondary metrics
Summary
At this point,
point you should:
• Have a clear understanding of the specific deliverables to complete

your project.
• Have started to develop a project plan to meet the deliverables

deliverables.
• Have identified ways to deal with potential roadblocks.
• Be ready to apply the Six Sigma method on your N EXT project.

651
It’s a Wrap
Congratulations you
have completed
Certified Lean Six Sigma
Black Belt
Training!!!

652
Lean Six Sigma

Black Belt Training
Control Phase
Quiz
Now we will see what you have retained from the Control Phase of the course. Please answer
the Control Phase where your retention of the knowledge is less than you desire.

653
Control Phase Quiz
1. Which statement is true about the steepest

p ascent Experimental
p Design?
g ((check all that
apply)
A. It finds the optimum spot within the original design space.
B. It attempts to find the optimum region outside the original design space.
C. The design works best when known curvature exists.
D. The design works best when 5 or more factors are significant from the screening
design.
E. The design finds the optimum spot within the original design space if curvature was
found previously
previously.
2. If the Belt has found a good, statistically significant model from the last Full Factorial
Design, what is the main reason a steepest ascent design be considered in the project?
A. 4 factors were found to be statistically significant.
B. The desired process output was not yet found within the original design space.
C. The project target was achieved but the project wants to further improve the process.
D. The DOE indicated curvature because Center Points were included and the local,,
desired maximum was within the original design space.
3. Advanced Capability Analysis for defects per unit is not possible within MINITABTM.
True False
4. Process Capability is discussed in the Control Phase. Why is Process Capability

considered in the Control Phase of a Six Sigma project?
A Process Capability is a way of predicting future performance when a stable process
A.
exists.
B. Special Causes reduce Predictability and Process Capability measures Process
Predictability relative to specifications.
C. Process Capability uses the same equations for normal and non-normal processes.
D. If the process is non-normal the type of distribution must be remembered when
monitoring a process in the Control Phase. This type of Non-normal Distribution must
be known to run a p proper
p Process Capability
p y Analysis.
y
5. The Lean toolbox including items such as 5S, Visual Factory management and Kanbans
can best be described to ________ a process in the Control Phase.
A. remove labor for
B. overly lengthen the Six Sigma project for
C. confuse
D. stabilize
6. How does the idea of MUDA from Lean Principles best fit with the Six Sigma
methodology?
A. MUDA means waste which is indicating defects are occurring in the process.
B. Lean is Six Sigma that originated in SE Asia.
C. MUDA is an abbreviation for Six Sigma tools.
D. MUDA is the technique of finding the best practices.
7. Kaizens or kaikakus are examples of six sigma projects?

True False

654
Control Phase Quiz
8. If excess inventory is one reason for Special Causes in the Six Sigma project, which best
it
item iin L
Lean P
Principles
i i l can hhelp
l iimprove th
the P
Process CCapability
bilit and
d sustainability
t i bilit off th
the
project?
A. Kanban
B. SPC
C. 5S
D. Value Stream Mapping
E. Operator support
9. Kanbans work best with pull systems for determining which products or services are
produced?
True False
10. __________ (fill in the blank) are signals telling a process to process a product or
service.
A. Kaizen
B Kanban
B. K b
C. Andon
D. Poka-Yoke
E. Gemba
11. Since Kanbans are used to control how much inventory exists, it is a quick fix to improve
the inventory.
True False
12. Which are examples of Defect Prevention to consider in your execution of the Control
Phase of your project? (check all that apply)
A. Poka-Yoke or Mistake Proofing
B. Monte Carlo Simulation
C. FMEA
D. Robust product design
E Negotiate
E. N ti t new specification
ifi ti limits
li it ffrom customers
t
13. Which items listed below will cause tolerance specification limits to tighten for an input
statistically affecting the output of interest. (check all that apply)
A. A gauge with a worsening precision.
B. The measuring instrument for the output has improving precision.
C. Other unknown significant Noise factors are increasingly varying.
D. The input
p has a new automated controller to minimize variation the input p from the
desired setting.
14. Every process has causes of variation commonly known as: (check all that apply)
A. Common
B. Insignificant
C. Special
D. Uneducated
15. SPC is an excellent tool for telling us why a process is exhibiting Special Cause
variation.
True False

655
Glossary
Affinity Diagram - A technique for organizing individual pieces of information into groups or broader categories.
ANOVA - Analysis of Variance – A statistical test for identifying significant differences between process or
system treatments or conditions. It is done by comparing the variances around the means of the conditions
being compared.
Attribute Data - Data which on one of a set of discrete values such as pass or fail, yes or no.
Average - Also called the mean, it is the arithmetic average of all of the sample values. It is calculated by adding
all of the sample values together and dividing by the number of elements (n) in the sample
sample.
Bar Chart - A graphical method which depicts how data fall into different categories.
Black Belt - An individual who receives approximately four weeks training in DMAIC, analytical problem solving,
and change management methods. A Black Belt is a full time six sigma team leader solving problems under the
direction of a Champion.
Breakthrough
g Improvement
p - A rate of improvement
p at or near 70% over baseline p
performance of the as-is
process characteristic.
Capability - A comparison of the required operation width of a process or system to its actual performance
width. Expressed as a percentage (yield), a defect rate (dpm, dpmo,), an index (Cp, Cpk, Pp, Ppk), or as a
sigma score (Z).
Cause and Effect Diagram - Fishbone Diagram - A pictorial diagram in the shape of a fishbone showing all
possible variables that could affect a given process output measure.
Central Tendency - A measure of the point about which a group of values is clustered; two measures of central
tendency are the mean, and the median.
Champion -A Champion recognizes, defines, assigns and supports the successful completion of six sigma
projects; they are accountable for the results of the project and the business roadmap to achieve six sigma
within their span of control.
Characteristic - A process input or output which can be measured and monitored

monitored.
Common Causes of Variation - Those sources of variability in a process which are truly random, i.e., inherent
in the process itself.
Complexity -The level of difficulty to build, solve or understand something based on the number of inputs,
interactions and uncertainty involved.
Control Chart - The most p powerful tool of statistical p

process control. It consists of a run chart,, together
g with
statistically determined upper and lower control limits and a centerline.
Control Limits - Upper and lower bounds in a control chart that are determined by the process itself. They can
be used to detect special or common causes of variation. They are usually set at ±3 standard deviations from
the central tendency.
Correlation Coefficient - A measure of the linear relationship between two variables.
Cost of Poor Quality (COPQ) - The costs associated with any activity that is not doing the right thing right the
first time. It is the financial qualification any waste that is not integral to the product or service which your
company provides.

656
Glossary
CP - A capability measure defined as the ratio of the specification width to short-term process performance
width.
CPk -. An adjusted short-term capability index that reduces the capability score in proportion to the offset of the
process center from the specification target.
Critical to Quality (CTQ) - Any characteristic that is critical to the perceived quality of the product, process or
system. See Significant Y.
Critical X - An input to a process or system that exerts a significant influence on any one or all of the key
outputs off a process.
Customer - Anyone who uses or consumes a product or service, whether internal or external to the providing
organization or provider.
Cycle Time - The total amount of elapsed time expended from the time a task, product or service is started
until it is completed.
Defect - An output of a process that does not meet a defined specification

specification, requirement or desire such as time
time,
length, color, finish, quantity, temperature etc.
Defective - A unit of product or service that contains at least one defect.
Deployment (Six Sigma) - The planning, launch, training and implementation management of a six sigma
initiative within a company.
Design
g of Experiments
p (DOE)
( ) - Generally,
y, it is the discipline
p of using
g an efficient,, structured,, and proven
p
approach to interrogating a process or system for the purpose of maximizing the gain in process or system
knowledge.
Design for Six Sigma (DFSS) - The use of six sigma thinking, tools and methods applied to the design of
products and services to improve the initial release performance, ongoing reliability, and life-cycle cost.
DMAIC - The acronym for core phases of the six sigma methodology used to solve process and business
problems through data and analytical methods. See define, measure, analyze, improve and control.
DPMO - Defects per million opportunities – The total number of defects observed divided by the total number
of opportunities, expressed in parts per million. Sometimes called Defects per Million (DPM).
DPU - Defects per unit - The total number of defects detected in some number of units divided by the total
number of those units.
Entitlement - The best demonstrated performance for an existing configuration of a process or system. It is an
empirical demonstration of what level of improvement can potentially be reached
reached.
Epsilon ε - Greek symbol used to represent residual error.
Experimental Design - See Design of Experiments.
Failure Mode and Effects Analysis (FMEA) - A procedure used to identify, assess, and mitigate risks
associated with potential product, system, or process failure modes.
Finance Representative - An individual who provides an independent evaluation of a six sigma project in
terms of hard and/or soft savings. They are a project support resource to both Champions and Project
Leaders.

657
Glossary
Fishbone Diagram - See cause and effect diagram.
Flowchart - A graphic model of the flow of activities, material, and/or information that occurs during a process.
Gage R&R - Quantitative assessment of how much variation (repeatability and reproducibility) is in a measurement
system compared to the total variation of the process or system.
Green Belt - An individual who receives approximately two weeks of training in DMAIC, analytical problem solving,
and change management methods. A Green Belt is a part time six sigma position that applies six sigma to their
local area, doing smaller-scoped projects and providing support to Black Belt projects.
Hidden Factory or Operation - Corrective and non-value-added work required to produce a unit of output that is
generally not recognized as an unnecessary generator of waste in form of resources, materials and cost.
Histogram - A bar chart that depicts the frequencies (by the height of the plotted bars) of numerical or
measurement categories.
Implementation Team - A cross-functional executive team representing various areas of the company . Its charter
is to drive the implementation of six sigma by defining and documenting practices,
practices methods and operating policies
policies.
Input - A resource consumed, utilized, or added to a process or system. Synonymous with X, characteristic, and
input variable.
Input-Process-Output (IPO) Diagram - A visual representation of a process or system where inputs are
represented by input arrows to a box (representing the process or system) and outputs are shown using arrows
emanating out of the box.
lshikawa Diagram - See cause and effect diagram and fishbone diagram.
Least Squares - A method of curve-fitting that defines the best fit as the one that minimizes the sum of the squared
deviations of the data points from the fitted curve.
Long-term Variation - The observed variation of an input or output characteristic which has had the opportunity to
experience the majority of the variation effects that influence it.
L
Lower Control
C t l Limit
Li it (LCL) - for
f controlt l charts:
h t theth limit
li it above
b which
hi h th
the subgroup
b statistics
t ti ti mustt remain
i ffor th
the
process to be in control. Typically, 3 standard deviations below the central tendency.
Lower Specification Limit (LSL) - The lowest value of a characteristic which is acceptable.
Master Black Belt - An individual who has received training beyond a Black Belt. The technical, go-to expert
regarding technical and project issues in six sigma. Master Black Belts teach and mentor other six sigma Belts,
their projects and support Champions.
Mean - See average.
Measurement - The act of obtaining knowledge about an event or characteristic through measured quantification
or assignment to categories.
Measurement Accuracy - For a repeated measurement, it is a comparison of the average of the measurements
compare to some known standard.
Measurement Precision - For a repeated measurement, it is the amount of variation that exists in the measured
values.

658
Glossary
Measurement Systems Analysis (MSA) - An assessment of the accuracy and precision of a method of obtaining
measurements. See also Gage R&R.
Median - The middle value of a data set when the values are arranged in either ascending or descending order.
Metric - A measure that is considered to be a key indicator of performance. It should be linked to goals or
objectives and carefully monitored.
Natural Tolerances of a Process - See Control Limits.
Nominal Group Technique - A structured method that a team can use to generate and rank a list of ideas or
items.
Non-Value Added (NVA) - Any activity performed in producing a product or delivering a service that does not add
value, where value is defined as changing the form, fit or function of the product or service and is something for
which the customer is willing to pay.
Normal Distribution - The distribution characterized by the smooth, bell- shaped curve. Synonymous with
Gaussian Distribution
Distribution.
Objective Statement - A succinct statement of the goals, timing and expectations of a six sigma improvement
project.
Opportunities - The number of characteristics, parameters or features of a product or service that can be classified
as acceptable or unacceptable.
Out of Control - A process is said to be out of control if it exhibits variations larger than its control limits or shows a
pattern of variation.
Output - A resource or item or characteristic that is the product of a process or system. See also Y, CTQ.
Pareto Chart - A bar chart for attribute (or categorical) data categories are presented in descending order of
frequency.
Pareto Principle - The general principle originally proposed by Vilfredo Pareto (1848-1923) that the majority of
influence on an outcome is exerted by a minority of input factors.
Poka-Yoke - A translation of a Japanese term meaning to mistake-proof.
Probability - The likelihood of an event or circumstance occurring.
Problem Statement - A succinct statement of a business situation which is used to bound and describe the
problem the six sigma project is attempting to solve.
Process - A set of activities and material and/or information flow which transforms a set of inputs into outputs for
the purpose of producing a product, providing a service or performing a task.
Process Characterization - The act of thoroughly understanding a process, including the specific relationship(s)
between its outputs and the inputs, and its performance and capability.
Process Certification - Establishing documented evidence that a process will consistently produce required
outcome or meet required specifications.
Process Flow Diagram - See flowchart.

659
Glossary
Process Member - A individual who performs activities within a process to deliver a process output, a product
or a service to a customer
customer.
Process Owner - Process Owners have responsibility for process performance and resources. They provide
support, resources and functional expertise to six sigma projects. They are accountable for implementing
developed six sigma solutions into their process.
Quality Function Deployment (QFD) - A systematic process used to integrate customer requirements into
every aspect of the design and delivery of products and services.
Range - A measure of the variability in a data set. It is the difference between the largest and smallest values
in a data set.
Regression Analysis - A statistical technique for determining the mathematical relation between a measured
quantity and the variables it depends on. Includes Simple and Multiple Linear Regression.
Repeatability (of a Measurement) - The extent to which repeated measurements of a particular object with a
particular instrument produce the same value. See also Gage R&R.
Reproducibility (of a Measurement) - The extent to which repeated measurements of a particular object with
a particular individual produce the same value. See also Gage R&R.
Rework - Activity required to correct defects produced by a process.
Risk Priority Number (RPN) - In Failure Mode Effects Analysis -- the aggregate score of a failure mode
including its severity, frequency of occurrence, and ability to be detected.
Rolled Throughput Yield (RTY) - The probability of a unit going through all process steps or system
characteristics with zero defects.
R.U.M.B.A. - An acronym used to describe a method to determine the validity of customer requirements. It
stands for Reasonable, Understandable, Measurable, Believable, and Achievable.
Run Chart - A basic graphical tool that charts a characteristic’s performance over time.
Scatter Plot - A chart in which one variable is plotted against another to determine the relationship, if any,
between the two.
Screening Experiment - A type of experiment to identify the subset of significant factors from among a large
group of potential factors.
Short Term Variation - The amount of variation observed in a characteristic which has not had the opportunity
to experience all the sources of variation from the inputs acting on it.
Sigma Score (Z) - A commonly used measure of process capability that represents the number of short-term
standard deviations between the center of a process and the closest specification limit. Sometimes referred to
as sigma level, or simply Sigma.
Significant Y - An output of a process that exerts a significant influence on the success of the process or the
customer.
Six Sigma Leader - An individual that leads the implementation of Six Sigma,
Sigma coordinating all of the necessary
activities, assures optimal results are obtained and keeps everyone informed of progress made.

660
Glossary
Six Sigma Project - A well defined effort that states a business problem in quantifiable terms and with known
impro ement e
improvement expectations.
pectations
Six Sigma (System) - A proven set of analytical tools, project management techniques, reporting methods and
management techniques combined to form a powerful problem solving and business improvement methodology.
Special Cause Variation - Those non-random causes of variation that can be detected by the use of control charts
and good process documentation.
Specification Limits - The bounds of acceptable performance for a characteristic.

characteristic
Stability (of a Process) - A process is said to be stable if it shows no recognizable pattern of change and no
special causes of variation are present.
Standard Deviation - One of the most common measures of variability in a data set or in a population. It is the
square root of the variance.
Statistical Problem - A problem that is addressed with facts and data analysis methods.
Statistical Process Control (SPC) - The use of basic graphical and statistical methods for measuring, analyzing,
and controlling the variation of a process for the purpose of continuously improving the process. A process is said to
be in a state of statistical control when it exhibits only random variation.
Statistical Solution - A data driven solution with known confidence/risk levels, as opposed to a qualitative, “I think”
solution.
S pplier - An indi
Supplier individual
id al or entity
entit responsible for providing
pro iding an input
inp t to a process in the form of reso
resources
rces or
information.
Trend - A gradual, systematic change over time or some other variable.
TSSW - Thinking the six sigma way – A mental model for improvement which perceives outcomes through a cause
and effect relationship combined with six sigma concepts to solve everyday and business problems.
Two Level Design - An experiment where all factors are set at one of two levels
Two-Level levels, denoted as low and high (-1
( 1 and
+ 1).
Upper Control Limit (UCL) for Control Charts - The upper limit below which a process statistic must remain to be
in control. Typically this value is 3 standard deviations above the central tendency.
Upper Specification Limit (USL) - The highest value of a characteristic which is acceptable.
Variability - A generic term that refers to the property of a characteristic, process or system to take on different
values when it is repeated.
Variables - Quantities which are subject to change or variability.
Variable Data - Data which is continuous, which can be meaningfully subdivided, i.e. can have decimal
subdivisions.
Variance - A specifically defined mathematical measure of variability in a data set or population. It is the square of
the standard de
deviation.
iation
Variation - See variability.

661
Glossary
VOB - Voice of the business – Represents the needs of the business and the key stakeholders of the
business. It is usuallyy items such as p
profitability,
y revenue, g
growth, market share, etc.
VOC - Voice of the customer – Represents the expressed and non-expressed needs, wants and desires of the
recipient of a process output, a product or a service. Its is usually expressed as specifications, requirements or
expectations.
VOP - Voice of the process – Represents the performance and capability of a process to achieve both
business and customer needs. It is usually expressed in some form of an efficiency and/or effectiveness
metric.
Waste - Waste represents material, effort and time that does not add value in the eyes of key stakeholders
(Customers, Employees, Investors).
X - An input characteristic to a process or system. In six sigma it is usually used in the expression of Y=f(X),
where the output (Y) is a function of the inputs (X).
Y - An output characteristic of a process. In six sigma it is usually used in the expression of Y=f(X), where the
output (Y) is a function of the inputs (X)
(X).
Yellow Belt - An individual who receives approximately one week of training in problem solving and process
optimization methods. Yellow Belts participate in Process Management activates, participate on Green and
Black Belt projects and apply concepts to their work area and their job.
Z Score – See Sigma Score.

1
Lean Six Sigma

Black Belt Training
Appendix
Quiz Answers
The Quiz questions at the end of each phase are intended to be a sampling of the topics covered
and provide you a guide to assess your level of knowledge retention. OpenSourceSixSigma.com
provides a Certified Lean Six Sigma Black Belt Assessment that is comprehensive in its coverage
of the topics addressed in this course. It contains 100 questions and exercises fully covering the
subject matter for Lean Six Sigma Black Belts. We suggest you consider this CLSSBB
Assessment package should you choose to pursue certification in Lean Six Sigma.

2
Define Phase Quiz Answers
1. C. How tightly all the various outcomes are clustered around the average
2. Standard Deviation
3. A. Features
B. Delivery
D. Integrity
E Expense
E.
4. True
5. E. Awareness
6. A. Start and stop points

C. Directional flow
D. All process steps
7. False
8. Change Agent
9. B. The defect or error in the process
10. Brainstorming
11. C. Pareto Analysis
12. Secondary
13. C. Zero inventory between process steps
14. True
15. D. An elimination of the specification(s)
16. A. Champion/Process Owner
17 False
17.
18. A. Internal Failure Costs
19. True
20. False

3
Measure Phase Quiz Answers
1. Reproducibility
2. Linearity
3. C. Data collection for streamed orientation
4. True
5 A.
5. A Nominal Scale Data
6. C. Mode
7. True
8. B. Special Causes are often the focus of BB projects
9. True
10. True
11. B. To help prioritize the independent variables
12. False
13. A. Predict failure risks and minimize their occurrence

B. Quantifies the severity, occurrence and detection of defects
D. Identify ways how a process leads to a failure to meet customer requirements
14. True
15. A. Precision
C. Accuracy
16. A. Primary and Secondary Metrics

B. Vital few X´s in the process
C. Before and after process changes
17. True
18. D. Comparison with a proven precise instrument
19. False

4
Analyze Phase Quiz Answers
1. False
2. Multi-Vari
3. D. Error in measurement
4. True
5. A. A Hypothesis
yp Test is an a p
priori theory
y relating
g to differences between variables
B. A statistical test or Hypothesis Test is performed to prove or disprove the theory
C. A Hypothesis Test converts the Practical Problem into a Statistical Problem.
6. A. Skewness
B. Mixed Distributions
C. Kurtosis
E. Granularity
7. False
8. D. Determine if document A and document B have different Median cycle times
9. True
10. D. Having
g the tails of the distribution equal
q each other
11. True
12. B. Compare more than two sample proportions with each other
13. True
14 C.
14. C 30
15. B. Median
16. False
17. B. Failure to accept the Null Hypothesis
18. True
19. True

5
Improve Phase Quiz Answers
1. C. Relationships between Y and two or more X’s.
2. A. Simple Linear
B. Quadratic
C. Cubic
D. Multiple Linear
E. Logarithmic
3. B. The X’s are assumed to be independent

p of each other.
C. The X’s are assumed to not be correlated.
D. The residuals or errors are assumed to be Normally Distributed.
E. Interactions are NOT included in Multiple Linear Regressions.
F. R2 and the statistical confidence of the coefficients are impacted by the measurement
error of the inputs or X’s.
4. A. Independent of the transform, the upper specification will be a larger number than the
l
lower specification
ifi ti when
h ttransformed.
f d
D. The process data is transformed but not the specification limits.
5. B. Screening factors among possibilities

C. Achieving a robust design
6. D. Response Surface Design
7. D. Define the Practical Problem
8. E. 64
9. A. Full Factorials are used when 5 or fewer factors are involved.

B. Full Factorials are better for optimizing a process than Fractional Factorials.
C. Full Factorials are used instead of Fractional Factorials if interactions need to be fully
understood.
d t d
10. B. The root cause for the defective product characteristic needs to be found.
C. The variation needs to be affected by the input factors.
D. The response time to calls needs to be reduced.
11. B. The process may show little change if curvature exists and the local maximum of the
process output
p p is between the largeg differences of factor levels chosen.
12. A. An Experimental Design cannot be orthogonal if not balanced.

B. An Experimental Design can be a balanced design but not orthogonal although it is
encouraged to use only balanced and orthogonal designs.
C. The use of blocking can be used for accounting of the impact of Noise variables.

6
Improve Phase Quiz Answers
13. False
14. C. If the experiment is going to start in a week, contact the Process Owners to work out
the needs before the experiment.
D. Use a log book and note any unusual observations during the experiment.
15. False
16. B. Implement
p solutions
17. B. The number of experimental runs minus 1
18. A. 13
19. B. A design with IV resolution will not have Main Effects confounded with 2-way
interactions.
C Ad
C. design
i with
ith V resolution
l ti will
ill h
have 2
2-way interactions
i t ti confounded
f d d with
ith 3
3-way
interactions.
E. A design with V resolution has no Main Effects confounded with other Main Effects
F. A design with III resolution has no Main Effects confounded with other Main Effects

7
Control Phase Quiz Answers
1. B. It attempts to find the optimum region outside the original design space.
2. B. The desired process output was not yet found within the original design space.
3. False
4 A. Process Capability is a way of predicting future performance when a stable process

exists.
B. Special
p Causes reduce Predictability y and Process Capability
p y measures Process
Predictability relative to specifications.
D. If the process is non-normal the type of distribution must be remembered when
monitoring a process in the Control Phase. This type of Non-normal Distribution must
be known to run a proper Process Capability Analysis.
5. D. stabilize
6 A.
6. A MUDA means waste
t which
hi h iis iindicating
di ti d defects
f t are occurring
i iin th
the process.
7. True
8. A. Kanban
9. True
10. B. Kanban
11. False
12. A. Poka-Yoke or Mistake Proofing

D. Robust product design
13 A
13. A. A gauge with
ith a worsening
i precision.
i i
C. Other unknown significant Noise factors are increasingly varying.
14. A. Common
C. Special
15. False

LSS Black Belt Ebook PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

LSS Black Belt Ebook PDF

Загружено:

Авторское право:

Доступные форматы

Certified Lean

Appendix – Quiz Answers

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Lean Six Sigma

Welcome to Open Source Six Sigma’s Black Belt Training Course.

We begin in the Define Phase with “Understanding Six Sigma”.

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Understanding Six Sigma

The core fundamentals

Six Sigma Fundamentals

Wrap Up & Action Items

What is Six Sigma…as a Symbol?

Variation is our enemy. Our

The Blue Line designates

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Understanding Six Sigma

What is Six Sigma…as a Value?

When measuring the

What is Six Sigma…as a Measure?

The probability of creating a defect can be estimated and translated into a

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Understanding Six Sigma

“Sigma Level” is:

The likelihood of failure decreases as the number of Standard Deviations

Each gray dot represents one Standard Deviation

What is Six Sigma…as a Metric?

 Rolled Throughput yield (RTY) 12

 First Time Yield (FTY) 10

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Understanding Six Sigma

What is Six Sigma…as a Benchmark?

Yield PPMO COPQ Sigma

99.9997% 3.4 <10% 6 World Class Benchmarks

99.976% 233 10-15% 5 10% GAP

99.4% 6,210 15-20% 4 Industry Average

93% 66,807 20-30% 3 10% GAP

65% 308 537

50% 500,000 >40% 1

W ha t does 2 0 - 4 0 % of Sa les represent to your O rga niza tion?

What is Six Sigma…as a Method?

DM AIC provides the m ethod for a pplying the Six

– Define - the business opportunity

– Measure - the process current state

– Analyze - determine root cause or Y= f (x)

– Improve - eliminate waste and variation

– Control - evidence of sustained results

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Understanding Six Sigma

What is Six Sigma…as a Tool?

Six Sigma conta ins a broa d set of tools, interw oven

- Six Sigma has not created new tools, it has simply

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Understanding Six Sigma

What is Six Sigma…as a Goal?

Low Ha nging Fruit

What is Six Sigma…as a Philosophy?

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Understanding Six Sigma

History of Six Sigma

The Phase Approach of Six Sigma

Control Im prove Ana ly ze M ea sure Define

M O TO RO LA GEN ERAL ELECTRIC

Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com

Rolled Throughput yield (RTY) 12

First Time Yield (FTY) 10

Charter Benefits Analysis

Own project selection, execution control, implementation and realization of

Provide advice and counsel to Executive Staff

Project team leader

Provide support to Black Belts and Green Belts as

Hard work (becoming a Six Sigma Belt is not