Академический Документы
Профессиональный Документы
Культура Документы
Six Sigma Black
l k
Belt Book
elt ook
LEAN SIX SIGMA BELT SERIES
OpenSourceSixSigma.com
Legal Notice
INDIVIDUAL COPY
This Book is an Open Source Six Sigma™ copyrighted
publication and is for individual use only. This publication
y p , y p y
may not be republished, electronically or physically y
reproduced, distributed, changed, posted to a website an
intranet or a file sharing system or otherwise distributed in
any form or manner without advanced written permission
from Open Source Six Sigma. Minitab is a Registered
Trademark of Minitab Inc.
FBI Anti Piracy Warning: The unauthorized reproduction or
distribution of this copyrighted work is illegal. Criminal
copyright infringement, including infringement without
monetary gain, is investigated by the FBI and is punishable by
up to 5 years in federal prison and a fine of $250,000.
FFor reprint permission, to request additional copies, or to
i t i i t t dditi l i t
request customized versions of this publication contact Open
Source Six Sigma.
Open Source Six Sigma
Open Source Six Sigma
6200 East Thomas Road Suite 203
Scottsdale, Arizona, United States of America 85251
Toll Free: 1 800 504 4511
International: +1 480 361 9983
International: +1 480 361 9983
Email: OSSS@OpenSourceSixSigma.com
Website: www.OpenSourceSixSigma.com
Table of Contents
Page
Define Phase
Understanding Six Sigma…………………………………………..………………..….…….… 1
Six Sigma Fundamentals………………………………..…………..………………..……..…. 22
Selecting Projects……………………………………….………………………..……..……… 42
Elements of Waste……………………………………..…………...……………………………64
Wrap Up and Action Items……………………...………………………………………….……77
Define Phase Quiz……………………………..…………………………………………………83
Measure Phase
Welcome to Measure……………………………………………………………….……..….....86
Measure 86
Process Discovery………………………………………..………………………………………89
Six Sigma Statistics…………………………………..….………………………………….….138
Measurement System Analysis……………………….……………………………………....171
Process Capability ……………………………………...…………………………… ……….203
Wrap Up and Action Items …………………………………………………………………….224
Measure Phase Quiz………………………………………………………….………………..230
Analyze Phase
Welcome to Analyze……………………………………………………………………… .…..233
“X” Sifting………………………………….………………...……………………….……….….236
Inferential Statistics……………………………………………..……………..………….…….262
Introduction to Hypothesis Testing……………………………..……….…………………….277
Hypothesis Testing Normal Data Part 1………………………………..…….………………291
Hypothesis Testing Normal Data Part 2 …………………….………………………….……334
Hypothesis Testing Non-Normal Data Part 1………………….….…………………….……364
1 364
Hypothesis Testing Non-Normal Data Part 2……………….…………….………………….390
Wrap Up and Action Items …………………………………………..………………....……..409
Analyze Phase Quiz…………………………………………….………………………………415
Improve Phase
Welcome to Improve……………………………………..………………………………...…..418
Process Modeling Regression……………………………………………………………….421
Advanced Process Modeling………………………….……………………………………….440
Designing Experiments…………………………………………………………………………467
Experimental Methods………………………………………….………………………………482
Full Factorial Experiments…………………………………………………………………..…497
Fractional Factorial Experiments……………………………………………………….……..526
Wrap Up and Action Items…………………………..…………………………………………546
Improve Phase Quiz……………………………………………………………………………552
Control Phase
Welcome to Control………………………………………..……………………………………556
Lean Controls……………………………………………………………………………………559
Defect Controls……………………………………………………………………….…………574
Statistical Process Control…………………………….……………………………………….586
Six Sigma Control Plans………………………………..………………………………………626
Wrap Up and Action Items…………………….…………………………………………….…649
Control Phase Quiz…………………………...………………………………………..……….659
Glossary
Define Phase
Understanding Six Sigma
This course has been designed to build your knowledge and capability to improve the
performance of processes and subsequently the performance of the business of which you are a
part. The focus of the course is process centric. Your role in process performance improvement
is to be through the use of the methodologies of Six Sigma, Lean and Process Management.
By taking this course you will have a well rounded and firm grasp of many of the tools of these
methodologies. We firmly believe this is one of the most effective classes you will ever take and it
is our commitment to assure that this is the case.
Selecting Projects
Elements of Waste
What if you had a few professional basketball player in the room, would that widen or narrow the variation?
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
The higher the sigma level, the better the performance. Six Sigma refers to a process having six
Standard Deviations between the average of the process center and the closest specification limit or
service level
level.
This pictorial depicts the percentage of data which falls between Standard Deviations within a Normal
Distribution. Those data points at the outer edge of the bell curve represent the greatest variation in our
process. They are the ones causing customer dissatisfaction and we want to eliminate them.
Measure
5 Sigma
4 Sigma
3 Sigma
2 Sigma
1 Sigma
Defects 20
Defects per unit (DPU) 18
Parts per million (PPM) 16
Defects per million opportunities (DPMO) 14
Sigma (s) 8
0 20 40 60 80 100
Above are some key metrics used in Six Sigma. We will discuss each in detail as we go through the
course.
Source: Journal for Quality and Participation, Strategy and Planning Analysis
The Six Sigma Methodology is made up of five stages: Define, Measure, Analyze, Improve and
Control.
Each has highly defined steps to assure a level of discipline in seeking a solution to any variation or
defect present in a process.
Customer Value
Management Product Process Process System Functional
Responsiveness,
Cost, Quality,
= EBIT, (Enabler) , Design , Yield , Speed , Uptime , Support
Delivery
Six Sigma has not created new tools. It is the use and flow of the tools that is important. How they
are applied makes all the difference.
Six Sigma is also a business strategy that provides new knowledge and capability to employees so
they can better organize the process activity of the business, solve business problems and make
better decisions.
decisions Using Six Sigma is now a common way to solve business problems and remove
waste resulting in significant profitability improvements. In addition to improving profitability,
customer and employee satisfaction are also improved.
Six Sigma is a process measurement and management system that enables employees and
companies to take a process oriented view of the entire business. Using the various concepts
embedded in Six Sigma, key processes are identified, the outputs of these processes are
prioritized, the capability is determined, improvements are made, if necessary, and a management
structure is put in place to assure the ongoing success of the business
business.
People interested in truly learning Six Sigma should be mentored and supported by seasoned
Belts who truly understand how Six Sigma works.
Sw eet Fruit
Design for Six Sigma
5+ Sigma
Bulk of Fruit
Process
3 - 5 Sigma Cha ra cteriza tion
a nd O ptimiza tion
Ground Fruit
1 - 2 Sigma Simplify a nd
St nda
Sta d rdize
di
General Electric: First, what it is not. It is not a secret society, a slogan or a cliché. Six Sigma is
a highly di i li d process that
hi hl disciplined th t helps
h l us ffocus on d developing
l i and dd
delivering
li i near-perfectf t products
d t
and services. The central idea behind Six Sigma is that if you can measure how many "defects" you
have in a process, you can systematically figure out how to eliminate them and get as close to "zero
defects" as possible. Six Sigma has changed the DNA of GE — it is now the way we work — in
everything we do and in every product we design.
Honeywell: Six Sigma refers to our overall strategy to improve growth and productivity as well as
a measurementt off quality.
lit As
A a strategy,
t t Six
Si Sigma
Si is
i a way for
f us to
t achieve
hi performance
f
breakthroughs. It applies to every function in our company, not just those on the factory floor. That
means Marketing, Finance, Product Development, Business Services, Engineering and all the other
functions in our businesses are included.
Lockheed Martin: We’ve just begun to scratch the surface with the cost-saving initiative called Six
Sigma and already we’ve generated $64 million in savings with just the first 40 projects. Six Sigma
uses data gathering and statistical analysis to pinpoint sources of error in the organization or products
and determines precise ways to reduce the error.
Simplistically, Six
Simplistically
• 1 9 8 4 Bob Ga lvin of M otorola edicted the first objectives of
Sigma was a
Six Sigma
program that was
– 1 0 x levels of improvem ent in service a nd qua lity by 1 9 8 9
generated around
targeting a process – 1 0 0 x improvement by 1 9 9 1
Mean (average) six – Six Sigma ca pa bility by 1 9 9 2
Standard Deviations – Bill Sm ith, a n engineer from M otorola , is the person credited a s
away from the the fa ther of Six Sigm a
closest specification • 1984 T Tex a s IInstruments
t t a ndd ABB W orkk closely
l l w ith
limit. M otorola to further develop Six Sigma
• 1 9 9 4 Applica tion ex perts lea ve M otorola
By using the process
• 1 9 9 5 AlliedSigna l begins Six Sigma initia tive a s directed by
Standard Deviation
La rry Bossidy
to determine the
location of the Mean – Ca ptured the interest of W a ll Street
the results could be • 1 9 9 5 Genera l Electric, led by Ja ck W elsh, bega n the most
predicted at 3.4 w idesprea d underta k ing of Six Sigma even a ttem pted.
defects per million by • 1 9 9 7 To present Six Sigma spa ns industries w orldw ide
the use of statistics.
There is an allowance for the process Mean to shift 1.5 Standard Deviations. This number is another
academic and esoteric controversial issue not worth debating. We will get into a discussion of this
number later in the course.
Today the Define Phase is an important aspect to the methodology. Motorola was a mature culture
from a process perspective and didn’t necessarily have a need for the Define Phase.
M t organizations
Most i ti today
t d DEFINITELY need
d it tto properly
l approach
h iimprovementt projects.
j t
As you will learn, properly defining a problem or an opportunity is key to putting you on the right
track to solve it or take advantage of it.
Process
O w ner
Estimate COPQ
Charter Project
M ea sure
Id if Prioritize,
Identify, Pi i i S
Select
l S
Solutions
l i C
Controll or Eli
Eliminate
i X’
X’s Causing
C i P Problems
bl
N Estimate COPQ
Approved
Project Recommend Project Focus
Focus
Y
Create Team
Charter Team
Listed below are the type of Define Phase deliverables that will be reviewed by this course.
By the end of this course, you should understand what would be necessary to provide these
deliverables in a presentation.
Wherever there are processes, Six Sigma can improve their performance.
Requirement Requirement
or or
LSL
Target USL
Conventional strategy was to create a product or service that met certain specifications.
Assumed that if products and services were of good quality then their
performance standards were correct.
Rework was required to ensure final quality.
Efforts were overlooked and unquantified (time, money, equipment
usage, etc).
The conventional strategy was to create a product or service that met certain specifications. It was
assumed that if products and services were of good quality, then their performance standards were
correct irrespective of how they were met.
Using this strategy often required rework to ensure final quality or the rejection and trashing of some
products and the efforts to accomplish this “inspect in quality” were largely overlooked and un-
quantified.
You will see more about this issues when we investigate the Hidden Factory.
Problem Solving Strategy
Y=f
Y f ((Xi)
This sim ply sta tes tha t Y is a function of the
X ’ s. In other w ords Y is dicta ted by the X ’ s.
Y = f(x) is a key concept that you must fully understand and remember
remember. It is a fundamental principle
to the Six Sigma methodology. In its simplest form it is called “cause and effect”. In its more robust
mathematical form it is called “Y is equal to a function of X”. In the mathematical sense it is data
driven and precise, as you would expect in a Six Sigma approach. Six Sigma will always refer to an
output or the result as a Y and will always refer to an input that is associated with or creates the
output as an X.
Another way of saying this is that the output is dependent on the inputs that create it through the
blending that occurs from the activities in the process. Since the output is dependent on the inputs
we cannot directly control it, we can only monitor it.
Example
Y f (Xi)
Y=f
W hich process va ria bles (ca uses) ha ve critica l impa ct on
the output (effect)?
Y=f(x) is a transfer function tool to determine what input variables (X’s) affect the output responses
(Y’s). The observed output is a function of the inputs. The difficulty lies in determining which X’s
are critical to describe the behavior of the Y’s.
In the Measure Phase we will introduce a tool to manage the long list of input variable and their
relationship to the output responses. It is the X-Y Matrix or Input-Output Matrix.
Y=f(X) Exercise
Exercise:
Espresso =f ( X1 , X , X , X , X )
2 3 4 n
Notes
(X6)
(X2)
As you go through the application of DMAIC you will have a goal to find the root causes to the
problem you are solving. Remember that a vital component of problem solving is cause and effect
thinking or Y=f(X). To aid you in doing so, you should create a visual model of this goal as a funnel -
a funnel that takes in a large number of the “trivial many contributors,” and narrows them to the “vital
few contributors
contributors” by the time they leave the bottom
bottom.
At the top of the funnel you are faced with all possible causes - the “vital few” mixed in with the
“trivial many.” When you work an improvement effort or project, you must start with this type of
thinking. You will use various tools and techniques to brainstorm possible causes of performance
problems and operational issues based on data from the process. In summary, you will be applying
an appropriate set of “analytical methods” and the “Y is a function of X” thinking, to transform data
into the useful knowledge needed to find the solution to the problem. It is a mathematical fact that 80
percent of a problem is related to six or fewer causes
causes, the X’s.
X’s In most cases it is between one and
three.
The goal is to find the one to three Critical X’s from the many potential causes when we start an
improvement project. In a nutshell, this is how the Six Sigma methodology works.
Breakthrough Strategy
Ba d 66-Sigm
-Sigmaa
Brea k through UCL
UCL
Brea k through
Perforrma nce
O ld Sta nda rd
LCL
LCL
UCL
UCL
N ew Sta nda rd
LCL
LCL
Good
By utilizing the DMAIC problem solving methodology to identify and optimize the vital few variables we
will realize sustainable breakthrough performance as opposed to incremental improvements or, even
worse, temporary and non-sustainable improvement..
The image above shows how after applying the Six Sigma tools, variation stays within the specification
limits.
The
foundation of
Six Sigma
VO C is Customer Driven
requires
F
Focus on the
th
voices of the VO B is Profit Driven
Customer, the
Business, and
the Employee
which
VO E is Process Driven
provides:
Awareness of the needs that are critical to the quality (CTQ) of our products and
services
Identification of the gaps between “what is” and “what should be”
Identification of the process defects that contribute to the “gap”
Knowledge of which processes are “most broken”
Enlightenment as to the unacceptable costs of poor quality (COPQ)
Six Sigma puts a strong emphasis on the customer because they are the ones assessing our performance
and they
y respond
p byy either continuing
g to p
purchase our p
products and services or….by y NOT!
So, while the customer is the primary concern we must keep in mind the Voice of the Business – how do we
meet the business’s needs so we stay in business? And we must keep in mind the Voice of the Employee -
how do we meet employees needs such that they remain employed by our firm and remain inspired and
productive?
MBB
Executive Leadership
Champion/Process Owner
Black Belts Master Black Belt
Black Belt
Green Belt
Green Belts Yellow Belt
Yellow Belts
Just like a winning sports team, various people who have specific positions or roles have defined
responsibilities. Six Sigma is similar - each person is trained to be able to understand and perform the
responsibilities of their role. The end result is a knowledgeable and well coordinated winning business
team.
The division of training and skill will be delivered across the organization in such a way as to provide a
specialist: it is based on an assistant structure much as you would find in the medical field between a
Doctor, 1st year Intern, Nurse, etc. The following slides discuss these roles in more detail.
In addition to the roles described herein, all other employees are expected to have essential Six Sigma
skills for process improvement and to provide assistance and support for the goals of Six Sigma and the
company.
Six Sigma has been designed to provide a structure with various skill levels and knowledge for all
members of the organization. Each group has well defined roles and responsibilities and communication
links. When all individuals are actively applying Six Sigma principles, the company operates and performs
at a higher level
level. This leads to increased profitability
profitability, and greater employee and customer satisfaction
satisfaction.
Executive Leadership
Not all Six Sigma deployments are driven from the top by executive leadership. The data is clear,
however, that those deployments that are driven by executive management are much more successful
than those that are not.
Makes decision to implement the Six Sigma initiative and develop accountability
method
Sets meaningful goals and objectives for the corporation
Sets performance expectations for the corporation
Ensures continuous improvement in the process
Eliminates barriers
The executive leadership owns the vision for the business, they provide sponsorship and set
expectations
t ti for
f the
th results
lt ffrom Si
Six Si
Sigma. Th
They enable
bl th
the organization
i ti tto apply
l Si
Six Si
Sigma and
d th
then
monitor the progress against expectations.
Champion/Process Owner
Champions are responsible for functional business activities and to provide business deliverables to
either internal or external customers. They are in a position to be able to recognize problem areas of
the business, define improvement projects, assign projects to appropriate individuals, review projects
and support their completion
completion. They are also responsible for a business roadmap and employee
training plan to achieve the goals and objectives of Six Sigma within their area of accountability.
MBB should be well versed with all aspects of Six Sigma, from technical applications to Project
Management. MBBs need to have the ability to influence change and motivate others.
A Master Black Belt is a technical expert, a “go to” person for the Six Sigma methodology. Master
Black Belts mentor Black Belts and Green Belts through their projects and support Champions. In
addition to applying Six Sigma, Master Black Belts are capable of teaching others in the practices
and tools.
Black Belt
Bl k Belts
Black B lt are application
li ti experts
t andd workk projects
j t within
ithi th
the b
business.
i Th
They should
h ld bbe wellll
versed with The Six Sigma Technologies and have the ability to drive results.
A Black Belt is a project team leader, working full time to solve problems under the direction of a
Champion, and with technical support from the Master Black Belt. Black Belts work on projects j
that are relatively complex and require significant focus to resolve. Most Black Belts conduct an
average of 4 to 6 projects a year -- projects that usually have a high financial return for the
company.
G
Green Belt
B lt
Green Belts are practitioners of Six Sigma Methodology and typically work within their
functional areas or support larger Black Belt Projects.
Green Belts are capable of solving problems within their local span of control. Green Belts remain in
their current positions, but apply the concepts and principles of Six Sigma to their job environment.
Green Belts usually address less complex problems than Black Belts and perform at least two projects
per year. They may also be a part of a Black Belt’s team, helping to complete the Black Belt project.
Yellow Belt
Training as a Six Sigma Belt can be one of the most rewarding undertakings of your career and
one of the most difficult.
You can expect to experience:
Organizational Behaviors
All players in the Six Sigma process must be willing to step up and act according to the Six Sigma
set of behaviors.
Leadership by example: “walk the talk”
Six Sigma is a system of improvement. It develops people skills and capability for the participants. It
consists of proven set of analytical tools, project-management techniques, reporting methods and
managementt methods
th d combined
bi d tto fform a powerful
f l problem-solving
bl l i and
dbbusiness-improvement
i i t
methodology. It solves problems, resulting in increased revenue and profit, and business growth.
The strategy of Six Sigma is a data-driven, structured approach to managing processes, quantifying
problems, and removing waste by reducing variation and eliminating defects.
The tactics of Six Sigma are the use of process exploration and analysis tools to solve the equation
of Y = f(X) and to translate this into a controllable practical solution.
As a performance goal, a Six Sigma process produces less than 3.4 defects per million
opportunities. As a business goal, Six Sigma can achieve 40% or more improvement in the
profitability of a company. It is a philosophy that every process can be improved, at breakthrough
levels.
Notes
Define Phase
Six Sigma Fundamentals
The output of the Define Phase is a well developed and articulated project. It has been correctly
stated that 50% of the success of a project is dependent on how well the effort has been defined.
Process
Process M
Metrics
etrics
Selecting Projects
What is a Process?
W hy ha ve a process focus?
– So we can understand how and why work gets done
– To characterize customer & supplier relationships
– To manage for maximum customer satisfaction while utilizing
minimum resources
– To see the process from start to finish as it is currently being
performed
– Blame the process, not the people
What is a Process? Many people do or conduct a process everyday but do you really think of it as a
process? Our definition of a process is a repetitive and systematic series of steps or activities where inputs
are modified to achieve a value-added output.
Examples of Processes
We go thru processes everyday. Below are some examples of processes. Can you think
of other processes within your daily environment?
Injection molding Recruiting staff
Decanting solutions Processing invoices
Filling vial/bottles Conducting research
Crushing ore Opening accounts
Refining oil Reconciling accounts
Turning screws Filling out a timesheet
Building custom homes Distributing mail
Paving roads Backing up files
Changing a tire Issuing purchase orders
Process Maps
Process Mapping, also called
flowcharting, is a technique to • The purpose of Process Maps is to:
visualize the tasks, activities and – Identify the complexity of the process
– Communicate the focus of problem solving
steps necessary to produce a product
or a service. The preferred method for • Process Maps are living documents and must be changed as the
describing a process is to identify it process is changed
with a generic name, show the – They represent what is currently happening, not what you think is
workflow with a Process Map and happening.
– They should be created by the people who are closest to the process
d
describe
ib itits purpose with
ith an
operational description.
Process Map
Remember that a process is a
blending of inputs to produce some
desired output. The intent of each
task, activity and step is to add value,
ct
The individual processes are linked together to see the total effort and flow for meeting business and
customer needs. In order to improve or to correctly manage a process, you must be able to describe it
in a way that can be easily understood. Process Mapping is the most important and powerful tool you
will use to improve the effectiveness and efficiency of a process.
St d d symbols
Standard b l ffor process mapping
i (available in Microsoft
Office™, Visio™, iGrafx™ , SigmaFlow™ and other products):
There may be several interpretations of some of the process mapping symbols; however, just
about everyone uses these primary symbols to document processes. As you become more
practiced you will find additional symbols useful,
useful i.e.
i e reports,
reports data storage etc
etc. For now we will start
with just these symbols.
At a minimum a high
One of the deliverables from the Define Phase is a high level process
level Process Map
map, at a minimum it must include:
must include; start
– Start and stop points
and stop points, all
process steps, all – All process steps
decision points and – All decision points
directional flow. – Directional flow
– Value categories as defined below
Also be sure to • Value Added:
include Value – Physically transforms the “ thing” going through the process
C t
Categories
i such h as – Must be done right the first time
Value Added – Meaningful from the customer’s perspective (is the customer willing to
(Customer Focus) and pay for it?)
Value Enabling • Value Enabling:
(External Stakeholder – Satisfies requirements of non-paying external stakeholders (government
regulations)
focus).
• N on-Value Added
– Everything else
N WALK-IN NOTE
CALL or DATA ENDS ENTER APPROPRIATE
WALK-IN? N SSAN (#,9s,0s)
Z CALL PUT ON HOLD,
REFER TO IF EMP DATA NOT
PHONE DATA REFERENCES POPULATED, ENTER
CAPTURE BEGINS
CREATE A CASE
Y INCL CASE TYPE
ANSWER? OLD N
DETERMINE WHO DATE/TIME, &
CASE
IS INQUIRING N NEEDED BY
Y
QUERY INTERNAL UPDATE ENTRIES
ACCESS CASE TOOL HRSC SME(S) INCL OPEN DATE/TIME AUTO Y
ROUTE
ROUTE
DETERMINE NATURE N
OF CALL & CONFIRM Y
ANSWER?
UNDERSTANDING
CASE Y CLOSE CASE
N CLOSED W/ E
DATE/TIME
CASE TOOL N OFF HOLD AND ADD TO N
RECORD? C ARRANGE CALL RESEARCH
BACK PHONE DATA LIST GO TO E
TAKE ACTION
Y ENDS F or E NEXT
or
DEPENDING ON
DO RESEARCH F
B CASE
where the
process involves Fill out ACH Receive
Vendor
Produce an No
enrollment payment End
Invoice
form
several
departments
departments. Match against
Accounting
Maintain database
Financial
different
Accounting
Review and
General
21.0
departments in Process
transfer in
3.0
Journey Entry
Bank
Reconciliation
the company
p y FRS
1 Create
1. C t a high
hi h llevell process map, use enough
hddetail
t il
to make it useful.
• It is helpful to use rectangular post-it’s for process
steps and square ones turned to a diamond for
decision points.
2. Color code the value added (green) and non-value
added (red) steps.
3. Be prepared to discuss this with your mentor
An important element of Six Sigma is understanding your customer. This is called VOC or Voice of the
Customer. By doing this allows you to find all of the necessary information that is relevant between your
product/process and customer, better known as CTQ’s (Critical to Quality). The CTQ’s are the customer
requirements for satisfaction with your product or service.
There of four steps The customer’s perspective has to be foremost in the mind of the Six
that can help you in Sigma Belt throughout the project cycle.
understanding your 1. Features
customer. These • Does the process provide what the customers expect and need?
• How do you know?
steps
t focus
f on the
th
2. Integrity
customer’s • Is the relationship with the customer centered on trust?
perspective of • How do you know?
features, your 3. Delivery
company’s integrity, • Does the process meet the customer’s time frame?
delivery mechanisms • How do you know?
4. Expense
and perceived value • Does the customer perceive value for cost?
versus cost. • How do you know?
What is a Customer?
Value Chain
The relationship from one process to the next in an organization creates a “ value
chain” of suppliers and receivers of process outputs.
Each process has a contribution and accountability to the next to satisfy the
external customer.
External customers needs and requirements are best met when all process
owners work cooperatively in the value chain.
Careful –
each move
has many
impacts!
The disconnect from Design and Production in some organizations is a good example. If Production
is not fed the proper information from Design how can Production properly build a product?
Every activity (process) must be linked to move from raw materials to a finished product on a store
shelf.
What is a CTQ?
• Critical to Quality (CTQ ’s) are measures that we use to capture VOC
properly. (also referred to in some literature as CTC’s – critical to Example: Making an
customer) Online Purchase
• CTQ ’s can be vague and difficult to define.
Reliability – Correct
– The customer may identify a requirement that is difficult to measure
amount of money is
directly so it will be necessary to break down what is meant by the
taken from account
customer into identifiable and measurable terms
Developing CTQ’s
The steps in developing
CTQ’s are identifying
the customer, capturing • Identify Customers
the Voice of the Step 1 • Listing
Customer and finally • Segmentation
validating the CTQ’s. • Prioritization
• Va lida te CTQ s
Step 2 • Translate VOC to CTQ s
• Prioritize the CTQ s
• Set Specified Requirements
• C fi CTQ s with
Confirm ith customer
t
• Ca pture V O C
Step 3 • Review existing performance
• Determine gaps in what you need to know
• Select tools that provide data on gaps
• Collect data on the gaps
Another important tool from • COPQ stands for Cost of Poor Quality
this phase is COPQ, Cost of
Poor Quality. COPQ • As a Six Sigma Belt, one of your tasks will be to estimate COPQ for
represents the financial your process
opportunity of your team’s
improvement efforts. Those • Through your process exploration and project definition work you will
opportunities are tied to develop a refined estimate of the COPQ in your project
either hard or soft savings
savings.
• This project COPQ represents the financial opportunity of your team’s
COPQ, is a symptom improvement effort (VOB)
measured in loss of profit
(financial quantification) that • Calculating COPQ is iterative and will change as you learn more
results from errors (defects) about the process
and other inefficiencies in our No, not that
processes. This is what we kind of cop
are seeking to eliminate! queue!
You will use the concept of COPQ to quantify the benefits of an improvement effort and also to
determine where you might want to investigate improvement opportunities.
Prevention Costs are typically cost associated to product quality, this is viewed as an investment that
companies make to ensure product quality. The final element is Appraisal costs, these are tied to
product inspection and auditing.
This idea was of COPQ was defined by Joseph Juran and is a great point of reference to gain a
further understanding
understanding.
Over time and with Six Sigma, COPQ has migrated towards the reduction of waste. Waste is a better
term, because it includes poor quality and all other costs that are not integral to the product or service
your company provides. Waste does not add value in the eyes of customers, employees or investors.
COPQ - Categories
Interna l CO PQ Prevention
• Quality Control • Error Proofing Devices
Department • Supplier Certification
• Inspection • Design for Six Sigma
• Quarantined Inventory • Etc…
• Etc…
Detection
• W arranty • Supplier Audits
• Customer Complaint Related • Sorting Incoming Parts
Travel • p
Repaired Material
• Customer Charge Back Costs • Etc…
• Etc…
COPQ - Iceberg
g
Even worse are the intangible Costs of Poor Quality.
y These are typically
yp y 20 to 35% of sales. If yyou
average the intangible and tangible costs together, it is not uncommon for a company to be spending
25% of their revenue on COPQ or waste.
Implementing Lean fundamentals can also help identify areas of COPQ. Lean will be discussed later.
W hil
hile ha
h rdd sa vings
i a re a lw
l a y s more desira
d i ble bl
beca use they a re ea sier to qua ntify, it is a lso
necessa ry to think a bout soft sa vings.
COPQ Exercise
Notes
• Better: DPU, DPMO, RTY (there are others, but they derive from these
basic three)
• F ster:
Fa t C l Ti
Cycle Time
• Chea per: COPQ
IfIfyou
youmake
makethetheprocess
processbetter
betterby
byeliminating
eliminatingdefects
defectsyou
youwill
willmake
makeititfaster
faster
IfIfyou
you choose to make the process faster, you will have to eliminatedefects
choose to make the process faster, you will have to eliminate defectstoto
bebeasasfast
fastas
asyou
youcan
canbe
be
IfIfyou
you make the processbetter
make the process betteror
orfaster,
faster,you
youwill
willnecessarily
necessarilymake
makeititcheaper
cheaper
The
The metrics
metricsfor
for aallllSix
Six Sigma
Sigma projects
projectsfa
fallllinto
into one
one of
of these
these three
three
ca tegories
ca tegories
Th previous
The i slides
lid have
h been
b discussing
di i process managementt and d th
the concepts
t bbehind
hi d a process
perspective. Now we begin to discuss process improvement and the metrics used.
– It is not simply the “ touch time” of the value-added portion of the process
W ha t is the cycle
y time of the pprocess you
y ma pp
pped?
Is there a ny va ria tion in the cycle time? W hy?
Cycle time includes any wait or queue time for either people or products.
DPU or D Defects
f t per U Unitit
quantifies individual defects Six Sigma methods quantify individual defects and not just defectives
on a unit and not just – Defects account for all errors on a unit
defective units. A returned • A unit may have multiple defects
unit or transaction can be • An incorrect invoice may have the wrong amount due and the wrong
due date
defective and have more
– Defectives simply classifies the unit bad
than one defect.
• Doesn’t matter how many defects there are
Defect: A physical count of • The
Th invoice
i i iis wrong, causes are unknown
k
all errors on a unit, – A unit:
regardless of the disposition • Is the measure of volume of output from your area.
of the unit. • Is observable and countable. It has a discrete start and stop point.
• It is an individual measurement and not an average of
EXAMPLES: An error in a measurements.
Online transaction has
(typed wrong card number, Tw o Defects O ne Defective
internet failed). In this case
one online transaction had 2
defects (DPU=2).
A Mobile Computer that has 1 broken video screen, 2 broken keyboard keys and 1 dead battery,
has a total of 4 defects. (DPU=4)
Is a p
process that pproduces 1 DPU better or worse than a p
process that g
generates 4 DPU? If yyou
assume equal weight on the defects, obviously a process that generates 1 DPU is better; however,
cost and severity should be considered. However, the only way you can model or predict a process
is to count all the defects.
Traditional metrics
when chosen
poorly can lead the
team in a direction
that is not
consistent with the
focus of the
business. Some
of the metrics we
must be
concerned about
would be FTY -
FIRST TIME
YIELD. It is very
possible to have
100% FTY and
spend tremendous
amounts in excess
repairs and
rework.
Instead of relying on FTY - First Time Yield, a more efficient metric to use is RTY-Rolled Throughput
Yield. RTY has a direct correlation (relationship) to Cost of Poor Quality.
In the few organizations where data is readily available, the RTY can be calculated using actual defect
data. The data provided by this calculation would be a binomial distribution since the lowest yield
possible would be zero.
As depicted here, RTY is the multiplied yield of each subsequent operation throughout a process (X1 *
X2 * X3…)
RTY Estimate
Sadly, in most companies there is • In many organizations the long term data required to
not enough data to calculate RTY calculate RTY is not available, we can however estimate
in the long term. Installing data RTY using a known DPU as long as certain conditions are
collection practices required to met.
provide such data would not be • The Poisson distribution generally holds true for the
cost effective. In those instances, random distribution of defects in a unit of product and is
it is necessary to utilize a the basis for the estimation.
prediction off RTY in the form
f off e- – The best estimate of the proportion of units containing
dpu (e to the negative dpu). no defects, or RTY is:
When using the e-dpu equation to RTY = e-dpu -dpu
calculate the probability of a
product or service moving through The mathematical constant e is the base of the natural logarithm.
e ≈ 2.71828 18284 59045 23536 02874 7135
the entire process without
a defect,, there are several things
g that must be held for consideration. While this would seem to be a
constraint, it is appropriate to note that if a process has in excess of 10% defects, there is little need to
concern yourself with the RTY.
In such extreme cases, it would be much more prudent to correct the problem at hand before worrying
about how to calculate yield.
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
38
120%
Poisson
Poisson VS
VS Binomial
Binomial (r=0,n=1)
(r=0,n=1) Probability
Probability Yield
Yield Yield
Yield %
%Over
Over
120% of
ofaadefect
defect (Binomial)
(Binomial) (Poisson)
(Poisson) Estimated
Estimated
0.0
0.0 100%
100% 100%
100% 0%
0%
100%
100% Yield
Yield (Binomial)
(Binomial) 0.1
0.1 90%
90% 90%
90% 0%
0%
Yield
Yield (Poisson)
(Poisson) 0.2
0.2 80%
80% 82%
82% 2%
2%
(RTY)
80%
d (RTY)
80%
0.3
0.3 70%
70% 74%
74% 4%
4%
60% 0.4
0.4 60%
60% 67%
67% 7%
7%
60%
Yield
0.5
05
0.5 50%
50% 61%
61% 11%
11%
Yiel
40% 0.6
0.6 40%
40% 55%
55% 15%
15%
40%
0.7
0.7 30%
30% 50%
50% 20%
20%
20%
20%
0.8
0.8 20%
20% 45%
45% 25%
25%
0.9
0.9 10%
10% 41%
41% 31%
31%
0%
0% 1.0
1.0 0%
0% 37%
37% 37%
37%
0.0
0.0 0.1
0.1 0.2
0.2 0.3
0.3 0.4
0.4 0.5
0.5 0.6
0.6 0.7
0.7 0.8
0.8 0.9
0.9 1.0
1.0
Probability
Probabilityof
ofaadefect
defect
Binom ia l
n = number of units
r = number of ppredicted defects
p = probability of a defect occurrence P i
Poisson
q = 1 -p
For low defect rates (p < 0.1), the Poisson approximates the Binomial fairly well.
Our goal is to predict yield. For process improvement, the “yield” of interest is the ability of a process
to produce zero defects (r=0). Question: What happens to the Poisson equation when r=0?
D i i RTY from
Deriving f DPU - Modeling
M d li
Given a Unit
probability that Ba sic Q uestion: W hat is the likelihood of
O pportunity producing a unit with zero defects?
any opportunity is
a defect = # • For the unit shown above the following
data was gathered:
defects / (# units
– 60 defects observed
x # opps
pp p per unit):
) – 60 units processed
RTY
RTY for
o DPU
for U == 11
DPU
0.368
0.368
• W hat is the DPU? 0.364
To what value is 0.364
0.36
0.36
Yield
the P(0)
Yield
0.356
0.356
converging? 0.352
0.352
• W hat is probability that any given 0.348
0.348
Note: Ultimately, opportunity will be a defect? 10
10 100
100 1000
1000 10000
10000 100000
100000 1000000
1000000
Chances
Chances Per
Per Unit
Unit
this means that
you need the
y • W hat is the probability that any given Opportunities P(defect) P(no defect) RTY (Prob defect free unit)
ability to track all opportunity will N OT be a defect is: 10 0.1 0.9 0.34867844
100 0.01 0.99 0.366032341
the individual 1000 0.001 0.999 0.367695425
10000 0.0001 0.9999 0.367861046
defects which • The probability that all 10 opportunities 100000 0.00001 0.99999 0.367877602
on single unit will be defect-free is: 1000000 0.000001 0.999999 0.367879257
occur per unit via
If we extend the concept to an infinite number
your data of opportunities, all at a DPU of 1.0, we will
collection system. approach the value of 0.368.
The p
point of this slide is to demonstrate the mathematical model used to p predict the p
probability
y of an
outcome of interest. It has little practical purpose other than to acquaint the Six Sigma Belt with the math
behind the tool they are learning and let them understand that there is a logical basis for the equation.
The DPU for a given operation can be calculated by dividing the number of
defects found in the operation by the number of units entering the operational
step
step.
1 0 0 pa rts built
2 defects identified a nd corrected
dpu = 0 .0 2
So RTY for this step w ould be e-.0 2 (.9 8 0 1 9 9 ) or 9 8 .0 2 %.
RTY
RTYTO =0 .9 0
TOTT= 0 .9 0
RTY 1 =0 .9 8 RTY 2 = 0 .9 8 RTY 3 =0 .9 8 RTY 4 = 0 .9 8 RTY 5 = 0 .9 8
dpu = .0 2 dpu = .0 2 dpu = .0 2 dpu = .0 2 dpu = .0 2
44
dpu
dpuTO = .1
TOTT = .1
If the process had only 5 process steps with the same yield the process
RTY would be: 0.98 * 0.98 * 0.98 * 0.98 * 0.98 = 0.903921 or 90.39%. Since our
metric of primary concern is the COPQ of this process, we can say that in less than 9% of
the time we will be spending dollars in excess of the pre-determined standard or value
added amount to which this process is entitled.
When the number of steps in a process continually increase, we then continue to multiply the yield
from each step to find the overall process yield. For the sake of simplicity let’s say we are calculating
the RTY for a process with 8 steps. Each step in our process has a yield of .98. Again, there will be a
direct correlation between the RTY and the dollars spent to correct errors in our process.
Product A
FTY = 80%
Product B
FTY = 80%
As you have seen, there are many factors behind the final number for FTY. That’s where we need to
look for process improvements.
Let’s look at the DPU of each product assuming equal opportunities and
Answer Slide margin…
questions.
Product A
Now we have a better Product B
idea of:
“What
What does a defect dpu 200 / 100 = 2 dpu
dpu 100 / 100 = 1 dpu
cost?”
“What product should N ow, can you tell which to work on?
get the focus?”
“ the product with the highest DPU?” …think again!
Explain COPQ
Notes
Define Phase
Selecting Projects
Selecting Projects
Overview
The core fundamentals of Understa nding
g Six Sigma
g
this phase are Selecting
Projects, Refining and
Defining and Financial Six Sigma Funda m enta ls
Evaluation.
Selecting Projects
The output of the Define
Phase is a well developed
and
a da articulated
t cu ated p
project.
oject Itt has
as Selecting
g Projects
Selecting j
Projects
been correctly stated that
50% of the success of a Refining
Refining &
& Defining
Defining
project is dependent on how
well the effort has been
Financial
Financial Evaluation
Evaluation
defined.
Selecting Projects
Project
j Cha rter – The p
project
j charter is a more detailed version of
the business case. This document further focuses the improvement
effort. It can be characterized by two primary sections, one, basic
project information and simple project performance metrics.
Responsible Frequency
Pa rty Resources of Upda te
Champion (Process
Project
Six Sigma Belt Owner) & Ongoing
Cha rter
Master Black Belt
Selecting Projects
These are some The Starting Point is defined by the Champion or Process Owner and the
examples of Business Case is the output.
Business Metrics or – These are some examples of business metrics or Key Performance Indicators
Key Performance commonly referred to as KPI’s.
Indicators. – The tree diagram is used to facilitate the process of breaking down the metric of
interest.
What metric should
you focus on…it EBIT
depends? What is Level 2
Cy cle time
the project focus?
What are your Defects Level 2
organizations Level 1
strategic goals? Cost
Level 2
Are Cost of Sales Revenue
preventing growth? Compla ints Level 2
Are ccustomer
stomer
complaints Complia nce
resulting in lost
Sa fety
earnings? Are
excess cycle times
and yield issues eroding market share? Is the fastest growing division of the business the
refurbishing department?
It depends because the motivation for organizations vary so much and all projects should be directly
aligned with the organizations objectives. Answer the question: What metrics are my department not
meeting? What is causing us pain?
Selecting Projects
Be sure to start with higher level metrics, whether they are measured at the Corporate Level,
Division Level or Department Level, projects should track to the Metrics of interest within a given
area. Primary Business Measures or Key Performance Indicators (KPI’s) serve as indicators of the
success of a critical objective.
Business Business
Activities Processes
Prima ry Business M ea sure M ea sure
Post business measures (product/service) are lower level metrics and must focus on the end
product.
Selecting Projects
Business Business
Activities Processes
Prim a ry Business M ea sure M ea sure
M ea sure Business
Business
Activities Processes
M ea sure M ea sure
Y = f (x 1 , x 2 , x 3 …x n )
1 st Call Resolution = f (Calls, Operators, Resolutions…xn )
Business measures are a function of activities. These activities are usually created or enforced by
direct supervision of functional managers. Activities are usually made up of a series of processes or
specific processes.
B i
Business C
Case C
Components
t - Processes
P
Business Business
Activities Processes
Prim a ry Business M ea sure M ea sure
M ea sure Business
Business
Activities
ct t es Processes
ocesses
M ea sure M ea sure
Y = f (x 1 , x 2 , x 3 …x n )
Resolutions = f (N ew Customers, Existing Customers, Defective Products…xn )
The processes represent the final stage of the matrix where multiple steps result in the delivery
of some output for the customer. These deliverables are set by the business and customer and
are captured within the Voice of the Customer, Voice of the Business or Voice of the Employee.
What makes up these process are the X’s that determine the performance of the Y which is
where the actual breakthrough projects should be focused.
Selecting Projects
Let’s get
down to
business!
As you review this statement remember the following format of what needs to be in a Business Case:
WHAT is wrong, WHERE and WHEN is it occurring, what is the BASELINE magnitude at which it is
occurring
i and d what
h t iis it COSTING me?
?
You must take caution to avoid under-writing a Business Case. Your natural tendency is to write too
simplistically because you are already familiar with the problem. You must remember that if you are to
enlist support and resources to solve your problem, others will have to understand the context and the
significance in order to support you.
The Business Case cannot include any speculation about the cause of the problem or what actions will
be taken to solve the problem. It’s important that you don’t attempt to solve the problem or bias the
solution at this stage. The data and the Six Sigma methodology will find the true causes and solutions
to the problem.
Selecting Projects
You need to make sure that your own Business Case captures the units of pain, the business measures,
the performance and the gaps. If this template does not seem to be clicking use your own or just free
form your Business Case ensuring that its well articulated and quantified.
Using the Excel file ‘Define Templates.xls’, Business Case, perform this exercise.
Selecting Projects
Components:
• The Problem
• Project Scope
• Project Metrics
y & Secondary
• Primary y
• Graphical Display of Project Metrics
• Primary & Secondary
• Standard project information
• Project, Belt & Process Owner
names
• Start date & desired End date
• Division or Business Unit
• Supporting Master Black Belt
(Mentor)
• Team Members
The Project Charter is an important document – it is the initial communication of the project. The first
phases of the Six Sigma methodology are Define and Measure. These are known as
“Characterization” phases that focus primarily on understanding and measuring the problem at hand.
Th f
Therefore some off th
the information
i f ti ini the
th Project
P j t Charter,
Ch t such h as primary
i and
d secondary
d metrics,
ti can
change several times. By the time the Measure Phase is wrapping up the Project Charter should be in
its final form meaning defects and the metrics for measuring them are clear and agreed upon.
As you can see some of the information in the Project Charter is self explanatory, especially the first
section. We are going to focus on establishing the Problem Statement and determining Objective
Statement, scope and the primary and secondary metrics.
P j t Charter
Project Ch t - Definitions
D fi iti
• Problem Sta tement - Articulates the pain of the defect or error in the
process.
• Prima ry M etric – The actual measure of the defect or error in the process.
• Cha rts – Graphical displays of the Primary and Secondary Metrics over a
period of time.
Selecting Projects
Selecting Projects
Pareto Analysis
Assisting you in
determining what Pa reto Ana lysis:
inputs are having
the greatest • A bar graph used to arrange information in such a way that priorities for
process improvement can be established.
impact on your
process is the
Pareto Analysis
approach
approach.
Selecting Projects
Level 1 Scrap
200000 100
80
150000
Level 2 60 Department
Percent
Cost
100000 180000
40
160000 100
50000 140000
20
80
120000
Percent
0 100000 0
60
Cost
Scrap A B C
Count
Percent
150000
73.2
30000
14.6
25000
80000
12.2 Level 3
Cum % 73.2 87.8 100.0 40 Part
60000
100000
100
40000
20
20000 80000
80
0 0
Department J M F W Other
60000
Percent
60
Count 95000 23000 19000 17500 5000
Cost
Percent 59.6 14.4 11.9 11.0 3.1
Cum % 59.6 74.0 85.9 96.9 40000
100.0 40
20000 20
0 0
Part Z101 Z876 X492
Count 75000 15000 5000
Percent 78.9 15.8 5.3
Cum % 78.9 94.7 100.0
The Pareto Charts are often referred to as levels. For instance the first graph is called the first level,
the next the second level and so on.
Start high and drill down. Let’s look at how we interpret this and what it means.
140000
60% of the scrapp and p
part Z101 makes up p 120000
80
Percent
100000 60
Cost
80000
0 0
You many be eager to jump into trying to fix Department
Count
J
95000
M
23000
F
19000
W
17500
Other
5000
60000
Percent
60
Cost
40000 40
20000 20
0 0
Part Z101 Z876 X492
Count 75000 15000 5000
Percent 78.9 15.8 5.3
Cum % 78.9 94.7 100.0
Selecting Projects
60
broad of a focus is when your 1500
40
1000
Pareto looks flat. It’s telling
500 20
you that there is no one or two
0 0
inputs that are impacting your FAILURE MODE
ED S L S
LL VE CT NS FO
process. Multiple inputs are O UT CA LE LI O W
T
IN
L F
R D IL N KD S
having similar effects. LY PE SK CO EA CU
CT R OP R CY BR ER
E D O LI S P
RR T
PO M O
CO RA M PR
You need to reduce the scope IN PE C O IM
O T
IN
of the project to get to a more
Count 495 489 478 472 468 455
granular level. Percent 17.3 17.1 16.7 16.5 16.4 15.9
Cum % 17.3 34.4 51.2 67.7 84.1 100.0
Selecting Projects
This g
gives a better p
picture of which p
product category
g yp produces the highest
g defect count.
2000
80
1500
Percent
60
Count
1000 40
500 20
0 0
PRODUCT CATAGORIES r
US ND US ND he
-B -I -B -I Ot
N
UM UM EE
N
EE
IN IN GR
A T A T GR
PL PL
Count 1238 450 362 201 106
Percent 52.5 19.1 15.4 8.5 4.5
Cum % 52.5 71.6 87.0 95.5 100.0
Now we’ve got something to work with. Notice the 80% area…. draw a line from the 80%
mark
k across to
t the
th cumulative
l ti percentt line
li (R
(Red
d Li
Line)) iin th
the graph
h as shown
h h
here.
Now you are beginning to see what needs work to improve the performance of your project.
Selecting Projects
Remember to keep focused on finding the biggest bang for the buck
buck.
1000
1000 80
80
Percent
800
Percent
800
Count
Count
60
60
600
600
40
40
400
400
20
20
200
200
00 00
TRAVEL
TRAVEL CAR
CAR HOTEL
HOTEL AIR
AIR
Count
Count 428
428 420
420 390
390
Percent
Percent 34.6
34.6 33.9
33.9 31.5
31.5
Cum
Cum% % 34.6
34.6 68.5
68.5 100.0
100.0
This does not mean there is NO opportunity for improvements to be had, simply means nothing
obvious is sticking out at this level.
So keep looking.
Selecting Projects
Moving
M i on tto ththe nextt E bli
Esta blishing
hi the
h Prim
P i a ry M etric:
i
element of the Project The primary metric is a
Charter…, Using the very important measure in
Excel file ‘Define
the Six Sigma project, this
Templates.xls’,
Project Charter,
m etric is a qua ntified
perform the following m ea sure of the defect
exercise: or prima ry issue of
the project.
Since we will be
narrowing in on the W e can only have One
defect thru the Primary metric, recall the
Measure Phase it is equation y equals f of x,
common for the well, once your defect is
primary metric to
– Quantified measure of the defect located then Y will be your
change several times
– Serves as the indicator of project success defect…your primary
while we struggle to
understand what is – Links to the KPI or Primary Business measure metric will measure it.
happening in our – Only one primary metric per project
process of interest.
The primary metric also serves as the gauge for when we can claim victory with the project.
Selecting Projects
Selecting Projects
The fina ncia l eva lua tion esta blishes the va lue of the project.
Standard financial principals should be followed at the beginning and end of the project to provide a
true measure of the improvement’s effect on the organization.
A financial representative of the firm should establish guidelines on how savings will be calculated
throughout the Six Sigma deployment.
Whatever your
organization’s W ha tever your orga niza tion’s protocol ma y be these a spects
protocol may be should be a ccounted for w ithin a ny improvement project.
these aspects
should be There are two types of
accounted for I
Impact, One Off &
within any
M
P
A
Sustainable Impact “One-Off” Impact Sustainable
C
i
improvement t T
Selecting Projects
• B
Benefits
fit should
h ld be
b m ea sured
d in
i a ccorda
d nce w ith
Genera lly Accepted Accounting Principles (GAAP).
A
• Projects directly impact the Income Statement or Cash Flow
Statement.
B
• Projects impact the Balance Sheet (working capital).
Selecting Projects
It is highly recommended that you follow the involvement governance shown here.
B
Benefits
fit CCapture
t - Summary
S
It’s a wrap!
Selecting Projects
The Benefits Calculation Template facilitates and aligns with the aspects discussed for Project
Accounting.
Selecting Projects
Notes
Define Phase
Elements of Waste
Elements of Waste
Overview
77 Com
Components
ponents of
of W
W aaste
ste
55S
S
Definition of Lean
Elements of Waste
Lean – History
Lean Manufacturing has been going on for a very long time, however the phrase is credited to
James Womac in 1990. A small list of accomplishments are noted in the slide above primarily
focused on higher volume manufacturing.
Forms of waste include: Wasted capital (inventory), wasted material (scrap), wasted time (cycle time),
wasted human effort (inefficiency, rework) and wasted energy (energy inefficiency). Lean is a
prescriptive methodology for relatively fast improvements across a variety of processes, from
administrative to manufacturing applications. Lean enables your company to identify waste where it
exists. It also provides the tools to make improvements on the spot.
Elements of Waste
Lean removes many forms of waste so that Six Sigma can focus on eliminating variability.
Variation leads to defects
defects, which is a major source of waste
waste. Six Sigma is a method to make
processes more capable through the reduction of variation. Thus the symbiotic relationship
between the two methodologies.
Lean brings these opportunities for savings back into focus with specific approaches to finding
and eliminating waste.
Elements of Waste
Overproduction
ordering materials
9Over-ordering
9Over
9Duplication of effort/reports
P d i more parts
Producing t th
than necessary to
t satisfy
ti f the
th customer’s
t ’ quantity
tit demand
d d thus
th leading
l di tto
idle capital invested in inventory.
Producing parts at a rate faster than required such that a work-in-process queue is created –
again, idle capital.
Elements of Waste
Correction
Examples are:
9Misspelled words in
communications
Inventory
Examples are:
9Over-ordering materials
consumed in-house
Inventory is a drain on an organization’s overhead. The greater the inventory, the higher the
overhead costs become. If quality issues arise and inventory is not minimized, defective material
i hidd
is hidden iin fifinished
i h d goods.
d
To remain flexible to customer requirements and to control product variation, we must minimize
inventory. Excess inventory masks unacceptable change-over times, excessive downtime,
operator inefficiency and a lack of organizational sense of urgency to produce product.
Elements of Waste
Motion
M ti is
Motion i the
th unnecessary movementt off people
l and
d equipment.
i t
– This includes looking for things like documents or parts as well as
movement that is straining.
Examples are:
9Extra steps
9Having to look
for something
Any movement of people or machinery that does not contribute added value to the product, i.e.
programming delay times and excessive walking distance between operations.
Overprocessing
Examples are:
9Sign-offs
9Communications, reports
9Communications reports,
emails, contracts, etc that
contain more than the
necessary points (briefer is
Waste of Over-processing relates to better)
over-processing anything that may not
be adding value in the eyes of the 9Voice mails that are too
customer. long
Processing work that has no connection to advancing the line or improving the quality of the
product. Examples include typing memos that could be had written or painting components or
fixtures internal to the equipment.
Elements of Waste
Conveyance
Examples are:
9Distance traveled
Conveyance is incidental, required action that does not directly contribute value to the product.
Perhaps it must be moved however, the time and expense incurred does not produce product or
service characteristics that customers see.
It’s vital to avoid conveyance unless it is supplying items when and where they are needed (i.e.
just-in-time delivery).
Waiting
Examples are:
Idle time between operations or events, i.e. an employee waiting for machine cycle to finish or a
machine waiting for the operator to load new parts.
Elements of Waste
– Overproduction
p ___________________
– Correction ___________________
– Inventory ___________________
– Motion ___________________
– Overprocessing ___________________
– Conveyance
y ___________________
– W aiting ___________________
Notes
Elements of Waste
5S – The Basics
Seiso – Clean
Seiketsu – Purity
Shitsuke - Commitment
The term “5S” derives from the Japanese words for five practices leading to a clean and manageable
work area. The five “S” are:
‘Seiri' means to separate needed tools, parts and instructions from unneeded materials and to
remove the latter.
'Seiton' means to neatly arrange and identify parts and tools for ease of use.
'Seiso' means to conduct a cleanup campaign.
'Seiketsu'
Seiketsu means to conduct seiri, seiton and seiso at frequent, indeed daily, intervals to maintain a
workplace in perfect condition.
'Shitsuke' means to form the habit of always following the first four S’s.
Simply put, 5S means the workplace is clean, there is a place for everything and everything is in its
place. The 5S will create a work place that is suitable for and will stimulate high quality and high
productivity work. Additionally it will make the workplace more comfortable and a place of which you
can be proud.
Developed in Japan, this method assume no effective and quality job can be done without clean and
safe environment and without behavioral rules.
The 5S approach allows you to set up a well adapted and functional work environment, ruled by
simple yet effective rules. 5S deployment is done in a logical and progressive way. The first three S’s
are workplace actions, while the last two are sustaining and progress actions.
It is recommended to start implementing 5S in a well chosen pilot workspace or pilot process and
spread to the others step by step.
Elements of Waste
English Translation
There have been many attempts to force 5 English “S”S words to maintain the original intent of 5S
from Japanese. Listed below are typical English words used to translate:
1. Sort (Seiri)
2. Straighten or Systematically Arrange (Seiton)
3. Shine or Spic and Span (Seiso)
4. Standardize (Seiketsu)
5. Sustain or Self-Discipline (Shitsuke)
Straighten
Sort Shine
5S
Identify necessary items and Visual sweep of areas,
remove unnecessary ones, use eliminate dirt, dust and
time management scrap. Make workplace
shine.
Self-Discipline
Standardize
Make 5S strong in
Work to standards,
habit. Make
maintain standards,
problems appear and
wear safety equipment.
solve them.
Regardless of which “S” words you use, the intent is clear: Organize the workplace, keep it neat and
clean, maintain standardized conditions and instill the discipline required to enable each individual to
achieve and maintain a world class work environment.
Elements of Waste
5S Exercise
• Sortt
S ____________________
• Straighten ____________________
• Shine ____________________
• Standardize ____________________
• Self-Discipline ____________________
Notes
Elements of Waste
Describe 5S
Notes
Define Phase
Wrap Up and Action Items
Now we will conclude the Define Phase with “Wrap Up and Action Items”.
• Making data-based
data based decisions
Look for the potential roadblocks and plan to address them before
they become problems:
– N o historical data exists to support the project.
– Team members do not have the time to collect data.
– Data presented is the best guess by functional managers.
– Data is communicated from poor systems.
– The project is scoped too broadly.
– The team creates the “ ideal
ideal” Process Map rather than the “ as
is” Process Map.
DMAIC Roadmap
Process Owner
Champion/
Estimate COPQ
Establish Team
Measure
Prove/Disprove Impact X’
x’s
s Have On Problem
N Estimate COPQ
Approved
Project Recommend Project Focus
Focus
Y
Create Team
Charter Team
Define Questions
Step One: Project Selection, Project Definition And Stakeholder Identification
Project Charter
• What is the problem statement? Objective?
• Is the business case developed?
• What is the primary metric?
• What are the secondary metrics?
• Why did you choose these?
• What are the benefits?
• Have the benefits been quantified? It not, when will this be done?
Date:____________________________
• Who is the customer (internal/external)?
• Has the COPQ been identified?
• Has the controller’s office been involved in these calculations?
• Who are the members on your team?
• Does anyone require additional training to be fully effective on the team?
Voice of the Customer (VOC) and SIPOC defined
• Voice of the customer identified?
• Key issues with stakeholders identified?
• VOC requirements identified?
• Business Case data gathered, verified and displayed?
Step Two: Process Exploration
Processes Defined and High Level Process Map
• Are the critical processes defined and decision points identified?
• Are all the key attributes of the process defined?
• Do you have a high level process map?
• Who was as in
involved
ol ed in its de
development?
elopment?
General Questions
• Are there any issues/barriers that prevent you from completing this phase?
• Do you have adequate resources to complete the project?
• Have you completed your initial Define report out presentation?
These are some additional questions to ensure all the deliverables are achieved.
Notes
Measure Phase
Welcome to Measure
Now that we have completed the Define Phase we are going to jump into the Measure Phase.
Welcome to the Measure Phase - will give you a brief look at the topics we are going to cover.
Welcome to Measure
Overview
Process
Process Discovery
Discovery
Six
Six Sigma
Sigma Statistics
Statistics
Measurement
Measurement System
System Analysis
Analysis
Process
Process Capability
Capability
Wrap
Wrap Up
Up &
& Action
Action Items
Items
DMAIC Roadmap
Process Owner
Champion/
D t
Determine
i A Appropriate
i t PProject
j tF Focus
Define
Estimate COPQ
Establish Team
Measure
Verify
y Financial Impact
p
Here is the overview of the DMAIC process. Within Measure we are going to start getting into details about
process performance, measurement systems and variable prioritization.
Welcome to Measure
Select the Vital Few X’s Causing Problems (X-Y Matrix, FMEA)
Y
Repeatable &
Reproducible?
N
This provides a process look at putting “Measure” to work. By the time we complete this phase you
will have a thorough understanding of the various Measure Phase concepts
concepts.
Measure Phase
Process Discovery
Process Discovery
Overview
Welcome
Welcome to
to Measure
Measure
Process
Process Discovery
Discovery
Cause
Cause &
& Effect
Effect Diagram
Diagram
Detailed
Detailed Process
Process Mapping
Mapping
Cause
Cause and
and Effect
Effect Diagrams
Diagrams
FMEA
FMEA
Six
Six Sigma
Sigma Statistics
Statistics
Measurement
Measurement System
System Analysis
Analysis
Process
Process Capability
Capability
Wrap
Wrap Up
Up &
& Action
Action Items
Items
The purpose of this module is highlighted above. We will review tools to help facilitate Process
Discovery.
This will be a lengthy step as it requires a full characterization of your selected process
process.
On the next lesson page we will help you develop a visual and mental model that will give you
leverage in finding the causes to any problem..
Process Discovery
The Y
The or
Problem
The X’s Problem
Condition
(Causes)
l
Material Measurement Environment Categories
You will need to use brainstorming techniques to identify all possible problems and their causes.
Brainstorming techniques work because the knowledge and ideas of two or more persons is
always greater than that of any one individual.
Brainstorming will generate a large number of ideas or possibilities in a relatively short time.
Brainstorming tools are meant for teams
teams, but can be used at the individual level also
also.
Brainstorming will be a primary input for other improvement and analytical tools that you will use.
You will learn two excellent brainstorming techniques, cause and effect diagrams and affinity
diagrams. Cause and effect diagrams are also called Fishbone Diagrams because of their
appearance and sometimes called Ishikawa diagrams after their inventor.
In a brainstorming session, ideas are expressed by the individuals in the session and written down
without debate or challenge
challenge. The general steps of a brainstorming sessions are:
Process Discovery
A cause and effect diagram is a composition of lines and words representing a meaningful
relationship between an effect,
effect or condition
condition, and its causes
causes. To focus the effort and facilitate thought
thought,
the legs of the diagram are given categorical headings. Two common templates for the headings are
for product related and transactional related efforts. Transactional is meant for processes where
there is no traditional or physical product; rather it is more like an administrative process.
Transactional processes are characterized as processes dealing with forms, ideas, people,
decisions and services. You would most likely use the product template for determining the cause of
burnt pizza and use the transactional template if you were trying to reduce order defects from the
order taking process
process. A third approach is to identify all categories as you best perceive them
them.
When performing a cause and effect diagram, keep drilling down, always asking why, until you find
the root causes of the problem. Start with one category and stay with it until you have exhausted all
possible inputs and then move to the next category. The next step is to rank each potential cause by
its likelihood of being the root cause. Rank it by the most likely as a 1, second most likely as a 2 and
so on. This make take some time, you may even have to create sub-sections like 2a, 2b, 2c, etc.
Then come back to reorder the sub-section in to the larger ranking. This is your first attempt at really
finding the Y=f(X); remember the funnel? The top X’s have the potential to be the Critical X’s, those
X’s which exert the most influence on the output Y.
Finally you will need to determine if each cause is a control or a Noise factor. This as you know is a
requirement for the characterization of the process. Next we will explain the meaning and methods
of using some of the common categories.
There may be several interpretations of some of the Process Mapping symbols; however, just about
everyone uses these primary symbols to document processes. As you become more practiced you
will find additional symbols useful, i.e. reports, data storage etc. For now we will start with just these
symbols.
Process Discovery
The People category groups root causes related to people, staffing, and
organizations:
Examples
p of q
questions to ask: People
p
• Are people trained, do they
have the right skills?
• Is there person to person
Y
variation?
• Are people over - worked?
The Method category groups root causes related to how the work is done, the
way the process is actually conducted:
Examples
p of q
questions to ask: Method
• How is this performed?
• Are procedures correct?
• What might unusual? Y
The Materials category groups root causes related to parts, supplies, forms or
information needed to execute a process:
Process Discovery
The Equipment category groups root causes related to tools used in the process:
Examples of questions to ask:
• Have machines been serviced recently,
what is the uptime?
• Have tools been properly maintained? Y
• Is there variation?
Equipment
The Environment (a.k.a. Mother Nature) category groups root causes related to
our work environment, market conditions, and regulatory issues.
Examples of questions to ask:
• Is the workplace safe and
comfortable? Y
• Are outside regulations impacting the
business?
• Does the company culture aid the
process? Environment
For each of the X’s identified in the Fishbone diagram classify them
as follows:
– Controllable – C (Knowledge)
– Procedural – P (People, Systems)
– Noise – N (External or Uncontrollable)
WHICH X’s
X s CAUSE DEFECTS?
The Cause and Effect Diagram is an organized way to approach brainstorming. This approach allows
us to further organize ourselves by classifying the X’s into controllable, procedural or noise types.
Process Discovery
Measurement
Capability (C) Adherence to procedure (P) S
Specifications
ifi ti (C)
Chemical
Startup inspection (P) Room Humidity (N) Column Capability (C) Purity
Handling (P) RM Supply in Market (N) Nozzle type (C)
Purification Method (P) Shipping Methods (C) Temp controller (C)
Data collection/feedback
(P)
This example of the Cause and Effect Diagram is of chemical purity. Notice how the input variables for
each branch are classified as Controllable, Procedural and Noise.
Below is a Cause & Effect Diagram for surface flaws. The next few
slides will demonstrate how to create it in MINITAB™.
The Fishbone Diagram shown here for surface flaws was generated in MINITAB™. We will now
review the various steps for creating a Cause and Effect Diagram using the MINITAB™
statistical software package.
Process Discovery
Open the MINITAB™ Project “Measure Data Sets.mpj” and select the worksheet
Surfaceflaws.mtw.
Take a few moments to study the worksheet. Notice the first 6 columns are the classic bones for a
Fishbone. Each subsequent column is labeled for one of the X’s listed in one of the first six columns
and are the secondary bones
bones.
After you have entered the Labels, click on the first field under the “Causes” column to bring up the
list of branches on the left hand side. Next double-click the first branch name on the left hand side to
move “C1 Man” underneath “Causes”.
Process Discovery
To continue identifying
the secondary
branches, select the
button, “Sub…” to the
right of the “Label”
column.
In order to adjust the Fishbone Diagram so the main causes titles are
not rolled grab the line with your mouse and move the entire bone.
Process Discovery
Process Discovery
ct
Sta rt Step A Step B Step C St
Step D Fi i h
Finish
e
sp
In
Process Mapping, also called flowcharting, is a technique to visualize the tasks, activities and steps
necessary to produce a product or a service. The preferred method for describing a process is to
identify it with a generic name, show the workflow with a Process Map and describe its purpose with
an operational description
description.
Remember that a process is a blending of inputs to produce some desired output. The intent of each
task, activity and step is to add value, as perceived by the customer, to the product or service we are
producing. You cannot discover if this is the case until you have adequately mapped the process.
Individual maps developed by Process Members form the basis of Process Management. The
individual processes are linked together to see the total effort and flow for meeting business and
customer needs.
In order to improve or to correctly manage a process, you must be able to describe it in a way that
can be easily understood, that is why the first activity of the Measure Phase is to adequately describe
the process under investigation. Process Mapping is the most important and powerful tool you will
use to improve the effectiveness and efficiency of a process.
Process Discovery
Process Mapping
Then there is the third view: “what it should be”. This is the result of process improvement activities. It
is precisely what you will be doing to the key process you have selected during the weeks between
classes. As a result of your project you will either have created the “what it should be” or will be well
on your way to getting there. In order to find the “what it should be” process, you have to learn
process mapping and literally “walk”
walk the process via a team method to document how it works. This is
a much easier task then you might suspect, as you will learn over the next several lessons.
Process Discovery
There may be several interpretations of some of the Process Mapping symbols; however, just
about everyone uses these primary symbols to document processes. As you become more
practiced you will find additional symbols useful, i.e. reports, data storage etc. For now we will
start with just these symbols.
Process Discovery
Levell 1 – The
L Th Macro
M Process
P Map,
M sometimes
ti called
ll d a
Management level or viewpoint.
Calls
Customer Take Make Cook Pizza Box Deliver Customer
for
Hungry Order Pizza Pizza Correct Pizza Pizza Eats
Order
No
Take Order Add Place in Observe Check Yes Remove
from Cashier Ingredients Oven Frequently if Done from Oven 1
Start New
Pizza
Scrap
No
Tape
Pizza Place in Put on
1 Correct Box
Order on Delivery Rack
Yes Box
Before Process Mapping starts, you have to learn about the different level of detail on a Process
Map and the different types of Process Maps. Fortunately these have been well categorized and
are easy to understand.
There are three different levels of Process Maps. You will need to use all three levels and you most
likely will use them in order from the macro map to the micro map. The macro map contains the
least level of detail, with increasing detail as you get to the micro map. You should think of and use
the level of Process Maps in a way similar to the way you would use road maps. For example, if
you want to find a country, you look at the world map. If you want to find a city in that country, you
look at the country map. If you want to find a street address in the city, you use a city map. This is
the general rule or approach for using Process Maps.
Thee Macro
ac o Process
ocess Map,ap, what
at iss called
ca ed tthe
e Level
e e 1 Map,
ap, sshows
o s ttheebbig
gppicture,
ctu e, you will use tthis
s to
orient yourself to the way a product or service is created. It will also help you to better see which
major step of the process is most likely related to the problem you have and it will put the various
processes that you are associated with in the context of the larger whole. A Level 1 PFM,
sometimes called the “management” level, is a high-level process map having the following
characteristics:
Process Discovery
Probably not, you are going to need a Level 3 Map called the Micro Process Map. It is also known
as the improvement view off a process. There is however a lot off value in the Level 2 Map,
because it is helping you to “see” and understand how work gets done, who does it, etc. It is a
necessary stepping stone to arriving at improved performance.
Next we will introduce the four different types of Process Maps. You will want to use different
types of Process Maps, to better help see, understand and communicate the way processes
behave.
Take
Order
Map (pronounced
sipoc) and the
Cook
M k
Make C
Cookk Pizza Box
Pizza Pizza Correct Pizza
Value Stream
Map.
Deliverer
Deliver
Pizza
The value of the Swim Lane map is that is shows you who or which department is responsible for
While they all
the steps in a process. This can provide powerful insights in the way a process performs. A
timeline can be added to show how long it takes each group to perform their work. Also each
show how work
time work moves across a swim lane, there is a “Supplier – Customer” interaction. This is usually
where bottlenecks and queues form.
gets done, they
emphasize
different aspects of process flow and provide you with alternative ways to understand the
behavior of the process so you can do something about it. The Linear Flow Map is the most
traditional and is usually where most start the mapping effort.
The Swim Lane Map adds another dimension of knowledge to the picture of the process: Now
you can see which department area or person is responsible. You can use the various types of
maps in the form of any of the three levels of a Process Map.
Process Discovery
L in e a r P r o c e s s M a p fo r D o o r M a n u fa c tu r in g
B e g in P r e p d o o r s In s p e c t P r e -c le a n in g A
R e tu r n
fo r
r e w o r k
M a r k f o r d o o r
In s ta ll in to In s p e c t
A w o r k jig
L ig h t s a n d in g
f in is h
h a n d le B
d r illin g
R e w o r k
D e - b u r r a n d A p p ly p a r t M o v e t o
B D r ill h o le s
s m o o th h o le n u m b e r fin is h in g
C
S c r a t c h F in a l A p p ly s t a in
C In s p e c t In s p e c t E n d
r e p a ir c le a n in g a n d d r y
S c r a p
S w im L a n e P r o c e s s M a p fo r C a p ita l E q u ip
P r e p a r e
Business
D e fin e p a p e r w o r k R e v ie w &
R e c e iv e &
Unit
( C A A R & a p p r o v e
N e e d s in s t a lla tio n C A A R
u s e
r e q u e s t )
R e v ie w &
C o n f ig u r e
I.T.
a p p r o v e
& in s t a ll
s t a n d a r d
Finance
R e v ie w &
Is s u e
a p p r o v e
p a y m e n t
C A A R
Corporate
Top Mgt/
R e v ie w &
a p p r o v e
C A A R
Procurement
A c q u ir e
e q u ip m e n t
S u p p lie r S u p p lie r
Supplier
S h ip s P a id
2 1 d a y s 6 d a y s 1 5 d a y s 5 d a y s 1 7 d a y s 7 d a y s 7 1 d a y s 5 0 d a y s
gathering of other r Drink types & quantities r Order transaction r Correct Price
r Other products r Delivery info
pertinent data that is r Phone number
N ame
systematic way. It will r Time, day and date
r Volume
help you to better see
and understand all of the Level 1 Process M a p for Custom er O rder Process
influences affecting the
Call for Answer W rite Confirm Sets Address Order to
behavior and an Order Phone Order Order Price & Phone Cook
performance of the
process. The SIPOC diagram is especially useful after you have been able to construct
either a Level 1 or Level 2 Map because it facilitates your gathering of other
You may also add a pertinent data that is affecting the process in a systematic way.
requirements section
to both the supplier side and the customer side to capture the expectations for the inputs and
the outputs of the process. Doing a SIPOC is a great building block to creating the Level 3
Micro Process Map. The two really compliment each other and give you the power to make
improvements to the process.
Process Discovery
thi P
this Process M Map llevell iis att queue 2.65 days 20.47 days 16.9 days 1.60 days 7.57 days
Read the following background for the exercise: You have been concerned
about your ability to arrive at work on time and also the amount of time it takes
from the time your alarm goes off until you arrive at work. To help you better
understand both the variation in arrival times and the total time,, you
y decide to
create a Level 1 Macro Process Map. For purposes of this exercise, the start is
when your alarm goes off the first time and the end is when you arrive at your
work station.
Task 1 – Mentally think about the various tasks and activities that you routinely
do from the defined start to the end points of the exercise.
Task 2 – Using a pencil and paper create a linear process map at the macro
level but with enough detail that you can see all the major steps of your
level,
process.
Task 3 – From the Linear Process Map, create a swim lane style Process Map.
For the lanes you may use the different phases of your process, such as the
wake up phase, getting prepared, driving, etc.
Process Discovery
Process Mapping
follows a general Select the process
Create the Level 2 Create a Level 3
PFM PFM
order, but sometimes
you may find it
necessary, even Determine Add Performance
approach to map Perform SIPOC
advisable to deviate the process data
somewhat. However,
you will find this a
good path to follow Complete Level 1
PFM worksheet
Identify all X’s and Identify VA/NVA
Y’s steps
as it has proven itself
to generate
significant results. Identify customer
Create Level 1 PFM
requirements
On the lessons
ahead we will always
show you where you Define the scope Identify supplier
for the Level 2 PFM requirements
are at in this
sequence of tasks
for Process Mapping. Before we begin our Process Mapping we will first start you off with how to
determine the approach to mapping the process.
Basically there are two approaches: the individual and the team approach.
If you decide to do the individual approach, here are a few key factors: You must pretend that you are the
product or service flowing through the process and you are trying to “experience” all of the tasks that
h
happen th
through
h th
the various
i steps.
t
You must start by talking to the manager of the area and/or the process owner. This is where you will
develop the Level 1 Macro Process Map. While you are talking to him, you will need to receive permission
to talk to the various members of the process in order to get the detailed information you need.
Process Discovery
Process Mapping
P M i
works best with a Select the Using the Team Approach
process 1. Start with the Level 1 Macro Process Map.
team approach. The
2. Meet with process owner(s) / manager(s). Create a
logistics of Level 1 Map and obtain approval to call a process
Determine
performing the approach to mapping meeting with process members (See team
map the workshop instructions for details on running the
mapping a process meeting).
somewhat different, 3. Bring key members of the process into the process
Complete
but it overall it takes Level 1 flow workshop. If the process is large in scope, hold
less time, the quality PFM individual workshops for each subsection of the
worksheet total process. Start with the beginning steps.
of the output is Organize meeting to use the “post-it note approach
higher and you will Create to gather individual tasks and activities, based on
Level 1 the macro map, that comprise the process.
have more “buy-in” PFM 4. Immediately assemble the information that has
into the results. Input been provided into a Process Map.
should come from Define the 5. Verify the PFM by discussing it with process owners
scope for and by observing the actual process from beginning
individuals familiar the Level 2
PFM to end.
with
ith allll stages
t off
process.
Where appropriate the team should include line individuals, supervisors, design engineers, process
engineers, process technicians, maintenance, etc. The team process mapping workshop is where it
all comes together.
In summary, after adding to and agreeing to the Macro Process Map, the team process mapping
approach is performed using multiple post-it notes where each person writes one task per note and,
when finished, place them onto a wall which contains a large scale Macro Process Map.
This is a very fast way to get a lot of information including how long it takes to do a particular task.
Using the Value Stream Analysis techniques which you will study laterlater, you will use this data to
improve the process. We will now discuss the development of the various levels of Process Mapping.
Process Discovery
A Macro Process Map can be useful when reporting project status to management. A macro-map can
show the scope of the project
project, so management can adjust their expectations accordingly.
accordingly Remember
Remember,
only major process steps are included. For example, a step listed as “Plating” in a manufacturing
Macro Process Map, might actually consists of many steps: pre-clean, anodic cleaning, cathodic
activation, pre-plate, electro-deposition, reverse-plate, rinse and spin-dry, etc. The plating step in the
macro-map will then be detailed in the Level 2 Process Map.
Exercise – Generate a Level 1 PFM
Process Discovery
If necessary,
necessary you may look
at the example for the Pizza 1. Identify a generic name for the process:
order entry process.
4. Mentally “walk” through the major steps of the process and write
them down:
1
1. Identify a generic name for the process:
(I.E. customer order process).
• Mentally “walk” through the major steps of the process and write them
down:
(Receive the order via phone call from the customer, calculate the price,
create a build order and provide the order to the chef).
Process Discovery
th details.
the d t il If th
the efficiency
ffi i map the No
process Take Order Add Place in Observe Check Yes Remove
or effectiveness of the from Cashier Ingredients Oven Frequently if Done from Oven 1
process could be
significantly improved by a Complete Start New
Pizza
Level 1
broad summary analysis, PFM Scrap
the improvement would be worksheet No
done already. If you map 1 Pizza Place in
Tape
Order on
Put on
Correct Box Delivery Rack
the process at an Yes Box
Create
actionable level, you can
Level 1
identify the source of PFM The rules for determining the Level 2 Process Map scope:
inefficiencies and defects.
But you need to be careful • From your Macro Process Map, select the area which represents your
about mapping too little an problem.
Define the
area and missing your scope for • Map this area at a Level 2.
problem cause, or mapping the Level 2
PFM • Start and end at natural startingg and stopping
pp g ppoints for a pprocess, in
t large
to l an area in
i d detail,
t il other words you have the complete associated process.
thereby wasting your
valuable time.
Process Discovery
Building a SIPOC
Identify
y
supplier
requirements
The tool name prompts the team to consider the suppliers (the 'S' in SIPOC) of your process, the
inputs (the 'I') to the process, the process (the 'P') your team is improving, the outputs (the 'O') of
the process and the customers (the 'C') that receive the process outputs.
Requirements of the customers can be appended to the end of the SIPOC for further detail and
requirements are easily added for the suppliers as well.
The SIPOC tool is particularly useful in identifying:
Who supplies inputs to the process?
What are all of the inputs to the process we are aware of? (Later in the DMAIC methodology
you will use other tools which will find still more inputs, remember Y=f(X) and if we are going to
improve Y, we are going to have to find all the X’s.
What specifications are placed on the inputs?
What are all of the outputs of the process?
Who are the true customers of the process?
What are the requirements of the customers?
You can actually begin with the Level 1 PFM that has 4 to 8 high-level steps, but a Level 2 PFM is even
of more value. Creating a SIPOC with a process mapping team, again the recommended method is a
wall exercise similar to your other process mapping workshop. Create an area that will allow the team to
place post-it
post it note additions to the 8.5
8 5 X 11 sheets with the letters S,
S I,
I P,
P O and C on them with a copy of
the Process Map below the sheet with the letter P on it.
Hold a process flow workshop with key members. (Note: If the process is large in scope, hold an
individual workshop for each subsection of the total process, starting with the beginning steps).
The preferred order of the steps is as follows:
1. Identify the outputs of this overall process.
2. Identify the customers who will receive the outputs of the process.
3. Identify
f customers’ preliminary requirements
4. Identify the inputs required for the process.
5. Identify suppliers of the required inputs that are necessary for the process to function.
6. Identify the preliminary requirements of the inputs for the process to function properly.
Process Discovery
The Excel spreadsheet is somewhat self explanatory. You will use a similar form for identifying the
supplier requirements. Start by writing in the process name followed by the process operational
definition. The operational definition is a short paragraph which states why the process exists, what it
does and what its value proposition is. Always take sufficient time to write this such that anyone who
reads it will be able to understand the process. Then list each of the outputs, the Y’s, and write in the
customer’s name who receives this output, categorized as an internal or external customer.
Next are the requirements data. To specify and measure something, it must have a unit of measure;
called a metric. As an example, the metric for the speed of your car is miles per hour, for your weight it is
pounds, for time it is hours or minutes and so on. You may know what the LSL and USL are but you may
not have a target value. A target is the value the customer prefers all the output to be centered at;
essentially, the average of the distribution. Sometimes it is stated as “1 hour +/- 5 minutes”. One hour is
the target, the LSL is 55 minutes and the USL is 65 minutes. A target may not be specified by the
customer; if not, put in what the average would be. You will want to minimize the variation from this
value.
value
You will learn more about measurement, but for now you must know that if something is required, you
must have a way to measure it as specified in column 9. Column 10 is how often the measurement is
made and column 11 is the current value for the measurement data. Column 12 is for identifying if this is
a value or non value added activity; more on that later. And finally column 13 is for any comments you
want to make about the output.
You will
Yo ill come back to this form and rank the significance of the o
outputs
tp ts in terms of importance to identif
identify
the CTQ’s.
Process Discovery
Target USL
Measurement
System (How is it Frequency of Performance
Measured) Measurement Level Data
NV
or
NVA Comments
Later you will come back to this form and rank the importance of the inputs to the success of your
process and eventually you will have found the Critical X’s.
Process Discovery
It is important to distinguish which category an input falls into. You know through Y=f(X), that if it is a
Critical X, by definition, that you must control it. Also if you believe that an input is or needs to be
controlled then you have automatically implied there are requirements placed on it and that it must be
controlled,
measured. You must always think and ask whether an input is or should be controlled or if it is
uncontrolled.
Read the following background for the exercise: You will use
Perform your selected key process for this exercise (if more than one
SIPOC
person in the class is part of the same process you may do it as a
small ggroup).
p) You may y not have all the p
pertinent detail to correctly
y
identify all supplier requirements, that is ok, do the best you can.
Identify all X’s This will give you a starting template when you go back to do your
and Y’s
workplace assignment. Use the process input identification and
analysis form for this exercise.
Identify
customer Task 1 – Identify a generic name for the process.
requirements
Task 2 - Write an operational description for the process
Task 3 - Complete
p the remainder of the form except
p the Value –
Identify Non value added column.
supplier Task 4 - Report out to the class when called upon,
requirements
Process Discovery
Pi
Pizza
Dough
No
Take Order Add Place in Observe Check Yes Remove
from Cashier Ingredients Oven Frequently if Done from Oven 1
Start New
Pizza
Scrap
No
Tape
Pizza Place in Put on
1 Correct Box
Order on Delivery Rack
Yes Box
OUTPUT
1 IDENTIFICATION
3 4 AND
5 ANALYSIS
6 7 8 9 10 11 12 13 INPUT IDENTIFICATION AND ANALYSIS
1 Output Data 3 4 5 Requirements
6 Data7 8 9 Measurement
10 Data 11 Value Data
12 General Data/Information
13 INPUT
1 IDENTIFICATION
2 3 AND4 ANALYSIS
5 6 7 8 9 10 11 12 13
Customer (Name) Metric 1 Input Data2 3 4 5Requirements
6 Data7 8 9 Measurement 10Data 11 Value Data
12 General Data/Information
13
Output Data Requirements Data Measurement Measurement Data Value
VA Data General Data/Information
Customer (Name) Metric System (How is it Frequency of or VA Input Data Supplier (Name) Metric
Requirements Data Measurement Measurement Data VA
Value Data General Data/Information
Measurement Controlled (C) System (How is it Frequency of Performance or VA
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Supplier (Name) Metric Measurement
System (How is itMeasurement
Frequency of Performance Level Data NVA or Comments
Process Input- Name (X) Noise Internal
(N) (C) External Metric LSL Target USL Measured)
Controlled it Frequency ofLevel
System (How isMeasurement Data
Performance NVA or Comments
Process Output - Name (Y) Internal External Metric LSL Target USL Measured) Measurement Performance Level Data NVA Comments
Process Input- Name (X) Noise (N) Internal External Metric LSL Target USL Measured) Measurement Level Data NVA Comments
You h
Y have a d
decision
i i att thi
this point
i t tto continue
ti with
ith a complete
l t characterization
h t i ti off th
the process you h
have
documented at a Level 2 in order to fully build the process management system or to narrow the effort
by focusing on those steps that are contributing to the problem you want solved.
Usually just a few of the process steps are the root cause areas for any given higher level process
output problem. If your desire is the latter, there are some other Measure Phase actions and tools you
will have to use to narrow the number of potential X’s and subsequently the number of process steps.
To narrow the
T th scope so it is
i relevant
l t to
t your problem
bl consider
id ththe ffollowing:
ll i R
Remember
b using
i ththe pizza
i
restaurant as our example for selecting a key process? They were having a problem with overall delivery
time and burnt pizzas. Which steps in this process would contribute to burnt pizzas and how might a
pizza which was burnt so badly it had to be scrapped and restarted effect delivery time? It would most
likely be the steps between “place in oven” to “remove from oven”, but it might also include “add
ingredients” because certain ingredients may burn more quickly than others. This is how, based on the
Problem Statement you have made, you would narrow the scope for doing a Level 3 PFM.
For your project, the priority will be to do your best to find the problematic steps associated with your
Problem Statement. We will teach you some new tools in a later lesson to aid you in doing this. You may
have to characterize a number of steps until you get more experience at narrowing the steps that cause
problems; this is to be expected. If you have the time you should characterize the whole process.
Each step you select as the causal steps in the process must be fully characterized, just as you have
previously done for the whole process. In essence you will do a “mini SIPOC” on each step of the
process as defined in the Level 2 Process Map.Map This can be done using a Level 3 Micro Process Map
and placing all the information on it or it can be consolidated onto an Excel spreadsheet format or a
combination of both. If all the data and information is put onto an actual Process Map, expect the map to
be rather large physically. Depending on the scope of the process, some people dedicate a wall space
for doing this; say a 12 to 14 foot long wall. An effective approach for this is to use a roll of industrial
Process Discovery
A Level 3 Process Map contains all of the process details needed to meet your objective: all of the flows,
set points, standard operating procedures (SOPs), inputs and outputs; their specifications and if they are
classified as being controllable or non-controllable (noise). The Level 3 PFM usually contains estimates of
defects per unit (DPU), yield and rolled throughput yield (RTY) and value/non-value
value/non value add. If processing
cycle times and inventory levels (materials or work queues) are important, value stream parameters are
also included.
This can be a lot of detail to manage and appropriate tracking sheets are required. We have supplied
these sheets in a paper and Excel spreadsheet format for your use. The good news is the approach and
forms for the steps are essentially the same as the format for identifying supplier and customer
requirements at the process level. A spreadsheet is very convenient tool and the output from the
spreadsheet can then be fed directly into a C&E matrix and an FMEA (to be described later),
later) also built
using spreadsheets.
You will find the work you have done up to this point in terms of a Level 1 and 2 Process Maps and the
SIPOC will be of use, both from knowledge of the process and actual data.
An important reminder of a previous lesson: You will recall when you were taught about project definition
where it was stated that you should only try to solve the performance of only one process output, at any
one time
time. Because of the amount of detail you can get into for just one Y
Y, trying to optimize more than one
Y at a time can become overwhelming. The good news is that you will have laid all the ground work to
focus on a second and a third Y for a process by just focusing on one Y in your initial project.
Process Inputs (X’s) and Outputs (Y’s)
You are now down at the PROCESS STEP
Process Name Step Name/Number
improvement view of a
process. Now you do
exactly the same thing
Add
as you did for the overall Performance
process, you list all of data
Process Discovery
characterized. This Ys
visualization shows many N/C 7”, 12”, 16”
N/C 12 meats, 2 veggies, 3 cheese
Size of Pizza
Toppings
of the inputs and outputs N N/A Name
N Within 10 miles Address Order •All fields
and their requirements. By N Within area code Phone
Take Order
complete
N 11 AM to 1 AM Time
using the process and the N 5 X 52 Day
process step input and N MM/DD,YY Date
output
t t sheets,
h t you gett a
very detail picture about C All fields complete
C Per Spec Sheets
Order
Ingredients •Size
Make Pizza Raw •Weight
how your process works. S.O.P Per Rev 7.0 Recipe Pizza
•Ingredients
C As per recipe chart 3-1 in Oz. Amounts
Now you have enough data correct
Identifying Waste
When we produce
A Writes Add to Rewrite
products or services, we NV
time on Order order
scratch
pad
p
engage process-based A No
Greetings Request
NV NoWrites on
Call for an Answer and Asks for Confirm
activities to transform Order phone mention
specials
order from
customer
scratch
pad more? order
Yes
physical materials, ideas 1 2
and information into NV
A No
No
Asks cook Inform Gets Thanks
something valued by 2 Calculate
price for time
estimate
customer
of
price/time
Yes
Order
still OK?
address &
phone #
3 customer
& hangs
up
Another
call
waiting
customers. Some A
NV
Yes
Writes
time on
Create a Yes
activities in the process Level 3 •Each process activity can be tested for 1
scratch
pad New
order?
its value-add contribution
generate true value
value, PFM N
No
A
others do not. The •Ask the following two questions to NV Completes
order from
3
Add identify non-value added activity: from note
pad
expenditure of resources, Performance –Is the form, fit or function of the A
data work item changed as a result of OK NV
capital and other this activity?
Give order to
Cook
Verify
with
notes
Not
energies that do not Identify
–Is the customer willing to pay for OK
this activity? A
generate value is VA/NVA NV
Rewrite
Order
steps
considered waste. Value
generation is any activity
that changes the form, fit or function of what we are working on in a way that the customer is willing to
pay for. The goal of testing for VA vs. NVA is to remove unnecessary activity (waste) from a process.
Hint: If an action starts with the two letters “re” it’s a good chance that it’s a form of waste, i.e. rework,
replace, review, etc.
Some non-value activities cannot be removed; i.e., data collection is required to understand and plan
production activity
p y levels,, data must be collected to comply
p y with g
governmental regulations,
g , etc. ((even
though the data have no effect on the actual product or service)
On the process flow diagram we place a red X through the steps or we write NVA or VA by each step.
Process Discovery
A Six Sigma Belt does not just discover which X’s are important in
a process (the vital few).
– The team considers all possible X’s that can contribute or
cause the problem observed.
– The team uses 3 primary sources of X identification:
• Process
ocess Mapping
app g
• Fishbone Analysis
• Basic Data Analysis – Graphical and Statistical
– A List of X’s is established and compiled.
– The team then prioritizes which X’s it will explore first, and
eliminates the “obvious” low impact X’s from further
consideration.
This is an important tool for the many reasons we have already stated. Use it to your benefit,
leverage the team and this will help you progress you through the methodology to accomplish your
ultimate project goal.
Process Discovery
This is the X-Y Diagram. You should have a copy of this template. If possible open it and get
familiar with it as we progress through this section.
Process Discovery
Li t X’s
List X’ from
f Fishbone
Fi hb Diagram
Di in
i horizontal
h i t l rows
Use your Fishbone Diagram as the source and type in the Inputs in this section, use common sense,
some of the info from the Fishbone may not justify going into the X-Y inputs.
Process Discovery
Process Discovery
Example
Example
This is the summary
worksheet. If you click Click the Summary Worksheet
on the “Summary” tab
you will see this output.
Take some time to YX Diagram Summary
review the worksheet. Process: laminatingg
Date: 5/2/2006 Input Matrix Results
100.00%
Output Variables Input Variables 90.00%
80.00%
Description Weight Description Ranking Rank %
Output (Y's)
70.00%
60.00%
broken 10 temperature 162 14.90% 50.00%
40.00%
unbonded area 9 human handling 159 14.63% 30.00%
20.00%
smears 8 material properties 130 11.96% 10.00%
0.00%
thickness 7 washer 126 11.59%
temperature
time
clean room cleanliness
material properties
pressure
Process Discovery
Definition of FMEA
Failure Modes Effect
Analysis or FMEA Failure Modes Effect Analysis (FMEA) is a structured approach to:
[*usually pronounced • Predict failures and pprevent their occurrence in manufacturingg
as F-M-E-A (individual
and other functional areas which generate defects.
letters) or FEMA** (as
a word)] is a structured • Identify the ways in which a process can fail to meet critical
approach to: read customer requirements (Y).
bullets. FMEA at this
point is developed with • Estimate the Severity, Occurrence and Detection (SOD) of
tribal knowledge with a defects
cross-functional
cross functional team.
team • Evaluate the current control plan for preventing these failures
Later using process from occurring and escaping to the customer.
data the FMEA can be
updated and better • Prioritize the actions that should be taken to improve and control
estimates of detection the process using a Risk Priority Number (RPN).
and occurrence can be
obtained. The FMEA is
not a tool to eliminate
X’s but rather control
Give
G ve mee an
a “F”,
F , give
g ve mee an
a “M”……
M
the X’s. It is only a tool
to identify potential X’s
and prioritize the order
in which the X’s should
be evaluated.
Process Discovery
History of FMEA
History of FM EA:
• First used in the 1960’s in the Aerospace industry during the
Apollo missions
• In 1974 the N avy developed MIL-STD-1629 regarding the use of
FMEA
• In the late 1970’s
1970 s, automotive applications driven by liability
costs, began to incorporate FMEA into the management of their
processes
• Automotive Industry Action Group (AIAG) now maintains the
FMEA standard for both Design and Process FMEA’s
The “edge of your seat” info on the history of the FMEA! I’m sure you will all be sharing this with
everyone tonight at the dinner table!
Types of FMEA’s
• Design DFMEA: Performed early in the design phase to analyze product fail
modes before they are released to production. The purpose is to analyze how
fail modes affect the system and minimize them. The severity rating of a fail
mode MUST be carried into the Process PFMEA.
Process Discovery
Purpose of FMEA
FMEA’s:
Process Discovery
A a means to
As t manage… FMEA s help you manage
FMEA’s
RISK by classifying your
process inputs and monitoring
The FMEA…
This is an FMEA
FMEA. We have provided a template for you to use.
use
Process Discovery
FMEA Components…#
The second
Th d column
l iis
the Name of the Process # Process
Function
Potential
Failure
Potential
Failure
S
E
C
l
Potential
Causes of
O
C
Current
Process
D R
E P
Recommen
d Actions
Responsibl
e Person &
Taken
Action
S O D R
E C E P
Step. The FMEA should (Step) Modes
(process
Effects
(Y's)
V a
s
Failure
(X's)
C Controls T N Target
Date
s V C T N
steps documented in
your Process Map.
Phone Enter the Name of the Process Step here. The FMEA should
Dial Number sequentially
ti ll ffollow
ll ththe steps
t d
documented
t d iin your Process
P Map.
M
Listen for Ring Phone
Say Hello Dial Number
Introduce Yourself Listen for Ring
Etc. Say Hello
Introduce Yourself
Etc.
Process Discovery
This information is
usually obtained from This is simply the effect of realizing the potential failure
your Process Map.
mode on the overall process. It focuses on the outputs
of each step.
This information can be obtained in the Process Map.
Process Discovery
The fifth column highlighted here is the ranking that is developed based on the team’s knowledge of the
process in conjunction with the predetermined scale.
Severity is a financial measure of the impact to the business of a failure in the output.
Ranking Severity
The Automotive Industry Action Group, a consortium of the “Big Three”: Ford, GM and Chrysler
developed this criteria. If you don’t like it develop one that fits your organization; just make sure it’s
standardized so everyone uses the same scale.
High Minor disruption to the production line. The product may have to be sorted and a portion 7
(less than 100%) scrapped. Vehicle operable, but at a reduced level of
performance. Customers will be dissatisfied.
Moderate Minor disruption to the production line. A portion (less than 100%) may have to be 6
scrapped (no sorting)
sorting). Vehicle/item operable
operable, but some comfort/convenience
item(s) inoperable. Customers will experience discomfort.
Low Minor disruption to the production line. 100% of product may have to be re-worked. 5
Vehicle/item operable, but some comfort/convenience item(s) operable at a
reduced level of performance. Customers will experience some dissatisfaction.
Very Low Minor disruption to the production line. The product may have to be sorted and a 4
portion (less than 100%) re-worked. Fit/finish/squeak/rattle item does not
conform. Most customers will notice the defect.
Minor Minor disruption to the production line. A portion (less than 100%) of the product may 3
have to be re-worked online but out-of-station. Fit/finish/squeak/rattle item
does not conform. Average customers will notice the defect.
Very Minor Minor disruption to the production line. A portion (less than 100%) of the product may 2
have to be re-worked online but in-station. Fit/finish/squeak/rattle
q item does
not conform. Discriminating customers will notice the defect.
None No effect. 1
* Potential Failure Mode and Effects Analysis (FMEA), Reference Manual, 2002. Pgs 29-45. Chrysler
Corporation, Ford Motor Company, General Motors Corporation.
Process Discovery
You will
Y ill need
d to
t define
d fi your own criteria…
it i …
criteria
and be consistent throughout your FMEA
The actual definitions of the severity are not so important as the fact that the team remains
consistent in its use of the definitions. Below is a sample of transactional severities.
Critical Business May endanger company’s ability to do business. Failure mode affects process
10
Unit-wide operation and / or involves noncompliance with government regulation.
Critical Loss - May endanger relationship with customer. Failure mode affects product delivered
Customer and/or customer relationship due to process failure and/or noncompliance with 9
Specific government regulation.
Major disruption to process/production down situation. Results in near 100%
High 7
rework or an inability to process. Customer very dissatisfied.
Moderate disruption to process. Results in some rework or an inability to process.
Moderate Process is operable, but some work arounds are required. Customers experience 5
dissatisfaction.
Minor disruption to process. Process can be completed with workarounds or
Low rework at the back end. Results in reduced level of performance. Defect is 3
noticed and commented upon by customers.
Minor disruption to process. Process can be completed with workarounds or
Minor rework at the back end. Results in reduced level of performance. Defect noticed 2
internally, but not externally.
None No effect. 1
Shown here is an example for severity guidelines developed for a financial services company.
Process Discovery
Controllable – A factor that can be dialed into a specific setting/value. For example Temperature or
Flow.
Procedures – A standardized set of activities leading to readiness of a step. For example Safety
Compliance, “Lock -Out Tag-Out.”
Noise - A factor that can not be dialed in to a specific setting/value
setting/value. For example rain in a mine
mine.
Recall the classifications of Procedural, Controllable and Noise developed when constructing your
Process Map and Fishbone Diagram? Use those classifications from the Fishbone in the “Class”
column, highlighted here, in the FMEA.
P t ti l C
Potential Causes off F
Failure
il (X’s)
(X’ )
Potential Causes of the Failure refers to how the failure could occur.
This information should be obtained from the Fishbone Diagram.
The column “Potential Causes of the Failure”, highlighted here, refers to how the failure could
occur.
Process Discovery
to occur.
This information should be obtained from Capability Studies or
Historical Defect Data - in conjunction with the predetermined scale.
Ranking Occurrence
Potential Failure Mode and Effects Analysis (FMEA), Reference Manual, 2002. Pg. 35.. Chrysler Corporation, Ford
Motor Company, General Motors Corporation.
The Automotive Industry Action Group, a consortium of the “Big Three”: Ford, GM and Chrysler
developed these Occurrence rankings.
Process Discovery
Current Process Controls refers to the three types of controls that are
in place to prevent a failure in with the X’s. The 3 types of controls are:
•SPC (Statistical Process Control)
•Poke-Yoke – (Mistake Proofing)
•Detection after Failure
The column “Current Process Controls” highlighted here refers to the three types of controls that are
in place to prevent a failures.
FMEA Components…Detection
Components Detection (DET)
The “Detection” highlighted here is an assessment of the probability that the proposed type of
control will detect a subsequent failure mode.
Process Discovery
Ranking Detection
Potential Failure Mode and Effects Analysis (FMEA), AIAG Reference Manual, 2002 Pg. 35.. Chrysler Corporation,
Ford Motor Company, General Motors Corporation.
The Automotive Industry Action Group, a consortium of the “Big Three”: Ford, GM and Chrysler
developed these Detection criteria.
RPN = (SEV)*(OCC)*(DET)
Process Discovery
FEMA Components…Actions
Responsible Person & Date refers to the name of the group or person
responsible for completing the activity and when they will complete it.
Taken Action refers to the action and effective date after it has been
completed.
The columns highlighted here are a type of post FMEA. Remember to update the FMEA throughout
your project, this is what we call a “Living Document” as it changes throughout your project.
Process Discovery
FMEA Exercise
OK Team,
Team let’s
get that FMEA!
Process Discovery
Create an FMEA
Notes
Measure Phase
Six Sigma Statistics
Now we will continue in the Measure Phase with “Six Sigma Statistics”.
Overview
In this module you will learn how your
processes speak to you in the form of W
Welcome
l
Welcome to
tto Measure
M
Measure
data. If you are to understand the Process
Process Discovery
Discovery
behaviors of your processes, then you
Six
Six Sigma
Sigma Statistics
Statistics
must learn to communicate with the
process in the language of data. Basic
Basic Statistics
Statistics
Descriptive
Descriptive Statistics
Statistics
The field of statistics provides the tools
and techniques
q to act on data,, to turn Normal
Normal Distribution
Distribution
data into information and knowledge Assessing
Assessing Normality
Normality
which you will then use to make Special
Special Cause
Cause // Common
Common Cause
Cause
decisions and to manage your
processes. Graphing
Graphing Techniques
Techniques
Measurement
Measurement System
System Analysis
Analysis
The statistical tools and methods that
you will need to understand and Process
Process Capability
Capability
optimize your processes are not
Wrap
Wrap Up
Up &
& Action
Action Items
Items
difficult. Use of Excel spreadsheets or
specific statistical analytical software
has made this a relatively easy task.
In this module you will learn basic, yet powerful analytical approaches and tools to increase your
ability to solve problems and manage process behavior.
Relax….it won’t
be that bad!
Having an understanding of Basic Statistics can be quite valuable to an individual. Statistics however,
like anything, can be taken to the extreme.
Data is like crude oil that comes out of the ground. Crude oil is not of much good use. However if
the crude oil is refined many useful products occur; such as medicines, fuel, food products,
lubricants, etc. In a similar sense statistics can refine data into usable “products” to aid in decision
making, to be able to see and understand what is happening, etc
Statistics is broadly used by just about everyone today. Sometimes we just don’t realize it. Things
as simple as using graphs to better understand something is a form of statistics, as are the many
opinion and political polls used today. With easy to use software tools to reduce the difficulty and
time to do statistical analyses, knowledge of statistics is becoming a common capability amongst
people.
An understanding of Basic Statistics is also one of the differentiating features of Six Sigma and it
would
ld nott b
be possible
ibl without
ith t th
the use off computers
t andd programs liklike MINITAB™
MINITAB™. It hhas b
been
observed that the laptop is one of the primary reasons that Six Sigma has become both popular
and effective.
The standard deviation of population data For each, all, individual values
Use this as a cheat sheet, don’t bother memorizing all of this. Actually most of the notation in Greek is
for population data.
Population: All the items that have the “property of interest” under study.
Population
Sample
Sample
Sample
A population parameter is a numerical value that summarizes the data for an entire population, a
sample has a corresponding numerical value called a statistic.
The population is a collection of all the individual data of interest. It must be defined carefully, such
as all the trades completed in 2001. If for some reason there are unique subsets of trades it may
be appropriate to define those as a unique population, such as, “all sub custodial market trades
completed in 2001”
2001 or “emerging
emerging market trades”
trades .
Sampling frames are complete lists and should be identical to a population with every element
listed only once. It sounds very similar to population… and it is. The difference is how it is used. A
sampling frame, such as the list of registered voters, could be used to represent the population of
adult general public. Maybe there are reasons why this wouldn’t be a good sampling frame.
Perhaps a sampling frame of licensed drivers would be a better frame to represent the general
public.
It is important to recognize the difference between a sample and a population because we typically
are dealing with a sample of the what the potential population could be in order to make an
inference. The formulas for describing samples and populations are slightly different. In most
cases we will be dealing with the formulas for samples.
Types of Data
The nature of data is important to understand. Based on the type of data you will have the option
to utilize different analyses.
Data, or numbers, are usually abundant and available to virtually everyone in the organization.
Using data to measure, analyze, improve and control processes forms the foundation of the Six
Sigma
g methodology. gy Data turned into information,, then transformed into knowledge,
g , lowers the
risks of decision. Your goal is to make more decisions based on data versus the typical practices
of “I think”, “I feel” and “In my opinion”.
One of your first steps in refining data into information is to recognize what the type of data it is
that you are using. There are two primary types of data, they are Attribute and Variable Data.
Attribute Data is also called qualitative data. Attribute Data is the lowest level of data. It is purely
binary in nature. Good or bad, yes or no type data. No analysis can be performed on Attribute
Data. Attribute Data must be converted to a form of Variable Data called Discrete Data in order to
be counted or be useful.
Discrete Data is information that can be categorized into a classification. Discrete Data is based
on counts. It is typically things counted in whole numbers. Discrete Data is data that can't be
broken down into a smaller unit to add additional meaning. Only a finite number of values is
possible and the values cannot be subdivided meaningfully. For example, there is no such thing
as a half of defect or a half of a system lockup.
lockup
Continuous Data is information that can be measured on a continuum or scale. Continuous Data,
also called quantitative data can have almost any numeric value and can be meaningfully
subdivided into finer and finer increments, depending upon the precision of the measurement
system. Decimal sub-divisions are meaningful with Continuous Data. As opposed to Attribute
Data like good or bad, off or on, etc., Continuous Data can be recorded at many different points
(length, size, width, time, temperature, cost, etc.). For example 2.543 inches is a meaningful
number, whereas 2.543 defects does not make sense.
Later in the course we will study many different statistical tests but it is first important to
understand what kind of data you have.
Discrete Variables
Shown here are additional Discrete Variables. Can you think of others within your business?
Continuous Variables
The length of prison time served for individuals All the real numbers between a and b,
b where a is
convicted of first degree murder the smallest amount of time served and b is the
largest.
The household income for households with All the real numbers between a and $30,000,
incomes less than or equal to $30,000 where a is the smallest household income in the
population
Th blood
The bl d glucose
l reading
di ffor those
th individuals
i di id l All reall numbers
b b
between
t 200 and d b,
b where
h b is
i
having glucose readings equal to or greater than the largest glucose reading in all such individuals
200
Shown here are additional Continuous Variables. Can you think of others within your business?
• Understanding the nature of data and how to represent it can affect the
types of statistical tests possible.
• Interval Scale – data can be arranged in some order and for which
differences in data values are meaningful. The data can be arranged in
an ordering scheme and differences can be interpreted
interpreted.
• Ratio Scale – data that can be ranked and for which all arithmetic
operations including division can be performed. (division by zero is of
course excluded) Ratio level data has an absolute zero and a value of
zero indicates a complete absence of the characteristic of interest.
Shown here are the four types of scales. It is important to understand these scales as they will dictate
the type of statistical analysis that can be performed on your data.
Nominal Scale
Listed are some Qualitative Variable Possible nominal level data values for
examples of the variable
Nominal Data.
The only analysis Blood Types A, B, AB, O
is whether they
are different or
not.
State of Residence Alabama, …, Wyoming
Ordinal Scale
Interval Scale
I t
Interval
l Variable
V i bl P
Possible
ibl Scores
S
Ratio Scale
Continuous Data
provides us more • Continuous Data is always more desirable
opportunity for
statistical analyses.
Attribute Data can often • In many cases Attribute Data can be converted to
be converted to continuous
Continuous by
converting it to a rate. • Which is more useful?
– 15 scratches or Total scratch length of 9.25”
– 22 foreign materials or 2.5 fm/square inch
– 200 defects or 25 defects/hour
Descriptive Statistics
Descriptive Statistics
Measures of Location
data. 70
70
NN 200
200
60
60
50
e que ncy
different, there is no 40
40
Descriptive Statistics: Data
mathematical difference 30
30
Variable N N* Mean SE Mean StDev Minimum Q1
between the Mean of a 20
20 Median Q3
Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900
10 5.0000 5.0100
sample and Mean of a 100
00 Variable Maximum
population. 4.97
4.97 4.98
4.98 4.99
4.99
Data
5.00
5.00 5.01
5.01 5.02
5.02 Data 5.0200
Data
The physical
Th h i l center t
of a data set is the Median is:
Median and • The mid-point, or 50th percentile, of a distribution of data.
• Arrange the data from low to high, or high to low.
unaffected by large
– It is the single middle value in the ordered list if there is an odd
data values. This is number of observations
why people use – It is the average of the two middle values in the ordered list if there
Median when are an even number of observations
discussingg averageg
salary for an Histogram
Histogram(with
(withNNormal
ormal Curv e) of
Curve) ofData
Data
M ean 5.000
80
American worker, 80
70
N
Mean
S tD ev
5.000
0.01007
StDev 0.01007
N
200
200
70
people like Bill 60
60
50
Frequency
30
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3
30 Data 200 0 4.9999 0.000712 0.0101 4.9700 4.9900 5.0000 5.0100
average number. 20
20 Variable Maximum
10
Data 5.0200
10
00
4.97
4.97 4.98
4.98 4.99
4.99 5.00
5.00 5.01
5.01 5.02
5.02
Dat a
Data
Trimmed Mean is a:
Compromise between the mean and median.
• The trimmed mean is calculated by eliminating a specified percentage
of the smallest and largest observations from the data set and then
calculating the average of the remaining observations
• Useful for data with potential extreme values
values.
Variable Q3 Maximum
Data 5.0100 5.0200
The trimmed Mean (highlighted above) is less susceptible to the effects of extreme scores.
Mode is:
The most frequently occurring value in a distribution of data.
Mode = 5
H i s t o g r a m ((with
Histogram w ith N o r m a l CCurve)
Normal u r v e ) oof
f DData
a ta
MMean
ean 55.000
.000
880
0
SStDev
tD e v 00.01007
.01 007
NN 2200
00
770
0
660
0
550
0
quency
requency
440
0
Fre
Fr
r
330
0
220
0
110
0
00
44.97
.9 7 44.98
.9 8 44.99
.9 9 55.00
.0 0 55.01
.0 1 55.02
.0 2
DData
ata
It is
i possible
ibl to
t have
h multiple
lti l Modes,
M d when
h this
thi h
happens it’
it’s called
ll d Bi
Bi-Modal
M d l Di
Distributions.
t ib ti H
Here we
only have One mode = 5.
Range is the:
Difference between the largest observation and the smallest
observation in the data set.
• A small range would indicate a small amount of variability and a large
range a large amount of variability.
Variable Maximum
Data 5.0200
A range is typically used for small data sets which is completely efficient in estimating variation for
a sample of 2. As your data increases the Standard Deviation is a more appropriate measure of
variation.
S
Sample P
Population
l ti
Variable Maximum
Data 5.0200
The Standard Deviation for a sample and population can be equated with short and long-term
variation.
Usually a sample is taken over a short period of time making it free from the types of variation
that can accumulate over time so be aware.
Variance is the:
Average squared deviation of each individual data point from the
mean.
Sample Population
The Variance is the square of the Standard Deviation. It is common in statistical tests where it is
necessary to add up sources of variation to estimate the total. Standard Deviations cannot be
added, variances can.
Normal Distribution
We can begin to discuss the Normal Curve and its properties once we understand the basic
concepts of central tendency and dispersion.
As we begin to assess our distributions know that sometimes it’s actually more difficult to determine
what is effecting a process if it is Normally Distributed. When we have a Non-normal Distribution
there is usually special or more obvious causes of variation that can be readily apparent upon
process investigation.
Normal Distribution
By normalizing the Normal Distribution this converts the raw scores into standard Z-scores
Z scores with a
Mean of 0 and Standard Deviation of 1, this practice allows us to use the Z-table.
The area under the curve between any 2 points represents the
proportion of the distribution between those points.
The
Thearea
areabetween
betweenthe
the
Mean
Mean andany
and anyother
other
point
pointdepends
dependsupon
uponthethe
Standard Deviation.
Standard Deviation.
μ x
Convert any raw score to a Z-score using the formula:
The area under the curve between any two points represents the proportion of the distribution. The
concept of determining the proportion between 2 points under the standard Normal curve is a critical
componentt to
t estimating
ti ti Process
P Capability
C bilit and
d will
ill b
be covered
d iin d
detail
t il iin th
thatt module.
d l
Empirical Rule
No matter what the shape of your distribution is, as you travel 3 Standard
Deviations from the Mean, the probability of occurrence beyond that point
begins to converge to a very low number.
The Anderson
Darling test yields a The sha pe of a ny norma l curve ca n be ca lcula ted ba sed
statistical on the norma l proba bility density function.
assessment (called
a goodness-of-fit
test) of Normality Tests for N orma lity ba sica lly compa re the sha pe of the
and the MINITAB™ ca lcula ted curve to the a ctua l distribution of your da ta
version of the points.
N
Normal l probability
b bili
test produces a For the purposes of this tra ining, w e w ill focus on 2
graph to visual
w a ys in M IN ITAB™ to a ssess N orma lity:
demonstrate just
how good that fit is. – The Anderson-Da rling test
– N orma l proba bility test
Goodness-of-Fit
100
Expected for Normal Distribution
Departure of the Actual Data
20%
actual data from the
80
expected normal C
u
m
distribution. The u
l
a 60
Anderson-Darling t
i
v
Goodness-of-Fit test e
P
assesses the e 40
r
c
magnitude of these e
n
departures using an t
20
Observed minus 20%
Expected formula. 0
3.0 3.5 4.0 4.5 5.0 5.5
Raw Data Scale
Anderson-Darling assess how closely actual frequency at a given value corresponds to the
theoretical frequency for a Normal Distribution with the same Mean and Standard Deviation.
Probability
ProbabilityPlot
Plotof
ofAmount
Amount
Normal
Normal
99.9
99.9
Mean
Mean 84.69
84.69
StDev
StDev 7.913
7.913
99
99 NN 70
70
AD
AD 0.265
0.265
95
95 P-Value 0.684
P-Value 0.684
90
90
80
80
70
ercent
70
rcent
60
60
50
50
40
40
Pe
Pe
30
30
20
20
10
10
55
11
0.1
0.1
60
60 70
70 80
80 90
90 100
100 110
110
Amount
Amount
The graph shows the probability density of your data plotted against the expected density of a
N
Normal l curve. NNotice
ti ththatt th
the y-axis
i ((probability)
b bilit ) d
does nott iincrease lilinearly.
l N Normall d
data
t will
ill lilie on a
straight line (the red line) in this analysis. The graph shows you which values tend to deviate from
the Normal curve.
Descriptive Statistics
Anderson-Darling Caveat
60
60 MMedian
edian 50.006
50.006
50
50 3rd
3rdQQuartile
uartile 53.218
53 218
53.218
Perc
40
40
36
36 40
40 44
44 48
48 52
52 56
56 60
60 MMaxim um 62.823
aximum 62.823
30
30 95%
95%CConfidence
onfidenceInterv
Intervalalfor
forMMean
ean
20
20 49.596
49.596 50.466
50.466
10
10 95%
95%CConfidence
onfidenceInterv
Intervalalfor
forMMedian
edian
55 49.663 50.500
49.663 50.500
95%
95%CConfidence
onfidenceInterv
Intervalalfor
forSStD
tDev
ev
11 9955%
% CConfide nce IInter
onfidence nte r vvaals
ls
4.662
4.662 5.278
5.278
Mean
Mean
0.1
0.1
35
35 40
40 45
45 50
50 55
55 60
60 65
65 Median
Median
AAnderson
ndersonDarling
Darling 49.50 49.75 50.00 50.25 50.50
49.50 49.75 50.00 50.25 50.50
In this case, both the Histogram and the Normality Plot look very “normal”. However,
because the sample size is so large, the Anderson-Darling test is very sensitive and any
slight deviation from normal will cause the p-value to be very low
low. Again
Again, the topic of
sensitivity will be covered in greater detail in the Analyze Phase.
For now, just assume that if N > 100 and the data look normal, then they probably are.
Answers:
1) Is Distribution A Normal? Answer > No
2) Is Distribution B Normal? Answer > No
Introduction to Graphing
Passive data
collection means The purpose of Gra phing is to:
don’t mess with the • Identify potential relationships between variables.
process! We are • Identify risk in meeting the critical needs of the Customer,
gathering data and Business and People.
looking for patterns • Provide insight into the nature of the X’s which may or may not
in a graphical tool. If control Y.
the data is
• Show the results of passive data collection
collection.
questionable, so is
the graph we create
from it. For now In this section w e w ill cover…
utilize the data 1. Box Plots
available, we will 2. Scatter Plots
learn a tool called
3. Dot Plots
Measurement
System Analysis 4. Time Series Plots
later in this phase. 5. Histograms
Data Sources
Data
demographics Data sources are suggested by many of the tools that have
will come out of
the basic
been covered so far:
Measure Phase – Process Map
tools such as – X-Y Matrix
Process Maps, – Fishbone Diagrams
X-Y Diagrams,
FMEAs and – FMEA
Fishbones. Put
your focus on Examples are:
the top X’s from
X-Y Diagram to 1. Time 3. Operator
focus your Shift Training
Day of the week Experience
activities.
Week of the month Skill
S
Season off th
the year Adherence to procedures
Graphical Concepts
The Histogram
A Histogram is a basic graphing tool
that displays the relative frequency A Histogram displays data that have been summarized into
or the number of times a measured intervals. It can be used to assess the symmetry or skewness of the
items falls within a certain cell size. data.
Histogram
Histogramof
ofHistogram
Histogram
The values for the measurements
are shown on the horizontal axis (in 40
40
Histogram Caveat
As you can see in
the MINITAB™ file All the Histograms below were generated using random samples of
the columns used to the data from the worksheet “ Graphing Data.mtw” .
generate the
Histogram
Histogramof
ofH1_20,
H1_20, H2_20,
H2_20, H3_20,
H3_20,H4_20
H4_20
Histograms above 98
98 99
99 100
100 101
101 102
102
only have 20 data 44
H1_20
H1_20
44
H2_20
H2_20
points. It is easy to 33 33
samples to create 11 11
FFrequency
requency
Histogram simply by 00
88
H3_20
H3_20
00
88
H4_20
H4_20
Data>Sample from 00 00
98
98 99
99 100
100 101
101 102
102
columns…”
Be careful not to determine N ormality simply from a Histogram plot,
if the sample size is low the data may not look very N ormal.
Variation on a Histogram
The
Histogram Using the worksheet “ Graphing Data.mtw” create a simple Histogram for
the data column called granular.
shown
here looks
to be very
Normal. Histogram of Granular
25
20
15
Frequency
10
0
44 46 48 50 52 54 56
Granular
Dot Plot
Using the worksheet “Graphing
Graphing
Data.mtw”, create a Dot Plot. The Dot Plot can be a useful alternative to the Histogram especially if you
want to see individual values or you want to brush the data.
Histogram for the granular
distribution obscures the granularity,
whereas the Dot Plot reveals it.
Also, Dot Plots allow the user to
brush data points. The Histogram
Dotplot
Dotplotof
of Granular
Granular
does not
not.
If in fact there are special causes (Uncontrollable Noise or Procedural non-compliance) then they
should be addressed separately and then excluded from this analysis.
Take a few minutes and create other Dot Plots using the columns in this data set.
Box Plot
A Box Plot (sometimes called a
Whisker Plot) is made up of a box Box Plots summarize data about the shape, dispersion and center of the
representing the central mass of the data and also help spot outliers.
variation and thin lines, called Box Plots require that one of the variables, X or Y, be categorical or
whiskers extending out on either
whiskers, discrete and the other be continuous
continuous.
side representing the thinning tails of
A minimum of 10 observations should be included in generating the box
the distribution. Box Plots summarize plot.
information about the shape, Maximum Value
B ox
Median
whisker represents the first 25% of Q2 Median
Q2: M di 50th Percentile
P til
the data in the Histogram (the light Q1: 25th Percentile
grey area). The second and third
quartiles form the box, which
Lower Whisker
represents fifty percent of the data
and finally the whisker on the right
Lower Limit: Q1+1.5(Q3-Q1)
represents the fourth quartile. The
line drawn through the box
represents the median of the data. Extreme values, or outliers, are represented by asterisks. A
value is considered an outlier if it is outside of the box (greater than Q3 or less than Q1) by more
than 1.5 times (Q3-Q1).
You can use the Box Plot to assess the symmetry of the data: If the data are fairly symmetric,
the Median line will be roughly in the middle of the box and the whiskers will be similar in length.
If the data are skewed, the Median may not fall in the middle of the box and one whisker will
likel be noticeabl
likely noticeably longer than the other
other.
cholesterol 125
Gluco
125
300
300
then check
Da
200
200
Plot! 100
100
2-Day
2-Day 4-Day
4-Day 14-Day
14-Day
Using the
MINITAB™
worksheet “Graphing
Data.mtw”.
Open the
O h MINITAB™ P Project
j The individual value plot shows the individual data points that are
“Measure Data Sets.mpj” and represented in the Box Plot.
select the worksheet “Graphing
Data.mtw”.
Data
12.5
Da ta
following the menu path “Graph> 10.0
10.0
Individual Value Plot> Multiple
7.5
7.5
Y’s, Simple…”.
5.0
5 0
5.0
Brian
Brian Greg
Greg Shree
Shree
If the output is
pass/fail, it must be
plotted on the y axis.
Use the data shown
to create the
transposed Box Plot.
The reason we do this
is for consistency and
accuracy.
The output Y is
Pass/Fail, the Box
11
Plot shows the
spread of hydrogen
Pass/Fail
Pass/Fail
content that created
the results.
22
215.0
215.0 217.5
217.5 220.0
220.0 222.5
222.5 225.0
225.0 227.5
227.5 230.0
230.0 232.5
232.5
Hydrogen
Hydrogen Content
Content
25
25 25
25
20
20 20
20
Data
Data
15
15 15
15
ta
ta
Da
Da
10
10 10
10
55 55
00 00
Weibull
Weibull Normal
Normal BiBiModal
Modal Weibull
Weibull Normal
Normal BiBiModal
Modal
Jitter Example
By using the Jitter
function we will Once your graph is created, click once on any of the data points (that
action should select all the data points).
spread the data apart
Then go to MINITAB™ menu path: Editor> Edit Individual Symbols…Jitter…
making it easier to
Increase the jitter in the x-direction to .075, click OK, then click anywhere
see how many data on the graph except on the data points to see the results of the change.
points there are.
This gives us
Individual
Individual Value
Value Plot
Plot of
of Weibull,
Weibull, Normal,
Normal, Bi
Bi Modal
Modal
relevance so we 30
30
Data
15
Data
15
10
10
55
00
Weibull
Weibull Normal
Normal Bi
Bi Modal
Modal
Using the MINITAB™ Time series plots allow you to examine data over time.
worksheet “Graphing
Depending on the shape and frequency of patterns in the plot,
Data.mtw”.
several X’s can be found as critical or eliminated.
A Time Series is Graph> Time Series Plot> Simple...
created by following
the MINITAB™ menu
path “Graph>
Graph> Time
Time Series
Time Series Plot
Plot of
of Time 11
Time
Series Plot>
Simple...” 602
602
d t point
data i t as it is
i
600
600
Time 11
gathered over time.
Time
Some interesting 599
599
occurrences can be
revealed. 598
598
597
597
11 10
10 20
20 30
30 40
40 50
50 60
60 70
70 80
80 90
90 100
100
Index
Index
600
600
Time 3).
599
599
MINITAB™
allows you to Time
Time Series
Series Plot
Plot of
of Time
Time 33
add a 605
605
603
to your time 603
602
602
series based on
601
601
Time 33
a smoothing
Time
600
600
technique 599
599
called Lowess. 598
598
597
597
596
596
11 10
10 20
20 30
30 40
40 50
50 60
60 70
70 80
80 90
90 100
100
Index
Index
Notes
Measure Phase
Measurement System Analysis
Now we will continue in the Measure Phase with “Measurements System Analysis”.
Overview
Measurement System
Analysis is one of those Welcome
Welcome to
to Measure
Measure
non-negotiable items!
MSA is applicable in Process
Process Discovery
Discovery
98% of projects and it
alone can have a Six
Six Sigma
Sigma Statistics
Statistics
massive effect on the
success of your project Measurement
Measurement System
y
System Analysis
y
Analysis
and improvements
within the company. Basics
Basics of
of MSA
MSA
In other words, LEARN
IT & DO IT. It is very Variables
Variables MSA
MSA
important. Attribute
Attribute MSA
MSA
Process
Process Capability
Capability
Wrap
Wrap Up
Up &
& Action
Action Items
Items
Introduction to MSA
So far we have learned that the heart and soul of Six Sigma is
that it is a data-driven methodology.
– How do you know that the data you have used is accurate and
precise?
– How do know if a measurement is a repeatable and
reproducible?
In order to improve your processes, it is necessary to collect data on the "critical to" characteristics.
When there is variation in this data, it can either be attributed to the characteristic that is being
measured and to the way that measurements are being taken; which is known as measurement error.
When there is a large measurement errorerror, it affects the data and may lead to inaccurate decision
decision-
making.
Measurement error is defined as the effect of all sources of measurement variability that cause an
observed value (measured value) to deviate from the true value.
There are several types of measurement error which affect the location and the spread of the
distribution. Accuracy, linearity and stability affect location (the average). Measurement accuracy
describes the difference between the observed average and the true average based on a master
reference value for the measurements. A linearity problem describes a change in accuracy
through the expected operating range of the measuring instrument. A stability problem suggests
that there is a lack of consistency in the measurement over time. Precision is the variability in the
measured value and is quantified like all variation by using the standard deviation of the
distribution of measurements. For estimating accuracy and precision, multiple measurements of
one single characteristic must be taken.
The primary contributors to measurement system error are repeatability and reproducibility
reproducibility.
Repeatability is the variation in measurements obtained by one individual measuring the same
characteristic on the same item with the same measuring instrument. Reproducibility refers to
the variation in the average of measurements of an identical characteristic taken by different
individuals using the same instrument.
Given that Reproducibility and Repeatability are important types of error, they are the object of a
specific study called a Gage Repeatability & Reproducibility study (Gage R&R). This study can be
performed on either attribute-based or variable-based measurement systems. It enables an
evaluation of the consistency in measurements among individuals after having at least two
individuals measure several parts at random on a few trials. If there are inconsistencies, then the
measurement system must be improved.
Measurement Purpose
Measurement is a process within In order to be worth collecting,
g, measurements must provide
p value -
itself. In order to measure something that is, they must provide us with information and ultimately,
you must go through a series of tasks knowledge
and activities in sequence. Usually
there is some from of set-up, there is The question…
an instrument that makes the
measurement, there is a way of
recording the value and it may be
What do I need to know?
done by multiple people.
people Even when
you are making a judgment call about …must be answered before we begin to consider issues of measurements,
metrics, statistics, or data collection systems
something, there is some form of
setup. You become the instrument
and the result of a decision is Too often, organizations build complex data collection and
information management systems without truly understanding how
recorded someway; even if it is verbal
the data collected and metrics calculated actually benefit the
or it is a set of actions that you take.
organization.
The ttypes and
Th d sophistication
hi ti ti off
measurement vary almost infinitely. It is becoming increasingly popular or cost effective to have
computerized measurement systems. The quality of measurements also varies significantly - with
those taken by computer tending to be the best. In some cases the quality of measurement is so
bad that you would be just as well off to guess at what the outcome should be. You will be
primarily concerned with the accuracy, precision and reproducibility of measurements to determine
the usability of the data.
Purpose
The purpose of
conducting an MSA is The purpose of MSA is to assess the error due to
to mathematically measurement systems.
partition sources of
The error can be partitioned into specific sources:
variation within the
measurement system – Precision
itself. This allows us • Repeatability - within an operator or piece of equipment
to create an action • Reproducibility - operator to operator or attribute gage to
plan to reduce the attribute gage
biggest contributors of – Accuracy
measurement error. • Stability - accuracy over time
• Linearity-
Linearity accuracy throughout the measurement range
• Resolution
• Bias – Off-set from true value
– Constant Bias
– Variable Bias – typically seen with electronic
equipment, amount of Bias changes with setting
levels
Measurement systems,
systems like
all things, generate some Accurate
Accuratebut butnotnotprecise
precise--On On Precise
Precisebut
butnotnotaccurate
accurate--The
The
average,
average,thetheshots
shotsare
areininthe average
averageisisnot
noton onthe
thecenter,
center,but
amount of variation in the the but
center
centerofofthe
thetarget
targetbut
butthere
thereisisaa the
thevariability
variabilityisissmall
small
results/data they output. In lot
lotof
ofvariability
variability
measuring, we are primarily
concerned with 3
characteristics:
1. How
1 H accurate
t is
i th
the
measurement? For a
repeated measurement,
where is the average
compared to some known
standard?. Think of the
target as the measurement
system,, the
syste t e known
o
standard is the bulls eye in
the center of the target. In
the first example you can see the “measurements” are very dispersed, there is a lot of variability as
indicated by the Histogram curve at the bottom. But on average, the “measurements” are on target.
When the average is on target, we say the measurement is accurate. However, in this example they
are not very precise.
3. The third characteristic is how reproducible is the measurement from individual to another? What is
the accuracy and precision from person to person. Here you would expect each person that performs
the measurement to be able to reproduce the same amount of accuracy and precision as that of other
person performing
f i the
h same measurement.
Ultimately, we make decisions based on data collected from measurement systems. If the
measurement system does not generate accurate or precise enough data, we will make the decisions
that generate errors, waste and cost. When solving a problem or optimizing a process, we must know
how good our data are and the only way to do this is to perform a Measurement System Analysis.
MSA Uses
M SA ca n be used to:
The measurement system always has some amount of variation and that variation is additive to
the actual amount of true variation that exists in what we are measuring. The only exception is
when the discrimination of the measurement system is so poor that it virtually sees everything the
same.
This means that you may actually be producing a better product or service than you think you are,
providing that the measurement system is accurate; meaning it does not have a bias, linearity or
stability problem. It may also mean that your customer may be making the wrong interpretations
about your product or service.
The components of variation are statistically additive. The primary contributors to measurement
system error are Repeatability and Reproducibility. Repeatability is the variation in measurements
obtained by one individual measuring the same characteristic on the same item with the same
measuring instrument. Reproducibility refers to the variation in the average of measurements of an
identical characteristic taken by different individuals using the same instrument.
Why MSA?
Why is MSA so important?
M ea surem ent System Ana ly sis is important to:
MSA is was allows us to trust
• Study the % of variation in our process that is caused by our
the data generated from our measurement system.
processes. When you charter • Compare measurements between operators.
a project you are taking on a • Compare measurements between two (or more) measurement
significant burden which will devices.
require Statistical Analysis. • Provide criteria to accept new measurement systems (consider new
What happens if you have a equipment).
great project, with lots of data • Evaluate a suspect gage
gage.
from measurement systems • Evaluate a gage before and after repair.
that produce data with no • Determine true process variation.
integrity?
• Evaluate effectiveness of training program.
Appropriate Measures
Sufficient means that are
Sufficient,
measures are available to Appropria te M ea sures are:
be measured regularly, if
not it would take too long • Sufficient – available to be measured regularly
to gather data.
Relevant, means that they • Relevant –help to understand/ isolate the problems
will help to understand
and isolate the problems.
problems
• Representative - of the process across shifts and people
Representative measures
mean that we can detect • Contextual – collected with other relevant information that
variation across shifts and might explain process variability.
people.
Contextual means they are necessary to gather information on other relevant information that actually
ld h
would help
l tto explain
l i sources off variation.
i ti
Poor Measures
It is very common
while working gpprojects
j Poor M ea sures can result from:
to discover that the
current measurement • Poor or non-existent operational definitions
systems are poor. • Difficult measures
Have you ever come
across a situation • Poor sampling
where the data from • Lack of understanding of the definitions
your customer or
supplier doesn’t
doesn t match • Inaccurate,
Inaccurate insufficient or non-calibrated
non calibrated measurement
yours? It happens devices
often. It is likely a
problem with one of
the measurement
M ea surement Error compromises decisions that affect:
systems. We have – Customers
worked MSA projects – Producers
across critical – Suppliers
measurement points
in various companies,
it is not uncommon for more than 80% of the measurements to fail in one way or another.
M SA is a Show Stopper!!!
Components of Variation
Precision Accuracy
Repeatability Reproducibility
p y Stability
y Bias Linearity
y
All measurement systems have error. If you don’t know how much of the
variation you observe is contributed by your measurement system, you
cannot make confident decisions.
We are going to strive to have the measured variation be as close as possible to the true variation.
In any case we want the variation from the measurement system to be a small as possible. We are
now going to investigate the various components of variation of measurements.
Precision
A precise metric is one that returns the same value of a given The spread of the data
is measured by
attribute every time an estimate is made.
Precision. This tells us
how well a measure
can be repeated and
Precise data are independent of who estimates them or when
reproduced.
the estimate is made.
Repeatability
Measurements will be Repea ta bility is the variation in measurements obtained with one
different…expect it! If mea surement instrument used several times by one appraiser
measurement are while measuring the identical characteristic on the sa m e pa rt.
always exactly the
same this is a flag,
sometimes it is Y
because the gauge
does not have the
proper resolution,
meaning the scale
doesn’t go down far Repeatability
enough to get any For example:
variation in the – Manufacturing: One person measures the purity of multiple samples
measurement. of the same vial and gets different purity measures.
– Transactional: One person evaluates a contract multiple times (over a
For example, would
period of time) and makes different determinations of errors.
you use a football field
to measure the gap in a
spark plug?
Reproducibility
Reproducibility will be
present when it is Reproducibility is the variation in the average of the
possible to have more measurements made by different appraisers using the sa me
than one operator or mea suring instrument when measuring the identical
more than one characteristic on the sa me pa rt.
instrument measure the Reproducibility
same part.
Y Operator A
Operator B
For example:
– Manufacturing: Different people perform purity test on samples from
the same vial and get different results.
– Transactional: Different people evaluate the same contract and
make different determinations.
1. Pair up
p with an associate.
2. One person will say start and stop to indicate how
long they think the 10 seconds last. Do this 6 times.
3. The other person will have a watch with a second
hand to actually measure the duration of the estimate.
Record the value where your partner can’t see it.
4 Switch tasks with partner and do it 6 times also
4. also.
5. Record all estimates, what do you notice?
Accuracy
Accuracy and the
average are related. An accurate measurement is the difference between the observed average of
Recall in the Basic the measurement and a reference value.
Statistics module we – W hen a metric or measurement system consistently over or under estimates the
talked about the Mean value of an attribute, it is said to be “ inaccurate”
and the variance of a Accuracy can be assessed in several ways:
distribution. – Measurement of a known standard
– Comparison with another known measurement method
Think of it this – Prediction of a theoretical value
way….If the W hat happens if we don’t have standards, comparisons or theories?
Measurement System
True
is the distribution then Avera ge
accuracy is the Mean
and the precision is
Accura cy
the variance. W a rning, do not a ssume y our
gy reference is gospel.
m etrology g
M ea surement
However, before you invest a lot of time analyzing the data, you
must ensure the data has integrity.
– The analysis should include a comparison with known
reference points.
– For the example of product returns, the transaction details
should add up to the same number that appears on financial
reports, such as the income statement.
ACCURATE PRECISE BO TH
+ =
Bias
Bias Bias
Bias is a component of Accuracy. Constant Bias is when the measurement is off by a constant
value. A scale is a prefect example, if the scale reads 3 lbs when there is no weight on it then there
is a 3lb Bias. Make sense?
Stability
Linearity
0.00
*
-e
*
*
Reference Value (x)
y = a + b.x
y: Bias, x: Ref. Value
a: Slope, b: Intercept
Linearity just evaluates if any Bias is consistent throughout the measurement range of the
instrument. Many times Linearity indicates a need to replace or maintenance measurement
equipment.
Types of MSA’s
Variable Data is
always preferred over M SA’s fa ll into tw o ca tegories:
Attribute because it – Attribute
give us more to work – Va ria ble
with.
Attribute Va ria ble
Now we are gong to – Pa ss/ Fa il – Continuous sca le
review Variable MSA – Go/ N o Go – Discrete sca le
testing
testing. – Document Prepa ra tion – Critica l dimensions
– Surfa ce imperfections – Pull strength
– Customer Service – W a rp
Response
Variable MSA’s
MSA s
MSA’s use a
MIN ITAB™ calculates a column of variance components (VarComp) which are used to
random effects calculate % Gage R&R using the AN OVA Method.
model meaning
that the levels for
Measured Value True Value
the variance
components are
not fixed or
assigned, they are
assumed to be
random. Estimates for a Gage R&R study are obtained by calculating the variance components
for each term and for error. Repeatability, Operator and Operator* Part components
are summed to obtain a total variability due to the measuring system.
W e use variance components to assess the variation contributed by each source of
measurement error relative to the total variation.
Contribution
Contribution ofof variation
variation to
to the
the total
total
variation
variation of
of the
the study.
study.
%
% Contribution,
Contribution, based
based onon variance
variance
components,
components, is is calculated
calculated byby dividing
dividing each
each
value
value in
in VarComp
VarComp by by the
the Total
Total Variation
Variation then
then
multiplying
multiplying the
the result
result by
by 100.
100.
Use
Use %% Study
Study Var
Var when
when you
you are
are interested
interested in
in
comparing
comparing thethe measurement
measurement system
system variation
variation to
to the
the
total variation.
total variation.
%
% Study
Study Var
Var is
is calculated
calculated by
by dividing
dividing each
each value
value in
in
Study
Study Var
Var by
by Total
Total Variation
Variation and
and Multiplying
Multiplying by
by
100
100.
100
100.
Study
Study Var
Var isis calculated
calculated asas 5.15
5.15 times
times the
the Standard
Standard
Deviation
Deviation for
for each
each source.
source.
(5.15
(5.15 is
is used
used because
because when
when data
data are
are normally
normally
distributed,
distributed, 99%
99% ofof the
the data
data fall
fall within
within 5.15
5.15
Standard
Standard Deviations.)
Deviations.)
WWhen
hen the
the process
process tolerance
tolerance is is entered
entered inin the
the
system,
system, MIN ITABTM
MINITAB TM calculates
calculates % % Tolerance
Tolerance whichwhich
compares
compares measurements
measurements system
system variation
variation to to
customer
customer specification.
specification. This
specification This allows
allows us
us to
to
determine
determine thethe proportion
proportion of of the
the process
process tolerance
tolerance
that
that is
is used
used by
by the
the variation
variation inin the
the measurement
measurement
system.
system.
R
Recom mended
d d
5 or more Categories
AIAG St
Standards
d d for
f Gage
G Acceptance
A t
%Tolerance
50 0.625
0 0.620
Gage R&R Repeat Reprod Part to Part
Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
UCL=0.005936
measurement
measurementsystem systeminto intospecific
specificsources.
sources. Each
Eachcluster
cluster
0.005
ofofbars
bars represents a source of variation. Bydefault,
0.625 represents a source of variation. By default,
R=0.001817
each
each cluster will have two bars, corresponding to
0.000 LCL=0 0.620 cluster will have two bars, corresponding to
0 %Contribution
%Contribution
Operator 1 and
and%StudyVar.
%StudyVar.
2 3 If you add a tolerance
If you add a tolerance
Xbar Chart by Operator and/
and/ ororhistorical sigma,
Operator*Part
historical sigma, bars
Interaction
bars for
for %% Tolerance
Toleranceand/
Operator and/oror
0.632 1 2 3
0.631
UCL=0.6316
%Process
0.631
%Process
0.630 are
areadded.
added.
1
Mean
0.630 2
0.629
age
0.629 3
Sample M
Mean=0 6282
Mean=0.6282 0 628
0.628
Avera
0 628
0.628
0.627
0.626
InInaa good
goodmeasurement
0.627
0.626 measurementsystem,
system,thethelargest
largestcomponent
component
0.625
0.624
LCL=0.6248
ofofvariation
variation is Part-to-Part variation. Ifinstead
0.625
0.624
is Part-to-Part variation. If insteadyou
youhave
have
0
large
largeamounts
Part
amountsofofvariation
1 2 3 4
variationattributed
5 6 7 8
attributedtotoGage
9 10
GageR&R,
R&R,then
then
corrective
correctiveaction
actionisisneeded.
needed.
%Tolerance
Percen
50 0.625
0.620
0
MIN ITABTMTMprovides an R Chart and Xbar Chart by Operator.
Gage R&R Repeat Reprod Part-to-Part MIN ITAB
Part 1 2 provides
3 4 5 an 6 R7 Chart
8 9 and
10 Xbar Chart by Operator.
The
TheRRchart
chartconsists
consistsofofthe
thefollowing:
following:
R Chart by Operator By Operator
0.010 1 2 3
- The plotted points are the difference between the largest
0.630
- The plotted points are the difference between the largest
Sample Range
UCL=0.005936 and
andsmallest
smallestmeasurements
measurementson oneach
eachpart
partfor
foreach
eachoperator.
operator.
0.005
If the measurements are the same then the range = 0.
0.625
If the measurements are the same then the range = 0.
R=0.001817 - The Center Line, is the grand average for the process.
- The Center Line, is the grand average for the process.
0.000 LCL=0 - -The
0.620 TheControl
ControlLimits
Limitsrepresent
representthetheamount
amountofofvariation
variation
0 expected
Operator 1 for the subgroup
2 ranges
ranges. 3These limits are calculated
expected for the subgroup ranges. These limits are calculated
Xbar Chart by Operator using the variation within subgroups.
using the Operator*Part Interaction
variation within subgroups. Operator
0.632 1 2 3
UCL=0.6316 0.631 1
0.631
If any of the points on the graph go above 2the upper Control
0.630
Sample Mean
0.630 If any of the points on the graph go above3 the upper Control
0.629
Average
%Tolerance
50 MIN ITABTMTMprovides an R Chart and Xbar Chart by Operator.
MIN ITAB provides an R Chart and Xbar Chart by Operator.
0.625
The Xbar Chart compares the part-to-part variation to
The Xbar Chart compares the part-to-part variation to
repeatability.
repeatability. The
0.620 TheXbar
Xbarchart
chartconsists
consistsofofthe
thefollowing:
following:
0
Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
id ll show
ideally
0 629
0.629 h llack-of-control.
k f t l L Lack-of-control
k f t l 3exists when many
Average
0.629
0.628 Mean=0.6282 points are above the Upper Control Limit and/ or below the
0.628
0.627 points are above the Upper Control Limit and/ or below the
0.627
Lower Control Limit.
0.626
0.625
Lower Control Limit.
0.626
LCL=0.6248 0.625
0.624 0.624
In this case there are only a 7few8 points out of control which
0 In this case
Part 1 2 there
3 4are
5 only
6 a few points
9 10 out of control which
indicates the measurement system is inadequate.
indicates the measurement system is inadequate.
%Tolerance
%
Percen
50 0.625
Ideally,
Ideally,the
thelines
lineswill
willfollow
followthethesame
samepattern
patternand andthe
the
part
partaverages
averageswill0
willvary
vary enough
enough that
that differences
differences
0.620
Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
between
betweenparts
partsare areclear.
clear. R Chart by Operator By Operator
0.010 1 2 3
0.630
Sample Range
UCL=0.005936
Pa ttern 0.005 M ea ns… 0.625
R=0.001817
0.000 LCL=0
Lines a re virtua lly identica l O pera tors a re m ea suring 0.620
0 O
Operator
t
the pa rts the sa m e 1 2 3
2
0.629
others low er tha n the others
Average
0.629 3
0.628 Mean=0.6282 0.628
0.627 0.627
Lines a re not pa ra llel or
0.626
0.625
The opera tors a bility to 0.626
LCL=0.6248 0.625
they cross 0.624 mea sure a pa rt depends 0.624
0 on w hich pa rt is being Part 1 2 3 4 5 6 7 8 9 10
Practical Conclusions
For this example, the measuring system contributes a great deal to the overall variation,
as confirmed by both the Gage R&R table and graphs.
The variation due to the measurement system, as a percent of study variation is causing
92.21% of the variation seen in the process.
By AIAG Standards this gage should not be used. By all standards, the
data being produced by this gage is not valid for analysis.
% Tolera nce
or % C
Contribution
t ib ti System is
is…
% Study Va ria nce
Design Types
Crossed Designs are
the workhorse of Crossed Design
• A crossed design is used only in non-destructive testing and assumes that all
MSA. They are the the parts can be measured multiple times by either operators or multiple
most commonly machines.
used design in – Gives the ability to separate part-to-part variation from measurement
industries where it is system variation.
possible to measure – Assesses repeatability and reproducibility.
something more than – Assesses the interaction between the operator and the part.
once. Chemical and
biological systems N ested Design
can use Crossed • A nested design is used for destructive testing (we will learn about this in
MBB training) and also situations where it is not possible to have all
Designs also as long operators or machines measure all the parts multiple times.
as you can assume – Destructive testing assumes that all the parts within a single batch are
that the samples identical enough to claim they are the same.
used come from a – N ested designs are used to test measurement systems where it is not
homogeneous possible (or desirable) to send operators with parts to different locations.
solution and there is – Do not include all possible combinations of factors.
no reason they can – Uses slightly different mathematical model than the crossed design.
be different.
Nested Designs must be used for destructive testing. In a Nested Design, each part is measured by
only one operator. This is due to the fact that after destructive testing, the measured characteristic is
different after the measurement process than it was at the beginning. Crash testing is an example of
destructive testing.
testing
If you need to use destructive testing, you must be able to assume that all parts within a single batch
are identical enough to claim that they are the same part. If you are unable to make that assumption
then part-to-part variation within a batch will mask the measurement system variation.
If you can make that assumption, then choosing between a Crossed or Nested Gage R&R Study for
destructive testing depends on how your measurement process is set up. If all operators measure
parts from each batch
batch, then use Gage R&R Study (Crossed).
(Crossed) If each batch is only measured by a
single operator, then you must use Gage R&R Study (Nested). In fact, whenever operators measure
unique parts, you have a Nested Design. Your Master Black Belt can assist you with the set-up of
your design.
A Gage R&R
R&R, like any study
study, Ga ge R& R Study
requires careful planning. The – Is a set of trials conducted to assess the repeatability and reproducibility
common way of doing an of the measurement system.
Attribute Gage R&R consists – Multiple people measure the same characteristic of the same set of
of having at least two people multiple units multiple times (a crossed study)
measure 20 parts at random, – Example: 10 units are measured by 3 people. These units are then
twice each. This will enable randomized and a second measure on each unit is taken.
you to determine how
y
consistently these people A Blind Study is extremely desirable.
evaluate a set of samples
– Best scenario: operator does not know the measurement is a part of a test
against a known standard. If
– At minimum: operators should not know which of the test parts they are
there is no consistency currently measuring.
among the people, then the
measurement system must
be improved, either by NO, not that kind of R&R!
defining a measurement
method, training, etc. You use
an Excel spreadsheet
template to record your study and then to perform the calculations for the result of the study.
The next few slides show how to create a data collection table in MINITAB™
MINITAB . You can use Excel
also.
Here is the
completed table.
The trial column
will not be used
for the analysis
and can actually
be deleted.
Va ria bles:
– Part
– Operator
– Response
Gage R & R
Graphical Output
Looking at the “ Components of Variation” chart, the Part to Part Variation needs to be larger
than Gage Variation.
If in the “ Components of Variation” chart the “ Gage R&R” bars are larger than the “ Part-to-
Part’ bars, then all your measurement variation is in the measuring tool i.e.… “ maybe the
gage needs to be replaced” . The same concept applies to the “ Response by Operator”
chart. If there is extreme variation within operators, then the training of the operators is
suspect.
Pa rt to Pa rt
Va ria tion needs
to be la rger tha n
Ga ge Va ria tion
O pera tor
Error
Session Window
This output tells us that the part to part variation exceeds the allowable tolerance. This gage is
acceptable.
Signal Averaging
Suppose the Standard Deviation for one part measured by one person
many times is 9.5.
Here we have a problem with Repeatability, not Reproducibility so we calculate what the Standard
Deviation should be in order to meet our desire of a 15% gage.
We are assuming that 15% will be acceptable for the short term until an appropriate fix can be
implemented. The 9.5 represents our estimate for Standard Deviation of population of Repeatability.
Attribute MSA
An Attribute MSA is similar in many ways to the continuous MSA, including the
purposes. Do you have any visual inspections in your processes? In your experience
y been?
how effective have they
When a Continuous MSA is not possible an Attribute MSA can be performed to evaluate the quality
of the data being reported from the process.
Why not? Does everyone know what an “F” (defect) looks like? Was the lighting good in the
room? Was it quite so you could concentrate? Was the writing clear? Was 60 seconds long
enough?
e oug
This is the nature of visual inspections! How many places in your process do you have visual
inspection? How good do you expect them to be?
SCORING REPORT
DATE: 5/10/2006
Attribute Legend5 (used in computations) NAME: Joe Smith
1 pass PRODUCT: My Gadget All operators
2 fail BUSINESS: Unit 1 agree within and All Operators
between each agree with
Other standard
Known Population Operator #1 Operator #2 Operator #3 Y/N Y/N
Sample # Attribute Try #1 Try #2 Try #1 Try #2 Try #1 Try #2 Agree Agree
1 pass pass pass pass pass fail fail N N
2 pass pass pass pass pass fail fail N N
3 fail fail fail fail pass fail fail N N
4 fail fail fail fail fail fail fail Y Y
5 fail fail fail pass fail fail fail N N
6 pass pass pass pass pass pass pass Y Y
7 pass fail fail fail fail fail fail Y N
8 pass pass pass pass pass pass pass Y Y
9 fail pass
p pass
p pass
p pass
p pass
p pass
p Y N
10 fail pass pass fail fail fail fail N N
11 pass pass pass pass pass pass pass Y Y
12 pass pass pass pass pass pass pass Y Y
In order to conduct an Attribute Gage R&R first select a set of samples. These samples should be
a mix of clearly Good/Pass, clearly Bad/Fail and Marginal so we can test an operator’s ability
across different types of attributes.
For each sample an attribute or true status of the part should be documented by an expert or team
of experts, these people have to be different that the operators who will do the study. Each
operator should assign a Pass or Fail to each part on two or three separate occasions.
The requirements for any sort of confidence with Attribute Data are big. Start with 50 samples, that
should give you enough data. If you use more, realistically things will just get worse.
Repea ta bility
Reproducibility R
A
C A
T
The
eggreen
ee ttriangle
a g e represents
ep ese ts tthe
e actua
actual sco
score
eoof tthe
e
U
A
N
appraiser. The range between the red squares is the L
G
Confidence Interval which is a function of the operators
score and the size of the sample they have inspected. E
Statistical Report
M&M Exercise
2 M&M Fail
• Pick 50 M&Ms out of a package.
3 M&M Pass
• Enter results into either the Excel template or MIN ITABTM and
draw conclusions.
• The instructor will represent the customer for the attribute score.
To complete this study you will need, a bag of M&Ms containing 50 or more “pieces”. The Attribute
Value for each piece, which means the “True” value for each piece, in addition to being the
facilitator of this study you will also serve as the customer, so you will have the say as to if the
piece is actually a Pass or Fail piece
piece. Determine this before the inspectors review the pieces
pieces. You
will need to construct a sheet as shown here to keep track of the “pieces” or “parts” in our case
M&Ms it is important to be well organized during these activities. Then the inspectors will
individually judge each piece based on the customer specifications of bright and shiny M&M with
nice M’s.
Notes
Measure Phase
Process Capability
Process Capability
Overview
Continuous
Continuous Capability
Capability
Concept
Concept of
of Stability
Stability
Attribute
Attribute Capability
Capability
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Process Ca pa bility:
This is the Definition of Process Capability. We will now begin to learn how to assess it.
Process Capability
Capability Analysis
Capability Analysis
provides you with a The X
X’ss Y = f(X) (Process Function) The Y
Y’ss
Variation – “Voice of
(Inputs) (Outputs)
quantitative assessment of the Process”
Frequency
your processes ability to Verified Op i + 1
Op i
meet the requirements X1
Data for
?
Y1…Yn
Correction 10.33
10.05 10.12
9.99 10.05
Analysis is traditionally
10.44
10.33 10.43
10.12 10.33
X3 Y2 9.86
10.44 10.21
10.43 10.44
10.01
10.21 9.86
9.80 9.90 10.0 10.1 10.2 10.3 10.4 10.5
10.07
9.86
10.29
10.07 10.15
10.01 10.07
10.36
10.29 10.44
10.15 10.29
10.03
10.44 10.36
X4 10.36
?
outputs of a process, in
other words comparing the
Requirements – “Voice
Voice of the Process to the Critical X(s): of the Customer” Data - VOP
Any variable(s) USL = 10.44
Voice of the Customer. LSL = 9.96 10.16
10.11 9.87 10.16
which exerts an 10.05
10.33
9.99
10.12
10.11
10.05
undue influence on
10.44 10.43 10.33
9.86 10.21 10.44
You will learn in the lesson how the output variation width of a given process output compares with
the specification width established for that out put. This ratio, the output variation width divided by
th specification
the ifi ti width
idth iis what
h t iis kknow as capability.
bilit
Since the specification is an essential part of this assessment, a rigorous understanding of the
validity of the specification is vitally important, it also has to be accurate. This is why it is important to
perform a RUMBA type analysis on process inputs and outputs.
Process Capability
Re
ss
du
variation is larger than the Capable and ce
ce
on target ro
rp
sp
difference between the upper e
nt
r ea
Average
spec limit minus the lower LSL USL Ce
d
spec limit, our product or
service output will always
produce defects, it will not be
capable of meeting the T
Target
t
customer or process output
requirements.
As you have learned, variation exists in everything. There will always be variability in every process
output. You can’t eliminate it completely, but you can minimize it and control it. You can tolerate
variability if the variability is relatively small compared to the requirements and the process
demonstrates long-term stability, in other words the variability is predictable and the process
performance is on target meaning the average value is near the middle value of the requirements.
The output from a process is either: capable or not capable, centered or not centered. The degree of
capability and/or centering determines the number of defects generated. If the process is not
capable, you must find a way to reduce the variation.
And if it is not centered, it is obvious that you must find a way to shift the performance. But what do
you do if it is both incapable and not centered? It depends, but most of the time you must minimize
and gget control of the variation first, this is because high
g variation creates high
g uncertainty,
y yyou can’t
be sure if your efforts to move the average are valid or not. Of course, if is just a simple adjustment
to shift the average to where you want it, you would do that before addressing the variation.
Problem Solving Options – Shift the Mean
Our efforts in a Six Sigma This involves finding the variables that will shift the process over to the
project that is examining a target. This is usually the easiest option.
process that is p
p performing
g at a
level less than desired is to USL
LSL
Shift the Mean of performance Shift
such that all outputs are within
an acceptable range.
Process Capability
Move the specification limits – Obviously this implies making them wider, not narrower. Customers
Obviously this implies making usually do not go for this option but if they do…it’s the easiest!
them wider,, not narrower.
Customers usually do not go LSL USL USL
for this option.
Move Spec
Process Capability
Capability Studies
Steps to Capability
#1 Verify Customer
Requirements
#2 Validate
Specification
Limits
#3 Collect Sample
Data
#4 Determine
Data Type
(LT or ST)
#5 Check data
for normality
#6 Calculate
Z-Score, PPM,
Yield, Capability
Cp, Cpk, Pp, Ppk
#7
Process Capability
Q uestions
ti to
t consider:
id Specifications must be
verified before
completing the
• W hat is the source of the specifications?
Capability Analysis. It
– Customer requirements (VOC) doesn’t mean that you
– Business requirements (target, benchmark) will be able to change
– Compliance requirements (regulations) them, but on occasion
– Design requirements (blueprint
(blueprint, system) some internal
specifications have
• Are they current? Likely to change? been made much
tighter than the
customer wants.
• Are they understood and agreed upon?
– Operational definitions
– Deployed to the work force
Data Collection
You must know if the
data collected from Ca pa bility Studies should include “ a ll” observa tions (1 0 0 % sa mpling) for a specified period.
process.
Fill Q
Each lot is sampled as it leaves the manufacturing facility on its way to the warehouse. The results
are represented by the graphic where you see the performance data on a lot by lot basis for the
amount of fill based on the samples that were taken. Each lot has its own variability and average as
shown. The variability actually looks reasonable and we notice that the average from lot to lot is
varying as well.
What the customer eventually experiences in the amount of fluid in each bottle is the value across the
full variability of all the lots. It can now be seen and stated that the long-term variability will always be
greater than the short-term variability.
Process Capability
Baseline Performance
As an example, imagine you reported the process performance Baseline was based on distribution 3
in the graphic, you would mislead yourself and others that the process had excellent on target
performance. If you used distribution 2, you would be led to believe that the average performance was
near the USL and that most of the output of the process was above the spec limit. To resolve these
potential problems, it is important to always use long-term data to report the Baseline.
How do you know if the data you have is short or long-term? Here are some guidelines. A somewhat
technical interpretation of long-term data is that the process has had the opportunity to experience
most of the sources of variation that can impact it. Remembering the outputs are a function of the
inputs what we are saying is that most of the combinations of the inputs
inputs, inputs, each with their full range of
variation has been experienced by the process. You may use these situations as guidelines.
Long-term data is a “video” of process performance and is characterized by these types of conditions:
Many shifts Many batches
Many employees Many services and lines
Many suppliers
Long-term variation is larger than short-term variation because of : material differences, fluctuations in
temperature and humidity, different people performing the work, multiple suppliers providing
materials, equipment wear, etc.
As a general rule, short-term data consist of 20 to 30 data points over a relatively short period of time
and long-term data consist of 100 to 200 data points over an extended period of time. Do not be
Process Capability
While we have used a manufacturing example to explain all this, it is exactly the same for a service or
administrative type of process. In these types of processes, there are still different people, different
shifts, different workloads, differences in the way inputs come into the process, different software,
computers,
t temperatures,
t t etc.
t The
Th same exactt conceptst andd rules
l apply.l
You should now appreciate why, when we report process performance, we need to know what the data
is representative of. Using such data we will now demonstrate how to calculate process capability and
then we will show how it is used.
C
Components
t off V
Variation
i ti
In general one or more months of data are probably more long-term than short-term; two weeks or
less is probably more like short-term data.
Process Capability
processing of data. x x
x
x x
x x x
x x
x
x x x Time
x x x x
x x
x x x
x
Stability
Stability is established by A Sta ble Process is consistent over time. Time Series Plots and
plotting data in a Time Control Charts are the typical graphs used to determine stability.
Series Plot or in a
Control Chart. If the data At this point in the Measure Phase there is no reason to assume the
used in the Control Chart process is stable.
goes out of control, the Time Series Plot of PC Data
data is not stable. 70
Att this
t s point
po t in the
t e 60
Measure Phase there is
no reason to assume the
PC Data
50
process is stable.
Performing a capability Tic toc…
40
study at this point
tic toc…
effectively draws a line in
the sand. 30
1 48 96 144 192 240 288 336 384 432 480
Index
If however, the process
is stable, short-term data
provides a more reliable estimate of true process capability.
Looking at the Time Series Plot shown on this slide, where would you look to determine the
entitlement of this process?
Process Capability
Measures of Capability
Mathematically Cpk and Ppk are the same and Cp and Pp are the same.
The only difference is the source of the data, Short-term and Long-term,
respectively.
– Cp and Pp Hope
p
• W hat is Possible if your process is perfectly Centered
• The Best your process can be
• Process Potential (Entitlement)
Capability Formulas
Sa m ple M ea n
Note: Consider the “K” value the penalty for being off center LSL – Lower specification limit
USL – Upper specification limit
Process Capability
MINITAB™ Example
At this point in time we are only attempting to get a Baseline number that we can compare to at the
end of problem solving. We are not using it to predict a quality, we want to get a snapshot. DO NOT
try and make your process STABLE BEFORE working on it! Your process is a project because
there is something wrong with it so go figure it out, don’t bother playing around with stability.
Crea te a Ca pa bility Ana lysis for both suppliers, a ssume long term
da ta .
N ote the subgroup size for this ex a m ple is 5 .
LSL= 5 9 8 USL=6 0 2
Process Capability
599.548
599 548 is the process
Process Capability of Supplier 1
Mean which falls short of
the target (600) for LSL USL
Supplier 1, and the left P rocess D ata Within
LS L 598 Ov erall
tail of the distribution Target *
P otential (Within) C apability
USL 602
falls outside the lower S ample M ean 599.115 Cp 1.19
S ample N 100 C P L 0.66
specification limits. From S tD ev (Within) 0.559239 C P U 1.72
S tD ev (O v erall) 0.604106 C pk 0.66
a practical standpoint O v erall C apability
p y
what does this mean? Pp 1.10
PPL 0.62
You will have camshafts PPU 1.59
P pk 0.62
that do not meet the C pm *
lower specification of
598 mm.
597.75 598.50 599.25 600.00 600.75 601.50
Next we look at the Cp O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance
P P M < LS L 30000.00 P P M < LS L 23088.05 P P M < LS L 32467.79
index.
de This s te
tells
s us if we
e PPM > USL 0 00
0.00 PPM > USL 0 12
0.12 PPM > USL 0 90
0.90
P P M Total 30000.00 P P M Total 23088.18 P P M Total 32468.68
will produce units within
the tolerance limits.
Supplier 1 Cp index is
.66 which tells us they need reduce the process variation and work on centering.
600.06
600 06 is the process man
for Supplier 2 and is very Process
Process Capability
Capability of
of Supplier
Supplier 22
close to the target
LSL
LSL USL
USL
although both tails of the PProcess
rocessDData
ata W
Within
ithin
LS
LSLL 598 Ov
distribution fall outside of 598 O verall
erall
Target
Target **
UUSSLL 602 PPotential
otential(Within)
(Within)CCapability
apability
the specification limits. SSample
ample MMean
602
ean 600.061
600.061
CCpp 0.66
0.66
SSample CCPPLL 0.68
The Cpk index is very ample NN 100
100 0.68
SStD CCPPUU 0.64 0.64
tDev
ev(Within)
(Within) 1.00606
1.00606
CCpk
pk 0.64
similar to Supplier 1 but SStD
tDev
ev(O
(Ovverall)
erall) 1.14898
1.14898
OOvverall C
0.64
apability
erallll C apability
bilit
this infers that we need to PPpp 0.58
0.58
PPPPLL 0.60
0.60
work on reducing PPPPUU 0.56
0.56
PPpk 0.56
variation. When making a pk
CCpm
pm
0.56
**
comparison between
Supplier 1 and 2 elative to
Cpk vs Ppk we see that 597
597 598
598 599
599 600
600 601
601 602
602 603
603
Supplier 2 process is more OObserv
bserved
edPPerformance
erformance EExp.
xp.Within
WithinPPerformance
erformance EExp.
xp.OOvverall
erallPPerformance
erformance
PPPPMM << LS
LSLL 40000.00
40000.00 PPPPMM << LS
LSLL 20251.30
20251.30 PPPPMM <<LS
LSLL 36425.88
36425.88
prone to shifting over time
time. PPPPMM >> UUSSLL 60000.00
60000.00 PPPPMM >> UUSSLL 26969.82
26969.82 PPPPMM >>UUSSLL 45746.17
45746.17
PPPPMM Total 100000.00 PPPPMM Total 47221.11 PPPPMM Total 82172.05
That could be a risk to be Total 100000.00 Total 47221.11 Total 82172.05
concerned about.
Again, Compare the PPM levels? What does this tell us? Hint look at PPM < LSL.
So what do we do. In looking only at the means you may claim that Supplier 2 is the best. Although
Supplier 1 has greater potential as depicted by the Cp measure and it will likely be easier to move their
Mean than deal with the variation issues of Supplier 2
2. Therefore we will work with Supplier 1 1.
Process Capability
MINITAB™ Example (cont.)
Process Capability
O ption 1 O ption 2
Enter subgroup size = tota l Go to options, turn off W ithin
num ber of sa m ples subgroup a na lysis
The default of MINITAB™ assumes long-term data. Many times you will have short-term data, be
sure to adjust MINITAB™ based on Option 1 or 2 as shown here to ensure you get a proper
analysis.
For option 1 you will enter the subgroup size as the total number of data points you have in your
short-term study.
For option 2, you will turn off the within subgroup analysis found inside the Options selection.
Process Capability
Overall Capability
0.93
Mean
StDev
50.19
20.90
95
90
N
AD
P-Value
150
11.238
<0.005
this because your project Observed Performance Exp. Within Performance Exp. Overall Performance
80
70
Percent
60
50
PPM < LSL 413333.33 PPM < LSL 2459.27 PPM < LSL 234065.73 40
10
Here in the Measure Phase stick with observed performance unless your data are Normal. There are
ways to deal with Non-normal Data for predictive capability but we
we’llll look at that once you have
removed some of the Special Causes from the process. Remember here in the Measure Phase we get
a snapshot of what we’re dealing with, at this point don’t worry about predictability, we’ll eventually get
there.
Capability Steps
When we follow the
steps in performing a
Select Output for
capability study on Improvement
W e can follow the steps for
Attribute Data we hit calculating capability for
a wall at step 6. #1 Verify Customer
Requirements
Continuous Data until we
Attribute Data is not reach the question about
considered Normal #2 Validate
so we will use a Specification data N ormality…
Limits
different
#3 Collect Sample
mathematical Data
method to estimate
capability. #4 Determine
Data Type
(LT or ST)
#5 Check data
for Normality
#6 Calculate
Z-Score, PPM,
Yield, Capability
Cp, Cpk, Pp, Ppk
#7
Process Capability
#2 Validate
Specification
Li it
Limits
#3 Collect Sample
Data
#4
Calculate
DPU
#5
Find Z-Score
#6 Convert Z-Score
to Cp & Cpk
#7
Z Scores
The Z Score effectively transforms the actual data into standard normal
units. By referring to a standard Z table you can estimate the area under
the N ormal curve.
– Given an average of 50 with a Standard Deviation of 3 what is
the proportion beyond the upper spec limit of 54?
50
54
Process Capability
Z Table
In our case we have
to lookup the
proportion for the Z
score of 1.33. This
means that
approximately 9.1%
of our data falls
beyond the upper
spec limit of 54. If
we are interested in
determining parts
per million defective
we would simply
multiply the
proportion .09176 by
one million
million. In this
case there are
91,760 parts per
million defective.
Attribute Capability
5 0.3 232.7
6 0.0 3.4
Stable process can shift and drift by as much as 1.5 Standard Deviations. Want the theory behind
the 1.5…Google it! It doesn’t matter.
Process Capability
A total of 20,000 calls came in during the month but 2,500 of them
“ dropped” before they were answered (the caller hung up).
Process Capability
"Cpk” is an index (a
11.. Ca
Calcula
lculate
te DPU
DPU
simple number)
22.. Look
Look upup DPU
DPU va
value
lue on
on the
the Z-Ta
Z-Table
ble
which measures how 33.. Find
Find ZZ Score
Score
close a process is 44.. Convert
C
Convert tZZ Score
Score
S to
t Cpk
to C k,, Ppk
Cpk P
Ppk
k
running to its
specification limits,
relative to the natural Ex
Ex aample:
mple:
variability of the Look
Look up up ZLT
ZLT
Z LT == 11.1
ZLT .111
process.
Convert
Convert ZLT ZLT to
to ZST
ZST == 11.1
.111++1
1 .5
.5 == 11.6
.611
A Cpk of at least
1.33
1 33 is desired and
is about 4 sigma +
with a yield of
99.3790% .
If you just want to know how much variation the process exhibits, a Ppk measurement is fine.
Remember Cpk represents the short-term capability of the process and Ppk represents the long-
t
term capability
bilit off th
the process.
With the 1.5 shift, the above Ppk process capability will be worse than the Cpk short-term capability.
Process Capability
Notes
Measure Phase
Wrap Up and Action Items
The Measure Phase is now complete. Get ready to apply it. This module will help you create a
plan to implement the Measure Phase for your project.
• Being
B i rigorous,
i di
disciplined
i li d
Listed below are the M ea sure Delivera bles that each candidate
should present in a Power Point presentation to their mentor and project
champion.
Look for the potential roadblocks and plan to address them before they
become problems:
– Team members do not have the time to collect data.
– Data presented is the best guess by functional managers.
– Process participants do not participate in the creation of the X-Y
Matrix, FMEA and Process Map.
It won’t all be
smooth
sailing…..
g
You will run into roadblocks throughout your project. Listed here are some common ones that Belts
have to deal with in the Measure Phase.
DMAIC Roadmap
Process Owner
Champion/
Estimate COPQ
Establish Team
Measure
Measure Phase
rule. The way that you apply the Six Detailed Process Mapping
Sigma problem-solving methods to a Identify All Process X’s Causing Problems (Fishbone, Process Map)
W HAT W HO W HEN W HY W HY N O T HO W
Identify the com plex ity of the process
Focus on the problem solving process
Define Cha ra cteristics of Da ta
Va lida te Fina ncia l Benefits
Ba la nce a nd Focus Resources
Over the last decade of deploying Six Sigma it has been found that the parallel application of the
tools and techniques in a real project yields the maximum success for the rapid transfer of
knowledge. For maximum benefit you should apply what has been learned in the Measure Phase
to a Six Sigma project. Use this checklist to assist.
Notes
Measure Phase
Quiz
Now we will see what you have retained from the Measure Phase of the course. Please answer
these questions to the best of your ability without referencing the text. The answers are in the
Appendix. Please check your answers against the answers provided and review the sections in
the Measure Phase where your retention of the knowledge is less than you desire.
1 Wh
1. When llooking
ki att precision,
i i th
the primary
i d
desire
i iis tto confirm
fi ththe process measurementt
system has low Repeatability and____________________. (fill in the blank)
2. The difference in Bias values across the process range are known
as_______________________. (fill in the blank)
3. There are many reasons why Basic Statistics are important to a Black Belt. The following
items are good reasons for using Basic Statistics except which one?
A. Makes inferences about the future
B. Foundation for assessing process capability
C. Data collection for streamed orientation
D. Provide a numerical description of the data especially if it´s Normally Distributed
5. A Black Belt was entering data into MINITABTM. The data being entered is the name of
the countries that his company supplies product to. This is an example of:
A. Nominal Scale Data
B. Ration Scale Data
C. Continuous Data
D. Ordinal Scale Data
6. The most frequently occurring number in a distribution set is 7. The 7 is the sample´s?
A. Mean
B. Median
C. Mode
D. Standard Deviation
7. A fundamental rule is that Standard Deviations cannot be summed but variances can be
summed.d
True False
8. The main difference between Special Cause and Common Cause is? (check all that
apply)
A. Sample size impacts if Common Cause variation is found or not.
B. Special Causes are often the focus of BB projects
C. Special Causes are found in short term Process Capability
D. Common Cause variation is larger than Special Cause variation.
9. The Fishbone is a tool to generate ideas about possible causes for defects.
True False
10. The X-Y Diagram is a tool used to identify/collate potential X´s and assess their relative
impact on multiple Y´s.
T
True False
F l
11. The X-Y Diagram serves an important function to a Black Belt. From the list below select
th item
the it th
thatt best
b t describes
d ib ththe iimportance
t off th
the X
X-Y
Y Di
Diagram.
A. To eliminate the obvious high impact independent variables
B. To help prioritize the independent variables
C. To help prioritize the dependent variables
D. To help with project scope
12. The term FMEA is an abbreviation for Failure Measures Effect Analysis.
True False
13. The FMEA tool is an important tool for a Black Belt. From the list below select the items
that describe the importance of constructing a FMEA. (check all that apply)
A. Predict failure risks and minimize their occurrence
B. Quantifies the severity, occurrence and detection of defects
C. Highlights the non-value added portions of a process
D. Identify ways how a process leads to a failure to meet customer requirements
15. After performing a MSA study if an error occurs, the error can be categorized into which
two specific categories?
A. Precision
B. Detailed
C. Accuracy
D. Random
E. Desirability
16. The following are some good examples of what Black Belt projects should measure:
(check all that apply)
A Primary
A. Pi anddSSecondary
d M
Metrics
ti
B. Vital few X´s in the process
C. Before and after process changes
D. All outputs of the process steps
17. The reason for performing a MSA on your system is to confirm minimal variation or
inaccuracy with your measurement systems and reduce the sources for the excessive
variation or inaccuracy.
y
True False
18. Accuracy can be assessed in several ways. From the list below select the least correct
accuracy assessment.
A. Measurement of a known standard
B. Comparison to another recently calibrated instrument with a proven accuracy
C. Comparison with another proven measurement technique
D C
D. Comparison
i with
ith a proven precise
i iinstrument
t t
19. A Crossed Design Gage R&R is best used for destructive testing.
True False
Analyze Phase
Welcome to Analyze
Now that we have completed the Measure Phase we are going to jump into the Analyze Phase.
Welcome to Analyze will give you a brief look at the topics we are going to cover
cover.
Welcome to Analyze
Overview
Inferentia
Inferentiall Sta
Statistics
tistics
Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
H
Hypothesis
H th
th ii Testing
Hypothesis TTesting
T ti
ti N
NNND
D P1
P1
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Estimate COPQ
Establish Team
Measure
I l
Implement
t Control
C t l Plan
Pl tto Ensure
E Problem
P bl D
Doesn’t
’t Return
Rt
Collect Data
Statistically
Significant?
N
Y
Update FMEA
N
Practically
Significant?
Root
Cause
N
Y
Identify Root Cause
This provides a process look at putting “Analyze” to work. By the time we complete this phase you will
have a thorough understanding of the various Analyze Phase concepts.
We will build upon the foundational work of the Define and Measure Phases by introducing
techniques to find root causes, then using experimentation and Lean Principles to find solutions to
process problems. Next you will learn techniques for sustaining and maintaining process performance
using control tools and finally placing your process knowledge into a high level Process Management
tool for controlling and monitoring process performance.
Analyze Phase
“X” Sifting
Now we will continue in the Analyze Phase with “X Sifting” – determining what the impact of the
inputs to our process are.
“X” Sifting
Overview
The core
fundamentals of this W
W elcom
elcomee to
to Ana
Analy
lyze
ze
M
Multi-Va
ulti-Vari
ri Ana
Analysis
lysis
phase are Multi-Vari
Analysis and ““X
X”” Sifting
Sifting
Classes and Cla
Classes
sses aand
nd Ca
Causes
uses
Causes. Inferentia
Inferentiall Sta
Statistics
tistics
Hy
Hypothesis
pothesis Testing
Testing N
NNND
D P1
P1
Hy
Hypothesis
pothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Multi-Vari Studies
The XXXXXXXXXX
The many
manyXs Xs
XXXXXXXXXX
when
when wewe first
first start
start X XX XXXXX X X
(The
(The trivial
trivial many)
many) X XX XXXXX X X
The
Thequantity
quantityofofX’s
Xs
XX XX XX X
keep
after reducing
we think as
you
about
workY=f(X)
the project
+e
The
Thequantity
quantityofofX’s
Xs
remaining
when we apply
after
XXX
leverage
DMAIC
(The vital
few)
In the Define Phase you use tools like Process Mapping to identify all possible “X’s”
X s . In the Measure
Phase you use tools to help refine all possible “X’s” like the X-Y Diagram and FMEA.
In the Analyze Phase we start to “dis-assemble” the data to determine what it tells us. This is the fun
part.
“X” Sifting
Multi-Vari Definition
The Multi-Vari Chart helps in screening factors by using graphical techniques to logically subgroup
discrete X’s (Independent Variables) plotted against a continuous Y (Dependent). By looking at the
pattern of the graphed points, conclusions are drawn from about the largest family of variation.
At this point in DMAIC, Multi-Vari Charts are intended to be used as a passive study, but later in the
process they can be used as a graphical representation where factors were intentionally changed. The
only caveat with using MINITABTM to graph the data is that the data must be balanced. Each source of
variation
i ti mustt have
h th
the same numberb off d
data
t points
i t across titime.
“X” Sifting
Multi-Vari Example
You are probably asking yourself what is Injection Molding? Well basically an injection molding
machine takes hard plastic pellets and melts them into a fluid. This fluid is then injected into a
mold or die, under pressure, to create products, such as piping and computer cases.
Method
Typically, we start Sa mpling pla ns should encom pa ss a ll three types of
with a data collection
va ria tion: W ithin,, Betw een a nd Tem pora
p l.
sheet
h t th
thatt makes
k
sense based on our
1). Create Sampling Plan
knowledge of the
process. Then follow 2). Gather Passive Date
the steps. 3). Graph Data
If we only see minor 4). Check to see if Variation is Exposed
variation in the
5) Interpret Results
5).
sample, it is time to go
back and collect No
additional data. When Is
Yes
Crea
Create
te Ga
Gather
ther Is
your data collection Pa
Gra
Graph
ph Va
Varia
riation
tion Interpret
Interpret
Sa
Sammpling
pling Passive
ssive Da Results
represents at least Da Data
ta Ex
Exposed
posed Results
Pla
Plann Datata
80% of the variation
within the
process then you should have enough information to evaluate the graph.
graph
Remember for a Multi-Vari Analysis to work the output must be continuous and the sources of
variation discrete.
“X” Sifting
Sources of Variation
Within unit, between
unit and temporal are W ithin unit or Positiona l
the classic causes of – W ithin piece variation related to the geometry of the part.
variation. A unit can – Variation across a single unit containing many individual parts
be a single piece or
such as a wafer containing many computer processors.
a grouping of pieces
– Location in a batch process such as plating.
depending on
whether they were
Between unit or Cyclica l
created
t d att unique
i
times. – Variation among consecutive pieces.
– Variation among groups of pieces.
Multi-Vari Analysis – Variation among consecutive batches.
can be performed on
other processes, Tempora l or Over time Shift-to-Shift
simply identify the
categorical
g sources – Day-to-Day
of variation you are – W eek-to-W eek
interested in.
M a ster Injection
Pressure
% O x ygen
Dista nce to ta nk
Injection Pressure
Per Ca vity
Fluid Level
#1
#2
Am bient
Die
#3 Temp
Tem p
#4
Die
Relea se
An example of Within Unit Variation is measured by differences in the 4 widgets from a single
die cycle. For example, we could measure the wall thickness for each of the 4 widgets.
Between Unit Variation is measured by differences from sequential die cycles. An example of
Between Unit Variation is, comparing the average of wall thickness from die cycle to die cycle.
Temporal Variation is measured over some meaningful time period. For example, we would
compare the average of all the data collected in a time period say the 8 o’clock hour to the 10
o’clock hour.
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
241
“X” Sifting
Sampling Plan
To continue with this Monday W ednesday Friday
example, the Multi-Vari Die Die Die Die Die Die Die Die Die
Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle
sampling plan will be to #1 #2 #3 #1 #2 #3 #1 #2 #3
gather data for 3 die cycles
Cavity #1
on 3 different days for 4
widgets inside the mold.
Cavity #2
Cavity #3
Cavity #4
Cavity #4
“X” Sifting
Gather the list of potential X’s and assign to one of the families of
variation.
– This information can be pulled from the X-Y Diagram from the
Measure Phase.
If an X spans one or more families, assign %’s to the supposed split.
Now let’s
let s use the same information from the X-Y
X Y Diagram that was created in the Measure Phase Phase. The
following exercise will help you assign one of the variables to the family of variation. f you find yourself
with a variable or (X) then assign percentages to split. Use your best judgment for the splits. Don’t
assume that the true X’s causing variation have to come from one in the list.
Step 4 - Focus further effort on the X’s associated with the family of
largest variation.
Remember
R b ththe goa l isi nott to
t onlyl
figure out w ha t it is, but w ha t it is not!
“X” Sifting
Data Worksheet
Now create the Multi-Vari
Chart in MINITABTM.
Run Multi-Vari
“X” Sifting
To find an example of
within unit variation, look
at Unit 1 in the second
time period. Notice the
spread of data is 0.07.
To determine temporal
variation, compare the
averages between time periods. It appears time period 3 and 2 have a difference of 0.06.
To determine within unit variation, find the unit with the greatest variation like Unit 1 in the second
time p
period. Notice the spread
p of data is 0.07. It appears
pp the second unit in the third.
Notice that the shifting from unit to unit is not consistent, but it certainly jumps up and down. The
question at this point should be: Does this graph represent the problem I’m working on? Do I see at
least 80% of the variation? Read the units off the Y axis or look in the worksheet. Notice the spread
of the data is 0.22 units. If the usual spread of the data is 0.25 units, then this data set represents
88% of the usual variation which tells us our sampling plan was sufficient to detect the problem.
“X” Sifting
Let’s try another example, A company with two call centers wants to compare two methods of
open the MINITABTM handling calls at each location at different times of the day.
worksheet “CallCenter.mtw”.
This example is a One method involves a team to resolve customer issues, and the other
transactional application of method requires a single subject-matter expert to handle the call
alone.
the tool.
In this p
particular case,, a • Output (Y)
company with two call – Call Time
centers wants to compare
two methods of handling • Input (X)
calls at each location at – Call Center (GA,N V)
different times of the day. – Time of Day (10:00, 13:00, 17:00)
One method involves a team – Method (Expert, Team)
to resolve customer issues,
and the other method
requires a single subject-
matter expert to handle the
call alone.
“X” Sifting
It is not necessaryy to
force fit any one tool to
your project. For
transactional projects
Multi-Vari may be difficult
to interpret purely
graphically. We will re-
visit this data set later
when working through
Hypothesis Testing.
M lti V i Exercise
Multi-Vari E i
“X” Sifting
MVA Solution
Do you recall the reason
why Normality is an Check for norm a lity …
issue? Normality is
required if you intend to
use the information as a
predictive tool. Early in
the Six Sigma process
there is no reason to
assume that
th t your data
d t Probability
ProbabilityPlot
Plotof
ofVolume
Volume
Normal
Normal
will be Normal. 99.9
99.9
Mean
Mean 514.7
514.7
StDev 6.854
Percent
70
Percent
60
60
50
50
easier. Let’s work the 40
40
30
30
20
20
problem now. Is that 10
10
5
5
0.1
0.1
490 500 510 520 530 540
Normality. Since the P- 490 500 510
Volume
Volume
520 530 540
Having a graphical
summary is quite Another method to check norm a lity is…
nice since it
provides a picture
of the data as well
as the summary
statistics The
statistics.
Summary
Summaryfor
forVolume
graphical summary Volume
AAnderson-Darling
nderson-D arlingNNormality
ormalityTest
Test
command in AA-Squared
-S quared
PP-V-Value
alue
0.49
0.49
0.212
0.212
MINITABTM is an MMean
ean
SStDev
tD ev
514.71
514.71
6.85
6.85
VVariance 46.97
alternative method ariance
SSkew
kewness
Kurtosis
ness
46.97
-0.084725
-0.084725
-0.696960
Kurtosis -0.696960
to check for NN
MMinimum 500.64
144
144
inimum 500.64
Normality. Notice 1st
1stQQuartile
MMedian
uartile
edian
509.70
509.70
515.32
515.32
3rd
3rdQQuartile 520.12
that the P-value in 504
504
510
510
516
516
522
522
528
528
95%
MMaximum
uartile
aximum
520.12
529.39
529.39
95%CConfidence
onfidenceInterv
Intervalalfor
forMMean
ean
this window is the 513.58
513.58 515.84
515.84
95% C onfidence Interv al for M edian
95% C onfidence Interv al for M edian
same as the 513.90
513.90
516.37
516.37
95% C onfidence Interv al for StDev
95% C onfidence Interv al for S tD ev
previous. Mean
9 5 % C onfidence Inter vals
9 5 % C onfidence Inter vals 6.14
6.14
7.75
7.75
Mean
Median
Median
Notice that even 513.5
513.5
514.0
514.0
514.5
514.5
515.0
515.0
515.5
515.5
516.0
516.0
516.5
516.5
“X” Sifting
MVA Solution
Now it is time to
perform the
process
capability. For
subgroup size is
enter 12 since all
12 bottles are
filled at the same
time. Also, use
500 milliliters as
the upper spec
limit in order to
see how bad the
capability was
from a
manufacturers
prospective.
Under the
“Options” tab you
can select the “Benchmark Z’s (sigma level)” of the process, or you can leave the default as
“Capability stats”. Just for fun you can run MINITABTM to generate the Capability Analysis using 500
as the upper spec limit, then run it again as the lower spec limit and see what happens to the
statistics
statistics.
MVA Solution
“X” Sifting
Perform an MVA
The order in which you enter the factors will
produce different graphs. The “classical”
method is to use Within, Between and over-
time (Temporal) order.
MVA Solution
The graph shows the variation within a unit is consistent across all the data. The variation between
units also looks consistent across all the data. What seems to stand out is the machine may be set
up differently from first shift to second. That should be easy to fix! What is the largest source of
variation? Within Unit Variation is the largest
largest, Temporal is the next largest (and probably easiest to
fix) and Between Unit Variation comes in last.
88
This example was 515
515 99
Between
that a high price scale Panel
Panel variable:
variable: Temporal
Temporal
could be generating
significant variation.
The in-line scale weighed the bottles and either sent them forward to ship or rejected them to be
topped off. The wind generated by the positive pressure in the room blew across the scale making
the weights recorded fluctuate unacceptably. The filling machine was actually quite good, there
were a few adjustments made once the variation from the scale was fixed. Once the variation in
the data was reduced, they were able to shift the Mean closer to the specification of 500 ml.
“X” Sifting
“X” Sifting
Classes of Distributions
By now you are
convinced that M ulti-Va ri is a tool tha t helps screen X ’s by visua lizing
Multi-Vari is a tool three prima ry sources of va ria tion. La ter w e w ill
that helps screen perform Hypothesis Tests ba sed on our findings.
X’s by visualizing
three primary
sources of At this point we will review classes and causes of distributions that
variation. At this can also help us screen X’s to perform Hypothesis Tests.
point we will review
classes and causes – N ormal Distribution
of distributions that
– N on-normality – 4 Primary Classifications
can also help us
screen X’s to 1. Skewness
perform Hypothesis
Tests. 2. Multiple Modes
3 Kurtosis
3.
4. Granularity
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
“X” Sifting
Normal Distribution
However, just
because a
distribution of sample data looks Normal does not mean that the variation cannot be reduced and a
new Normal Distribution created.
Non-Normal Distributions
“X” Sifting
Skewness Classification
When a distribution P t ti l Ca
Potentia C uses off Sk ew ness
is not symmetrical,
then it’s Skewed. Left Sk ew Right Sk ew
Generally a Skewed 60
distribution longest 40
50
tail points in the
Frequency
Frequency
30 40
direction of the 20 30
Skew. 10
20
10
0 0
10 15 20 4 5 6 7 8 9 10 11
M a chine A M a chine B
O pera tor A O pera tor B
Pa y ment M ethod A Pa y ment M ethod B Combined
Interview er A Interview er B
Sa m ple A
+ Sa mple B
=
What causes Mixed Distributions? Mixed Distributions occur when data comes from several sources
that are supposed to be the same but are not.
Note that both distributions that formed the combined Skewed Distribution started out as Normal
Distributions.
“X” Sifting
Just because on Linea r Rela tionships occur w hen the X a nd Y sca les
N on-Linea
your Input (X) a re different.
is Normally
Distributed 10
about a Mean,
the Output (Y)
may not be
Normally
Distributed.
Y
M a rgina l Distribution
5
of Y
0 50 100
X
M a rgina l Distribution
of X
1 5 IInteractions
1-5 t ti
Intera ctions occur when two inputs interact with each other to
have a larger impact on Y than either would by themselves.
On
35
Temperature
Spray
Off
30
Room T
25
No Spray
If you find that two inputs have a large impact on Y but would not effect Y by themselves
themselves, this is
called a Interaction.
For instance, if you spray an aerosol can in the direction of a flame what would happen to room
temperature? What do you see regarding these distributions?
“X” Sifting
Th distribution
The di t ib ti is
i dependent
d d t on time.
ti Time
relationships
occur when the
30
distribution is
dependent on
M a rgina l Distrribution
time, some
25
examples are
tool wear,
wear
chemical bath
of Y
depletion, stock
20 prices, etc.
10 20 30 40 50
Tim e
O
Often
ften seen
seen w w hen
hen tooling
tooling requires
requires ““w
w aarm ing up”
rming up”,, tool
tool w
w ea
ear,
r,
chemica l ba th depletions, a mbient tem pera ture effect on tooling.
chemica l ba th depletions, a m bient tem pera ture effect on tooling.
“X” Sifting
Kurtosis 2
K t i refers
Kurtosis f tto th
the sha
h pe off the
th ta
t ils.
il Platykurtic are
flat with short-
– Leptokurtic tails.
– Platykurtic
• Different combinations of distributions causes the resulting overall
shapes.
Platykurtic
2 -2
2 Sorting or Selecting:
Scrapping product that falls
outside the spec limits
2 -3 Trends or Pa tterns:
Lack of Independence in the
data (example: tool wear,
chemical bath)
2 -4 N on Linea r
Rela tionships
Chemical Systems
“X” Sifting
Leptokurtic
Positive
Kurtosis value Distributions
Di t ib ti overla
l ying
i ea ch
h other
th ththa t hah ve very
indicates different va ria nce ca n ca use a Leptok urtic
Leptokurtic distribution.
distribution. Ca uses:
2 -2 Sorting or Selecting:
Scrapping product that falls
outside the spec limits
2 -3 Trends or Pa tterns:
Lack of Independence in the data
(example: tool wear, chemical
bath)
2 -4 N on Linea r
Rela tionships
Chemical Systems
Multiple Modes 3
Multiple Modes have such dramatic combinations of underlying sources that they show distinct
modes. They may have shown as Platykurtic, but were far enough apart to see separation.
“X” Sifting
Bimodal Distributions
Descriptive Statistics
Variable: ExtremeBiMod
Anderson-Darling
Anderson Darling Normality Test
A-Squared: 22.657
P-Value: 0.000
Mean 28.8144
StDev 7.5702
Variance 57.3081
Skewness 1.37767
Kurtosis 2.66E-03
N 127
22 26 30 34 38 42 46
Minimum 22.6294
1 t Quartile
1st Q til 24 2649
24.2649
Median 25.2902
3rd Quartile 26.5494
95% Confidence Interval for Mu Maximum 45.3291
95% Confidence Interval for Mu
27.4851 30.1438
24.6 25.6 26.6 27.6 28.6 29.6 30.6 95% Confidence Interval for Sigma
6.7398 8.6359
95% Confidence Interval for Median
95% Confidence Interval for Median
25 0263
25.0263 25 7491
25.7491
If you see an extreme outlier, it usually has its on cause or own source of variation. It’s relatively
easy to isolate the cause by looking on the X Axis of the Histogram.
“X” Sifting
Mean 26.2507
StDev 4.8453
Variance 23.4767
Skewness 3.17250
Kurtosis 9.11483
N 108
22 26 30 34 38 42 46
Minimum 22.6294
1st Quartile 24.1285
Median 25.0534
3rd Quartile 25.9709
95% Confidence Interval for Mu Maximum 46.0000
95% Confidence Interval for Mu
25.3265 27.1750
25 26 27 95% Confidence Interval for Sigma
4.2740 5.5943
95% Confidence Interval for Median
95% Confidence Interval for Median
24.8365 25.2971
Granular 4
Now let’s take a moment and Notice the P-value in the Normal Probability Plot, it is definitely smaller
than 0.05!
“X” Sifting
Normal Example
N on
on-normal
normal Distributions can give more root cause
information than N ormal data (the nature of why…)
Hey
y Honey,
y I found the key….
y
“X” Sifting
Notes
Analyze Phase
Inferential Statistics
Inferential Statistics
Overview
The core
fundamentals of this W
W elcome
elcome to
to Ana
Analyze
lyze
phase are Inferential
Statistics, Nature of ““X
X”” Sifting
Sifting Inferential
Inferential Statistics
Statistics
Sampling and
Central Limit Inferentia
Inferentiall Sta
Statistics
tistics N
Nature
ature of
of Sampling
Sampling
Theorem.
Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing Central
Central Limit
Limit Theorem
Theorem
We will examine the
meaning of each of
these and show you Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
how to apply them.
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Hypothesis
H h i Testing
Hypothesis TTesting
i N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Nature of Inference
Inferential Statistics
1 . W ha t do you w a nt to k now ?
So many
questions….?
As with most things you have learned associated with Six Sigma – there are defined steps to be
taken.
Types of Error
1 . Error in sa mpling
– Error due to differences among samples drawn at random from the
population (luck of the draw).
– This is the only source of error that statistics can accommodate.
2 . Bia s in sa mpling
– Error due to lack of independence among random samples or due to
systematic sampling procedures (height of horse jockeys only).
3 . Error in mea surement
– Error in the measurement of the samples (MSA/ GR&R)
4 . La ck of mea surement va lidity
– Error in the measurement does not actually measure what it intends to
measure (placing a probe in the wrong slot measuring temperature
with a thermometer that is just next to a furnace).
Inferential Statistics
Popula tion
– EVERY data point that has ever been or ever will be generated from a given
characteristic.
Sa m ple
– A portion (or subset) of the population, either at one time or over time.
X
X X
X X
O bserva tion
– An
A individual
i di id l measurement.
t
Let’s just review a few definitions: A population is EVERY data point that has ever been or ever will
be generated from a given characteristic. A sample is a portion (or subset) of the population, either
at one time or over time. An observation is an individual measurement.
Significance
* RORI includes not only dollars and assets but the time and participation of your teams.
Inferential Statistics
The Mission
Your mission, which you have chosen to accept, is to reduce cycle time, reduce the error rate,
reduce costs, reduce investment, improve service level, improve throughput, reduce lead time,
increase productivity… change the output metric of some process, etc…
In statistical terms, this translates to the need to move the process Mean and/or reduce the process
Standard Deviation
You’ll be making decisions about how to adjust key process input variables based on sample data,
not population data - that means you are taking some risks.
How will you know your key process output variable really changed, and is not just an unlikely
sample? The Central Limit Theorem helps us understand the risk we are taking and is the basis for
using sampling to estimate population parameters.
Imagine you have some population. The individual values of this population form some distribution.
Take a sample of some of the individual values and calculate the sample Mean.
The Central Limit Theorem says that as the sample size becomes large, this new distribution (the
sample Mean distribution) will form a Normal Distribution, no matter what the shape of the
population distribution of individuals.
Inferential Statistics
Popula tion • Samples from the population, each with five observations:
3
5 Sa mple 1 Sa mple 2 Sa mple 3
2
12 1 9 2
10 12 8 3
1 9 5 6
6
12 7 14 11
5 8 10 10
6
12 7 .4 9 .2 6 .4
14
3 • In this example, we have taken three samples out of the
6 population, each with five observations in it. W e computed a
11
9 Mean for each sample. N ote that the Means are not the same!
10 • W hy not?
10
12
• W hat would happen if we kept taking more samples?
Every statistic derives from a sampling distribution. For instance, if you were to keep taking samples
from the population over and over, a distribution could be formed for calculating Means, Medians,
Mode, Standard Deviations, etc. As you will see the above sample distributions each have a
diff
different
t statistic.
t ti ti ThThe goall h
here iis tto successfully
f ll make
k iinferences
f regarding
di ththe statistical
t ti ti l ddata.
t
Create a sample of 1,000 individual rolls of a die that we will store in a variable named “Population”.
From the p
population,
p , we will draw five random samples.
p
Inferential Statistics
Sampling Distributions
To draw random samples from the population follow the command shown below and repeat 4 more
times for the other columns.
Sampling Error
Ca lcula te the M ea n a nd Sta nda rd Devia tion for ea ch column Now compare
a nd compa re the sa mple sta tistics to the popula tion. the Mean and
Standard
Stat > Basic Statistics > Display Descriptive Statistics…
Deviation of the
samples of 5
Descriptive Sta tistics: Popula tion, Sa mple1 , Sa mple2 , Sa m ple3 , Sa mple4 , Sa mple5 observations to
the population.
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum What do you
see?
Population 1000 0 3.5510 0.0528 1.6692 1.0000 2.0000 4.0000 5.0000 6.0000
Sample1
Sa pe 5 0 3.400
3 00 0.927
0 9 2.074
0 1.000
000 1.500
500 3.000
3 000 5.500
5 500 6.000
6 000
Inferential Statistics
Sampling Error
Sampling
p g Error - Reduced
Calculate the Mean and Standard Deviation for each column and compare the sample statistics to
the population.
S
Sample6
l 6 10 0 3 600
3.600 0 653
0.653 2 066
2.066 1 000
1.000 1 750
1.750 3 500
3.500 6 000
6.000 6 000
6.000
Can you tell what is happening to the Mean and Standard Deviation? When the sample size
increases, the values of the Mean and Standard Deviation decrease.
What do you think would happen if the sample increased? Let’s try 30 for a sample size.
Inferential Statistics
Do you notice
anything different?
Sampling Distributions
Now instead of looking at the effect of sample size on error, we will create a sampling distribution
of averages. Follow along to generate your own random data.
Inferential Statistics
Sampling Distributions
Repea
p tt this
Repea this com
commmaand
nd
to
to ca
calcula
lculate
te the
the M
Meaeann
of
of C1
C1-C1
-C100,, aand
nd store
store
result
result in
in M
Meaeann 1100
The commands shown above will create new columns that are now averages from the columns of
random population data. We have 1000 averages of sample size 5 and 1000 averages of sample
size
i 1010.
Crea te a Histogra m of C1 , M ea n5 a nd M ea n1 0 .
Graph> Histogram> Simple…..
Multiple
p Graph…On
p separate
p graphs…Same
g p X,, including
g same bins
In MINITABTM
follow the above
commands. The
Histogram being
generated makes
it easy to see
what happened
when the sample
size was
Select
Select ““Sa
Sam meeX X ,,
including
increased.
including sasammee
bins”
bins” to
to fa
facilita
cilitate
te
com
compapa rison
rison
Inferential Statistics
Different Distributions
Sa m ple M ea ns
Individua ls
Observations
have a Mean
Everything we have gone have a Std Dev
through with sampling error
and sampling distributions
was leading up to the and be normally distributed when the parent population is
normally distributed, or will be approximately normal for samples
Central Limit Theorem. of size 30 or more when the parent population is not normally
distributed
distributed.
This improves with samples of larger size.
Bigger is Better!
Inferential Statistics
So What?
A Practical Example
What is the likelihood of getting a sample with a 2 second difference? This could be caused either
by implementing changes or could be a result of random sampling variation, sampling error. The
95% confidence interval exceeds the 2 second difference (delta) seen as a result. What is the delta
caused from? This could be a true difference in performance or random sampling error
error. This is why
you look further than only relying on point estimators.
Inferential Statistics
Theoretica l distribution of
sa mple M ea ns for n = 2
Distribution of
Theoretica l distribution of
individua ls in the
sa mple M ea ns for n = 1 0
popula tion
Inferential Statistics
Standard Error
The ra te of cha nge in the sta nda rd error a pproa ches zero
a t a bout 3 0 sa mples.
Sta nda rd Errror
0 5 10 20 30
Sa m ple Size
When comparing standard error with sample size, the rate of change in the standard error
approaches zero at about 30 samples. This is why a sample size of 30 comes up often in discussions
on sample size.
This is the point at which the t and the Z distributions become nearly equivalent. If you look at a Z
table and a t table to compare
p Z=1.96 to t at 0.975 as sample
p approaches
pp infinite degrees
g of freedom
they are equal.
Inferential Statistics
Notes
Analyze Phase
Introduction to Hypothesis Testing
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Our goal is to improve our Process Capability, this translates to the need to move the process Mean
(or proportion) and reduce the Standard Deviation.
Because it is too expensive or too impractical (not to mention theoretically impossible) to
collect population data, we will make decisions based on sample data.
Because we are dealing with sample data, there is some uncertainty about the true
population parameters.
Hypothesis Testing helps us make fact-based decisions about whether there are different population
parameters or that the differences are just due to expected sample variation.
96 100 104 108 112 116 120 102 105 108 111 114 117 120
O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance O bserv ed P erformance Exp. Within P erformance E xp. O v erall P erformance
P P M < LS L 6666.67 P P M < LSL 115.74 P P M < LSL 55078.48 P P M < LS L 0.00 P P M < LSL 0.00 P P M < LSL 0.00
PPM > USL 0.00 PPM > USL 0.71 P P M > U SL 18193.49 P P M > U SL 0.00 P P M > U S L 0.00 P P M > U S L 0.00
P P M Total 6666.67 P P M Total 116.45 P P M Total 73271.97 P P M Total 0.00 P P M Total 0.00 P P M Total 0.00
The purpose of appropriate Hypothesis Testing is to integrate the Voice of the Process with the
Voice of the Business to make data-based decisions to resolve problems.
Hypothesis Testing can help avoid high costs of experimental efforts by using existing data. This
can be likened to:
Local store costs versus mini bar expenses.
There may be a need to eventually use experimentation, but careful data analysis can
indicate a direction for experimentation if necessary.
Recall from the discussion on classes and cause of distributions that a data set may seem Normal,
yet still be made up of multiple distributions.
Hypothesis Testing can help establish a statistical difference between factors from different
distributions.
0.8
0.7
0.6
0.5
freq
0.4
03
0.3
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
x
Because of not having the capability to test an entire population, having to use a sample is the
closest we can get to the population. Since we are using sample data and not the entire population
we need to have methods what will allow us to infer the sample if a fair representation of then
population.
When we use a proper sample size, Hypothesis Testing gives us a way to detect the likelihood that
a sample came from a particular distribution. Sometimes the questions can be: Did our sample
come from a population with a mean of 100? Is our sample variance significantly different than the
variance of the population? Is it different from a target?
Significant Difference
μ1 μ2
Sa mple 1 Sa mple 2
Do you see a difference between Sample 1 and Sample 2? There may be a real difference between
the samples shown; however, we may not be able to determine a statistical difference. Our
confidence is established statistically which has an effect on the necessary sample size. Our ability
to detect a difference is directly linked to sample size and in turn whether we practically care about
such a small difference.
Detecting Significance
H A : The sk y is fa lling.
We will discuss the difference between practical and statistical throughout this session. We can
affect the outcome of a statistical test simply by changing the sample size.
Lets take a moment to explore the concept of Practical Differences versus Statistical Differences.
Detecting Significance
g in the Mean or in
The difference d can be either a change
the variance.
Hypothesis Testing
A Hypothesis
H T t is
th i Test i an a priori
i i theory
th relating
l ti tot differences
diff b
between
t variables.
i bl
DICE Example
You have rolled dice before haven’t you? You know dice that you would find in a board game or in
Las Vegas.
Well assume that we suspect a single die is “Fixed.” Meaning it has been altered in some form or
fashion to make a certain number appear more often that it rightfully should.
Consider the example on how we would go about determining if in fact a die was loaded.
If we threw the die five times and got five one’s, what would you conclude? How sure can you be?
The pprobability
y of g
getting
g jjust a single
g one. The p
probability
y of g
getting
g five ones.
W e could throw it a number of times and track how many each face
occurred. W ith a standard die, we would expect each face to occur 1/ 6 or
16.67% of the time.
If we threw the die 5 times and got 5 one’s, what would you conclude? How
sure can you be?
– Pr (1 one) = 0.1667 Pr (5 ones) = (0.1667)5 = 0.00013
There are approximately 1.3 chances out of 1000 that we could have gotten
5 ones with a standard die.
Therefore, we would say we are willing to take a 0.1% chance of being
wrong about our hypothesis that the die was “ loaded” since the results do not
come
co e cclose
ose to ou
our p
predicted
ed cted outco
outcome.
e
Hypothesis Testing
DECISIONS
β n
A differences
Any diff between
b observed
b dddata and
d claims
l i made
d under
d H0 may b
be reall or d
due to chance.
h
Hypothesis Tests determine the probabilities of these differences occurring solely due to chance and
call them P-values.
The a level of a test (level of significance) represents the yardstick against which p-values are
measured and H0 is rejected if the P-value is less than the alpha level.
Th mostt commonly
The l used
d a llevels
l are 5%
5%, 10% and
d 1%
1%.
There of ttwo
o ttypes
pes of error T
Type
pe I with
ith an associated risk eq
equal
al to alpha (the first letter in the Greek
alphabet), and of course named the other one Type II with an associated risk equal to beta.
The formula reads: alpha is equal to the probability of making a Type 1 error, or alpha is equal to
the probability of rejecting the null hypothesis when the null hypothesis is true.
Alpha Risk
Region R i
Region
of of
DO UBT DO UBT
Hypothesis
yp Testing
g Risk
The beta risk or Type 2 Error (also called the “Consumer’s Risk”) is the probability that we could
be wrong in saying that two or more things are the same when, in fact, they are different.
Actua l Conditions
N ot Different Different
(Ho is True) (Ho is False)
Another way to describe beta risk is failing to recognize an improvement. Chances are the sample
size was inappropriate or the data was imprecise and/or inaccurate.
Reading the formula: Beta is equal to the probability of making a Type 2 error.
Or: Beta is equal to the probability of failing to reject the null hypothesis given that the null
hypothesis is false.
Beta Risk
Critica
Criticall va
value
lue of
of
test
test sta
statistic
tistic
Theoretical Distribution
of Means
When n = 30
δ=5
S=1
Large S
All samples are estimates of the population. All statistics based on samples are estimates of the
equivalent population parameters. All estimates could be wrong!
These are typical questions you will experience or hear during sampling. The most common answer
is “It depends.”. Primarily because someone could say a sample of 30 is perfect where that may
actually be too many. Point is you don’t know what the right sample is without the test.
40 50 60 70
H
Here iis a H
Hypothesis
th i T Testing
ti roadmap
d ffor C
Continuous
ti D
Data.
t Thi
This iis a greatt reference
f ttooll while
hil you
are conducting Hypothesis Tests.
N orm a l
u s
i n uo
nt
Co Da ta
2 Sa m ple T O ne W a y AN O VA 2 Sa m ple T O ne W a y AN O VA
s
ou
t i nu
n
Co D a ta N on N orma l
Attribute Da ta
u te
t t r ib
A a ta
D
One Fa ctor Tw o Fa ctors
Two Samples Two or More Samples
One Sample
W hile
hil using
i H Hypothesis
th i T Testing
ti ththe ffollowing
ll i ffacts
t should
h ld bbe b
borne in
i
mind at the conclusion stage:
– The decision is about Ho and N OT Ha.
– The conclusion statement is whether the contention of Ha was upheld.
– The null hypothesis (Ho) is on trial.
– W hen a decision has been made:
• N othing has been proved.
• It is just a decision.
• All decisions can lead to errors (Types I and II).
– If the decision is to “ Reject Ho,” then the conclusion should read “ There
is sufficient evidence at the α level of significance
g to show that “ state the
alternative hypothesis Ha.”
– If the decision is to “ Fail to Reject Ho,” then the conclusion should read
“ There isn’t sufficient evidence at the α level of significance to show that
“ state the alternative hypothesis.”
Notes
Notes
Analyze Phase
Hypothesis Testing Normal Data Part 1
Overview
The core
fundamentals of this W
W elcom
elcomee to
to Ana
Analy
lyze
ze
phase are
Hypothesis Testing, ““X
X”” Sifting
Sifting
Tests for Central
Tendency, Tests for Inferentia
Inferentiall Sta
Statistics
tistics
Variance and
ANOVA.
Intro
Intro to
to Hypothesis
Hy pothesis Testing
Testing Sa
Sample
p Size
mple Size
We will examine the
meaning of each of Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1 Testing
Testing M
Mea
eans
ns
these and show you Ana
Analy
lyzing
zing Results
Results
how to apply them. Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hy pothesis Testing
Hypothesis Testing N
NNND
D P1
P1
Hypothesis
H th i Testing
Hypothesis Testing
T ti N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Item s
T-tests are used to compare a Mean against a target and to compare Means from two different
samples and to compare paired data. When comparing multiple Means it is inappropriate to use a t-
test. Analysis of variance or ANOVA is used when it is necessary to compare more than two Means.
t-tests a re used:
– To compare
p a Mean against
g a target.
g
• i.e.; The team made improvements and wants to compare
the mean against a target to see if they met the target.
They don’t
The d ’t look
l k the
same to me!
1 Sample t
Here we are looking for the region in which we can be 95% sure our true population Mean will lie
lie.
This is based on a calculated average, Standard Deviation, number of trials and a given alpha risk of
.05.
A 1-sample t-test is used to compare an expected population mean to a
In order for the Mean target.
of the sample to be
considered not
significantly different
than the target,
target the
target must fall within Target μsample
the confidence
interval of the sample MIN ITABTM performs a one sample t-test or t-confidence interval for the Mean.
Mean.
Use 1-sample t to compute a confidence interval and perform a hypothesis
test of the Mean when the population Standard Deviation, σ, is unknown. For
a one or two-tailed 1-sample t:
If you remember from earlier, 95% of the area under the curve of a Normal Distribution falls within plus
or minus 2 Standard Deviations. Confidence intervals are based on your selected alpha level, so if you
selected an alpha of 5%, then the confidence interval would be 95% which is roughly plus or minus 2
Standard Deviations. Using your eye to guesstimate you can see that the target value falls within plus
or minus 2 Standard Deviations of the sampling distribution of sample size 2
2.
If you used a sample of 30, could you tell if the target was different? Just using your eye it appears
that the target is outside the 95% confidence interval of the Mean. Luckily, MINITABTM makes this very
easy…
Sample Size
IInstead
t d off going
i To determine proper sa mple size in M IN ITABTM :
through the dreadful
hand calculations of
sample size we will
use MINITABTM.
Three fields must be
filled in and one left
blank in the sample Three fields m ust be filled
in a nd one left bla nk .
size window.
MINITABTM will solve
for the third. If you
want to know the
sample size, you must
enter the difference,
which is the shift that
mustt be
b detected.
d t t d It isi
common to state the
difference in terms of
“generic” Standard Deviations when you do not have an estimate for the Standard Deviation of the
process. For example, if you want to detect a shift of 1.5 Standard Deviations enter that in difference
and enter 1 for Standard Deviation. If you knew the Standard Deviation and it was 0.8, then enter it
for Standard Deviation and 1.2 for the difference (which is a 1.5 Standard Deviation shift in terms of
real values)
values).
If you are unsure of the desired difference, or in many cases simply get stuck with a sample size that
you didn’t have a lot of control over, MINITABTM will tell you how much of a difference can be
detected. You as a practitioner must be careful when drawing Practical Conclusions because it is
possible to have statistical significance without practical significance. In other words- do a reality
check. MINITABTM has made it easy to see an assortment of sample sizes and differences.
1-Sample t Example
3 . 1 -sa mple t-test (popula tion Sta nda rd Devia tion unk now n,
compa ring to ta rget).
α = 0.05 β = 0.10
4 . Sa mple Size:
• O pen the M IN ITABTM w ork sheet:
Ex h_Sta t.M TW
• Use the C1 colum n: Va lues
– In this ca se, the new supplier
sent 9 sa mples for eva lua tion.
– How much of a difference ca n
be detected w ith this sa mple?
Hypothesis Testing
Follow along in
MINITABTM, as you can
see, we will be able to
detect a difference of
1.23 with the sample of
9.
Now refer to the road map for Hypothesis Testing, you must first check for Normality. In MINITABTM
select “Stats>Basic Statistics>Normality Test”. For the “Variable Fields” double-click on “Values” in
the left-hand box. Once this is complete select “OK”.
Since the P-value is greater than 0.05 we fail to reject the null hypothesis that the data are Normal.
60
50
40
30
20
Are
Are the
the
10
da
datata in
in the
the
va
values
lues
column
l
colum n
4.2 4.4 4.6 4.8 5.0 5.2 5.4 norma
norm al? l?
Values
1-Sample t Example
Perform the one sample t-t
test. In MINITABTM select
Stat>Basic Statistics>1-
Sample t. From the left-
hand box double-click on
“Values”.
Click “ Gra phs”
In the “Options” button
th
there iis a selection
l ti ffor th
the -Select
S l t a ll 3
alternative hypothesis, the Click “ O ptions
default is not equal which
corresponds to our - In CI enter 9 5
hypothesis. If your
alternative hypothesis was
a greater than or less than,
you would have to change
y g
the default.
Histogram of Values
hypothesized value of 5
noted as the Ho or null N ote our ta rget M ea n (represented by red Ho) is outside our
hypothesis. popula tion confidence bounda ries w hich tells tha t there is
significa nt difference betw een popula tion a nd ta rget M ea n.
Boxplot
Boxplotof
ofValues
Values
(with
(withHo
Hoand
and95%
95%t-confidence
t-confidenceinterval
intervalfor
forthe
themean)
mean)
__
XX
Ho
Ho
4.4
4.4 4.5
4.5 4.6
4.6 4.7
4.7 4.8
4.8 4.9
4.9 5.0
5.0 5.1
5.1
Values
Values
I di id l Value
Individual V l Pl t off Values
Plot V l
(with Ho and 95% t-confidence interval for the mean)
_
X
Ho
As you will see the conclusion is the same, but the Dot Plot is just another representation of data.
Session Window
Ho Ha
n
(X i − X ) 2
One-Sample T: Values
s= ∑
i =1 n −1
Test of mu = 5 vs not = 5 S
SE Mean =
n
N – sa mple size
M ea n – ca lcula te ma thema tic a vera ge
StDev – ca lcula ted individua l sta nda rd devia tion (cla ssica l m ethod)
SE M ea n – ca lcula ted sta nda rd devia tion of the distribution of the m ea ns
Confidence Interva l tha t our popula tion a vera ge w ill fa ll betw een 4 .5 9 8 9 , 4 .9 7 8 9
X Ho
X − Target 4 . 79 − 5 . 00
t= = = − 2 . 56
s 0 . 247
n 9
degrees of T - Distribution
freedom
.600 .700 .800 .900 .950 .975 .990 .995
1 0.325
0 325 0.727
0 727 1.376
1 376 3.078
3 078 6.314
6 314 12.706
12 706 31.821
31 821 63.657
63 657
2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925
3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841
4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604
5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032
-2.306 2.306
0
If the ca lcula ted t-va lue lies a ny w here Critical Regions
in the critica l regions
regions, reject the null hypothesis
hypothesis.
– The da ta supports the a lterna tive hy pothesis tha t the
estima te for the M ea n of the popula tion is not 5 .0 .
Here iis th
H the fformula
l ffor th
the
confidence interval. Notice The formula for a tw o-sided t-test is:
we get the same results as
MINITABTM. s s
X − t α/2, n −1 ≤ μ ≤ X + t α/2,n −1
n n
or
X ± t crit SE
S mean = 4.788
88 ± 2.306
2 306 * .0824
0824
4.5989 to 4.9789
4.5989 X Ho
4.9789
4.7889
1-Sample t Exercise
3. Are we on Target?
B
Because we used d th
the
option of “Graphs”, we get a Histogram of ppm VOC
nice visualization of the (with Ho and 95% t-confidence interval for the mean)
data in a Histogram AND a 10
cy
Frequenc
4
Because the null
hypothesis is within the
2
confidence level, you know
we will “fail to reject” the 0 _
X
null hypothesis and accept Ho
the equipment is running at 20 25 30 35 40 45 50
the target of 32.0.
32 0 ppm VOC
N orma l
s
ou
nu
o n ti ta
C Da
2 Sa mple T O ne W a y AN O VA 2 Sa mple T O ne W a y AN O VA
2 Sample t-test
Notice the
difference in the A 2-sample t-test is used to compare two Means.
hypothesis for two
two- Stat > Basic Statistics > 2-Sample t
tailed vs. one-tailed MIN ITABTM performs an independent two-sample t-test and generates a
test. This confidence interval.
terminology is only
used to know which Use 2-Sample t to perform a Hypothesis Test and compute a confidence
column to look interval of the difference between two population Means when the
down in the t-table. population Standard Deviations, σ’s, are unknown.
Sample Size
In MINITABTM
select “Stat>Power To determine proper sa mple size in M IN ITABTM :
and Sample
Size>2-Sample t”.
Follow the same
steps that were
taken for 1-sample
t.
Three fields m ust be filled
in a nd one left bla nk .
2-Sample t Example
Now in Step 4.
Open the worksheet 4 . Sa mple Size:
in MINITABTM • Open the MIN ITABTM worksheet: Furnace.MTW
called:
g the data to see how the data is coded.
• Scroll through
“Furnace
Furnace.MTW”
MTW”
• In order to work with the data in the BTU.In column, we will need
How is the data to unstack the data by damper type.
coded?
2-Sample t Example
Notice the “unstacked” data for each damper. WE NOW HAVE TWO COLUMNS.
2-Sample t Example
For the field “Sample Sizes:” enter 40 space 50 because our data set has unequal sample sizes
which is not uncommon. The smallest difference that can be detected is based on the smallest
sample size, so in this case it is: 0.734.
Probability
ProbabilityPlot
Plotof
ofBTU.In_1
BTU.In_1
Normal
Normal
99
99
Mean
Mean 9.908
9.908
StDev
StDev 3.020
3.020
95
95 NN 40
40
AAD
D 0.475
0.475
90
90 P-Value
P-Value 0.228
0.228
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
55 10
10 15
15 20
20
BTU.In_1
BTU.In_1
The data is considered Normal since the P-value is greater than 0.05.
Probability
ProbabilityPlot
Plotof
ofBTU.In_2
BTU.In_2
Normal
Normal
99
99
Mean
Mean 10.14
10.14
StDev
StDev 2.767
2.767
95
95 NN 50
50
AAD
D 0.190
0.190
90
90 P-Value
P-Value 0.895
0.895
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
22 44 66 88 10
10 12
12 14
14 16
16 18
18
BTU.In_2
BTU.In_2
This is the Normality Plot for damper 2. Is the data Normal? It is Normal, continuing down the
roadmap…
Levene'ss Test
Levene
Test Statistic 0.00
Sa mple 1 2
P-Value 0.996
1
Damper
5 10 15 20
BTU.In
The P-value of 0.558 indicates that there is no statistically significant difference in variance.
Box Plot
Boxplot
Boxplotof
ofBTU.In
BTU.Inby
byDamper
Damper
20
20
15
15
BBTU.In
TU.In
10
10
55
11 22
Damper
Damper
The Box Plots do not show much of a difference between the dampers.
Ca lcula ted
Avera ge n
(X i − X)
2
s= ∑i =1 n −1
S
SE Mean= (N 1 – 1 ) + (N 2 -1 )
n
Tw
Tw o-
o- Sa
Sammple
ple T-Test
T-Test
((Varia
(Va riances
nces Equa
Equal)l)
)
H
Hoo:: μμ11 == μμ22
N -1 .4 5 0 0 .9 8 0
Num
umber
ber of
of H
Haa:: μμ11≠≠ or
or << or
or >> μμ22
Sa
Samples
mples -0 .3 8
Exercise
2. Statistical Problem:
Ho:μ1 = μ2
Ha:μ1 ≠ μ2
To unstack the data follow the steps here. This will generate two new columns of data shown on the
next page…
By unstacking
the data we how
have the • Clor.Lev_Post_1 =
Clor.Lev data Distributor 1
separated by the
distributor it
came from. Now • Clor.Lev_Post_2 =
let’s
let s move on to Distributor 2
trying to
determine correct
sample size.
We wantt to
W t determine
d t i whath t is
i the
th
smallest difference that can be
detected based on our data.
In this case:
.7339 rounded to.734
The results show us a P-value of 0.154 so our data is Normal. Recall if the P-value is greater than
.05 then we will consider our data Normal.
ent
60
Perce 50
0
40
30
20
10
1
10 12 14 16 18 20 22 24 26 28
Clor.Lev_Post_1
60
50
40
30
20
10
1
10 12 14 16 18 20 22 24 26
Clor.Lev_Post_2
Look at the P
P-value
value of 0.574.
0 574
This tells us that there is no statistically significant difference in the variance in these two data sets.
What does this mean….We can finally run a 2 sample t–test with equal variances?
Levene's Test
Test Statistic 0.00
P-Value 0.986
2
1
Distributor
Boxplot
Boxplotof
ofClor.Lev_Post
Clor.Lev_Postby
byDistributor
Distributor
Hmm, we’re 28
28
26
26
a lot alike! 24
24
22
22
Clor.Lev_Post
Clor.Lev_Post
20
20
18
18
16
16
14
14
12
12
10
10
11 22
Distributor
Distributor
The Box Plots show VERY little difference between the Distributors,, also not the P-value in the
Session Window– there is no difference between the two Distributors.
N orma l
s
u ou
n tin a
Co Da t
2 Sa mple T O ne W a y AN O VA 2 Sa mple T O ne W a y AN O VA
Normality Test
70
Percent
60
60
50
50
40
40
30
30
20
20
10
10
5
5
1
1
Probability
ProbabilityPlot
Plotof
ofSample
Sample11
Normal
o a
Normal
0.1
01
0.1
-5
-5 00 55 10
10 15
15
99.9
99.9
Mean 4.853 Sample
Sample33
Mean 4.853
StDev
StDev 1.020
1.020
99
99 NN 100
100
AD 0.374
95
AD 0.374
95 P-Value 0.411
P-Value 0.411
90
90
80
80
70
Percent
70
Percent
60
60
50
50
40
OOur
ur da
datata sets
sets aare
40
30
30
20
20 re
norm a lly distributed.
10
10
5
5 norma lly distributed.
1
1
0.1
0.1
11 22 33 44 55 66 77 88
Sample
Sample11
1 2 3 4
WW ee use
use F-Test
F-Test Sta
Statistic
tistic
F-Test Levene's Test
beca
because
use ourour da
datata is
is Test Statistic: 0.106 Test Statistic: 67.073
norm
normaallylly distributed.
distributed. P-Value : 0.000 P-Value : 0.000
P-Va
P-Value
lue isis less
less tha
thann
00.0
.055,, our
our vavaria
riances
nces aarere Boxplots of Raw Data
not
not equa
equal.l.
1
0 5 10 15
Stacked
M
Media
edians
ns of
of Sa
Sam ples
mples
This is the output from MINITABTM. Notice that even though the names of the columns in
MINITABTM were Sample p 1 and Sample
p 3,, MINITABTM used Factor levels 1 and 2 to differentiate
the outcome. We have to interpret the meaning for factor levels properly, it is simply the difference
between the samples labeled one and three in our worksheet.
UN CHECK
“ Assum e equa
q l
va ria nces” box .
You can see there is very little difference in the 2-Sample t-tests.
Boxplot
Boxplot of
of Stacked
Stackedby
by C4
C4
15
15 Indica te
Sa mple
M ea ns
10
10
Stacked
Stacked
55
00
-5
-5
11 22
C4
C4
The Box Plot shows no difference between the Means. The overall box is smaller for sample on the
left,, which is an indication for the difference in variance.
Individual
Individual Value
Value Plot
Plotof
of Stacked
Stackedvs
vs C4
C4
15
15
IIndica
di te
10
10
Sa mple
M ea ns
Stacked
Stacked
55
00
-5
-5
11 22
C4
C4
By looking at this Individual Value Plot you can notice a big spread or variance of the data.
Tw
Tw o-Sa
o-Sammple
ple T-Test
T-Test
(Va
(Varia
riances
nces N
Not
ot Equa
Equal)
l)
Ho:
Ho: μμ11 == μμ22 (P-Va
(P-Value
lue >> 00.0
.055))
Ha:: μμ11 ≠≠ or
Ha or << or
or >> μμ22 (P-Va
(P-Value
lue << 00.0
.055))
What does the P-value of 0.996 mean? After conducting a 2-sample t-test there is no significant
difference between the Means.
N orma l
s
u ou
n t in a
Co Da t
2 Sa mple
p T O ne W a y AN O VA 2 Sa m p
ple T O ne W a y AN O VA
Paired t-test
• Use th
U the PPa iired
d t com m a nd
d to
t com pute
t a confidence
fid interva
i t l a ndd
perform a Hy pothesis Test of the difference betw een popula tion M ea ns
w hen observa tions a re pa ired. A pa ired t-procedure m a tches responses
tha t a re dependent or rela ted in a pa ir-w ise delta
m a nner. This m a tching a llow s y ou to a ccount for (δ)
va ria bility betw een the pa irs usua lly resulting in
a sm a ller error term , thus increa sing the sensitivity
of the Hypothesis Test or confidence interva l.
– H o : μδ = μo
– H a : μδ ≠ μo
μbefore μafter
• W here μδ is the popula tion M ea n of the differences a nd μ0 is the
hypothesized M ea n of the differences, typica lly zero.
Example
JJustt checking
h ki your souls,
l
er…soles!
Example (cont.)
EXH_STAT DELTA.MTW
Paired t-test
t test Example
In MINITABTM
open
“Stat>Power
Now that’s
and Sample
size>1-
a tee test!
Sample t”.
E t in
Enter i th
the
appropriate
Sample Size, M IN ITABTM Session W indow
Power Value Pow er a nd Sa mple Size
and Standard 1 -Sa mple t Test
Deviation.
Testing mea n = null (versus not = null)
Ca lcula ting pow er for mea n = null +
diff
difference
Alpha = 0 .0 5 Assumed sta nda rd
This mea ns w e w ill only be a ble to devia tion = 1
detect a difference of only 1 .1 5 if the
Sa mple
Sta nda rd Devia tion is equa l to 1 .
Size Pow er Difference
10 0 .9 1 .1 5 4 5 6
Given the sample size of 10 we will be able to detect a difference of 1.15. If this was your process you
would need to decide if this was good enough. In this case, is a difference of 1.15 enough to
practically want to change the material used for the soles of the children’s shoes.
P i d t-test
Paired tt tE Example
l
Probability
ProbabilityPlot
Plotof
ofAB
ABDelta
Delta
Normal
Normal
99
99
Mean
Mean 0.41
0.41
StDev
StDev 0.3872
0.3872
95
95 NN 10
10
AAD
D 0.261
0.261
90
90 P-Value
P-Value 0.622
0.622
80
80
70
70
Percent
ercent
60
60
50
50
40
40
P
30
30
20
20
10
10
55
11
-0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5
AAB
B Delta
Delta
1-Sample t
Box Plot
Analyzing the Box Plot we see that the null hypothesis falls outside the confidence interval, so we
reject the null hypothesis. The P-value is also less than 0.05. Given this we are 95% confident that
there is a difference in the wear between the two materials used for the soles of children’s shoes.
Paired T-Test
Click
Click on
on ““Gra
Graphs”
phs” aand
nd select
select
the
the gra
graphs
phs yyou
ou w
w ould
ould lik
likee
to
to genera
generate.
te.
Boxplot
Boxplotof
ofDifferences
Differences
(with
(withHo
Hoand
and95%
95%t-confidence
t-confidenceinterval
intervalfor
forthe
themean)
mean)
The
The P-Va
P-Valuelue of
of from
from
this
thi
this Pa
PPaired
iiredd T-Test
T T t tells
T-Test ttells
ll
us
us the
the difference
difference in in
mmaateria
terials
ls is
is
_
X
X
_ sta
statistica
tistically
lly significa
significant. nt.
Ho
Ho
-1.2
-1.2 -0.9
-0.9 -0.6
-0.6 -0.3
-0.3 0.0
0.0
Differences
Differences
Pa
Paired
ired T-Test
T-Test aand
nd CI:
CI: MMaat-A,
t-A, M
Maat-B
t-B
Pa
Paired
ired TT for
for M
Maat-A
t-A -- M
Maat-B
t-B
N
N MMeaeann StDev
StDev SE
SE M Meaeann
M
Maat-A
t-A 1100 1100.6.6330000 22.4 .4551133 00.7
.7775522
M
Maat-B
t-B 1100 1111.0.0440000 22.5 .5118855 00.7
.7996644
Difference
Difference 1100 -0
-0.4
.41100000000 00.3
.38877115555 00.1.12222442299
9955%
% CI
CI for
for m
mea difference: ((-0
eann difference: ( 0.6
(-0 .68866995544,, -0
-00.1
.13333004466))
T-Test
T-Test of
of m
meaeann difference
difference == 00 (vs
(vs not
not == 00): ): T-Va
T-Value
lue == -3-3.3
.355 P-Va
P-Value
lue == 00.0
.00099
As you will see the conclusions are the same, but just presented differently.
Calc>Calculator
Median
Histogram
Histogramof
ofTX_MX-Diff
TX_MX-Diff
(with
(withHo
Hoand
and95%
95%t-confidence
t-confidenceinterval
intervalfor
forthe
themean)
mean)
55
44
33
Frequency
Frequency
N orma l
s
ou
nu
o n ti ta
C Da
2 Sa mple T O ne W a y AN O VA 2 Sa mple T O ne W a y AN O VA
Notes
You have now completed Analyze Phase – Hypothesis Testing Normal Data Part 1.
Notes
Analyze Phase
Hypothesis Testing Normal Data Part 2
Overview
We are now
moving into W
W elcome
elcome to
to Ana
Analy
lyze
ze
Hypothesis
Testing Normal ““X
X”” Sifting
Sifting
Data Part 2 where
we will address Inferentia
Inferentiall Sta
Statistics
tistics
Calculating
Sample Size, Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
Variance Testing
and Analyzing Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Results. Ca
Calcula
lculate
te Sa
Sample
mple Size
Size
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Tests of Variance
Tests of Va ria nce are used for both normal and non-normal
data.
N orma l Da ta
– 1 Sample to a target
– 2 Samples – F-Test
– 3 or More Samples Bartlett’s Test
N on-N orm a l Da ta
– 2 or more samples Levene’s
Levene s Test
1-Sample Variance
Use the sa mple size ca lcula tions for a 1 sa mple t-test since
they a re ra rely performed w ithout performing a 1 sa mple t-
test a s w ell.
1-Sample Variance
4 . Sa mple Size:
• O pen the M IN ITABTM w ork sheet: Ex h_Sta t.M TW
• This is the sa me file used for the 1 Sa mple t ex a mple.
– W e w ill a ssume the sa mple size is a dequa te.
MMean
ean 4.7889
4.7889
StD ev
StDev 0.2472
0.2472
What does this mean from a VVariance
ariance
Skew ness
Skewness
0.0611
0.0611
-0.02863
-0.02863
practical stand point? They Kurtosis
Kurtosis
NN
-1.24215
-1.24215
99
4.5989 4.9789
easier
i tto accomplish
li h iin a 95%
95%CConfidence
onfidence Interv al for
Interval forMMedian
edian
4.6000 5.0772
process than reducing 95%
95%Confidence
ConfidenceIntervals
Intervals
95%
4.6000
95%CConfidence
onfidence Interv
5.0772
al for
Interval forSStDev
tD ev
3 . Equa l va ria nce test (F-test since there are only 2 factors.)
Ch k ffor N
Check Normality.
lit
5 . Sta tistica l Solution:
Stat>Basic Statistics>Normality Test
According to the
graph we have Ho: Da ta is norm a l
Normal data. Ha : Da ta is N O T norm a l Stat>Basic Stats> Normality Test
(Use Anderson Darling)
Probability
ProbabilityPlot
Plotof
ofRot
Rot11
Normal
Normal
99.9
99.9
Mean
Mean 4.871
4.871
StDev
StDev 0.9670
0.9670
99
99 NN 100
100
AAD
D 0.306
0.306
95
95 P-Value 0.559
P-Value 0.559
90
90
80
80
70
Percent
70
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
0.1
0 1
0.1
22 33 44 55 66 77 88
Rot
Rot11
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot11
F-Test
F-Test
Test
TestStatistic
Statistic 0.74
0 74
0.74
11 P-Value
P-Value 0.298
0.298
Factors
Levene's Test
Test
TestStatistic
Statistic 0.53
0.53
P-Value
P-Value 0.469
0.469
22
Use
Use F-Test
F-Test for
for 22 sa mmples
sa0.7 ples0.8 0.9 1.0 1.1 1.2 1.3 1.4
0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4
norm
normaally
lly distributed
distributed da data
ta.. 95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
P-Va
P-Value
lue >>00.0
.055 (.2
((.29988))
Assum
Assume e Equa
Equall Va
Va1 ria
riance
nce 1
Factors
Factors
22
22 33 44 55 66 77
Rot
Rot11
Normality Test
Probability
Probability Plot
Plot of
of Rot
Rot
Normal
Normal
99
99
Mean
Mean 13.78
13.78
StDev
StDev 7.712
7.712
95
95 NN 18
18
AADD 0.285
0.285
90
90 P-Value
P-Value 0.586
0.586
80
80
70
70
Percent
Percent
60
60
50
50
40
40 The
The P-value
P-value is
is >> 0.05
0.05
30
30 We
We can
can assume
assume ourour
20
20 data
data is
is Normally
normally
10
10 Distributed.
distributed.
55
11
-5
-5 00 55 10
10 15
15 20
20 25
25 30
30 35
35
Rot
Rot
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot
F-Test
F-Test
Test
TestStatistic
Statistic 0.68
0.68
10
10 P-Value
P-Value 0.598
0.598
Lev ene's Test
Levene's Test
Temp
Tem p
Test
TestStatistic
Statistic 0.05
0.05
P-Value
P-Value 0.824
0.824
16
16
22 44 66 88 10
10 12
12
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs Ho:
Ho: σ
σ11 == σσ22
Ha
Ha:: σ
σ11≠≠ σ
σ22
10
10
P-Va
P-Value
lue >> 00.0
.055,, There
There isis no
no
sta
statistica
tistically
lly significa
significant nt difference.
difference.
Temp
Tem p
16
16
00 55 10
10 15
15 20
20 25
25
Rot
Rot
You can see there is no statistical difference for variance in Rot based on temperature as a factor.
Since the data is Normally Distributed and we have 2 samples, use F-Test statistic.
Use
Use F-
F- Test
Test for
for 22
sa
samples
mples ofof N
N orma
ormally lly
Distributed
Distributed da
datata..
Another method for testing for equal variance will allow more than one factor. The Labels are the
factors. The data is the Output.
This time we have Rot as the response and Temp and Oxygen as the factors.
This graph
Thi h shows
h a ttestt off
equal variance which Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot
displays Bonferroni 95% Temp
Temp Oxygen
Oxygen
confidence for the response Bartlett's
Bartlett'sTest
Test
Standard Deviation at each 22 Test
TestStatistic
Statistic 2.71
2.71
P-Value
P-Value 0.744
0.744
level. As you will see the 10 66 Lev ene's Test
Levene's Test
10
Bartlett’s and Levene’s test Test
TestStatistic
Statistic 0.37
0.37
P-Value
P-Value 0.858
0.858
are displayed in the same 10
10
being equal.
P-va lue > 0 .0 5 show s insignifica nt
difference betw een va ria nce
Use this if
Bartlett s Test (normal distribution)
Bartlett's
da ta is N orma l
Test statistic = 2.71, p-value = 0.744
a nd for Fa ctors < 2
Does the Session Window have the same P-values as the Graphical Analysis?
First we want to do a graphical summary of the two samples from the two suppliers.
I “Variables:”
In “V i bl ” enter
t ‘‘ppm
VOC’
The P-value is greater than 0.05 for both Anderson-Darling Normality Tests so we conclude the
samples are from Normally Distributed populations because we “failed to reject” the null hypothesis
that the data sets are from Normal Distributions.
Median Median
Continue to
determine if
they are of
equal variance.
For “Response:”
p
enter ‘ppm VOC’
Note MINITABTM
defaults to 95%
confidence
co de ce interval
te a
which is exactly the
level we want to test
for this problem.
RM Supplier
Lev ene's Test
4 6 8 10 12 14
The P-value of the F-test 95% Bonferroni Confidence Intervals for StDevs
N orma l
u s
uo
n tin a
Co D a t
2 Sa mple T O ne W a y AN O VA 2 Sa m ple T O ne W a y AN O VA
Purpose of ANOVA
Ana llysis
A i off Va
V ria
i nce (AN O VA) is
i used
d to
t investiga
i ti tet a ndd
m odel the rela tionship betw een a response va ria ble a nd
one or m ore independent va ria bles.
Is the between group variation large enough to be distinguished from the within group variation?
(δ)
T
Tota l (O vera ll) Va
V ria
i tion
i
X
X
X X
X
X X X
μ1 μ2
Calculating ANOVA
W here:
G - the number of groups (levels in the study )
x ij = the individua l in the jth group
n j = the number of individua ls in the jth group or level
X = the gra nd M ea n
X j = the M ea n of the jth group or level
delta
(δ) W ithin Group Va ria tion
∑j=1
nj (Xj − X ) 2 ∑ ∑ (X ij − X)2 ∑ ∑ (X
j=1 i =1
ij − X )2
j =1 i =1
Calculating ANOVA
1 − (1 − α )
k
The reason we don’t use a t-test to evaluate series of Means is because the alpha risk increases as the
number of Means increases. If we had 7 pairs of Means and an alpha of 0.05 our actual alpha risk
could be as high as 30%. Notice we did not say it was 30%, only that it could be as high as 30% which
is quite unacceptable.
Three Samples
We have three potential suppliers that claim to have equal levels of quality
quality. Supplier B provides a
considerably lower purchase price than either of the other two vendors. We would like to choose the
lowest cost supplier but we must ensure that we do not effect the quality of our raw material.
W
W ee w
w ould
ould lik
likee test
test the
the da
data
ta to
to determ
determine
ine w
w hether
hether
there is a difference betw een the three suppliers.
there is a difference betw een the three suppliers.
95
StDev
N
0.4401
5 Supplier A P-Value 0.568
90
AD
P-Value
0.246
0.568
Supplier B P-Value 0.385
80 Supplier C P-Value 0.910
70
Percent
60
50
40
30 Probability Plot of Supplier B
20 Normal
99
10 Mean 3.968
5 StDev 0.2051
95 N 5
AD 0.314
1
90 Probability
P-Value 0.385 Plot of Supplier C
25
2.5 30
3.0 80 35
3.5 40
4.0 45
4.5 Normal
70
Supplier A 99
Mean 4.03
Percent
60
StDev 0.4177
50
95 N 5
40
AD 0.148
30 90
P-Value 0.910
20
80
10 70
Percent
60
5
50
40
1 30
3.50 3.75 4.00 20 4.25 4.50
Supplier B
10
1
3.0 3.5 4.0 4.5 5.0
Supplier C
suppliers. Supplier
SupplierAA P-Value
P-Value
Lev ene's Test
Levene's
0.348
Test
0.348
Test
TestStatistic
Statistic 0.59
0.59
P-Value
P-Value 0.568
0.568
Suppliers
Suppliers
Supplier
SupplierBB
Supplier
pp CC
Supplier
0.0
0.0 0.2
0.2 0.4
0.4 0.6
0.6 0.8
0.8 1.0
1.0 1.2
1.2 1.4
1.4 1.6
1.6 1.8
1.8
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
ANOVA in MINITABTM
Stat>ANOVA>One-Way Unstacked
ANOVA
4.2
4.0
Data
3.8
3.6
3.4
3.2
3.0
Supplier A Supplier B Supplier C
Looking at the P-value the conclusion is we fail to reject the null hypothesis. According to the data
there is no significant difference between the Means of the 3 suppliers.
N
Norm
ormaall da
data
ta P-va
P-value
lue >>
Stat>ANOVA>One Way
y .0
.055 N
Noo Difference
Difference
ANOVA
Before looking up the f critical value you must first know what the degrees of freedom are
are. The purpose
of the ANOVA’s test statistic uses variance between the Means divided by variance within the groups.
Therefore, the degrees of freedom would be 3 suppliers minus 1 for 2 degrees of freedom. The
denominator would be 5 samples minus 1 (for each supplier) multiplied by 3 suppliers, or 12 degrees of
freedom. As you can see the critical F value is 3.89, and since the calculated f of 1.40 not close to the
critical value we fail to reject the null hypothesis.
T t for
Test f Equal
E l Variances:
V i Suppliers
S li vs ID
One-way ANOVA: Suppliers versus ID
Analysis of Variance for Supplier
Source DF SS MS F P
ID 2 0.384 0.192 1.40 0.284
Error 12 1.641 0.137 F-Ca lc F-Critica l
Total 14 2.025
Individual 95% CIs For Mean
D/N 1 2 3 4
Based on Pooled StDev 1 161.40 199.50 215.70 224.60
2 18 51
18.51 19 00
19.00 19 16
19.16 19 25
19.25
L
Level
l N Mean
M StDev
StD ----------+---------+---------+------
3 10.13 9.55 9.28 9.12
Supplier 5 3.6640 0.4401 (-----------*-----------) 4 7.71 6.94 6.59 6.39
5 6.61 5.79 5.41 5.19
Supplier 5 3.9680 0.2051 (-----------*-----------)
6 5.99 5.14 4.76 4.53
Supplier 5 4.0300 0.4177 (-----------*-----------) 7 5.59 4.74 4.35 4.12
8 5.32 4.46 4.07 3.84
----------+---------+---------+------
9 5.12 4.26 3.86 3.63
Pooled StDev = 0.3698 3.60 3.90 4.20 10 4.96 4.10 3.71 3.48
11 4.84 3.98 3.59 3.36
12 4.75 3.89 3.49 3.26
13 4.67 3.81 3.41 3.18
14 4.60 3.74 3.34 3.11
15 4.54 3.68 3.29 3.06
Sample Size
Let’s check on how much difference we can see with a sample of 5.
Will having a
sample of 5 show
a difference?
After crunching
the numbers, a
sample of 5 can
only detect a
difference of 2.56
Standard Pow er a nd Sa m ple Size
Deviations Which
Deviations. O ne-w a y AN O VA
means that the
Alpha = 0 .0 5 Assum ed Sta nda rd Devia tion = 1
Mean would have N um ber of Levels = 3
to be at least 2.56 Sa m ple M a x im um
Standard Size Pow er SS M ea ns Difference
Deviations until we 5 0 .9 3 .2 9 6 5 9 2 .5 6 7 7 2
could see a
The sa mple size is for ea ch level.
difference. To help
elevate this
problem a larger sample should be used. If there is a larger sample you would be able to have a
more sensitive reading for the Means and the variance.
ANOVA Assumptions
1 . O bserva
b tions
ti a re a dequa
d tely
t l described
d ib d by
b the
th m odel.
d l
2 . Errors a re norm a lly a nd independently distributed.
3 . Homogeneity of va ria nce a m ong fa ctor levels.
Residual Plots
To generate the residual plots in MINITABTM select “Stat>ANOVA>One-way Unstacked>Graphs”,
then select “Individual value plot” and check all three types of plots.
Stat>ANOVA>One-Way Unstacked>Graphs
Histogram of Residuals
Histogram
Histogramof
ofthe
theResiduals
Residuals
(responses
(responsesare
areSupplier
SupplierA,
A,Supplier
SupplierB,
B,Supplier
SupplierC)
C)
55
44
FFrequency
requency
33
22
11
00
-0.6
06
-0.6 -0.4
04
-0.4 -0.2
02
-0.2 00.0
0.0
0 0.2
0 2
0.2 0.4
0 4
0.4 0.6
0 6
0.6
Residual
Residual
N orm a lity plot of the residua ls should follow a stra ight line.
Results of our ex a m ple look good.
The norm a lity a ssum ption is sa tisfied.
Normal
NormalProbability
ProbabilityPlot
Plotof
ofthe
theResiduals
Residuals
(responses
(responsesare
areSupplier
SupplierA,
A,Supplier
SupplierB,
B,Supplier
SupplierC)
C)
99
99
95
95
90
90
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
-1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0
Residual
Residual
2-Sample t Example
For the field “Sample Sizes:” enter 40 space 50 because our data set has unequal sample sizes
which is not uncommon. The smallest difference that can be detected is based on the smallest
sample size, so in this case it is: 0.734.
Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
((responses
(responsesare
areSupplier
SSupplier
li A,AA,Supplier
SSupplier
li B,BB,Supplier
SSupplier
li C) C)
0.75
0.75
0.50
0.50
0.25
0.25
Residual
Residual
0.00
0.00
-0.25
-0.25
-0.50
-0.50
3.65
3.65 3.70
3.70 3.75
3.75 3.80
3.80 3.85
3.85 3.90
3.90 3.95
3.95 4.00
4.00 4.05
4.05
Fitted
FittedValue
Value
ANOVA Exercise
In “Variables:” enter
‘ppm VOC’
In “By Variables:”
e te ‘Shift’
enter S t
Summary Summary
Summaryfor
forppm
ppmVOC P-Value 0.658
Summaryfor
forppm
ppmVOC
VOC P-Value 0.334 VOC
Shift
Shift
Shift==22 Shift==33
A nderson-Darling N ormality Test A nderson-D arling Normality Test
Anderson-Darling Normality Test Anderson-Darling Normality Test
A -S quared 0.37 A -S quared 0.24
A-Squared 0.37 A-Squared 0.24
P -V alue 0.334 P -V alue 0.658
P-Value 0.334 P-Value 0.658
M ean 34.625 M ean 28.000
Mean 34.625 Mean 28.000
S tD ev 5.041 S tD ev 6.525
StDev 5.041 StDev 6.525
V ariance 25.411 V ariance 42.571
Variance 25.411 Variance 42.571
S kew ness -0.74123 S kew ness 0.06172
Skewness -0.74123 Skewness 0.06172
Kurtosis 1.37039 Kurtosis -1.10012
Kurtosis 1 37039
1.37039 Kurtosis -1
1.10012
10012
N 8 N 8
N 8 N 8
M inimum 25.000 M inimum 19.000
Minimum 25.000 Minimum 19.000
1st Q uartile 31.750 1st Q uartile 22.000
1st Quartile 31.750 1st Quartile 22.000
M edian 35.500 M edian 28.000
Median 35.500 Median 28.000
20 25 30 35 40 45 50 3rd Q uartile 37.000 20 25 30 35 40 45 50 3rd Q uartile 32.750
20 25 30 35 40 45 50 3rd Quartile 37.000 20 25 30 35 40 45 50 3rd Quartile 32.750
M aximum 42.000 M aximum 38.000
Maximum 42.000 Maximum 38.000
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M ean
95% Confidence Interval for Mean 95% Confidence Interval for Mean
30.411 38.839 22.545 33.455
30.411 38.839 22.545 33.455
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
95% Confidence Interval for Median 95% Confidence Interval for Median
30.614 37.322 20.871 33.322
95% Confidence Intervals 30.614 37.322 95% Confidence Intervals 20.871 33.322
95% Confidence Intervals 95% C onfidence Interv al for S tDev 95% Confidence Intervals 95% C onfidence Interv al for S tDev
95% Confidence Interval for StDev 95% Confidence Interval for StDev
Mean 3.333 10.260 Mean 4.314 13.279
Mean 3.333 10.260 Mean 4.314 13.279
Median Median
Median Median
Test
Testfor
forEqual
EqualVariances
Variancesfor
forppm
ppmVOC
VOC
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 0.63
0.63
11 P-Value
P-Value 0.729
0.729
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 0.85
0.85
P-Value
P-Value 0.440
0.440
Shift
Shift
22
33
22 44 66 88 10
10 12
12 14
14 16
16 18
18
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Since our residuals look Normally Distributed and randomly patterned, we will assume our analysis is
correct.
Residual
Percent
50 0
10 -5
1 -10
-10 0 10 30 35 40
Residual Fitted Value
3.6 5
Frequency
Residual
2.4 0
1.2 -5
0.0 -10
-10 -5 0 5 10 2 4 6 8 10 12 14 16 18 20 22 24
Residual Observation Order
Since the P-value of the ANOVA test is less than 0.05, we “reject” the null hypothesis that the Mean
product quality as measured in ppm VOC is the same from all shifts.
We “accept” the alternate hypothesis that the Mean product quality is different from at least one shift.
You have now completed Analyze Phase – Hypothesis Testing Normal Data Part 2.
Notes
Analyze Phase
Hypothesis Testing Non-Normal Data
Part 1
Overview
The core
fundamentals of this W
W elcom
elcomee to
to Ana
Analy ze
lyze
phase are Equal
Variance Tests and ““X
X”” Sifting
Sifting
Tests for Medians.
Inferentia
Inferentiall Sta
Statistics
tistics
We will examine the
meaning of each of Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
these and show you
how to apply them. Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Equal
Equal Variance
Variance Tests
Tests
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Tests
Tests for
for Medians
Medians
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
At this point we have covered the tests for determining significance for Normal Data. We will continue
to follow the roadmap to complete the test for Non-Normal Data with Continuous Data.
Later in the module we will use another roadmap that was designed for Discrete data.
Recall that Discrete data does not follow a Normal Distribution, but because it is not
Continuous Data, there are a separate set of tests to properly analyze the data.
1 Sample t
Why do we care if a data set is Normally Distributed?
When it is necessary to make inferences about the true nature of the
population based on random samples drawn from the population.
When the two indices of interest (X-Bar and s) depend on the data
being Normally Distributed.
For problem solving purposes, because we don’t want to make a bad
decision – having Normal Data is so critical that with EVERY statistical
test the first thing we do is check for Normality of the data
test, data.
Recall the four primary causes for Non-normal data:
Skewness – Natural and Artificial Limits
Mixed Distributions - Multiple Modes
Kurtosis
Granularity
We will focus on skewness for the remaining tests for Continuous Data.
s
u ou
n
o n ti ta
C Da N on N orm a l
Now we will continue down the Non-Normal side of the roadmap. Notice this slide is primarily for tests
of Medians.
Sample Size
– Ho: σ1 = σ2 = σ3 …
– Ha: At least one is different.
You have already seen this command in the last module, this is simply the application for Non-
Normal data. The question is: are any of the Standard Deviations or variances statistically different?
P-Va
P-Value
lue << 00.0
.055 (0
(0.0
.000))
Assum
Assumee da
datata is
is not
not
N
N orma
ormally
lly distributed.
distributed.
60
50
40
30
20
10
5
EXH_AOV.MTW 0.1
-5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Rot 2
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot22
F-Test
F-Test
Test
TestStatistic
Statistic 1.75
1.75
11 P-Value
P-Value 0.053
0.053
Factors2
Factors2
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 0.03
0.03
P-Value
P-Value 0.860
0.860
22
1.0
1.0 1.2
1.2 1.4
1.4 1.6
1.6 1.8
1.8 2.0
2.0 2.2
2.2
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
11
Factors2
Factors2
22
00 22 44 66 88 10
10
Rot
Rot22
W hen testing >2 samples with N ormal distribution, use Bartlett’s test:
– To determine whether multiple N ormal distributions have equal
variance.
i
Our focus for this module when working with N on-N ormal distributions.
For the Data to be Normal the P-value must be greater than 0.05.
Based off the P-value, the variables being analyzed is Non-normal Data.
As you can see the data illustrates a P-value of 0.247 which is more than 0.05. As a result, there
is no variance between CallperWk1 and CallperWk2.
Nonparametric Tests
Non-parametric Hypothesis Testing works the same way as parametric testing. Evaluate the P-
value in the same manner
~ ~ ~
Target X X1 X2
MINITABTM’s Nonparametrics
1-Sample Sign: performs a one-sample sign test of the Median and calculates the corresponding
point estimate and confidence interval. Use this test as an alternative to one-sample Z and one-
sample t-tests.
1-Sample Wilcoxon: performs a one-sample Wilcoxon signed rank test of the Median and
calculates the corresponding point estimate and confidence interval (more discriminating or efficient
than the sign test)
test). Use this test as a nonparametric alternative to one-sample
one sample Z and one-sample
one sample t- t
tests.
Mann-Whitney: performs a Hypothesis Test of the equality of two population Medians and
calculates the corresponding point estimate and confidence interval. Use this test as a
nonparametric alternative to the two-sample t-test.
Kruskal-Wallis: performs a Hypothesis Test of the equality of population Medians for a one-way
design. This test is more powerful than Mood’s Median (the confidence interval is narrower, on
average)
g ) for analyzing
y g data from many yp
populations,
p , but is less robust to outliers. Use this test as an
alternative to the one-way ANOVA.
Mood’s Median Test: performs a Hypothesis Test of the equality of population Medians in a one-
way design. Test is similar to the Kruskal-Wallis Test. Also referred to as the Median test or sign
scores test. Use as an alternative to the one-way ANOVA.
1-Sample Example
4 . Sa m ple Size:
This data set has 500 samples (well in excess of necessary sample size).
The Statistical Problem is: The null hypothesis is that the Median is equal to 63 and the
alternative hypothesis is the Median is not equal to 63.
Open the MINITABTM Data File: “DISTRIB1.MTW”. Next you have a choice of either performing a
1-Sample Sign Test or 1-Sample Wilcoxon Test because both will test the Median against a
target. For this example we will perform a 1-Sample Sign Test.
1-Sample Example
=
Sign Test for M edia n: Pos Sk ew
Sign test of M edia n = 6 3 .0 0 versus = 6 3 .0 0
N Below Equa l Above P M edia n
Pos Sk ew 5 0 0 37 0 463 0 .0 0 0 0 6 5 .7 0
As you can see the P-value is less than 0.05, so we must reject the null hypothesis which
means we have data that supports the alternative hypothesis that the Median is different than
63. The actual Median of 65.70 is shown in the Session Window. Since the Median is greater
than the target value, it seem the new process is not as good as we may have hoped.
Perform the same steps as the 1-Sample Sign to use the 1-sample Wilcoxon.
1-Sample Example
For the 1-sample sign test, select a confidence interval level of 95%. As you can see this yields a
result intervals of 65.26 to 66.50. The NLI means a non linear interpolation method was used to
estimate the confidence intervals
intervals. As you can see the confidence interval is very narrow
narrow.
Since the target of 63 is not within the confidence interval, reject the null hypothesis.
As you will see the confidence interval is even tighter for the Wilcoxon test. Therefore we reject
the null, the Median is higher than the target of 63. Unfortunately, the Median was higher than
the target which is not the desired direction.
HYPOTESTSTUD.MPJ
Stat>Nonparametrics>1-Sample Sign
The Black Belt in this case agrees the Mine Manager is achieving
his target of 2.1 tons/ day
We agree!
Mann-Whitney
Mann Whitney Example
The Mann-W hitney test is used to test if the Medians for 2 samples
are different.
2. Ho: M 1 = M 2
Ha: M 1 ≠ M 2
3. Mann
Mann-W
W hitney test.
4. There are 200 data points for each machine, well over the
minimum sample necessary.
Mann-Whitney Example
Wh llooking
When ki att th
the
5 . Sta tistica l Conclusion
Probability Plot,
Match A yields a less Probability
ProbabilityPlot
Plotof
ofMach
MachAA
Normal
than .05 P-value. 99.9
99.9
Normal
50
50
the other that is 40
40
30
30
99.9
99.9
Mean
Mean 16.73
16.73
20 StDev 5.284
Normal. The good 20 99 StDev 5.284
99 NN 200
200
10
10 AAD
D 0.630
0.630
news is when
55 95
95 P-Value
P-Value 0.099
0.099
90
90
11
performing a 80
80
70
Percent
70
Percent
0.1 60
Nonparametric Test 0.1
00 10
10 20
20
60
50
50 30
40
40
30 40
40
Mach
MachAA 30
of 2 Samples, only 30
20
20
10
one has to be 10
55
N
Normal.l With ththatt 11
Whitney.
Perform the Mann- 6 . Pra ctica l Conclusion: The medians of the machines are
Whitney test. Since different.
zero (the difference Stat>Nonparametric>Mann-Whitney…
between the 2
Medians) is not
contained within the
confidence interval If the sa mples a re the
sa m e, zero w ould be
we reject the null included w ithin the
hypothesis. Also, the confidence interva l.
last line in the
Session Window
Mann-Whitney Test and CI: Mach A, Mach B
where it says … is N Median
significant at 0.0010 Mach A 200 14.841
is the equivalent of a Mach B 200 16.346
P-value for the Mann- Point estimate for ETA1-ETA2 is -1.604
Whitney test. 95.0 Percent CI for ETA1-ETA2 is (-2.635,-0.594)
W = 36509.0
Exercise
The final 2 tests are the Mood’s Median and the Kruskal Wallis.
= = ?
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
381
Summary
Summaryfor
forRecovery
Recovery
Location
Location==Savannah
Savannah
AAnderson-Darling
nderson-D arling NNormality
ormality Test
Test
AA-Squared
-S quared 0.81
0.81
PP-Value
-V alue 0.032
0.032
MMean
ean 87.660
87.660
SStDev
tD ev 7.944
7.944
VVariance
ariance 63.113
63.113
SSkewness
kew ness -0.15286
-0.15286
Kurtosis
K t i
Kurtosis -1
1.11764
11764
-1.11764
1 11764
N 25
N 25
MMinimum
inimum 75.300
75.300
1st Q uartile 79.000
1st Quartile 79.000
MMedian
edian 87.500
87.500
78 84 90 96 3rd Q uartile 96.550
78 84 90 96 3rd Quartile 96.550
M aximum 99.200
Maximum 99.200
95% C onfidence Interv al for M ean
95% Confidence Interval for Mean
84.381 90.939
84.381 90.939
95% C onfidence Interv al for M edian
95% Confidence Interval for Median
86.179 90.080
86.179 90.080
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev
95% Confidence Intervals 95% Confidence Interval for StDev
Mean 6.203 11.052
Mean 6.203 11.052
Median
Median
Notice evidence of outliers in at least 2 of the 3 populations. You could do Box Plot to get a clearer idea
about outliers.
Summary
Summaryfor
forRecovery
Recovery
Location
Location==Bangor
Bangor
AAnderson-Darling
nderson-D arling NNormality
ormality Test
Test
AA-Squared
-S quared 0.72
0.72
PP-Value
-V alue 0.045
0.045
MMean
ean 93.042
93.042
SStDev
tD ev 5.918
5.918
VVariance
ariance 35.017
35.017
SSkewness
kew ness -1.81758
-1.81758
Kurtosis
Kurtosis 4.66838
4.66838
NN 13
13
MMinimum
inimum 76.630
76.630
1st
1stQQuartile
uartile 90.600
90.600
78 84 90 96
M edian
Median
3rd Q uartile
Summary
Summaryfor
94.800
94.800
97.350 forRecovery
Recovery
78 84 90 96 3rd Quartile 97.350
99.700 Location = Ankhar
99.700 Location = Ankhar
M aximum
Maximum
95% C onfidence Interv al for M ean AAnderson-Darling
nderson-D arling NNormality
ormality Test
95% Confidence Interval for Mean Test
89.466 96.617 AA-Squared
-S quared 0.86
89.466 96.617 0.86
95% C onfidence Interv al for M edian PP-Value
-V
PVValue
l 00.022
0.022
0022
022
95% Confidence
C fid Interval
I t l for
f Median
M di
90.637 97.036 MMean
ean 88.302
90.637 97.036 88.302
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev SStDev
tD ev 6.929
95% Confidence Intervals 95% Confidence Interval for StDev 6.929
4.243 9.768 VVariance
ariance 48.008
48.008
Mean 4.243 9.768
Mean SSkewness
kew ness -0.105610
-0.105610
Median
Kurtosis
Kurtosis 0.182123
0.182123
Median NN 20
20
90 92 94 96 98
90 92 94 96 98 MMinimum
inimum 73.500
73.500
1st Q uartile 85.150
1st Quartile 85.150
MMedian
edian 88.425
88.425
78 84 90 96 3rd Q uartile 89.700
78 84 90 96 3rd Quartile 89.700
M aximum 99.450
Maximum 99.450
95% C onfidence Interv al for M ean
95% Confidence Interval for Mean
85.059 91.545
85.059 91.545
95% C onfidence Interv al for M edian
95% Confidence Interval for Median
86.735 89.299
86.735 89.299
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev
95% Confidence Intervals 95% Confidence Interval for StDev
Mean 5 269
5.269 10 120
10.120
Mean 5.269 10.120
Median
Median
85 86 87 88 89 90 91
85 86 87 88 89 90 91
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRecovery
Recovery
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 1.33
1.33
Ankhar
Ankhar P-Value
P-Value 0.514
0.514
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 1.02
1.02
P-Value
P-Value 0.367
0.367
Location
Location
Bangor
Bangor
Savannah
Savannah
33 44 55 66 77 88 99 1010 11 11 1212
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Sta tistica l Solution: Since the P-value of the Mood’s Median test is
less than 0.05, we reject the null hypothesis.
Pra ctica l Solution: Bangor has the highest recovery of all three
facilities.
W e observe the confidence interva ls for
the M edia ns of the 3 popula tion’s. N ote
there is no overla p of the 9 5 %
confidence levels for Ba ngor—so w e
visua lly k now the P-va lue is below 0 .0 5 .
Mood Median Test: Recovery versus Location
Kruskal-Wallis Test
Using the same data set
set, analyze using the Kruskal-Wallis test.
test
H = 6.86 DF = 2 P = 0.032
H = 6.87 DF = 2 P = 0.032 (adjusted for ties)
When comparing the Kruskal-Wallis test to the Mood’s Median test, the Kruskal-Wallis test is better.
In this case the Kruskal-Wallis Test showed the variances were equal and illustrated the same
conclusion.
Exercise
When comparing the Kruskal-Wallis test to the Mood’s Median test, the Kruskal-Wallis test is better.
In this case the Kruskal-Wallis Test showed the variances were equal and illustrated the same
conclusion.
Unequal Variance
Example
This is an example of comparable products.
products To view these graphs open the data set
“Var_Comp.mtw”.
Model A and Model B are similar in nature (not exact), but are
manufactured in the same plant.
– Check for N ormality: Var_Comp.mtw
p
Percent
60 60
50 50
40 40
30 30
20 20
10 10
5 5
1 1
8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Model A Model B
Does Model B have a larger variance than Model A? The Median for Model B is much lower. How
can we capitalize on our knowledge of the process? Let’s look at data demographic to help us
explain the differences between the two processes.
Test
TestStatistic
Statistic 4.47
4.47
id
P-Value
P-Value 0.049
0.049
Model
ModelBB
00 11 22 33 44 55 66 77
995%
5 % Bonfer roni Confidence
Bonferroni ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Model
ModelAA
idvar
idvar
Model
ModelBB
00 22 44 66 88 10
10 12
12
Data
Data
Data Demographics
What clues can explain the difference in variances? This example illustrates how Non-normal Data
can have significant informational content as revealed through data demographics. Sometimes this
is all that is needed to draw conclusions.
Median
Median
9.8 10.0 10.2 10.4 10.6 10.8 11.0
0 1 2 3 4 5 6
Dotplot
p of Model A,, Model B
Model A
Model B
-0.0 1.6 3.2 4.8 6.4 8.0 9.6 11.2
Data
Now let’s look at the MINITABTM Session Window. As you can see the P-value is greater than 0.05.
N ex t w e a re going to
check for va ria nce.
Before performing a
Test for Equa l
Va ria nce should the
da ta be sta ck ed?
Therefore we reject the accept the null hypothesis, there is no difference between a potential Black
Belt’s
’ degree and performance.
f
You have now completed Analyze Phase – Hypothesis Testing Non-Normal Data Part 1.
Notes
Analyze Phase
Hypothesis Testing Non-Normal Data
Part 2
Overview
The core
fundamentals of this W
W elcome
elcome to
to Ana
Analy
lyze
ze
phase are Tests for
Proportions and ““X
X”” Sifting
Sifting
Contingency Tables.
Inferentia
Inferentiall Sta
Statistics
tistics
We will examine the
meaning of each of Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
th
these andd show
h you
how to apply them. Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Tests
Tests for
for Proportions
Proportions
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
2
P2
Contingency
Contingency Tables
Tables
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
te Attribute Da ta
t r ib u
A t a ta
D
O ne Fa ctor Tw o Fa ctors
Two Samples Two or More Samples
One Sample
We will now continue with the roadmap for Attribute Data. Since Attribute Data is Non-normal by
definition, it belongs in this module on Non-normal Data.
For Continuous Da ta :
– Ca pa bility a na lysis – a minimum of 3 0 sa mples
– Hypothesis Testing – depends on the pra ctica l
difference to be detected a nd the inherent va ria tion
in the process.
For Attribute Da ta :
– Ca pa bility a na lysis – a lot of sa mples
– Hypothesis Testing – a lot, but depends on pra ctica l
difference to be detected.
The hypotheses:
– H0: p = p 0
– Ha: p ¹ p 0
p (1 − p )n
obs
0 0
Now let’s
let s try an example:
4 . Sa mple size:
Take note of the how quickly the sample size increases as the alternative proportion goes up. It
would require 1402 samples to tell a difference between 98% and 99% accuracy. Our sample of
500 will do because the alternative hypothesis is 96% according to the proportion formula.
After you analyze the data you will see the statistical conclusion is to reject the null hypothesis.
What is the Practical Conclusion…(the process is not performing to the desired accuracy of 99%).
As you can see the Sample Size should be at least 4073 to prove our hypothesis.
Yes, you get your bonus since .80 is not within the confidence interval. Because the improvement
was 84%, the sample size was sufficient.
Answer: Use alternative proportion of .82, hypothesized proportion of .80. n=4073. Either you’d
better ship a lot of stuff or you’d better improve the process more than just 2%!
N ow let us ca lcula te if w e
?
receive our bonus…
O ut of the 2 0 0 0
shipments, 1 6 8 0 w ere
a ccura te. W a s the X 1680
sa mple size sufficient? p̂ = = = 0.84
n 2000
p̂1 − p̂ 2 − D
Zobs =
p̂1 (1 − p̂1 ) n1 + p̂ 2 (1 − p̂ 2 ) n 2
This is compa red to Z critica l = Z a / 2
a δ p1 p2 n
5% .01
01 0.79
0 79 0.8
0 8 ___________
5% .01 0.81 0.8 ___________ Answers:
5% .02 0.08 0.1 ___________ 34,247
32,986
5% .02 0.12 0.1 ___________ 4,301
5% .01 0.47 0.5 ___________ 5,142
5% .01 0.53 0.5 ___________ 5,831
5,831
P
Pow er a ndd Sa
S mple
l Size
Si
Test for Tw o Proportions
Testing proportion 1 = proportion 2 (versus not = )
Ca lcula ting pow er for proportion 2 = 0 .9 5
Alpha = 0 .0 5
Sa m ple Ta rget
Proportion 1 Size Pow er Actua l Pow er
0 .8 5 188 0 .9 0 .9 0 1 4 5 1
The sa mple size is for ea ch group.
A sample of at least 188 is necessary for each group to be able to detect a 10% difference. If you
have reason to believe your improved process is has only improved to 90% and you would like to
be able to prove that improvement is occurring the sample size of 188 is not appropriate.
Recalculate using .90 for proportion 2 and leave proportion 1 at .85. It would require a sample size
of 918 for each sample!
The data
shown was The follow ing da ta w ere ta k en:
gathered for
two Tota l Sa mples Accura te
processes.
Before Im provem ent 600 510
After Im provement 225 212
Ca lcula te proportions:
X1 510
Before Im provem ent: 6 0 0 sa mples, 5 1 0 a ccura te p̂1 = = = 0.85
n1 600
X 2 212
After Improvement: 2 0 0 sa mples, 2 2 0 a ccura te p̂ 2 = = = 0.942
n 2 225
Difference = p (1 ) - p (2 )
Estima te for difference: -0 .0 9 2 2 2 2 2
9 5 % CI for difference: (-0 .1 3 4 0 0 5 , -0 .0 5 0 4 3 9 9 )
Test for difference = 0 (vs not = 0 ): Z = -4 .3 3 P-Va lue =
0 .0 0 0
1. W ho is worse?
2. Is the sample size large enough?
X1 47
Boris p̂1 = = = 0.132
n1 356
X 2 99
Igor p̂ 2 = = = 0.173
n 2 571
Results:
As you can see we N ow let’s see w ha t the
Fail to reject the null minimum sa mple size w ill
hypothesis with the be…
data given. One
conclusion is the
sample size is not
large enough. It
would take a
minimum sample of
1673 to distinguish
the sample Stat>Power and Sample Size>2 Proportions
proportions for Boris
and Igor
Igor.
Sample Target
Proportion 1 Size Power Actual Power
0.17 1673 0.9 0.900078
Contingency Tables
Contingency
C ti T bl a re used
Ta bles d to
t simulta
i lt neouslyl compa re
m ore tha n tw o sa mple proportions w ith ea ch other.
Sta tisticia ns ha ve show n tha t the follow ing sta tistic forms
a chi-squa re distribution w hen H 0 is true:
∑
(observed − expected)
2
expected
W here “ observed
observed” is the sa mple frequency
frequency, “ ex pected
pected”
is the ca lcula ted frequency ba sed on the null hypothesis,
a nd the summa tion is over a ll cells in the ta ble.
Th ..oh,
That? oh, that’s
h
my contingency
table!
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
401
Chi-squa re Test
r c (Oij − E ij ) 2
χ =∑ ∑
2 W here:
o
i =1 j=1 E ij O = the observed va lue
(from sa mple da ta )
E = th
the ex pected
t d va lue
l
(F * F )
E ij = row col r = number of row s
Ftotal c = number of columns
Frow = tota l frequency for tha t
2
χ critical = χ α,2 ν row
Fcol = tota l frequency for tha t
From the Chi-Square
Chi Square Table column
Ftota l = tota l frequency for the ta ble
n = degrees of freedom [(r-1 )(c-1 )]
Wow!!! Can you believe this is the math in a Contingency Table. Thank goodness for MINITABTM.
Now let’s do an example.
Note the data gathered in the table. Curley isn’t looking too good right now (as if he ever did).
0 .3 0 6 *4 5 = 1 3 .8
0 .6 9 4 * 3 8 = 2 6 .4
(observed - expected)2
expected
The final step is to create a summary table including the observed chi-squared.
Critica l Va lue:
• Like any other Hypothesis Test, compare the observed statistic with
the critical statistic. W e decide a = 0.05, what else do we need to
know?
• For a chi-square
chi square distribution, we need to specify n, in a
contingency table:
n = (r - 1)(c - 1), where
r = # of rows
c = # of columns
• In our example,
example we have 2 rows and 3 columns
columns, so n = 2
• W hat is the critical chi-square? For a Contingency Table, all the
risk is in the right hand tail (i.e. a one-tail test); look it up in
MIN ITABTM (Calc>Probability Distributions>Chisquare…)
2 = 5.99
χ crit
0.5
0.4
0.3
Accept Reject
f
0.2 χobs
2 = 7.02
0.1
0.0
0 1 2 3 4 5 6 7 8
chi-square χcrit
2 = 5.99
Using M IN ITABTM
As you can see the data confirms: to reject the null hypothesis.
Chi-Square Test
Expected counts are printed below observed counts
Moe Larry Curley Total
1 5 8 20 33
7.64 11.61 13.75
0.912 1.123 2.841
2 20 30 25 75
17.36 26.39 31.25
0.401 0.494 1.250
Stat>Tables>Chi-Square Test
Total 25 38 45 108
Quotations Exercise
• You are the quotations manager and your team thinks that the
reason you don’t get a contract depends on its complexity.
• You determine a way to measure complexity and classify lost
contracts as follows:
Secondly, in M IN ITABTM
perform
f a Chi-Squa
Chi S re Test
T t
Stat>Tables>Chi-Square Test
Overview
You have now completed Analyze Phase – Hypothesis Testing Non-Normal Data Part 2.
Notes
Analyze Phase
Wrap Up and Action Items
• Embracing change
• Continuous learning
• Being tenacious and courageous
• Make data-based decisions
• Being rigorous
• Thinking outside of the box
Ea
Each
ch ““pla
playerer” in
yyer” in the
the Six
Six Sigm
Sigmaa process
process m
must
ust be
be
AA RO
ROLELEMMOODEL
DEL
for
for the Six Sigm a culture.
the Six Sigm a culture.
A Six Sigma Black Belt has a tendency to take on many roles, therefore these behaviors help you
through the journey.
Analyze Deliverables
Sample
p size is dependent
p on the type
yp of data.
• Listed below are the Analyze Phase deliverables that each candidate
will present in a Power Point presentation at the beginning of the
Control Phase training.
• At this point you should all understand what is necessary to provide
these deliverables in your presentation.
– Team Members (Team Meeting Attendance)
– Primary Metric
– Secondary Metric(s)
– Data Demographics
– Hypothesis Testing (applicable tools)
– Modeling (applicable tools)
– Strategy to reduce X’s
– Project Plan
– Issues and Barriers It’s your show!
DMAIC Roadmap
Estimate COPQ
D
Establish Team
Measure
Analyze Phase
Collect Data
Statistically
Significant?
N
Y
Update FMEA
N
Practically
Significant?
Root
Cause
N
Y
Identify Root Cause
Genera l Q uestions
• Are there any issues or barriers that prevent you from completing this phase?
• Do you have adequate resources to complete the project?
This is a template that should be used with each project to assure you take the proper steps –
remember, Six Sigma is very much about taking steps. Lots of them and in the correct order.
W HAT W HO W HEN W HY W HY N O T HO W
Y ’ on your way!
You’re a !
Notes
Analyze Phase
Quiz
Now we will see what you have retained from the Analyze Phase of the course. Please answer
these questions to the best of your ability without referencing the text. The answers are in the
Appendix. Please check your answers against the answers provided and review the sections in
the Analyze Phase where your retention of the knowledge is less than you desire.
1. The Multi-Vari Chart was originally designed to show variation from 3 primary sources:
Within unit, Between unit, and Temporal (or over time).
True False
2. One Six Sigma tool helps to screen factors by using graphical techniques to logically
subgroup multiple Discrete X´s plotted against a Continuous Y is known as a
________________________Chart. (fill in the blank)
4. As the sample size becomes large, the new distribution of Means will form a Normal
Distribution, no matter what the shape of the population distribution of individuals are.
This concept is known as the Central Limit Theorem.
True False
5. Which of the following statements are true regarding Hypothesis Testing? (check all that
apply)
A. A Hypothesis Test is an a priori theory relating to differences between variables
B A statistical test or Hypothesis Test is performed to prove or disprove the theory
B.
C. A Hypothesis Test converts the Practical Problem into a Statistical Problem.
D. A Hypothesis Test illustrates short-term results
6. What are the four primary causes for Non-normal Data? (check all that apply)
A. Skewness
B. Mixed Distributions
C. Kurtosis
D. Formulosis
E. Granularity
7. When a data set is Normally Distributed, making inferences about the true nature of the
population based on random samples drawn from the population is an example of using
Non-normal Data.
True False
8. From the list below, which is the best example of a Mann-Whitney Test? (check all that
apply)
A. Determine if one of a few machines has a different Mean cycle time
B. Determine if one of a few machines has a different Median cycle time
C. Determine if document A and document B have different Mean cycle times
D. Determine if document A and document B have different Median cycle times
10. Having Unequal variance is a result of similar distributions having: (check all that apply)
A Extreme
A. E t tails
t il
B. Outliers
C. Multiple Modes
D. Having the tails of the distribution equal each other
11. Conducting a Capability Analysis using Attribute Data should contain a lot of samples to
be statistical sound.
True False
12. Contingency Tables are used to: (check all that apply)
A. Illustrate one tail proportion
B. Compare more than two sample proportions with each other
C. Contrast the outliers under the tail
D. Analyze the ´´what if´´ scenario
13 C
13. Contingency
ti T
Tables
bl are used
d tto ttestt ffor association
i ti ((or d
dependency)
d )bbetween
t ttwo or
more classifications.
True False
14. To conduct a proper Capability Analysis using Continuous Data, what is the minimum
recommended number of samples to use? (check all that apply)
A. 15
B. 20
C. 30
D. 50
15. For a Skewed Distribution, the appropriate statistic to describe the central tendency is:
(check all that apply)
A. Mean
B. Median
C M
C. Mode
d
16. A Non-parametric Test makes assumptions about the data are from Normal
Populations.
True False
17. If the results from a Hypothesis Test are located in the ´´Region of Doubt´´ area, what
can be concluded? ((check all that apply)
pp y)
A. Failure to reject the Null Hypothesis
B. Failure to accept the Null Hypothesis
C. The test was conducted improperly
D. Rejection of the alpha
19. To conduct a proper Hypothesis Test there are six recommended steps to follow.
True False
Improve Phase
Welcome to Improve
Now that we have completed the Analyze Phase we are going to jump into the Improve Phase. In
Welcome to Improve we will give you a brief look at the topics we are going to cover.
Welcome to Improve
Overview
Well,, now that the
Analyze Phase is over,
W elcom e to Im prove
on to a more difficult
phase. The good news
is….you’ll hardly ever Process M odeling: Regression
use this stuff, so pay
close attention! Adva nced Process M odeling:
We will examine the M LR
meaning of each of
these and show you Designing Ex perim ents
how to apply them.
Ex perim enta l M ethods
DMAIC Roadmap
Process Owner
Champion/
Estimate COPQ
Establish Team
Measure
We are currently in the Improve Phase and by now you may be quite sick of Six Sigma, really! In this
module we are going to look at additional approaches to process modeling, its actually quite fun in a
weird sort of way!
Welcome to Improve
Improve Phase
Analysis Complete
Validate N ew Process
Implement N ew Process
After completing the Improve Phase you will be able to put to use the steps as depicted here.
Improve Phase
Process Modeling Regression
Now we will continue in the Improve Phase with “Process Modeling: Regression”.
Overview
W
W elcom
elcomee to
to Im
Improve
prove Correlation
Correlation
Process
Process M
Modeling:
odeling: Regression
Regression Introduction
Introduction to
to Regression
Regression
Adva
Advanced
nced Process
Process M
M odeling:
odeling: Simple
Simple Linear
Linear Regression
Regression
M
MLR
LR
Designing
Designing Ex
Experim
periments
ents
Ex
Experim
perimenta
entall M
Methods
ethods
Full
Full Fa
Factoria
ctoriall Ex
Experim
periments
ents
FFractiona
Fra ti
ctiona ll Fa
FFactoria
t i ll
ctoria
Ex
Experim
periments
ents
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
In this module of Process Modeling we will study Correlation, Introduction to Regression and
Simple Linear Regression. These are some powerful tools in our data analysis tool box.
We will examine the meaning of each of these and show you how to apply them.
Correlation
The primary purpose of linear correlation analysis is to measure the strength of linear
association between two variables (X and Y). You have already seen correlation graphically when
you created a Scatter Plot
Plot.
The correlation is positive when Y tends to increase and negative when Y tends to decrease.
If the ordered pairs (X, Y) tend to follow a straight line path, there is a linear correlation.
The preciseness of the shift in y as x increases determines the strength of the linear correlation.
To conduct the study you need:
- Bivariate Data – Two pieces of data that are variable
- Bivariate data is comprised of ordered pairs (X/Y)
- X is the independent variable
- Y is the dependent variable
Correlation Coefficient
The correlation
Th l ti coefficient
ffi i t off th
the population,
l ti R
R, iis estimated
ti t d bby the
th sample
l
correlation coefficient, r:
The null hypothesis for correlation is: there is no correlation, the alternative is there is correlation.
The correlation coefficient (always) assumes a value between –1 and +1.
The graphics shown here are labeled as the type and magnitude of their correlation: Strong,
Moderate or Weak correlation.
Limitations of Correlation
To properly
understand • A strong positive or negative correlation between X and Y does not indicate
regression you causality.
must first • Correlation provides an indication of the strength but does not provide us
understand with an exact numerical relationship (i.e. Y=f(x)).
correlation. Once • The magnitude of the correlation coefficient is somewhat relative and should
be used with caution.
a relationship is
• Just like any other statistic, you need to assess whether the correlation
described, then a
coefficient is statistically significant
significant, as well as practically significant.
significant
regression can be
performed. • As usual, statistical significance is judged by comparing a p-value with the
chosen degree of alpha risk.
A strong positive • Guidelines for practical significance are as follows:
or negative – If | r | > 0.80, relationship is practically significant
correlation
between X and Y – If | r | < 0.20, relationship is not practically significant
does not
ot indicate
d cate
Area of
Area ofnega
negative
tive Area of positive
causality. linear rcorrela
correlation
tion N o linea r correla tion linea r correla tion
linea
Correlation
provides an
indication of the -1 .0 -0 .8 -0 .2 0 0 .2 0 .8 + 1 .0
strength but does
not provide us with an exact numerical relationship. Regression however provides us with that data
more specifically a y equals f of x equation. Just like any other statistic, be sure to assess the
correlation coefficient is both statistically significant and practically significant
significant.
Correlation Example
X va lues Y va lues
The correla tion coefficient [r]:
Pa y
yton ca rries Pa y ton y
ya rds
• Is a positive value if one variable 196 679
increases as the other variable 311 1390
increases. 339 1852
• Is a negative value if one variable 333 1359
decreases as the other increases. 369 1610
317 1460
339 1222
148 596
Correla tion Form ula
314 1421
381 1684
Σ ( X i − X )(Yi − Y )
r= 324 1551
∑ ( X i − X ) ∑ (Yi − Y )
2 2 321 1333
146 586
We will use some data from a National Football League player, Walter Payton of the Chicago
Bears. Open MINITABTM worksheet “RB Stats Correlation.mtw” as shown here.
Correlation Analysis
Correlation Example
theoretical distribution.
Lowess smoothers are 1000
1000
most useful when the
curvature of the relationship 750
750
positive correlation.
p Correla tions: Pa y ton ca rries, Pa yton y a rds
Pea rson correla tion of Pa yton ca rries a nd Pa y ton ya rds = 0 .9 3 5
The P-value is low at P-Va lue = 0 .0 0 0
.935 so we reject the
null hypothesis by
saying that there is significant correlation between Payton’s carries and the number of yards.
Regression Analysis
The regression equation from MIN ITABTM is the BEST FIT for the plotted
data.
Prediction Equations:
Y= a + bx (Linear or 1 st order model)
Y= a + bx + cx2 (Quadratic or 2 nd order model)
Y= a + bx + cx2 + dx3 (Cubic or 3 rd order model)
Y= a (bx) (Exponential)
Correlation ONLY tells us the strength of a relationship while Regression gives the mathematical
relationship or the prediction model.
Fitted
FittedLine
LinePlot
Plot
payton
paytonyards
yards== --163.5
163.5++4.916
4.916payton
paytoncarries
carries
2000
2000 SS 153.985
153.985
R-Sq
R-Sq 87.3%
87.3%
R-Sq(adj)
R-Sq(adj) 86.2%
86.2%
1750
1750
1500
1500
yards
paytonyards
1250
1250
payton
1000
1000
750
750
500
500
150
150 200
200 250
250 300
300 350
350 400
400
payton
paytoncarries
carries
There are two ways to perform a Simple Regression. One is the Fitted Line Plot which will give a
Scatter Plot with a Fitted Line and will generate a limited regression equation in the Session Window
of MINITABTM as shown above.
Follow the MINITABTM command prompt shown here, double-click “payton yards” for Response (Y)
and double-click “payton carries” for the Predictor (X) and click “OK” which will produce this output.
Let’s look at the Regression Analysis Statistical Output. The difference between R squared and
adjusted R squared is not terribly important in Simple Regression.
In Multiple Regression where there are many X’s it becomes more important which you will see
in the next module.
The Regression
Analysis generates a Regression Ana lysis: Pa yton ya rds versus Pa yton ca rries
prediction model The regression equation is
based on the best fit
line through the data Payton yards = -163.497 + 4.91622 Payton carries
represented by the
equation shown here.
To p
predict the Consta nt Level of X
number of yards that
Coefficient
Payton would run if
he had 250 carries
To predict how many yards Payton would run if he had 250 carries use the
you simply fill in that
prediction equation above.
value in the equation
and solve.
Payton
y yyards = - 163.497 + 4.91622(250 ) = 1,065.6
Y could
You ld
make an fairly Compa re to the Fitted Line.
accurate
estimate by Fitted
FittedLine
LinePlot
Plot
payton
paytonyards
yards== --163.5
163.5++4.916
4.916payton
paytoncarries
using the Line carries
2000
2000
Plot also. SS
R-Sq
153.985
153.985
87.3%
R-Sq 87.3%
R-Sq(adj)
R-Sq(adj) 86.2%
86.2%
1750
1750
1500
1500
yards
paytonyards
payton 1250
1250
~1067 yds
1000
1000
750
750
500
500
150
150 200
200 250
250 300
300 350
350 400
400
payton
paytoncarries
carries
Q ua dra tic a nd Cubic – Check the r 2 va lue a ga inst the linea r m odel to
determ ine if the difference betw een the va ria nce ex pla ined by our
equa tion is significa nt.
MINITABTM will also generate both quadratic and cubic fits. Select the appropriate variables for (Y) and
(X) and for the type of Regression Model choose “Quadratic” or “Cubic” for the regression model type.
Fitted
FittedLine
LinePlot
Plot
payton
paytonyards
yards== --199.7
199.7++5.239
5.239payton
paytoncarries
carries
--0.00064
0.00064payton
paytoncarries**2
carries**2
2000
If the R-Sq va lue im proves
2000 SS 161.474
R-Sq
R-Sq significa ntly, or if the
161.474
87.3%
87.3%
R-Sq(adj) 84.8%
1750
1750 Q ua dra tic R-Sq(adj)
a ssum ptions of the residua ls a re
84.8%
1500
1500
better m et a s a result of utilizing
yards
paytonyards
1000
1000 g equa
fitting q tion.
750
750
Fitted
FittedLine
LinePlot
Plot
500
500 payton
paytonyards
yards== 2188
2188- -24.71
24.71payton
paytoncarries
carries
150
150 200
200 250
250 300
300 350
350 400
400 ++0.1147
0.1147payton
paytoncarries**2
carries**2--0.000141
0.000141payton
paytoncarries**3
carries**3
payton
paytoncarries
carries
2000
2000 SS 164.218
164.218
R-Sq
R-Sq 88.2%
88.2%
R-Sq(adj)
R-Sq(adj) 84.3%
84.3%
1750
1750
Cubic
1500
1500
yards
yton yards
1250
1250
ayton
pay
pa
1000
1000
750
750
500
500
150
150 200
200 250
250 300
300 350
350 400
400
payton
paytoncarries
carries
Use the best fitting equation by looking at the R-Sq value. If it improves significantly, or if the
assumptions of the residuals are better met as a result of utilizing the quadratic or cubic equation
you should use it.
Here there is no big difference so we will stick with the linear model.
Residuals
As in AN OVA
OVA, the residuals should:
– Be normally distributed (normal plot of residuals)
– Be independent of each other
• no patterns (random)
• data must be time ordered (residuals vs. order graph)
– Have a constant variance (visual
(visual, see residuals versus fits chart
chart,
should be (approximately) same number of residuals above and
below the line, equally spread.)
Residuals (cont.)
Residual Plots can be generated from both the fitted line plot and
regression selection in MIN ITABTM .
Residual Plots can be generated from both the Fitted Line Plot and regression selection when using
MINITABTM.
Here we produced the graph by selecting the “Four
Four in one”
one option.
option
2
Standardized Resid
90
1
Percent
50
0
10 -1
1 -2
-2 -1 0 1 2 600 900 1200 1500 1800
Standardized Residual Fitted Value
Independence a ssum ption
Histogram of the Residuals Residuals Versus the Order of the Data
8
esidual
2
6
Standardized Re
Frequency
y
1
4 0
2 -1
0 -2
-2 -1 0 1 2 1 2 3 4 5 6 7 8 9 10 11 12 13
Standardized Residual Observation Order
Residual Analysis
Standardized Stat>Regression>Regression
residuals greater
than 2 and less Regression Analysis: payton yards versus payton carries
The regression equation is
than -2 are
payton yards = - 163 + 4.92 payton carries
usually Predictor Coef SE Coef T P
considered large Constant -163.5 172.0 -0.95 0.362
and MINITABTM payton c 4.9162 0.5645 8.71 0.000
labels these
observations with S = 154.0 R-Sq = 87.3% R-Sq(adj) = 86.2%
To view a normal
probability plot in N orma lly distributed response a ssumption.
MINITABTM select
“Stat>Regression>Fit Normal
NormalProbability
ProbabilityPlot
Plotof
(response
ofthe
theResiduals
Residuals
(responseisispayton
paytonyards)
yards)
ted Line Plot” and 99
99
60
60
50
are four options to 50
40
40
pencil of ea ch
choose from. For
30
30
20
other).
20
to see a “funnel
00
effect” where the
residuals gets -1
-1
bigger and bigger
as the Fitted Value -2
-2
gets bigger or 500
500 750
750 1000
1000 1250
1250 1500
1500 1750
1750
Fitted
FittedValue
Value
smaller.
Independence Assumption
Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
ofthe
theData
Data
((response
esponse isispayton
(response pa ton yards)
payton a ds)
yards)
33
Should show no trends
either up or dow n a nd
22
should ha ve
Residual
ndardizedResidual
points a bove a nd
00
below the line
( pprox im
(a i a tely
t l
Stan
-1
-1
consta nt va ria nce).
-2
-2
11 22 33 44 55 66 77 88 99 10
10 11
11 12
12 13
13
Observation
ObservationOrder
Order
Residuals
R id l versus th the order
d off d
data
t iis used
d tto evaluate
l t ththe IIndependence
d d A
Assumption.
ti It should
h ld nott
show trends either up or down and should have approximately the same number of points above
and below the line.
Scatterplot
Scatterplotof
of dorsett
dorsettyards
yards vs
vs dorsett
dorsett carries
carries
1750
1750
1500
1500
yards
1250
orsett yards
1250
dorsett
1000
1000
do
750
750
500
500
100
100 150
150 200
200 250
250 300
300 350
350
dorsett
dorsett carries
carries
Fitted
FittedLine
LinePlot
Plot
dorsett
dorsettyards
yards== --160.1
160.1++4.993
4.993dorsett
dorsettcarries
carries
1750
1750 SS 79.3033
79.3033
R-Sq
R-Sq 95.0%
95.0%
R-Sq(adj)
R-Sq(adj) 94.5%
94.5%
1500
1500
yards
1250
dorsett yards
1250
dorsett
1000
1000
750
750
500
500
100
100 150
150 200
200 250
250 300
300 350
350
dorsett
dorsettcarries
carries
If Dorsett carries the football 325 times the predicted value would be
determined as follows…
If Dorsett carries the football 325 times the predicted value would be determined that Dorsett
would carry the football for 1782.825 yards – approximately!
N 12
StandardizedResidual
N 12
AD 0.309
90 AD 0.309
90 P-Value 0.510 11
P-Value 0.510
Percent
Percent
00
Standardized
50
50
-1-1
10
10
-2-2
11
SS
-2
-2 -1
-1 00 11 22 500
500 750
750 1000
1000 1250
1250 1500
1500
Standardized
StandardizedResidual
Residual Fitted
FittedValue
Value
Histogram
Histogramof
ofthe
theResiduals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
ofthe
theData
Data
22
33
Residual
StandardizedResidual
11
Frequency
Frequency
22
00
Standardized
11 -1-1
-2-2
00
-2.0
-2.0 -1.5
-1.5 -1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5 11 22 33 44 55 66 77 88 99 1010 11
11 12
12
Standardized
StandardizedResidual
Residual Observation
ObservationOrder
Order
Notes
Improve Phase
Advanced Process Modeling
Now we will continue with the Improve Phase “Advanced Process Modeling MLR”.
Overview
W
W elcom
elcomee to
to Im
Improve
prove
Review
Review Corr./
Corr./ Regression
Regression
Process
Process M
Modeling:
odeling: Regression
Regression
N
Non-Linear
on-Linear Regression
Regression
Adva
Advanced
nced Process
Process M
Modeling:
odeling:
M
MLR
LR
Transforming
Transforming Process
Process Data
Data
Designing
Designing Ex
Experim
periments
ents
Multiple
Multiple Regression
Regression
Ex
Ex perim
perimenta
entall M
Methods
ethods
Full
Full Fa
Factoria
ctoriall Ex
Ex perim
periments
ents
Fra
Fractiona
ctionall Fa
Factoria
ctoriall
Ex
Ex perim
periments
ents
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
W will
We ill examine
i ththe meaning
i off each
h off th
these and
d show
h you h
how tto apply
l th
them.
Recall momentarily the Simple Linear Regression and Correlation proposed earlier in the Analyze
Phase. The essential tools presented here describe the relationship between two variables. A
independent or input factor and typically an output response. Causation is NOT always proved;
however, the tools do present a guaranteed relationship.
Correlation Review
The Pearson
coefficient, Correla tion is used to m ea sure the linea r rela tionship betw een tw o
represented here as continuous va ria bles (bi-va ria te da ta ).
“r”; shows the Pea rson correla tion coefficient “ r” w ill a lw a ys fa ll betw een –1
strength of a a nd + 1 .
relationship in A Correla tion of –1 indica tes a strong nega tive rela tionship, one
Correlation. fa ctor increa ses the other decrea ses.
Between -1 and +1 A Correla tion of + 1 indica tes a strong positive rela tionship, one
are the only values f ctor
fa t increa
i ses so does
d the
th other.
th
in which the value
of the coefficient P-Value ≤ 0.05, Ho: N o relationship
P-Value < 0.05, Ha: Is relationship
can be found and
zero has NO
“ r”
relationship.
Strong No Strong
Correla tion Correla tion Correla tion
The P-value proves
the statistical
th t ti ti l
confidences of our -1 .0 0 + 1 .0
representing
possibility that
relationship exists, simultaneously; the Pearson correlation coefficient shows the “strength” of the
relationship. For example, P-value standardized at .05, then 95% confidence in a relationship is
exceeded by the two factors tested.
tested
by 30 times .4566,
S 0.919316
R-S q 93.4%
R - S q ( ad j) 92.7%
moreover, 1 7 .5
subtracting .289 1 5 .0
Impurit y
Correlation Review
Numerical
relationship is left Correla tion only tells us the strength of a linea r rela tionship,
out when speaking not the numerica l rela tionship.
of Correlation. The la st step to proper a na lysis of continuous da ta is to
Correlation shows determine the regression equa tion.
potency of linear
relationship, The regression equa tion ca n ma thema tica lly predict Y for
mathematical a ny given X .
relationship is The regression equa tion from M IN ITABTM is the best fit for
shown by and the plotted da ta .
through the
prediction equation Prediction Equa tions:
of regression. As Y = a + bx (Linea r or 1 st order model)
shown, these Y = a + bx + cx 2 (Q ua dra tic or 2 nd order
correlations or model)
regressions are not Y = a + bx + cx 2 + dx 3 (Cubic or 3 rd order model)
proven casual Y = a (b )x (Ex ponentia l)
relationships, we
are in attempt for
PROVING statistical commonality. Exponential, quadratic, simple linear relationships, or even
predictable outputs (Y) concerns REGERRESION equations. More complex relationships are
approaching.
Simply Regressions
have one X and are Simple Regression
referenced as the – O ne X , O ne Y
regressors or
predictors;
di t multiple
lti l – Ana lyze in M IN ITABTM using
X’s give reason to • Sta t>Regression>Fitted Line Plot or
output or response • Sta t>Regression>Regression
variable, this is
Multiple Regression
accounts. M ultiple Regression
– Tw o or M ore X ’s, O ne Y
Strength
g of the
regression known – Ana
A llyze in
i M IN ITABTM Using
U i
quantity by R • Sta t>Regression>Best Subsets
squared and dictates • Sta t>Regression>Regression
overall variation in
output (Y),
In both ca ses the R-sq va lue estima tes the
independent variable
a mount of va ria tion ex pla ined by the model.
subjected to the
regression equation.
equation
To conclude a Linear Regression exists; majority has that a 95% statistical confidence or above
has to be obtained. If unsatisfied conclusions are drawn, a point of contingency, step 6 is
essential. At present, in step 6, we contemplate the potential Non-linear Regression, however, this
is only vital if we can not find a regression equation (statistical and practical) variation of output by
way of scoping the input; analyzing the model error for correctness. Step 7, is depicted in
subsequent slides, validating residuals are a necessity for a valid model.
Recollection of learning This da ta set is from the mining industry . It is a n eva lua tion
tools in and throughout of ore concentra tors.
the Analyze Phase,
presented here is a Scatterplot
Scatterplotof
ofPGM
PGMconcentrate
concentrate(g/ton)
(g/ton)vs
vsAgitator
AgitatorRPM
RPM
simple Regression 70
70
example examining a 60
60
piece
i off equipment
i t
(g/ton))
concentrate(g/ton)
50
pertaining to a mining 50
PGMconcentrate
20
20
steps and noticing how
the equipment is agitated 10
10
10
10 15
15 20
20 25
25 30
30 35
35 40
40 45
45
by output of PGM Agitator
A gitatorRPM
RPM
concentrate.
Opening the MINITABTM named “Concentrator.MTW” will show how output is always applied to the
Y axis (dependent), as input is always applied to the X axis (independent).
Example Correlation
Pea rson correla tion of PGM concentra te (g/ ton) a nd Agita tor RPM = 0 .8 4 7
P Va lue = 0 .0
P-Va 001
Fitted
FittedLine
LinePlot
Plot
PGM
PGMconcentrate
concentrate(g/ton)
(g/ton)== 1.119
1.119++1.333
1.333Agitator
AgitatorRPM
RPM
70
70 SS 9.08220
9.08220
R-Sq
R-Sq 71.8%
71.8%
R-Sq(adj)
R-Sq(adj) 69.0%
69.0%
60
60
(g/ton)
centrate (g/ton)
50
50
ncentrate
40
40
PGMconc
con
30
30
PGM
20
20
10
10
10
10 15
15 20
20 25
25 30
30 35
35 40
40 45
45
Agitator
A gitatorRPM
RPM
Example
p Regression
g Line
Fitted
FittedLine
LinePlot
Plot
PGM
PGMconcentrate
concentrate(g/ton)
(g/ton)== 30.53
30.53--1.460
1.460Agitator
AgitatorRPM
RPM
++0.05586
0.05586Agitator
AgitatorRPM**2
RPM**2
70
70 SS 7.61499
7.61499
R-Sq
R-Sq 82.2%
82.2%
R-Sq(adj)
R-Sq(adj) 78.2%
78.2%
60
60
(g/ton)
ate(g/ton)
50
50
concentrate
PGMconcentra
40
40
30
30
PGM
20
20
10
10
10
10 15
15 20
20 25
25 30
30 35
35 40
40 45
45
Agitator
A gitatorRPM
RPM
Noticing how the new line is more appropriate for our diagram, this is in essence of choosing a
Non-linear Regression and choosing Quadratic Regression. The model option can be used, simply
by clicking the “Quadratic:”. The curvature better fits the plotted points by the distances. Can you
see the difference?
presents
Source DF SS MS F P
Regression 2 2404.04 1202.02 20.73 0.000
estimated Error 9 521.89 57.99
Total 11 2925.93
St d d
Standard
Deviation of Sequential Analysis of Variance
Source DF SS F P
errors, Non- Linear 1 2101.07 25.47 0.001
Quadratic 1 302.97 5.22 0.048
linear model has
a lower decimal.
Referenced earlier in Measure Phase is Standard Deviation. Take a look if necessary. Let us now
consider the model error, you need not be perplexed, model error has many variables in and of
itself. Output dependency on the impact of other input variables and measurement system errors of
output and inputs can be causes. MINITABTM Session Window displays these very Regression
Analyses feel free to use.
The recommendation here would be to use standardized residuals and “Four in one” option for
plotting. In the upper left window “Graph” NEEDS to be clicked, appropriate modeling and
analyzing the residuals will conclude the seventh step.
In identifying Non-linear Relationships, graphically looking at the variation of output to input on any
given Scatter Plot the Non-linear Relationship is self evident. Using step four of the Regression
Analysis methodology, unusual observation will ask us to focus deeper at Fitted Line Plots to see
what is the solution for the historical data. Detecting a Non-linearity carefully look at the Residuals
vs. Fitted Values graph of a Linear Regression. Finding clustering and/or trends of data could
conclude to a Non-linear Regression. Relying on a team or expert whom has prior knowledge can
avail much information, also.
This ex a mple
Thi l w ill demonstra
d t te
t how
h to
t use confidence
fid a nd
d
prediction interva ls.
W ha t percent discount should be offered to a chieve a
m inimum 1 0 % response from the m a iling?
The discount is in sa les coupons
being sent in the m a il.
Clip ’em!
Open the MINITABTM file called “Mailing Response vs. Discount.mtw”. This shows transactions by
a retail store chain, in essence, giving data relationship between discount amounts impact and
response of customers to the mailed coupons
coupons. With input variable being displayed in C1 and output
displayed in C2, Belts need to establish the discount rate that will yield 10% response of customers
mailed. The coupons used to buy merchandise by the % of customers whom received the mailings
is the measured % response.
curvature. 60
60
mailing
frommailing
50
50
40
responsefrom
40
%response
30
30
20
20
%
10
10
00
00 10
10 20
20 30
30 40
40
%
%discount
discount
Now we are testing for a Linear Relationship by running a Correlation, the results of the analysis a
strong confidence because the P-value strikes under .05.
Do you notice the Pearson Correlation Coefficient is almost 1.0 indicating a strong correlation?
R-squared. 70 S
R-Sq
5.60971
94.5%
step is to consider 20
10
a Non-linear 0 Regression Analysis: % response from mailing versus % discount
Regression -10
0 10 20 30 The40regression equation is
Analysis, % discount
% response from mailing = - 11.2 + 1.83 % discount
following right Predictor Coef SE Coef T P
along the N ote there a re no Constant
% discount
-11.215
1.8301
2.541 -4.41 0.001
0.1179 15.52 0.000
methodology. unusua l observa tions. S = 5.60971 R-Sq = 94.5% R-Sq(adj) = 94.1%
Analysis of Variance
Even though the R Source DF SS MS F P
squa red va lues a re Regression 1 7580.0 7580.0 240.87 0.000
Residual Error 14 440.6 31.5
high,
g , a N on-linea r fit Total 15 8020
8020.5
5
ma y be better ba sed
on the Fitted Line Plot.
Fitted
FittedLine
LinePlot
Plot
%
%response
responsefrom
frommailing
mailing== - -0.416
0.416++0.1526
0.1526%
%discount
discount
++0.04166
0.04166%%discount**2
discount**2
80
80 S 2.91382
S 2.91382
R-Sq 98.6%
R-Sq 98.6%
70
70 R-Sq(adj) 98.4%
R-Sq(adj) 98.4%
mailing
frommailing
60
60
50
50
eefrom
40
40
response
30
30
20
20 the N on-linea r fit
increa sed to 9 8 .6 % from
%
10
10
00 9 4 .5 % in the Linea r
00 10 20 30 40
10 20
% discount
% discount
30 40 Regression.
Polynomial Regression Analysis: % response from mailing versus % discount
The regression equation is
% response from mailing = - 0.416 + 0.1526 % discount + 0.04166 % discount**2
S = 2.91382 R-Sq = 98.6% R-Sq(adj) = 98.4%
Analysis of Variance
Source DF SS MS F P
Regression 2 7910.14 3955.07 465.83 0.000
Error 13 110.37 8.49
Total 15 8020.51
Sequential Analysis of Variance
Source DF SS F P
Linear 1 7579.95 240.87 0.000
Quadratic 1 330.19 38.89 0.000
W are satisfied!
We ti fi d! The
Th application
li ti off a N
Non-linear
li R
Regression
i M Model
d l shows
h an iincreased
dRR-
squared.
IIn order
d to t a nsw er the
th origina
i i l question
ti it is
i necessa ry to
t
eva lua te the confidence a nd prediction interva ls.
W ha t percent discount should be offered to a chieve a 1 0 %
response from the ma iling?
…..O ptions
A powerful option is the Fitted Line Plot analysis, so click “options” after running
“statregressionfittedlineplot” command. Now select “Display confidence interval” and “Display
prediction interval” and leave the Confidence Level at 95%.
Taking a look at
what has changed in
the MINITABTM Fitted
FittedLineLine Plot
Plot
%
%response
response from
from mailing
mailing == -- 0.416
0.416 ++0.1526
0.1526 %%discount
discount
window by selecting ++0.04166
0.04166 % %discount**2
discount**2
both interval options, Regression
80
80 Regression
Confidence and 95%
95%CI CI
70
70 95%
95%PI PI
Prediction; each
ailing
ailing
R-Sq(adj) 98.4%
50
50 intersects the low er R-Sq(adj) 98.4%
a color code, the red
response from
30
the green is 30
Prediction. In the 20
20
M a nua lly dra w a
previous “Option” 10
10 horizonta l line a t 1 0 %.
%
equation usually 80
80
Regression
Regression
95%
95%CICI
causes the 70
70 95%
95%PIPI
Confidence mailing
ailing
SS 2.91382
2.91382
60
60 R-Sq
R-Sq 98.6%
98.6%
Intervals to flare
mm
R-Sq(adj) 98.4%
50
50 R-Sq(adj) 98.4%
response from
from
out at the 40
40
extreme ends; if
%response
30
30
a prediction
20
20 The Prediction Interva l is the ra nge w here a new
equation exists, observa tion is ex pected to fa ll. In this ca se, w e a re
10
10
it would be
%
Considering the question of yielding 10% or more, finding the regression equation is of menial
importance than to estimate where the data ought to predicted within the relationship. The
prediction intervals will provide a degree of confidence in how the customers will respond, this
estimate is of great importance
importance.
Residual Analysis
Confirming the validity, taking into
To complete the ex a mple, the Residua l Ana lysis
consideration our residuals and va lida tes the a ssumptions for Regression Ana lysis.
completing step seven is next. Having a
variation of outputs is due to a high Residual
ResidualPlots
Plotsfor
for%
%response
responsefrom
frommailing
mailing
Normal
NormalProbability
ProbabilityPlot
Plotof
ofthe
theResiduals Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
level in R-squared, but from that 99
99
Residuals Values
Residual
StandardizedResidual
22
information we cannot draw the 90
90
11
Percent
Standardized
50
50 00
6.0
StandardizedResidual
22
Standardized
1.5 -1
store should give a discount of 18% and 1.5
0.0
-1
-2
0.0 -2
see if they redeem their 10% of -1
-1 00
Standardized
11
StandardizedResidual
Residual
22 11 22 33 44 55 66 77 88 99 10
Observation
1011
1112
ObservationOrder
Order
1213
1314
1415
1516
16
customers mailed.
Now does the present data for the response fit the equation as predicted?
Majority has it that Belts find data that is abnormally distributed. We have learned doing Non-linear
Regression, but another approach is to transform it into Linear Regression. Outputs or inputs can
be transformed and many people will wonder "what'swhat s the point?”
point? Simplicity is very valuable.
{ }
Square 2
xp
xtrans= N o Change 1
log(x) Square Root 0.5
Logarithm 0
Reciprocal Root -0.5
Reciprocal -1
Effect of Transformation
Using g a mathematical
Before Transform
function we have
transformed this data.
25
Frequency
15
distribution became 0
function. 15
Frequency
10
0
0 10 20 30 40 50 60 70 80 90 100
Sqrt
StDev
applied to the data. In 1.5
right plot obviously shows a Before Tra nsform After Tra nsform Lambda
showing abnormal
Percent
Percent
60 60
50 50
40 40
x 0.50 or x
30 30
0.1
1
0.1
-2 -1 0 1 2 3 4 5 6
symbolized by a P-value of
0.0 0.5 1.0 1.5 2.0 2.5
Pos skew BoxCox
Before executing the transformation, make sure the word “number” is highlighted, and now within the
f
function
ti the
th new column
l shall
h ll appear iin th
the “Expression:”
“E i ”b box. Th
The ttransformed
f dddata
t will
ill show
h alongside
l id
the unchanged data, providing you clicking the “OK” button.
When using MINITABTM The output should resem ble this view .
for the majority of Confirm if the new da ta set found in C3 is
commands, the order of norm a lly distributed.
columns is unimportant,
Probability
ProbabilityPlot
Plotof
ofSquare
SquareRoot
moreover; if the square Normal
Normal
Root
a different column it is
N 100
AD 0.265
95 AD 0.265
95 P-Value 0.687
P-Value 0.687
90
Percentt
70
Percent
60
Normally Distributed 20
20
10
10
5
M odel error (residua ls) is im pa cted by the a ddition of m ea surem ent error
for a ll the input va ria bles.
In review, we only do Regression on historical data and Regression is not applied to experimental
data, furthermore, we covered performing Regression involving one input and one output. Now
taking into account Multiple Linear Regressions and when they are applicable, these allow us to
identify Linear Regression including one output and more than one input at the same time. If you
haven’tt identified enough of the output variation,
haven variation recall briefly R-squared measures the amount of
variation for the output in correlation with the input you selected. In looking at the equations on this
page we can assume that in Multiple Linear Regressions each input are independent of one another,
no correlation exists. Having the inputs independent of one another gives each of them their own
slope and we also see the epsilon at the end of the equation, every Regression has model error.
With many different input variables on hand and only one output it can be so tedious to find if
variations come from one particular input, using a Matrix Plot can greatly speed up the process and
it will show which is impacting the output the most. After narrowing the field of variables use the best
given command to complete the Multiple Linear Regression,
Regression we identify the correct command by
examining R-squared, R-squared adjustable, #’s of predictors, S variable and Mallows Cp; following
this we must iteratively confirm inputs are statistically significantly. We have then only confirmation
of this valid model and we MUST especially in consideration for Multiple Linear Regressions
process and witness the presently performing Regression.
W hen compa ring a nd verify ing models consider the follow ing:
1 . Should be a rea sona bly sma ll difference betw een R2 a nd R2
- a djusted (much less tha n 1 0 % difference).
2 . W hen more terms a re included in the model, does the
a djusted R2 increa se?
3 . Use the sta tistic M a llow s’ Cp. It should be sma ll a nd less
tha n the number of terms in the model.
4 . M odels w ith sma ller S (sta nda rd devia tion of error for the
model) a re desired.
5 . Simpler models should be w eighed a ga inst models w ith
multiple predictors (independent va ria bles).
6 . The best technique is to use M IN ITABTM ’s Best Subsets
comma nd.
Using “Best Subsets Regression” we will be given multiple statistics, provided by MINITABTM, it
is in our best interest to use the least confusing Multiple Linear Regression model using these
particular guidelines.
600
Altitude
Altitude
600
flight speeds vs. 37.0
37.0
F
T u
u e
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e
Mallows d l i C m
Vars R-Sq R-Sq(adj) C-p S e e o R p
1 72.1 71.1 38.4 28.054 X
1 39.4 37.2 112.8 41.358 X
2 85.9 84.8 9.0 20.316 X X
2 82
82.0
0 80
80.6
6 17
17.9
9 22
22.958
958 X X
3 87.5 85.9 7.5 19.561 X X X
3 86.5 84.9 9.6 20.267 X X X
4 89.1 87.3 5.7 18.589 X X X X
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
In MINITABTM using “Best Subsets Regression” command is efficient and powerful by loading all inputs
to a single output; we use the “Free predictors:” box and place all inputs of interest inside it. This
particular command can be helpful in other circumstances,
circumstances however,
however now by placing the output column
of data in the “Response:” box it should be on the right of your screen. This is very simple, evaluation is
done and results are given to you in rows; 1st column - # of variables, 2nd column - R squared, 3rd
column - R squared adjusted, 4th column is mallows Cp, 5th column - Standard Deviation of the model
error and finally the 6th column - input variables.
T
F
u
List of a ll the
u e Predictors (X ’s)
r l
b /
i A
A n i
l e r
t
i A r
t n a T
u g t I e W ha t model w ould you select?
Mallows d l i C m
Vars R-Sq R-Sq(adj) C-p S e e o R p
1 72.1 71.1 38.4 28.054 X Let’s consider the 5 predictor m odel:
1
2
39.4
85.9
37.2
84.8
112.8
9.0
41.358
20.316
X
X X
• Highest R-Sq(a dj)
2 82.0 80.6 17.9 22.958 X X • Low est M a llow s Cp
3
3
87.5
86.5
85.9
84.9
7.5
9.6
19.561
20.267 X
X X
X
X
X
• Low est S
4 89.1 87.3 5.7 18.589 X X X X • How ever there a re m a ny term s.
4 88.1 86.1 8.2 19.481 X X X X
5 89.9 87.7 6.0 18.309 X X X X X
In choosing the correct model our attention goes to the bottom5 term Linear Regression. Are they
all statistically significant?
Stat>Regression>Regression>Options
Let’s go back to “Stat>Regression>Regression” again and click on the “Options” button. Place all
outputs in the “Response:” box and the inputs in the “Predictors:” box.
Predictor Coef SE Coef T P VIF The VIF for tem p indica tes it
Constant 770.4 229.7 3.35 0.003
should be rem oved from the
Altitude 0.15318 0.06605 2.32 0.030 2.3
Turbine Angle 5.806 2.843 2.04 0.053 1.4 m odel. Go ba ck to the Best
F l/Ai ratio
Fuel/Air ti 8
8.696
696 3
3.327
327 2
2.61
61 0
0.016
016 3
3.2
2 Subsets a na ly sis a nd select
ICR -52.269 6.157 -8.49 0.000 2.6
the best m odel tha t does not
Temp 4.107 3.114 1.32 0.200 5.4
include the predictor tem p.
S = 18.3088 R-Sq = 89.9% R-Sq(adj) = 87.7%
Va ria nce Infla tion Fa ctor (VIF) detects correla tion a m ong
predictors.
• VIF = 1 indica tes no rela tion a mong g predictors
p
• VIF > 1 indica tes predictors a re correla ted to som e degree
• VIF betw een 5 a nd 1 0 indica tes regression coefficients a re
poorly estim a ted a nd a re una ccepta ble.
Do you notice any similarities here? A foreign column has appeared, labeled VIF, this appears if a
high correlation among inputs exists. Temp has a high VIF, so we will remove it.
To start step four we want to take into account the Regression Model that does not include TEMP.
We have satisfied the Best Subsets model; we need not rerun this command.
Regression Analysis: Flight Speed versus Turbine Angl, Fuel/Air rat, ICR
Re-run the
The P-va lue for Turbine
Angle now indica tes it
Regression
should be rem oved a nd re-
run the Regression
beca use p > 0 .0 5
Here we have removed Altitude from the “Predictors:” box and the Regression
g output
p now
shows the Turbine Angle is not statistically significant.
Shown here is the entire Regression output for a complete discussion of the final Multiple Linear
Regression model. We have 2 predictor variables and all are statistically significant.
Now having a final model, it is VITAL to confirm the residuals are correct and the model is valid. How do
we do this? Graph and appropriate commands to analyze.
Residual
Residual Plots
Plots for
for Flight
FlightSpeed
Speed
Normal
NormalProbability
Probability Plot
Plot of
of the
the Residuals
Residuals Residuals
ResidualsVersus
Versusthe
theFitted
Fitted Values
Values
99
99
Residual
StandardizedResidual
90 22
90
Percent
Percent
50
Standardized
50
00
10
10
11 -2
-2
-3.0
-3.0 -1.5
-1.5 0.0
0.0 1.5
1.5 3.0
3.0 450
450 500
500 550
550 600
600 650
650
Standardized
StandardizedResidual
Residual Fitted
FittedValue
Value
Histogram
Histogramof
of the
the Residuals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
of the
theData
Data
88
Residual
StandardizedResidual
22
66
Frequency
Frequency
Standardized
44
00
22
00 -2
-2
-2
-2 -1
-1 00 11 22 22 44 66 88 10
10 12
12 14
14 16
16 18
18 20
20 22
22 24
24 26
26 28
28
Standardized
StandardizedResidual
Residual Observation
ObservationOrder
Order
Notes
Notes
Improve Phase
Designing Experiments
Now we are going to continue with the Improve Phase “Designing Experiments”.
Designing Experiments
Overview
Within this
module we W
W elcom
elcomee to
to Im
Improve
prove
will provide an
introduction to
Process
Process M
Modeling:
odeling: Regression
Regression
Design of
Experiments, Adva
Advanced
nced Process
Process M
M odeling:
odeling:
explain what M LR Reasons
Reasons for
for Experiments
Experiments
M LR
they are, how
they work and Designing
D i i
Designing Ex
E
Experim
i ents
perim t
ents Graphical
G hi l Anal
Graphical Analysis
A l sis
Analysisi
when to use
them. DOE
DOEMethodology
Methodology
Ex
Ex perim
perimenta
entall M
Methods
ethods
Full
Full Fa
Factoria
ctoriall Ex
Experim ents
periments
Fra
Fractiona
ctionall Fa
Factoria
ctoriall
Ex
Ex perim
periments
ents
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Designing Experiments
iers Cu ts
Suppl st o pu SIPO C
O
m n
ut
ersI
VO C
Con
pu
Project Scope
trac Emplo
tors yees
st
P-M a p, X Y, FM EA
(X1) (X2) (X3) (X4) (X8) (X11) (X9) Ca pa bility
(X6) (X7) (X5) (X10)
Box Plot, Sca tter
(X3) (X4) (X1) (X11) Plots, Regression
(X5) (X8)
(X2)
(X11)
(X4)
This is reoccurring awareness. By using tools we filter the variables of defects. When talking of the
Improve Phase in the Six Sigma methodology we are confronted by many designed experiments;
transactional, manufacturing, research.
Designs of Experiments help the Belt to understand the cause and effect between the process
output or outputs of interest and the vital few inputs. Some of these causes and effects may include
the impact of interactions often referred to synergistic or cancelling effects.
Designing Experiments
The objective is to
minimize the response.
The physica l m odel is
not important for our
business objective. The
DO E M odel will focus in
the region of interest.
Designing Experiments
Design
D i off E
Ex perim t (DO E) is
i ents i a scientific
i tifi method
th d off Design of Experiment
pla nning a nd conducting a n ex periment tha t w ill yield shows the cause and effect
the true ca use-a nd-effect rela tionship betw een the X relationship of variables of
va ria bles a nd the Y va ria bles of interest. interest X and Y. By way of
input variables, designed
DO E a llow s the ex perimenter to study the effect of ma ny
input va ria bles tha t m a y influence the product or process experiments have been
simulta neously, a s w ell a s possible intera ction effects (for noted within the Analyze
ex a mple synergistic effects). Phase then are executed in
the Improve Phase. DOE
The end result of ma ny ex periments is to describe the tightly controls the input
results a s a m a thema tica l function.
variables and carefully
y = f (x )
monitors the uncontrollable
The goa l of DO E is to find a design tha t w ill produce the variables.
inform a tion required a t a minimum cost.
P
Properly
l designed
d i d DOE’s
DOE’ are more efficient
ffi i experiments.
i
Let’s assume a Belt has O ne Fa ctor a t a Time (O FAT) is a n ex perimenta l style but not a
found in the Analyze Phase pla nned ex perim ent or DO E.
that p
pressure and The ggra phic
p show s yield
y contours for a process
p tha t a re
temperature impact his unk now n to the ex perim enter.
Trial Temp Press Yield
process and no one knows Yield Contours Are 1 125 30 74
what yield is achieved for the Unknown To Experimenter 75 2 125 31 80
3 125 32 85
possible temperature and 4 125 33 92
pressure combinations. 80 5 125 34 86
6 130 33 85
ure (psi)
7 120 33 90
If a Belt inefficiently did a One 135
6
85
90
125
1 2 3 4 5 Optimum identified
(referred to as OFAT), one 95 with OFAT
120
variable would be selected to 7
The curves shown on the graph above represent a constant process yield if the Belt knew the
theoretical relationships of all the variables and the process output of pressure. These contour lines
are familiar if you’ve ever done hiking in the mountains and looked at an elevation map which shows
contours of constant elevation. As a test we decided to increase temperature to achieve a higher
yield. After achieving a maximum yield with temperature, we then decided to change the other factor,
pressure. We then came to the conclusion the maximum yyield is near 92% because it was the highest
p g
yield noted in our 7 trials.
With the Six Sigma methodology, we use DOE which would have found a higher yield using
equations. Many sources state that OFAT experimentation is inefficient when compared with DOE
methods. Some people call it hit or miss. Luck has a lot to do with results using OFAT methods.
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
472
Designing Experiments
Fractional Factorials or screening designs are used when the process or product knowledge is low.
We may have a long list of possible input variables (often referred to as factors) and need to screen
them down to a more reasonable or workable level.
Full Factorials are used when it is necessary to fully understand the effects of interactions and when
there are between 2 to 5 input variables.
Response surface methods (not typically applicable) are used to optimize a response typically when
the response surface has significant curvature.
Value Chain
Designing Experiments
Back when we had to calculate the effects of experiments by hand it was much simpler to use coded
variables. Also when you look at the prediction equation generated you could easily tell which
variable had the largest effect. Coding also helps us explain some of the math involved in DOE.
The representation
Consider a 2 3 design on a ca ta pult...
here has two cubed
designs and 2
8.2 4.55 A B C Response
levels of three
factors and shows Run Start
Number Angle
Stop
Angle Fulcrum
Meters
Traveled
a treatment 3.35 1.5 1 -1 -1 -1 2.10
combination table 2 1 -1 -1 0.90
using coded inputs
Stop Angle
3 -1 1 -1 3.35
level settings. The 5.15 2.4
4 1 1 -1 1.50
table has 8 5 -1 -1 1 5.15
6 1 -1 1 2.40
experimental runs. Fulcrum
7 -1 1 1 8.20
Run 5 shows start
8 1 1 1 4.55
angle, stop angle 2.1 Start Angle 0.9
Designing Experiments
MINITABTM generates
Stat>DOE>Factorial>Factorial Plots … Cube, select response and factors
various plots, the cube plot is
one. Open the MINITABTM This gra ph is used by the ex perimenter to visua lize how the
response da ta is distributed a cross the ex perimenta l spa ce.
worksheet “Catapult.mtw”.
Cube
CubePlot
Plot(fitted
(fittedmeans)
means)for
forDistance
Distance
This cube plot is a 2 cubed How do you rea d
design for a catapult using or interpret this 8.20
8.20 4.55
4.55
plot?
three variables:
Start Angle 3.35
3.35 1.50
1.50
11
Stop Angle
Fulcrum
W ha t a re
Stop
StopAngle 5.15 2.40
Here we used coded variable these? 11
Angle 5.15 2.40
Make sense?
Mean of Distance
The Main Effects Plots shown here display the effect that the input values have on the output
response.
The y axis is the same for each of the plots so they can be compared side by side.
Which has the steepest Slope? What has the largest impact on the output?
Answer: Fulcrum
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
475
Designing Experiments
Avg Distance at Low Setting of Start Angle: 2.10 + 3.35 + 5.15 + 8.20 = 18.8/4 = 4.70
Main Effects Plot (data means) for Distance
-1 1 -1 1 -1 1
5.2
4.4
Dist
3.6
28
2.8
2.0
Start Angle Stop Angle Fulcrum
Avg. distance at High Setting of Start Angle: 0.90 + 1.50 + 2.40 + 4.55 = 9.40/4 = 2.34
Run # Start Angle Stop Angle Fulcrum Distance
1 -1 -1 -1 2.10
2 1 -1 -1 0.90
3 -1 1 -1 3.35
4 1 1 -1 1.50
5 -1
1 -1
1 1 5 15
5.15
6 1 -1 1 2.40
7 -1 1 1 8.20
8 1 1 1 4.55
In order to create the Main Effects Plot we must be able to calculate the average response at the low
and high levels for each Main Effect. The coded values are used to show which responses must be
used to calculate the average.
Let’s review what is happening here. How many experimental runs were operated with the start angle
at the high level or 1. The answer is 4 experimental runs shows the process to run with the start angle
at the high level. The 4 experimental runs running with the start angle at the high level are run number
2, 4, 6 and 8. If we take the 4 distances or process output and take the average, we see the average
distance when the process had the start angle running at the high level was 2.34 meters. The second
dot from the left in the Main Effects Plots shows the distance of 2.34 with the start angle at a high
level.
Interaction Definition
Intera ctions occur w hen va ria bles a ct together to impa ct the output of
the process. Intera ctions plots a re constructed by plotting both va ria bles
together on the sa m e gra ph. They ta k e the form of the gra ph below .
N ote tha t in this gra ph, the rela tionship betw een va ria ble “ A” a nd Y
cha nges a s the level of va ria ble “ B” cha nges. W hen “ B” is a t its high (+)
level, va ria ble “ A” ha s a lm ost no effect on Y. W hen “ B” is a t its low (-)
level, A ha s a strong effect on Y. The fea ture of intera ctions is non-
pa ra llelism betw een the tw o lines.
Higher
B-
Y
W hen B cha nges
from low to high,
utput
W hen
h B cha
h nges dra ma tica lly.
lly
from low to high,
B+
the output drops
Lower
very little.
- A +
Designing Experiments
interaction.
B+
Low Low Low
A common - A + - A + - A +
misunderstanding is that Strong Interaction Moderate Reversal
the lines must actually High
B- High
B-
cross each other for an
interaction to exist but Y Y
that’s NOT true. The lines
B+
may cross at some level B+ B+
Low Low
OUTSIDE of the - A + - A +
experimental region, but
we really don’t know that.
Parallel lines show absolutely no interaction and in all likelihood will never cross.
Let’s review what is happening here. The dot indicated by the green arrow is the mean distance when
the fulcrum is at the low level as indicated by a -1 and when the start angle is at the high level as
i di t d b
indicated by a 1
1. EEarlier
li we said
id th
the point
i t iindicated
di t d bby th
the green arrow h
had
d th
the ffulcrum
l att th
the llow
level and the start angle at the high level. Experimental runs 2 and 4 had the process running at those
conditions so the distance from those two experimental runs is averaged and plotted in reference to a
value of 1.2 on the vertical axis. You can note the red dotted line shown is for when the start angle is
at the high level as indicated by a 1.
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
477
Designing Experiments
generates
t the
th 3 Fulcrum on Y -1-1
1
Start
11 -1-1
1 11
Start 66
Stop
bla ck line Stop
AAngle
ngle
66
-1-1
represents the Stop 11
StopAAngle
ngle
44
effects of 2
2
Fulcrum on Y
w hen Sta rt Fulcr um
Fulcrum
Angle is a t its
low level.
MINITABTM will also plot the mirror images, just in case it is easier to interpret with the variables
flipped. If you care to create the mirror image of the interaction plots, while creating interaction plots,
click on “Options”
p and choose “Draw full interaction pplot matrix” with a checkmark in the box. These
mirror images present the same data but visually may be easier to understand.
Interaction
InteractionPlot
Plot(data
(datameans)
means)for
forDistance
Distance
-1-1 11
Start
Start
66
AAngle
ngle
-1-1
44
Star t AAngle
ngle 11
Start
22
Stop
Stop
66
AAngle
ngle
-1-1
44
Stop
St 11
StopAAngle
ngle
l
-1 1 -1 1
-1 1 -1 1
Designing Experiments
DOE Methodology
It is easy to
generate full
factorial designs in
MINITABTM.
Follow the
command path
shown here.
These are the
designs that
MINITABTM will
create. They are
color coded using
th R
the Red,
d Y
Yellow
ll
and Green. Green
are the “go”
designs, yellow
are the “use
caution” designs
and red are the
“stop,
stop, wait and
think” designs. It
has a similar
meaning as do
street lights.
Designing Experiments
Let’s create a three factor full factorial design using the MINITABTM command shown at the top of the
graphic above. This design we selected will give us all possible experimental combinations of 3 factors
using 2 levels for each factor.
factor
Be sure to have changed the number of factors as seen in the upper left to “3”. Also be sure not to forget
to click on the “Full factorial” line within the Designs box shown in the lower right of the graphic.
Designing Experiments
One warning to you as a new Belt to using MINITABTM. Never copy, paste, delete or move columns
within the first 7 columns or MINITABTM may not recognize the design you are attempting to use.
Is our experiment done? Not at all. The process must now be run at the 8 experimental set of
conditions shown above and the output or outputs of interest must be recorded in columns to the
right of our first 7 columns shown. After we have collected the data we will then analyze the
experiment. Remember the 11 Step DOE methodology from earlier?
Designing Experiments
Notes
Improve Phase
Experimental Methods
Experimental Methods
Experimental Methods
Designing
Designing Experiments
Experiments Methodology
Methodology
Experimental
Experimental Methods
Methods Considerations
Considerations
Full
Full Factorial
Factorial Experiments
Experiments Steps
Steps
Fractional
Fractional Factorial
Factorial Experiments
pp
Experiments
Wrap
Wrap Up
Up &
& Action
Action Items
Items
DOE Methodology
In this module we will describe the 11 step DOE methodology some basic concepts and lots of
fun and exciting terminology. Once again great content for dinner conversation later tonight!
Experimental Methods
So you’ve decided to use Designed Experiments. Shown here are 10 basic project management
considerations before running any experiment. This is obviously not an exhaustive list, but certainly
some important questions to consider and answer.
What is behind some of these questions? Let’s briefly discuss a few aspects individually.
1.Access to a process is necessary for proper monitoring and execution of a project. If restricted
access for whatever reason exists, then work around must exist.
2.If the team members or subject matter experts aren’t fully involved, then potential conflicts or
unrealistic designs may be awaiting you for a poor experiment.
3.If the Process Owners and stakeholders are unknown to you before execution of an experiment rude
awakenings such as cancellations, scheduling conflicts and other nightmares can occur.
4 No one wants to be told what will happen to the process they are managing so if you don’t involve
4.No
them in the experimental design even if it involves reviewing the team’s designed experiment, how do
you expect cooperation?
5.If the Process Owners don’t understand what your DOE is, how can they assist you?
6.Does your DOE intend to make a wide range of quality product or potentially produce an
unacceptable product in the quest to improve the process? If the Process Owner has never known
what your DOE intentions were, how can they not be upset if they are surprised by the results of the
DOE?
7.Time and money impact scheduling, randomization, testing concerns. All of these must be
considered especially when using the actual process.
8.It is often desirable to run DOE’s in a pilot plant or facility but this is not often the case. If a pilot
facility is to be used, do the results match the process when translated outside of the laboratory?
9.Noise variables cannot be controlled, by definition, but if ambient weather is considered to have an
effect on your process, why would you execute an experiment when a cold or warm front is passing
through your area. This is one example of a known disturbance being designed around.
10 Manage your project to know if the DOE is intended to stretch the boundaries of conceived product
10.Manage
creation or work well within a small experimental area.
There are many considerations to consider. Often learning comes through experience so if you are
unsure about your future experiment in this project or another, consult with mentors or Six Sigma belts.
Experimental Methods
Technical Considerations
What are the objectives/goals for the experiment:
1. What factors are important? (narrowed from Analyze Phase)
USL
2. What is the operating range for each factor?
3. How can I minimize both the cost of DOE and the cost of
6 Sigma
running the process? 5 Sigma
4. How much change in the process do we require?
4 Sigma
5. How close to optimal does the process currently run?
6. Are we tackling a centering or variation problem? 3 Sigma
7. What impact to the process while running the DOE?
2 Sigma
8. What is the cost of competing DOE designs?
1 Sigma
9. What do you know about the process interactions?
The technical considerations to be made, these need be answered before running an experiment.
Making sense of these at the present is not necessary.
Experimental Methods
• Objective must include the critical characteristics and the desired outcome.
– If the experiment
p and pproject
j is tackling
g recurring
g issues,, consider a
different critical characteristic.
• The characteristic may require a different physical phenomenon
being measured or with a differing measurement system.
• The measurement system precision and accuracy may influence the
specific output to be measured.
• Identify the desired experimental outcome.
1 Eliminate Root Cause
1.
2. Reduce Variation
3. Achieve a target
4. Maximize Output Response
5. Minimize Output Response
6. Robust process or product
Step 3 is knowing that a DOE is going to be performed, does it makes sense to go an extra mile?
Let’s get our money’s worth by measuring more than one output if it could benefit us in any way.
Experimental Methods
• Use the Analyze Phase and subject matter experts to select these factors.
• All factors must be independent of each other.
• Consider past results from previous experiments.
• Test the most likely candidates first.
• Factors not included in the designed experiment should be held constant
and recorded.
• N oise
i or uncontrollable
t ll bl ffactors
t (t
(typically
i ll environmental
i t l conditions)
diti ) should
h ld
be monitored and the experimental design may be impacted (see Step 6).
The inputs selected by the tea m follow ing the Six Sigm a m ethodology a re
dw ell tim e (sec), tem pera ture of solution (deg F) a nd concentra tion of solution
((% solids).
) N oise fa ctors of a mbient tem ppera ture a nd hum idity
y w ere
recorded a nd m onitored.
Step 5 is to choose
the levels for the 5 . Choose the Levels for the Input Va ria bles
input variables. The
• Factor levels must be considered to create the desired change in Output
factor levels must be
Response identified in Step 3.
considered to create
the desired change • Do N OT create unsafe conditions or beyond the feasibility of the process.
in the output – This does N OT mean constraining Input Variable levels to current
response as process range.
identified in Step 3
3. – Be wary y if operating
p g near the extremes or operating
p g limits.
Poor choices for • Realize some experimental runs may produce unacceptable product or
input variable level process results. These results must be weighed against the risk of future
settings could very production.
well render an • Even when designing your experiment with coded levels for the factors, the
experiment useless team MUST be aware of what the levels mean in the process language.
so be smart. • Factor levels can be impacted by the Experimental Objective in Step 2.
– Screening g experiments
p have wider settings
g for factors
– Full Factorials have narrower settings than screening experiments
– Response surface Designed Experiments have quite narrow settings
Experimental Methods
“-” “+”
F
Factor Settings
S i
Be aware you do not want to set the factor levels too low either. We could be shown no difference in
output to input relationship.
“ -” “ +” Factor Settings
Experimental Methods
ponse
experimental noise.
A
Assume this
thi graphic
hi was a
Output Resp
sketch generated from our
basic understanding of the
theory. We don’t know
exactly what factor setting Factor Settings
would produce the output “ -”
response but we do know “ +”
the ggeneral shape
p of the The ex p perim ent is usingg coded levels:
curve. Notice that we Dw ell tim e: +1 (2 0 sec); -1 (1 0 sec)
Temp of sol’n: + 1 (8 0 deg F); -1 (1 0 0 deg F)
stayed away from the sharp Conc. of sol’n: + 1 (4 0 %) ; -1 (2 0 %)
peak. It is very easy to slide
off such a steep peak,
unless your process controls are very tight it would be better to find the nice robust region where the
output response is high but flat, meaning that the factor settings can change a bit, but it does not
have much effect on the output response. If the concern for spending too much time on this comes
up, also,
l consider
id h how many d defects
f t are ttaken
k iin when h th the statistical
t ti ti l significance
i ifi iis d
deemed d
inadequate.
You might think we have spent too much time on just setting the levels for the input variables or
factors in your experiment. However, consider the learning of others who have had to go back to
their Process Owners or Champions and explain that no factors were deemed statistically significant
because the design was inadequate.
Experimental Methods
Remembering that noise variables can’t be controlled but managed around, blocking is a technique
for managing your experiment around noise variables considered of importance. Remember, you are
interested in understanding the effects and interactions of your controlled variables so you want
statistical confidence
confidence.
Randomization has an
impact on your statistical Randomization has an impact on your statistical confidence because your
experimental noise is spread across the runs.
confidence because your
experimental noise is
spread across the runs.
What would happen if
another unknown significant
variable changed halfway
during our experiment?
It is possible that an
unknown significant variable
such as machine warm up
would g get confused with the What would happen if
C variable because without another unknown
randomization all the low significant variable
changed halfway
levels would be generate thru our experiment?
first and then all the high
levels?
Experimental Methods
Determining sample
size is very similar to
what we did in the Sa m ple size m ust be determ ined.
Analyze Phase.
There are a few
Determine
distinctions. Much of d by Step
4.
For full
fa ctoria ls,
explanatory.
2 L
2-Level
l F
Factorial
t i l DDesign
i
MINITABTM
then shows Alpha = 0.05 Assumed standard deviation = 1
2 Reps.
Center Total Target
WHAT THE Points Effect Reps Runs Power Actual Power
HECK IS A 0 2 2 16 0.9 0.936743
REP??
Experimental Methods
A replication is
NOT a duplicate or
a repeat. Look at
the two designs
shown here. The
first is a single
g
replicate design,
which means there
is only one value
for each unique experimental run. The terminology is a bit confusing, but don’t worry.
The replicated design has double the runs. The design is fully randomized whenever possible so this is
not the order in which it is run.
Notice how experimental run #1 and #9 have the three factors which are start angle, stop angle and
fulcrum, running with the same combination of levels and then experimental run #9 is a replicate of run
#1.
Experimental Methods
Recall from the Analyze Phase the Multi-Vari tool described the three
families of variation. Consider these families of variation to determine
how to sample with replication for an experiment.
– W ithin Unit or Positional
• W ithin piece variation related to the geometry of the part.
• Variation across a single unit containing many individual parts
such as a wafer containing many computer processors.
• Location in a batch process such as plating.
– Between Unit or Cyclical
• Variation among consecutive pieces.
• Variation among groups of pieces.
• Variation among consecutive batches.
• Temporal or Over time
• Shift-to-Shift
• Day-to-Day
• W eek-to-W eek
• Discuss the experimental scope, time and cost with the process owners
prior to the experiment.
• Some team members must be present during the entire experiment.
• After the experiment has started, are you getting output responses you
expected?
– If not, quickly evaluate for N oise or other factors and consider
stopping or canceling the experiment.
• Use a log book to make notes of observations, other factor settings, etc.
• Communicate with the operators, technicians, staff about the
experimental details and why the experiment is being discussed before
running the experiment.
– This communication can prevent “ helping” by the operators,
technicians, etc. that might damage your experimental design.
• Alert the laboratory or quality technicians if your experiment will
increase the number of samples arriving during the experiment
experiment.
Experimental Methods
Experimental Methods
• After finding the Practical Results from Step 9, verify the results:
– Set the factors at the Practical Results found with Step 9 and see
if the process output responds as expected. This verification
replicates the result of the experiment.
– Do not forget your model has some error.
1 1 . Implement Solutions
You will p
probably
y not fully
y appreciate
pp all the comments in the modules of this p
phase until yyou have
designed, managed, executed and analyzed a few real life experiments for yourself.
Experimental Methods
Notes
Improve Phase
Full Factorial Experiments
p
In this module
we will discuss W
W elcom
elcomee to
to Im
Improve
prove
the Full
Factorial in Process
Process M
Modeling:
odeling: Regression
Regression
detail.
Adva
Advanced
nced Process
Process M
Modeling:
odeling:
M
MLR
LR
Designing
gg gg Ex
Designing Ex perim
pperim
p ents
ents
Mathematical
Mathematical Models
Models
Ex
Experim
perimenta
entall M
Methods
ethods
Balance
Balance and
and Orthogonality
Orthogonality
Full
Full Fa
Factoria
ctoriall Ex
Ex perim
periments
ents
Fit
Fit and
and Diagnose
Diagnose Model
Model
Fra
Fractiona
ctionall Fa
Factoria
ctoriall
Ex
Experim
periments
ents Center
Center Points
Points
W
W ra
rapp Up
U &
Up & Action
A
Action
ti Item
Itemss
It
Two level Full Factorial designs are the most powerful and efficient set of experiments.
This may look similar to regression, but the important difference is that DOE is considered true
cause and effect because of the controlled nature of experimentation. This is an important tool in
manufacturing environments.
The only difference between the model equation and the prediction equation shown is that the
prediction equation is simplified for describing the data gathered in the experiment and using it to
predict future events
events. Just because you end up with a prediction equation in an experiment does not
mean it is a good predictive model. We will discuss this further when we introduce Center Points.
65
65
60 55
% Reacted % Reacted
1 1
55
45
-1
0
Cn -1
0
Cn
0 -1 0 -1
Ct 1 T 1
Linear Models are usually sufficient for most industrial experimental objectives. This goes back to
the difference between a physical model and a DOE model. Just because we know by theory that
the model should not be linear, it may express itself as sufficiently Linear in the particular design
space.
People can get confused between the concept of curvature and twisted response planes. We do
not have enough information (not enough levels for each variable) to describe true curvature. Take
a piece
i off paper which
hi h will
ill representt 2 iinputt variables.
i bl Lift opposite
it corners. Th
Thatt iis a graphical
hi l
representation of an interaction. The response plane (paper) is twisted. Now lift up the paper to eye
level and rotate until the projection looks like a curved line. We are simply looking at the projection
of the twisted plane with Linear Models. There may be true curvature in the real world, we simply
can’t describe it with a Linear Model.
HOWEVER, in most manufacturing processes the Linear Model is very powerful because of the
constrained design space. Draw a box on the paper and hold it up by two opposite corners.
Depending on how much twist you give the paper and how big the box is you will either see a
curve or not in the defined space.
The surface plot on the left has no significant interaction, but both Main Effects are significant. The
surface plot on the right shows a significant interaction with T and Cn.
Here is a surface plot of true curvature in a Quadratic Model. This shape is referred to as a saddle
for obvious reasons.
Quadratic Models can be obtained with designs not described in this module.
Quadratic Models explain curvature, maximums, minimums and twisted
maximums and minimums when interactions are active.
– The following is the quadratic prediction model used in some response
surface models not covered in this training.
– The simpler 2k models do not include enough information to generate
the Quadratic Model.
21
16
C6
11
1.5
1.0
6 0.5
-1.5
5 -0.5
0.0
B
-1.0
-1 0 -0.5
05 -1.0
-1 0
0.0 05
0.5 -1.5
A 1.0
0 1.5
Treatment Combinations
Minuses and plusses can be used to indicate low and high factor level
settings, center points are indicated with zeros.
The design matrix for 2 k factorials are shown in standard order (not
randomized).
– The low level is indicated by a “ -” and the high level by a “ +” .
– This order is commonly referred to Yates standard order for Dr.
Frank Yates.
Here we h
H have
standard notation for
2 to the 4 design and
above using 2 cubes,
a common
representation; now
for the low levels of
tthe
e 4 the
t e factor
acto and
a d
one for the high.
This ttable
Thi bl created
t d with
ith Factors
the factors is referred to
as a table of contrasts.
The contrast columns are
the minus ones and plus
ones in the factor
columns. In order to
calculate contrast
columns for interactions,
we need the contrast
columns for the main
factors.
Warning, whatever you do, do not change the names of the columns by simply typing over the
names. MINITABTM creates a model that it uses for the analysis later. If it can’t find the column
names used to generate the worksheet, it will give an error message.
Balanced Design
Factorial designs should be
Factorial Designs should be balanced for proper interpretation of the
balanced for proper
mathematical equation.
interpretation of the
mathematical equation. An experiment is balanced when each factor has the same number of
experimental runs at both high and low levels.
An experiment is balanced
when each factor has the Summing the signs of the column contrast should yield a zero.
same number of
experimental runs at both Balance simplifies the math necessary to analyze the experiment
experiment.
high and low levels. – If you always use the designs MIN ITABTM provides, they will always be
balanced.
Summing the signs of the A B
column contrast should yield 1 - -
a zero. In this example, there 2 + -
are 2 minuses and 2 plusses.
3 - +
Balance
B l simplifies
i lifi th the math
th 4 + +
necessary to analyze the ∑ Xi 0 0
experiment.
MINITABTM creates balanced, orthogonal designs. If they aren’t changed, this isn’t a problem.
Orthogonal Design
An orthogonal design
allows each effect in An Orthogonal Design allows each effect in an experiment to be
an experiment to be p
measured independently, y theyy are vectors that are at 90 degrees
g to
measured each other.
independently, these If every interaction for all possible variable pair sums to zero, the
are vectors which are design is orthogonal.
at 90 degrees to each W ith an Orthogonal Design, if an interaction is found to be significant,
other. When every it is because of the data and not the experimental design.
interaction for all – If you always use the designs MIN ITABTM provides, they will always be
possible variable pair orthogonal and balanced.
sums to zero, the
A B C AB AC BC
design is orthogonal.
1 - - + + - -
2 + - - - - +
3 - + - - + -
4 + + + + + +
∑ XiX y = 0 0 0
In an empty
p y column, type
yp in
‘Yield’ where we will place the In an empty column, type in ‘Yield’ where we will place the experimental
results. Column C8 was selected in this example.
experimental results. Column
Do NOT edit, copy, paste or alter anything in the first 7 columns or
C8 was selected in this MINITABTM will not understand the worksheet.
example.
80
A F actor
A
B
N ame
Temp
C onc
Any significa nt effects w ill be
AC
plotted off the stra ight line a nd
large are indicated in red and
C S upplier
70
Percent
60
50
highlighted in red.
labeled. This method is referred 40
30
20 Pareto Chart of the Standardized Effects
to as the Daniels method in some 10
5 2.31
(response is Yield, Alpha = .05)
literature. 1
A
F actor N ame
A
B
Temp
C onc
0 10 20 30 40 50 60 70 C S upplier
Standardized Effect AC
BC
sta nda rdized effects
the selected alpha level. Any gra phica lly show s w hich
ABC
effects a re significa nt AB
Effect that is beyond the red line ba sed on the selected C
is considered significant. a lpha level. Any effect
0 10 20 30 40 50 60 70
tha t goes bey ond the red Standardized Effect
line is significa nt.
At this point, Temperature and
the interaction of Temperature
with supplier are the significant
Effects. In the Session W indow under the factorial fit, any effect that has a P-
value less than 0.05 (for an alpha of 0.05) is considered significant.
Look for the factorial fit
N otice that all three methods of determining what effects belong in
information. We interpret this
the final model fit agree.
based on the same way as we
Factorial Fit: Yield versus Temp, Conc, Supplier
have interpreted as we do any
other statistical test. Estimated Effects and Coefficients for Yield (coded units)
What does this tell us….there are Term Effect Coef SE Coef T P
We need to create some Sta t> DO E> Fa ctoria l>Fa ctoria l Plots
factor plots before evaluating Anytim e there is a significa nt
the residuals. Follow the intera ction, it is useful to plot.
Plot both “ M a in Effects Plot” a nd
MINITABTM path shown here.
here “ Intera ction Plot” in this ex a m ple.
ple
Mean of Yield
indicate they are not 50
25 45 5 15
slightly a nd still be
significant. Supplier insignifica nt.
70 Interaction Plot (data means) for Yield
65 5 15 A B
Residual
Residual Plots
Plots for
for YYield
ield
50
50 00
11 -2
-2
-2
-2 -1
-1 00 11 22 40
40 50
50 60
60 70
70 80
80
Standardized
StandardizedResidual
Residual Fitted
FittedValue
Value
Hist
Histogram
ogramof
of tthe
he Residuals
Residuals Residuals
ResidualsVersus
Versustthe
heOrder
Orderof
of tthe
he Dat
Dataa
22
44
esidual
esidual
33 11
Re
Frequency
yy
StandardizedRe
Frequency
Standardized
22 00
11 -1
-1
00 -2
-2
-1.5
-1.5 -1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5 11 22 33 44 55 66 77 88 99 10
10 11
11 12
12 13
13 14
14 15
15 16
16
Standardized Obser vation OOrder
rder
StandardizedResidual
Residual Observation
The Residuals versus The Residuals versus variables are most important when deciding what level to
Variables are most set an insignifica nt factor.
important when A typical guideline is a difference of a factor of 3 in the spread of the Residuals
deciding what level to between the low and high levels of an insignificant input variable.
set an insignificant – In this case concentration was not significant, but we still need to make a decision
factor. on how to set it for the process. The low level for concentration has a smaller
spread of Residuals, but there is not a difference of 3:1. Other considerations for
setting the variable are cost and reducing cycle time.
A typical guideline is a
difference of a factor Residuals Versus Temp Residuals Versus Conc
(response is Yield) (response is Yield)
of 3 in the spread of 2
Spread of residuals
2
the Residuals
1 1
Standardized Residual
Standardized Residual
between the low and
high levels of an 0 0
insignificant input
-1 -1
variable. In this case
concentration was not -2
25 30 35 40 45
-2
5.0 7.5 10.0 12.5 15.0
need to make a
decision on how to set it for the process. The low level for concentration has a smaller spread of
Residuals, but there is not a difference of 3:1. Other considerations for setting the Variable are
cost and reducing cycle time.
The Response Optimizer in MINITABTM is a great tool to visually determine where to set the input
variables to achieve the desired output response. Play with it for a while and see what you get. The
more you play around with these thing the better your understanding will be of how it works.
Practical Solution:
Temp 45C
Concentration 5%
Supplier B
Center Points
As you can see in the A Center Point is an additional experimental run made at the physical center of
the design.
graphic there may be an – Center Points do not change the model to quadratic.
unknown hump in the – They allow a check for adequacy of linear model.
Response Curve, by The Center Point provides a check to see if it is valid to say that the output
response is linear through the center of the design space.
adding the Center Point it If a straight line connecting high and low levels passes through the center of the
allows us to calculate an design, the model is adequate to predict inside the design space.
– “Curvature” is the statistic used to interpret the adequacy of the Linear
additional statistic. If there Model.
is significant curvature in – If curvature is significant the P-value will be less than 0.05.
the model all we know is Do NOT predict outside the design space.
Linear
Linear.
Output Resp
In this example
p we will walk throughg the 11 stepp DOE methodologygy for
a panel cleaning machine using Center Points in the analysis. The
manufacturing firm is attempting to start up a new panel cleaning
machine and would like to getting it running quickly. They have
experience with this type of machine, but they do not have experience
with this particular model of equipment.
Na2S2O8 is Sodium
Persulfate; please 3 . Select the O utput (response) Va ria bles
use that any time • W idth of conductor is the only response.
you see that
notation. 4. Select the Input (independent) Va ria bles
• Dwell Time
• Temperature
• N a2 S2 O 8
• The experts believe that ambient temperature and humidity will have
no effect on the process. Monitors will be placed in the room to
record temperature and humidity.
You actually know the answer already since the sample size is the same as the previous example
since they were both 2 cubed designs. Look at your worksheet and find the Center Point runs.
Why are the Center Points uniformly distributed?
Why are
Wh
N otice the
Center
Center Points
Points not
a re uniform ly
random?
distributed
through the
design.
Center Points not only tell us something about how well the linear model works, but is also a
reality check for our data. By eyeballing the Center Point data as our experiment progressed we
can see if anything has effected our experiment that we were not expecting. If your Center Points
are dramatically different from each other, you’ve got a problem -- somewhere. They should be
fairly close in magnitude, at least within normal variation.
MIN ITABTM will place the Center Points randomly in the worksheet. The
next few slides will demonstrate how to move the Center Points so they
are uniformly distributed.
1. Create a 3 factor design with 3 Center Points and 2 replicates,
be sure to randomize the design
design.
Your designg should look different than the one in the illustration because we more likely
y than not have
a different random seed that generated the designs. It is possible that our designs are the same, but
trying to calculate the odds of that occurring is not worth the bother. You should have 19 rows in your
design, so if you do not, go back and fix it.
Sodium Persulfate,
99
Effect Type
Not Significant
95 Significant
80
B
C F actor
A
B
C
Name
Dw ell Time
Temp
Na2S 2O 8
60
50
5
BC
2.23
(response is Width, Alpha = .05)
F actor N ame
1 A D w ell Time
-10 -5 0 5 10 15 20 C B Temp
Standardized Effect C N a2S 2O 8
B
BC
Term
A
The significa nt effects a re
N a 2 S2 O 8 , Tem p, Dw ell AB
0 5 10 15 20
Standardized Effect
Notice that all three methods of determining what effects belong in the final
model fit agree.
When working with 2 level designs you will always have 1 degree of freedom for each effect
(including interactions) which is calculated as 2 levels minus 1 equals 1 degree of freedom. In the
ANOVA table for Main Effects we have 3 degrees of freedom for the 3 Main Effects placed in the
model There is one degree of freedom for the temperature Sodium Persulfate interaction
model. interaction.
The Residual error is broken into 2 sources. The 3 degrees of freedom for lack of fit are from the 3
interaction effects that were removed from the model because they were not significant in explaining
the variation of the data. The 10 degrees of freedom come from replication. The 8 runs from the
original design generated 8 degrees of freedom
freedom, in this case there were 2 replicates minus 1 equals 1
degree of freedom for each run in the design. Add to that 2 degrees of freedom from the Center
Points (3 Center Points minus 1 equals 2 degrees of freedom) and we have a total of 10 degrees of
freedom for pure error. Pure error can be defined as the failure of things treated alike to act alike
which are the replicates.
The SS or Sum
S off Squares
S calculations are simply an unscaled or unadjusted measure off
dispersion or spread of the data. Seq or Sequential Sum of Squares and Adj or Adjusted Sum of
Squares are the same for DOE analyses. (There may be differences in Regression Analysis).
Adj MS or Adjusted Mean Square takes the Sum of the Squares number and scales it using the
number of degrees of freedom for that calculation. Mean Squares are the equivalent of variance.
Here we use the F statistic. An F statistic is simply variance divided by variance. In the case of
DOE it is the Variance of an effect divided by the variance due to residual error. In this platform,
MINITABTM sums the sum of the squares for certain elements of the model to report in the ANOVA
table instead of keeping them separate. The F statistic with respect to the Main Effects is calculated
by taking 199.779 and dividing by 1.348 which equals 148.18. The associated P-value is 0.000
which is less than 0.05 so our conclusion is that the model is significant.
Notice in this example the curvature is not significant which means our assumption of linearity is
good. Also
good so tthe
e p value
a ue for
o lack
ac oof fitt is
s not
ot ssignificant.
g ca t Thatat means
ea s tthe
eeeffects
ects we
e removed
e o ed from
o tthe e
model really do not belong in the model. If there was significant lack of fit, that would indicate that
some of the effects that were removed from the model actually belong in the model.
The last to discuss here is the prediction equation. Please note here the coefficients for the
prediction equation are based on uncoded units. In other words, you can use this equation directly
in real units. Let’s do an example next.
Prediction Equation
Take a few minutes to study
Determine the predicted value when:
the equation above. It really
– Dwell time = 4.2 minutes
is simply “plug and chug”.
– Temp = 75C
– Sodium Persulfate = 2.0
Please note, we have taken
liberties with rounding
numbers! You won’t actually Simply insert these values into the equation and do the math.
have to do this by hand
because that is exactlyy what
the response optimizer does
in MINITABTM.
36
34
Persulfate interaction
of Width
32
Na2S2O8
Interaction Plot (data means) for Width
38
36
40 60 80 1.8 2.1 2.4
Dwell
response as long as
40
34
Time
4
5
Point Ty pe
C orner
C enter
Sodium Persulfate is held
Dwell T ime
30
concentration of Sodium
24
1.8 2.1 2.4
Temp Point Ty pe
40
40 C orner
33.245
33.245 38.395
38.395
80
80
35.020
35.020
Temp
Temp 36. 010
36.010 41. 000
41.000
2.4
2.4
Na2S2O8
N 2S2O8
Na2S2O8
23.025
23.025 25.895
25.895
40
40 1.8
1.8
44 66
Dwell
DwellTime
Time
Residual
StandardizedResidual
90
90 11
Percent
Percent
00
Standardized
50
50
-1
-1
10
10
11 -2
-2
-2
2
-2 -1
1
-1 00 11 22 20
20 25
25 30
30 35
35 40
40
Standar dized Residual
Standardized Residual Fitted
FittedValue
Value
Residual
6.0
StandardizedResidual
4.5 11
4.5
Frequency
Frequency
00
Standardized
3.0
3.0
1.5 -1
-1
1.5
0.0 -2
-2
0.0
-2
-2 -1
-1 00 11 22 22 44 66 88 10 10 1212 14 14 16
16 18
18
Standar dized Residual
Standardized Residual OObservation
bser vation OOrder
r der
1 1
Standardized Residual
Standardized Residual
0 0
As depicted here the
Residuals Versus
-1 -1
Factor Plots do NOT
-2
4.0 4.5 5.0 5.5 6.0
-2
40 50 60 70 80
show
h any diff
differences
Dwell Time Temp
in the variation of the
Residuals Versus Na2S2O8 data from the low to
(response is Width)
2 the high values.
1
Standardized Residual
-1
S
-2
1.8 1.9 2.0 2.1 2.2 2.3 2.4
Na2S2O8
9 . Dra w Pra ctica l Solutions Sta t>DO E> Fa ctoria l> Response O ptimizer
Here we will use the
Response Optimizer
to draw some
Practical
Conclusions. Play
with the Response
Optimizer and see
what you can do
remembering that the
original
i i l objective
bj ti was
to hit a target of 40
+/- 5 for the width.
Predicted output
Imagine if you were working with gold or platinum. What effect could that have on the bottom line?
Look at another
graphical tool you can There is another MINITABTM function that will show the complete
use in MINITABTM to solution set for a targeted values.
visualize the solution Stat>DOE>Factorial>Overlaid Contour Plot
As shown here we
generate 3 Overlaid Contour Plot of Width Overlaid Contour Plot of Width
a result of
45
Na2S2O8
Na2S2O8
point for dwell 2.10
Dw ell Tim e 2.10 Dw ell Tim e
time. The areas a t low a t m iddle
1.95
setting 1.95 setting
shown in white are
the solution set for 1.80
40 50 60 70 80
1.80
40 50 60 70 80
Notes
Improve Phase
Fractional Factorial Experiments
Now we will continue with the Improve Phase “Fractional Factorial Designing Experiments”.
Advanced
Advanced Process
Process Modeling:
Modeling:
MLR
MLR
Designing
Designing Experiments
Experiments
Experimental
Experimental Methods
Methods
Designs
Designs
Full
Full Factorial
Factorial Experiments
Experiments
Creation
Creation
Fractional
Fractional Factorial
Factorial Experiments
Experiments
Generators
Generators
Wrap
Wrap Up
Up &
& Action
Action Items
Items
Confounding
Confounding && Resolution
Resolution
Fractional Factorial Designs are a powerful sub-set of Factorial Designs. As the name implies, you
may expect they are some fraction of the original Factorial Designs – and you’d be correct. The
question is what fraction?
We’ve shown two 4 factor designs side by side so you can contrast the two designs. Notice the
Fractional Factorial Design requires only a fraction of the experimental runs to evaluate 4 input
factors In this case,
factors. case it is a half fraction
fraction. As with most things in life there is a price to be paid for
reducing the number of runs required which we will go through in detail in this module.
Fractional Factorial designs are also used to study Main Effects and 2-way interactions if the
experimenter and team has good process knowledge and can assume higher order interactions are
negligible. There is the cost in a nutshell. In exchange for reducing the overall experiment’s size you will
give up the ability to evaluate higher order interactions. It turns out this is a pretty good assumption in
many cases. We’ll talk about this more later.
Fractional Factorial designs are also used to reduce the time and cost of experiments because the
number of runs are lowered. As the number of factors increases, the number of runs required to run a
full 2k factorial experiment also increases (even without repeats or replicates) as you already know.
3 factors: requires 8 runs
4 factors: requires 16 runs
5 factors: requires 32 runs etc….
The number of runs required for a Fractional Factorial will depend on how many factors are included in
the design and how much fractioning can be tolerated based on the facts of the process
process.
Fractionals are also used as an initial experiment that can be augmented with another fraction to
reduce confounding and estimate factors of interest. We’ll define this as we advance through the
module.
How many runs if no repeats or replicates? Simply do the math. 2 to the 5 minus 1 is the same as
2 to the fourth which is 8 runs.
What Fractional Design is this? Since this design uses only half the number of runs as a Full
Factorial with 5 factors it is a half fraction.
Recall the 2x2x2 full 3-factor, 2-level Factorial Design. Suppose we needed to investigate
a fourth factor but we could N OT increase the number of runs because of time or cost.
g
Select the highest order interaction to represent
p the levels of the fourth factor. The ABC
interaction will determine the levels for factor D.
W hen we replace the ABC interaction with factor D, we say the ABC 3-way interaction
was aliased or confounded with D. This experiment maintains balance and orthogonality.
– The first experimental run in the first row indicates the experiment is executed with factor D at
the low level while running all the 3 other factors at the low level.
Factor D
A B C AxB AxC BxC AxBxC
-1
1 -1
1 -1
1 1 1 1 -1
1
1 -1 -1 -1 -1 1 1
-1 1 -1 -1 1 -1 1
1 1 -1 1 -1 -1 -1
-1 -1 1 1 -1 -1 1
1 -1 1 -1 1 -1 -1
-1 1 1 -1 -1 1 -1
1 1 1 1 1 1 1
Why is the design, shown as orange rows, called a “half” fraction? This is the design
just created on the previous slide. This is a half fraction since a full 2x2x2x2 factorial
would take 16 runs. With the half fraction we can estimate the effects of 4 factors in 8
runs. What is the cost? We lose the ability to study the higher order interaction
independently!
- A + - A +
Top line of previous slide
-
- C
+
B
-
+ C
+
- D +
Remember that D is confounded with the ABC interaction in this half-fractional
design.
Design Generators
Don’tt worry – MINITABTM will take care of this! THANK YOU MINITABTM!!!!
Don
This graph helps us visually draw the conclusion of the data that we already have. We have
highlighted in green two boxes and this can very simply be filled in by the data expressed by the
generator; A times B times C equals D.
Design
g Generator D = ABC
• Because of the Design Generator we can now fill out the D column
– For each row of D, multiply the values in the columns of A, B and
C together and create the column
• You may correctly suspect some 2-factor interactions are
confounded
• Create contrast columns for AD,
AD BD,
BD CD using a similar technique
used to create the column for D
A B C AB AC BC D AD BD CD
-1 -1 -1 1 1 1
1 -1 -1 -1 -1 1
-1 1 -1 -1 1 -1
1 1 -1 1 -1 -1
-1 -1 1 1 -1 -1
1 -1 1 -1 1 -1
-1 1 1 -1 -1 1
1 1 1 1 1 1
Notice after the design structure an alias structure is indicated. The line under the alias structure
showing A plus BCD means the A Main Effect is confounded with the 3 way interaction BCD. Also,
later we can see the AB 2 2-way
way interaction is confounded with the CD 2 2-way
way interaction meaning we
cannot distinguish if the interaction is statistically significant whether it is a result of the AB or CD
interaction or a combination.
So What is “Confounding”?
Experimental Resolution
k-p
2R
Remember R in the nomenclature referenced the Resolution.
This useful visual aid remembers definitions of the
Confounding designated by the Resolution.
Resolution IV
Next hold up four fingers
The Confounding is main effects with
three way interactions or…
Main Effects Three Way Interactions
k-p
2R
The visual aid is shown through Resolution V.
Resolution V
Hold up Five Fingers, One on one hand and
F
Four on the
th other.
th This
Thi illustrates
ill t t the th
Confounding of main effects with four way
Main Effects Four Way Interactions interactions or …
Example of a very useful Fractional Design often used for screening designs.
Run A B C D E
1 -1 -1 -1 -1 1
2 1 -1 -1 -1 -1
3 -1 1 -1 -1 -1
E
4 1 1 -1 -1 1
5 -1 -1 1 -1 -1
6 1 -1 1 -1 1
7 -1 1 1 -1 1
B
8 1 1 1 -1 -1
C
A 9 -1 -1 -1 1 -1
D 10 1 -1 -1 1 1
11 -1 1 -1 1 1
Pros Cons 12 1 1 -1 1 -1
13 -1 -1 1 1 1
5 factors (Main Effects) 16 trials to get 5 Main Effects 14 1 -1 1 1 -1
10 2-way interactions 2nd order interactions are
15 -1 1 1 1 -1
Main Effects only Confounded Confounded with 3rd order
with rare 4-way interactions 16 1 1 1 1 1
DOE Methodology
We have included a copy of the methodology here for you to use when following our practical
example for Fractional Factorials.
This is a two to the eighth minus four power design with a resolution four design
design. This design has
16 runs as you see in the graphic with all eight factors at two levels.
Take a look at what Confounding exists before you jump into analysis.
Pareto
ParetoChart
Chartof
ofthe
theEffects
Effects
(response
(responseisisY,
Y,Alpha
Alpha==.10)
.10)
0.26
0.26
EE FFactor
actor Name
N ame
AA AA
AC
AC BB BB
HH CC CC
BB DD DD
AF EE EE
AF FF FF
AE
AE GG GG
AD
AD HH HH
TTerm
Term
AA
AG
AGG
CC
AH
AH
AB
AB
GG
FF
DD
00 22 44 66 88 10
10 12
12 14
14
Effect
Effect
Lenth's
Lenth's PSE
PSE==0.129375
0.129375
S = 0.175232 R-Sq
R Sq = 99.98% R-Sq(adj)
R Sq(adj) = 99.96%
Analysis of Variance for Y (coded units)
Source DF Seq SS Adj SS Adj MS F P
Main Effects 4 921.55 921.545 230.386 7502.91 0.000
2-Way Interactions 3 331.20 331.198 110.399 3595.34 0.000
Residual Error 8 0.25 0.246 0.031
Total 15 1252.99
Residual
Residual Plots
Plots for
for YY
Normal
NormalProbability
Probability Plot
Plot Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
99
99 NN 16
16
0.4
0.4
AD
AD 0.532
0.532
90
90 P-Value
P-Value 0.146
0.146 0.2
02
0.2
al
Percentt
Residua
Residual
Percent
50
50
0.0
0.0
10
10 -0.2
-0.2
11
-0.50
-0.50 -0.25
-0.25 0.00
0.00 0.25
0.25 0.50
0.50 00 10
10 20
20 30
30
Residual
Residual Fitted
FittedValue
Value
Histogram
Histogramof
of the
theResiduals
Residuals Residuals
ResidualsVersus
Versusthe
theOrder
Orderof
ofthe
theData
Data
44 0.4
0.4
33 0.2
ency
0.2
ncy
ual
al
Residua
Residu
Frequen
Freque
22
0.0
0.0
11
-0.2
-0.2
00
-0.3
-0.3 -0.2
-0.2 -0.1
-0.1 0.0
0.0 0.1
0.1 0.2
0.2 0.3
0.3 0.4
0.4
11 22 33 44 55 66 77 88 99 10
1011
11 12
1213
1314
14 15
1516
16
Residual Observation
ObservationOrder
Order
Residual
No, no unusual
observations here…
It ca n be difficult to optim ize the solutions a nd get the Pra ctica l Solution
desired.
Using Response O ptim izer w ithin M IN ITABTM helps us find the Pra ctica l
Solution of setting the fa ctors left in the m odel a ll a t the high level or + 1
We win, we win…!!
1 1 . Im plement Solutions
W ork with the Process Owners and develop the Control Plans
to sustain your success
Notes
Improve Phase
Wrap Up and Action Items
Congratulations on completing the training portion of the Improve Phase. Now comes the
exciting and challenging part…implementing what you have learned to real world projects.
• Listed below are the Improve Phase deliverables that each candidate
will present in a Power Point presentation at the beginning of the
Control Phase training.
• At this point you should all understand what is necessary to provide
these deliverables in your presentation.
– Team Members (Team Meeting Attendance)
– Primary Metric
– Secondary Metric(s)
– Experiment Justification
– Experiment Plan / Objective
– Experiment Results
– Project Plan
– Issues and Barriers
It’s your show!
Before beginning the Control Phase you should prepare a clear presentation that addresses each
topic shown here.
• Being tenacious
tenacious, courageous
Look for the potential roadblocks and plan to address them before they
become problems:
– Lack of data
– Data p presented is the best g
guess by
y functional managers
g
– Team members do not have the time to collect data
– Process participants do not participate in the analysis planning
– Lack of access to the process
Each phase will have roadblocks. Many will be similar throughout your project.
DMAIC Roadmap
Process Owner
Champion/
Estimate COPQ
Establish Team
Measure
The objective of the Improve Phase is simple – utilize advanced statistical methods to identify
contributing variables OR more appropriately optimize variables to create a desired output.
Improve Phase
Over 80% of projects will realize there
solutions in the Analyze Phase – Analysis Complete
Designed Experiments can be extremely
effective when used properly, it is Identify Few Vital X’s
• How much of the problem have you explained with these X’s?
X s?
These are questions that the participant should be able to answer in clear, understandable language
at the end of this phase.
W HAT W HO W HEN W HY W HY N O T HO W
Over the last decade of deploying Six Sigma it has been found that the parallel application of the
tools and techniques in a real project yields the maximum success for the rapid transfer of
knowledge. It is imperative that you complete this and submit your plan for action for review with
your mentors. Thanks and good luck!
You have now completed Improve Phase – Wrap Up and Action Items.
Notes
Improve Phase
Quiz
Now we will see what you have retained from the Improve Phase of the course. Please answer
these questions to the best of your ability without referencing the text. The answers are in the
Appendix. Please check your answers against the answers provided and review the sections in
the Improve Phase where your retention of the knowledge is less than you desire.
1 M
1. Multiple
lti l RRegressions
i are b
bestt used
d ffor?
?
A. Non-linear relationships between an X and a Y.
B. Uncertainty in the slope of the linear relationship between an X and a Y.
C. Relationships between Y and two or more X’s.
D. Replacing the use of a Designed Experiment.
2. Which relationships can be modeled with a Regression Equation? (check all that apply)
A. Simple Linear
B. Quadratic
C. Cubic
D. Multiple Linear
E. Logarithmic
3. Which statements are true about Multiple Regressions? (check all that apply)
A. Multiple Regressions are a form of experimentation.
B The
B. Th X’X’s are assumed d tto b
be iindependent
d d t off each
h other.
th
C. The X’s are assumed to not be correlated.
D. The residuals or errors are assumed to be Normally Distributed.
E. Interactions are NOT included in Multiple Linear Regressions.
F. R2 and the statistical confidence of the coefficients are impacted by the measurement
error of the inputs or X’s.
5. The results for experiments include the desire for problem solving, screening factors and
(check all that apply)
A. Physically model a process
B. Screening factors among possibilities
C. Achieving a robust design
D. Provide Regression Analysis
E. Understand the impact of an improved Measurement System
6. Which Experimental Design typically is most associated with the fewest number of input
variables or factors in the design?
A. Fractional Factorial Design
B. Full Factorial Design
C. Simple Linear Regression
D R
D. Response S Surface
f D
Design
i
7. The 11 step methodology recommended for performing a DOE has which item as the first
step?
A. Select the output response variable(s)
B. Select the Experimental Design
C. Select the input variables
D. Define the Practical Problem
8. How many experimental runs exist in a full factorial 2-level design for 5 factors with 2
replicates for the Corner Points and no Center Points?
A. 10
B. 16
C. 32
D. 34
E. 64
9. Which statements are true about Full Factorials? (check all that apply):
A. Full Factorials are used when 5 or fewer factors are involved.
B. Full Factorials are better for optimizing a process than Fractional Factorials.
C. Full Factorials are used instead of Fractional Factorials if interactions need to be fully
understood.
D. Full Factorials are used for screening factors if the Analyze Phase was unable to
narrow the critical factors sufficiently.
E. Full Factorials never have Center Points in the design.
10. Examples of the first step in the recommended 11 step methodology for a DOE include:
(check all that apply)
A. Consider the cost of a DOE.
B. The root cause for the defective product characteristic needs to be found.
C. The variation needs to be affected by the input factors.
D. The response time to calls needs to be reduced.
E. The DOE effect on the project timeline needs to be considered.
11. What is the best reason for not selecting too large of a difference among the factor
levels in the Experimental Design?
A. The process output must not change too much.
B. The process may show little change if curvature exists and the local maximum of the
process output is between the large differences of factor levels chosen.
C. The experimental factors have rarely been operating in such a wide range.
D The experiment must have Center Points if the factor levels are wide.
D. wide
12. Which statements are correct about Experimental Designs? (check all that apply):
A. An Experimental Design cannot be orthogonal if not balanced.
B. An Experimental Design can be a balanced design but not orthogonal although it is
encouraged to use only balanced and orthogonal designs.
C. The use of blocking can be used for accounting of the impact of Noise variables.
D. Center Points are not recommended unless the experimenter is attempting to
optimize the process.
E. A resolution IV design has only 4-way interactions confounded with Main Effects.
14. Executing the Experimental Design implies which correct statements? (check all that
apply)
A. The experiment can only be run on a product or service that the customer will not
experience.
B. If the experiment is well documented with the operators, it is not recommended to
have team members present during the experiment to save on space and allow for
uninhibited movement of the process.
C. If the experiment is going to start in a week, contact the Process Owners to work out
the needs before the experiment.
D. Use a log book and note any unusual observations during the experiment.
15. Statistically significant are the only important criteria for factors being included in the
experiment’s mathematical model.
True False
16. The last step of the recommended 11 step DOE methodology is:
A. Draw Practical Solutions
B. Implement solutions
C. Discuss results with the Process Owner
D. Plan the next Design of Experiment required
17. In 2-level factorial Experimental Designs, the total number of degrees of freedom is equal
to:
A. The number of experimental runs minus 2
B. The number of experimental runs minus 1
C. The number of experimental runs
D. The number of experimental runs minus the number of main factors in the
mathematical model
E The number of residuals
E.
18. If an Experimental Design has 3 factors with no replicates and 5 Center Points in the full
factorial 2-level design, the total number of experimental runs is best described as:
A. 13
B. 8 if the experiment can have two sets of conditions run simultaneously
C. 15
D. 30 if the number of blocks is 2
19. Which statements are correct about these 2-level factorial designs? (check all that apply)
A. A design with III resolution will not have Main Effects confounded with 2-way
interactions.
B. A design with IV resolution will not have Main Effects confounded with 2-way
interactions.
C. A design with V resolution will have 2-way interactions confounded with 3-way
interactions
interactions.
D. A design with V resolution has no Main Effects confounded with any interactions.
E. A design with V resolution has no Main Effects confounded with other Main Effects
F. A design with III resolution has no Main Effects confounded with other Main Effects
Control Phase
Welcome to Control
Now that we have completed the Improve Phase we are going to jump into the Control Phase.
Welcome to Control will give you a brief look at the topics we are going to cover.
Welcome to Control
Overview
These are the modules
we will cover in the
W
W elcome
elcom e to
to Control
Control
Control Phase as we
attempt to insure that
the gains we have Adva
Advanced
nced Ex
Experim ents
periments
made with our project
remain in place..
Adva
Advanced
nced Ca
Capa
pability
bility
We will examine the
meaning of each of
these and show you Lea
Leann Controls
Controls
how to apply them.
Defect
Defect Controls
Controls
Sta
Statistica
tisticall Process
Process Control
Control
(SPC)
(SPC)
Six
Six Sigm
Sigmaa Control
Control Pla
Plans
ns
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
DMAIC Roadmap
Process Owner
Champion/
Estimate COPQ
Establish Team
Measure
Welcome to Control
Improvement Selected
Align
g Systems
y and Structures
Go to N ext Project
Control Phase
Lean Controls
Lean Controls
Overview
You can see in this section of the course we will look at the Vision of Lean, Lean Tools and
Sustaining Project Success.
We will examine the meaning of each of these and show you how to apply them.
W
W elcom
elcomee to
to Control
Control
Adva
Ad
Advancedd Ex
nced E
Ex perim
i ents
t
periments
Adva
Advanced
nced Ca
Capa
pability
bility Vision
Vision of
of Lean
Lean Supporting
Supporting Six
Six Sigma
Sigma
Lea
Leann Controls
Controls Lean
Lean Tool
Tool Highlights
Highlights
Defect
Defect Controls
Controls Project
Project Sustained
Sustained Success
Success
Sta
Statistica
tisticall Process
Process Control
Control
(SPC)
(SPC)
Six
Six Sigm
Sigmaa Control
Control Pla
Plans
ns
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Lean Controls
You’ve begun the process of sustaining your project after finding the “vital few” X’s to your project.
In the last module with Advanced Process Capability, we discussed removing some of the Special
Causes causing spread from outliers in the process performance.
This module gives more tools from the Lean toolbox to stabilize your process.
Belts, after some practice, often consider this module’s set of tools a way to improve some
processes that are totally “out of control” or of such poor Process Capability before applying the Six
Sigma methodology.
The tools we are going to review within this module can be used to help control a process. They can
be utilized at any time in an improvement effort not just control. These Lean concepts can be applied
to help reduce variation, effect outliers or clean up a process before, during or at the conclusion of a
project.
Lean Controls
Ka nba n
The Continuous Goa l…l… p W e ca nnot susta in
Susta ining Results Ka izen Ka nba n w ithout
Ka izen.
p W e ca nnot susta in a
visua l fa ctory w ithout 5 S.
The specifics of the M UDA w ere discussed in the Define Pha se:
Lean Controls
The Goal
Remember that anyy project
p j
Don’t forget the goa l -- Susta ining your Project w hich elimina tes
needs to be sustained. Muda M UDA!
(pronounced like mooo dah)
W ith this in mind, w e w ill introduce a nd review some of the Lea n
are wastes than can reappear tools used to susta in y our project success.
if the following Lean tools are
not used. The goal is to have
your Belts move onto other
projects and not be used as
firefighters
firefighters.
5S - Workplace Organization
The term “5S” derives from the
Japanese words for five practices • 5S means the workplace is
leading to a clean and clean, there is a place for
manageable work area. The five everything and everything
is in its place
place.
“S” are: ‘S
‘Seiri'
i i' means to
t separate
t
needed tools, parts, and • 5S is the starting point for
instructions from unneeded implementing
improvements to a process.
materials and to remove the
latter. 'Seiton' means to neatly • To ensure your gains are
sustainable, you must start
arrange and identify parts and with a firm foundation.
tools for ease of use. 'Seiso'
means to conduct a cleanup • Its strength is contingent
upon the employees and
campaign. 'Seiketsu' means to company being committed
conduct seiri, seiton, and seiso at to maintaining it.
frequent, indeed daily, intervals to
maintain a workplace in perfect
condition. 'Shitsuke' means to form the habit of always following the first four S’s.
On the next page the Japanese words are translated to English words. Simply put, 5S means the
workplace is clean
clean, there is a place for everything and everything is in its place.
place The 5S will create
a workplace that is suitable for and will stimulate high quality and high productivity work.
Additionally it will make the workplace more comfortable and a place in which you can take pride.
Developed in Japan, this method assume no effective and quality job can be done without clean
and safe environment and without behavioral rules.
The 5S allow you to set up a well adapted and functional work environment, ruled by simple yet
effective rules.
rules 5S deployment is done in a logical and progressive way
way. The first three S
S’s
s are
workplace actions, while the last two are sustaining and progress actions.
It is recommended to start implementing 5S in a well chosen pilot workspace or pilot process and
spread to the others step by step.
Lean Controls
Seiri = Sorting
Eliminate everything not required for the current work, keeping only the bare essentials.
Seiton = Straightening
Arrange items in a way that they are easily visible and accessible.
Seiso = Shining
Clean everything and find ways to keep it clean. Make cleaning a part of your everyday
work.
Seketsu = Standardizing
Create rules by which the first three S’s are maintained.
Shitsuke = Sustaining
Keep 5S activities from unraveling
Lean Controls
For items that are useful, there is also a method for determining how and where they should be
stored to help you achieve a clean and orderly workplace.
Lean Controls
A
B
C
F
Distance
After you have determined the usefulness of an item, set three classes for determining where to store an
item based on the frequency of use and the distance to travel to get the item. “A” is for things which are
to be kept close at hand, because the frequency of use is high. “B” is if the item is used infrequently but
approximately on a weekly basis. Do no put it on your work surface, rather keep in easy walking
distance, i.e. on a bookshelf or in a nearby cabinet, usually in the same room you are in. For “C” items it
is acceptable to store in a somewhat remote place, meaning a few minutes walk away.
By rigorously applying the sort action and the prescribed method, you will find that the remainder of the
5S items will be quite easy to accomplish. It is very difficult to order a large number of items in a given
space and the amount of cleaning increases with the number of items. Your workplace should only
contain those items needed on a daily to weekly basis to perform your job.
Lean Controls
Lean Controls
Ex ercise:
– Ca n you com e up w ith a ny opportunities for “ VISUAL” a ids in y our
project?
– W ha t visua l a ids ex ist to ma na ge y our process?
Lean Controls
Sta nda rdized w ork does not ha ppen w ithout the visua l fa ctory
w hich ca n be further described w ith:
Ava ila bility of required tools (5 S). O pera tors ca nnot be ex pected
to m a inta in sta nda rd w ork if required to loca te needed tools
The steps in developing CTQ’s are identifying the customer, capturing the Voice of the Customer and
finally validating the CTQ’s.
Lean Controls
What is Kaizen?
A Kaizen event is very similar to a Six Sigma project. A Six Sigma project is actually a Kaizen.
By involving your project team or others in an area to assist with implementing the Lean Control
or concepts you will increase buy in of the team which will effect your projects sustainability.
M ea sura ble Process. W ithout sta nda rdized w ork , w e rea lly w ouldn’ t
ha ve a consistent process to m ea sure. Cy cle tim es w ould va ry , a ssem bly
m ethods w ould va ry , ba tches of m a teria ls w ould be m ix ed, etc…
Ana ly sis Tools. There a re im provem ent projects in ea ch orga niza tion
w hich ca nnot be solved by a n opera tor. This is w hy w e tea ch the
a na ly sis tools in the brea k through stra tegy of Six Sigm a .
O pera tor Support. The orga niza tion needs to understa nd tha t its future
lies in the success of the va lue-a dding em ploy ees. O ur roles a s Belts a re
to convince opera tors tha t w e a re here for them --they w ill then be there
for us.
A Kaizen event can be small or large in scope. Kaizens are improvement with a purpose of constantly
improving a process. Some Kaizens are very small changes like a new jig or placement of a product
or more involved projects. Kaizens are Six Sigma projects with business impact.
Lean Controls
What is Kanban?
This is a building block. A Kanban needs to be supported by the previous steps we have reviewed. If
Kanbans are abused they will actually backfire and effect the process in a negative manner.
Lean Controls
• M a teria l ha ndlers must be tra ined in the orga niza tion of the
tra nsporta tion system.
As w e ha ve indica ted, if y ou do N O T ha ve 5 S,
visua l fa ctory , sta nda rdized w ork a nd ongoing
k a izen’s,, Ka nba ns ca nnot succeed.
It is
i nott possible
ibl to
t implement
i l t a viable
i bl KKanban
b systemt without
ith t a strong
t supportt structure
t t maded up
of the prerequisites. One of the most difficult concepts for people to integrate is the simplicity of the
Lean tools… and to keep the discipline. Benchmarks have organizations using up to seven years
to implement a successful Kanban System all the way through supplier and customer supply chain.
Lean Controls
1 . The TEAM should 5 S the project a rea a nd begin integra ting visua l
fa ctory indica tors.
– Indica tions of the need for 5 S a re:
– O utliers in y our project m etric
– Loss of initia l ga ins from project findings
4 . Project Scope dicta tes how fa r up the Lea n tools la dder y ou need
to im plem ent m ea sures to susta in a ny project success from y our
DM AIC efforts.
The 5 Lean concepts are an excellent method for Belts to sustain their project success. If you have
outliers, declining benefits
f or dropping process capability, you need to consider the concepts
presented in this module.
Class Exercise
Lean Controls
Notes
Control Phase
Defect Controls
Defect Controls
Overview
W
W elcom
elcomee to
to Control
Control
Adva
Advanced
nced Ex
Ex perim
periments
ents
Adva
Advanced
nced Ca
Capa
pability
bility
Lea
Leann Controls
Controls Realistic
R li ti T
Realistic Tolerance
l
Tolerance and
d Si
and Six
Six Si
Sigma
Sigma D
Design
i
Design
Defect
Defect Controls
Controls Process
Process Automation
Automation or
or Interruption
Interruption
Sta
Statistica
tisticall Process
Process Control
Control
Poka-Yoke
Poka-Yoke
(SPC)
(SPC)
Six
Si Sigm
Six a Control
Sigma
Si C t l Pla
Control Pl ns
Pla ns
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
In an effort to put in place Defect Controls we will examine Tolerances, Process Automation and
Poka-Yoke.
We will examine the meaning of each of these and show you how to apply them.
With Defect Prevention we want to ensure that the improvements created during the project stay in place.
Defect Controls
The best approach to Defect Prevention is to design Six Sigma right into the process
process.
22
Specification
Distribution 21
of Y
19
Relationship
17 Y = F(x)
10 11 12 13 14 15 16 17 18 19 20
Distribution of X
W hen designing the part or process, specifications on X are set such that the
target capability on Y is achieved.
Both the target and tolerance of the X must be addressed in the spec limits.
6s Product/Process Design
Upper
24
Prediction
Interval
Specification on
22
Y
Distribution 21 Relationship
of Y
Y = F(x)
19
17
10 11 12 13 14 15 16 17 18 19 20 Lower
Prediction
Distribution of X Interval
Defect Controls
90
80
70
W ha t a re the 60
the output? 40
30
20
Regression
10
95% PI
0
0 5 10
Input
W ha t is the tolera nce ra nge for the input?
If you w a nt 6 σ perform a nce, you w ill rem em ber to tighten the
output’s specifica tion to select the tolera nce ra nge of the input.
Usually we use the prediction band provided by MINITABTM. This is controllable by manipulation of
the confidence intervals. 90%, 05%, 99%, etc. Play with adjusting the prediction bands to see the
effect it has.
Regression Plot
Y = 2.32891 - 0.282622X
R-Sq = 96.1 %
10
N ote: High
g output
p spec
p connects
w ith top line in both ca ses.
Output2
Regression
Input2 90
80
70
60
Output
50
40
30
20
Regression
0 5 10
Input
Using top output spec determ ines high or low tolera nce for input
d
depending
di on slope
l off regression
i
Defect Controls
Regression Plot
Y = -4.7E-01 R-Sq =
+ 0.811312X 90.4 % Poor correla tion does
not a llow for tighter
tolera ncing
20
Outp1
10
20
Outp2 10
Regression
0
95% PI
0 10 20 30
Inp1
5 – 6 σ Full Automation
Full Automa tion: Systems that monitor the process and automatically
adjust critical X’s to correct settings
Automation can be an option as well which removes the human element and its inherent
variation. Although use caution to automate a process, many times people jump into automation
prematurely, if you automate a poor process what will that do for you?
Defect Controls
4 – 5 σ Process Interruption
Defect Controls
Ex a mple:
• A Bla ck Belt is w ork ing on la unching a new electric drive unit
on a tra nsfer sy stem
– O ne common fa ilure mode of the sy stem is a bea ring fa ilure
on the m a in motor sha ft
– It w a s determ ined tha t a high press fit a t bea ring
insta lla tion w a s ca using these fa ilures
– The root ca use of the problem turned out to be undersized
bea rings from the supplier
• Until the supplier could be brought into control or repla ced, the
tea m im plem ented a press loa d m onitor a t the bea ring press
w ith a indica tor
– If the monitor detects a press loa d higher tha n the set point,
it shuts dow n the press a nd w ill not a llow the unit to be
removed from press until a n interlock k ey is turned a nd the
ra m resett iin the
th m a nua l m ode
d
– O nly the line lea d person a nd the supervisor ha ve k eys to
the interlock
– The non-conforming pa rt is a utom a tica lly ma rk ed w ith red
dye
Process Interruption
3 – 5 σ Mistake Proofing
Mistake Proofing is
great because it is M ista k e Proofing is best defined as:
usually inexpensive – Using wisdom, ingenuity, or serendipity to create devices
and very effective. allowing a 100% defect free step 100% of the time
Consider the many
everyday examples of
Poka-Yoke is the Japanese term for mistake proofing or to avoid
Mistake Proofing.
You can not fit the “ yokeuro” inadvertent errors “ poka” .
diesel gas hose into 1 2 3 4
an unleaded vehicle
gas tank. Pretty
straightforward right?
straightforward, See if you can
find the Poka-
5 7 8
Yokes!
6
Defect Controls
This clearly
highlights the Tra ditiona l Inspection
difference between
the two Result
approaches. What Sort
are the benefits to W orker or Don’t Do Defective At Other
Machine Error Anything
the Source Step
Inspection method?
Source Inspection
“ KEEP ERRO RS FRO M
TURN IN G IN TO DEFECTS”
SHUTDO W N SHUTDO W N
(Stop O pera tion) (Stop O pera tion)
Defect Controls
The very best approaches make creating a defect impossible, recall the gas hose example, you
can not put diesel fuel into an unleaded gas tank unless you really try hard or have a hammer.
Conta ct M ethod
Guide Pins of
– Physica l or energy conta ct
w ith product
1 Different Sizes
• Lim it sw itches
• Photo-electric bea m s
Error Detection
Fix ed Va lue M ethod 2 and Alarms
– N umber of pa rts to be
a tta ched/ a ssem bled etc.
a re consta nt
– N umber of steps done
3 Limit Switches
in opera tion
• Lim it sw itches
M otion-step M ethod 4 Counters
Defect Controls
To see a much more in-depth review of improving the product or service quality by preventing defects
you MUST review the book shown here. A comprehensive 240 Poka-Yoke examples are shown and
can be applied to many industries. The Poka-Yoke’s
Poka Yoke s are meant to address errors from processing,
assembly, mounting, insertion, measurement, dimensional, labeling, inspection, painting, printing,
misalignment and many other reasons.
Defect Controls
IInvolve
l everyone in i defect
d f t prevention
ti
– Esta blish process ca pa bility through SPC
– Esta blish a nd a dhere to sta nda rd procedures
– M a k e da ily improvem ents
– Invent M ista k e-proofing devices
Class Exercise
You ha ve 3 0 minutes!
Defect Controls
Notes
Control Phase
Statistical Process Control
We will now continue in the Control Phase with “Statistical Process Control or SPC”.
Overview
W
W elcom
elcomee to
to Control
Control
Adva
Advanced
nced Ex
Experiments
periments
Adva
Advanced
nced Ca
Capa
pability
bility
Lea
Leann Controls
Controls
Elements
Elements and
and Purpose
Purpose
Defect
Defect Controls
Controls
Methodology
Methodology
Sta
Statistica
tisticall Process
Process Control
Control
(SPC)
(SPC)
Special
Special Cause
Cause Tests
Tests
Six
Si
Si Sigma
Six Si
Sigm
Si a Control
C
C tt ll Pla
Control Pl
Pl ns
Pla ns
Examples
Examples
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Statistical techniques can be used to monitor and manage process performance. Process
performance, as we have learned, is determined by the behavior of the inputs acting upon it in the
form of Y=f(X). As a result it must be well understood that we can only monitor the performance of a
process output. Many people have applied Statistical Process Control (SPC) to only the process
outputs. Because they were using SPC, their expectations were high regarding a new potential level
of performance and control over their processes. However, because they only applied SPC to the
outputs, they were soon disappointed. When you apply SPC techniques to outputs, it is
appropriately called Statistical Process Monitoring or SPM.
You of course know that you can only control an output by controlling the inputs that exert an
influence on that output. This is not to say that applying SPC techniques to an output is bad, there
are valid reasons for doing this. Six Sigma has helped us all to better understand where to apply
such control techniques.
In addition to controlling inputs and monitoring outputs, control charts are used to determine the
Baseline performance of a process, evaluate measurement systems, compare multiple processes,
compare processes before and after a change, etc. Control Charts can be used in many situations
that relate to process characterization,
characterization analysis and performance
performance.
To better understand the role of SPC techniques in Six Sigma, we will first investigate some of the
factors that influence processes, then review how simple probability makes SPC work and finally
look at various approaches to monitoring and controlling a process.
Using rational subgroups is a common way to assure that this does not happen. A rational subgroup is a
sample of a process characteristic in which all the items in the sample were produced under very similar
conditions and in a relatively short time period. Rational subgroups are usually small in size, typically
consisting of 3 to 5 units to make up the sample.
sample It is important that rational subgroups consist of units
that were produced as closely as possible to each other, especially if you want to detect patterns, shifts
and drifts. If a machine is drilling 30 holes a minute and you wanted to collect a sample of hole sizes, a
good rational subgroup would consist of 4 consecutively drilled holes. The selection of rational subgroups
enables you to accurately distinguish Special Cause variation from Common Cause variation.
Make sure that your samples are not biased in any way, meaning that they are randomly selected. For
example, do not plot only the first shift’s data if you are running multiple shifts. Don’t look at only one
vendor’s material if you want to know how the overall process is really running. Finally, don’t concentrate
on a specific time to collect your samples; like just before the lunch break.
If your process consists of multiple machines, operators or other process activities that produce streams
of the same output characteristic you want to control, it would be best to use separate Control Charts for
each of the output streams.
If the process is stable and in control, the sample observations will be randomlyy distributed around the
average. Observations will not show any trends or shifts and will not have any significant outliers from the
random distribution around the average. This type of behavior is to be expected from a normally operating
process and that is why it is called Common Cause variation. Unless you are intentionally trying to
optimize the performance of a process to reduce variation or change the average, as in a typical Six
Sigma project, you should not make any adjustments or alterations to the process if it is demonstrating
only Common Cause variation. That can be a big time saver since it prevents “wild goose chases.”
• An I-M R Cha rt com bines a Control Cha rt of the a vera ge m oving ra nge w ith the
individua ls Cha rt.
• You ca n use Individua ls Ccha rts to tra ck the process level a nd to detect the
presence of specia l ca uses w hen the sa mple size is 1 .
• Seeing both cha rts together a llow s y ou to tra ck both the process level a nd process
va ria tion a t the sa m e tim e, providing grea ter sensitivity tha t ca n help detect the
presence of specia l ca uses.
Individuals Chart
Individuals Chart
Observation
Observation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
4
34
3
2 Data
12 Data
Measure
LCL
01
Measure
LCL
-1 0 Xbar
Xbar
-1 UCL
-2
UCL
-3-2
-4-3
-4
M Rbar Chart
M Rbar Chart
Observation
Observation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
5
5
4
4 Range
Range
Range
3 LCL
Range
3 LCL
2 Rbar
2 Rbar
UCL
1 UCL
1
0
0
Individuals (I) and Moving Range (MR) Charts are used when each measurement represents one
batch. The subgroup size is equal to one when I-MR Charts are used. These charts are very
simple to prepare and use. The graphic shows the Individuals Chart where the individual
measurement values are plotted with the Center Line being the average of the individual
measurements. The Moving Range Chart shows the range between two subsequent
measurements.
There are certain situations when opportunities to collect data are limited or when grouping the
data into subgroups simply doesn't make practical sense. Perhaps the most obvious of these
cases is when each individual measurement is already a rational subgroup. This might happen
when each measurement represents one batch, when the measurements are widely spaced in
time or when only one measurement is available in evaluating the process. Such situations include
destructive testing, inventory turns, monthly revenue figures and chemical tests of a characteristic
in a large container of material
material.
All of these situations indicate a subgroup size of one. Because this chart is dealing with individual
measurements it, is not as sensitive as the X-Bar Chart in detecting process changes.
If each of your observations consists of a subgroup of data, rather than just individual
measurements, an Xbar-R Chart providers greater sensitivity. Failure to form rational
subgroups correctly will make your Xbar-R Charts dangerously wrong.
Xbar Chart
Xbar Chart
Subgroup
Subgroup
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1.5
1.5
1
1
0.5
0 5 Xbar
0.5 Xbar
0 LCL
Xbar
0 LCL
Xbar
-0.5 Xbarbar
-0.5 Xbarbar
-1 UCL
-1 UCL
-1.5
-1.5
-2
-2
Rbar Chart
Rbar Chart
Subgroup
Subgroup
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
6
6
5
5 Rbar
4 Rbar
4 LCL
Rbar
LCL
Rbar
3
3 Rbar
2 Rbar
2 UCL
1 UCL
1
0
0
These charts are most effective when they are used together. Each chart individually shows only a
portion of the information concerning the process characteristic. The upper chart shows how the
process average (central tendency) changes
changes. The lower chart shows how the variation of the process
has changed.
It is important to control both the process average and the variation separately because different
corrective or improvement actions are usually required to effect a change in each of these two
parameters.
The R Chart must be in control in order to interpret the averages chart because the Control Limits are
calculated considering both process variation and center
center. When the R Chart is not in control
control, the
control limits on the averages chart will be inaccurate and may falsely indicate an out of control
condition. In this case, the lack of control will be due to unstable variation rather than actual changes
in the averages.
XBar and RBar Charts are often more sensitive than I-MR, but are frequently done incorrectly. The
most common error is failure to perform rational sub-grouping correctly.
A rational subgroup is simply a group of items made under conditions that are as nearly identical as
possible. Five consecutive items, made on the same machine, with the same setup, the same raw
materials and the same operator, are a rational subgroup. Five items made at the same time on
different machines are not a rational subgroup. Failure to form rational subgroups correctly will make
your XBar-R Charts dangerously wrong.
U Chart
U Chart
Sample
Sample
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1
1
0.8
0.8 DPU
DPU
0.6 LCL
DPU
0.6 LCL
DPU
0.4 Ubar
0.4 Ubar
UCL
0.2 UCL
0.2
0
0
The U Chart plots defects per unit data collected from subgroups of equal or unequal sizes. The “U”
in U Charts stands for defects per Unit
Unit. U Charts plot the proportion of defects that are occurring.
occurring
The U Chart and the C Chart are very similar. They both are looking at defects but the U Chart does
not need a constant sample size like the sample size like the C Chart. The Control Limits on the U
Chart vary with the sample size and therefore they are not uniform, similar to the P Chart which we
will describe next.
Counting defects on forms is a common use for the U Chart. For example, defects on insurance
claim forms are a problem for hospitals
hospitals. Every claim form has to be checked and corrected before
going to the insurance company. When completing a claim form, a particular hospital must fill in 13
fields to indicate the patient’s name, social security number, DRG codes and other pertinent data. A
blank or incorrect field is a defect.
A hospital measured their invoicing performance by calculating the number of defects per unit for
each day’s processing of claims forms. The graph demonstrates their performance on a U Chart.
P Chart
P Chart
Sam ple
Sam ple
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Proportion Defective (P)
0.35
P)
0 35
0.35
Proportion Defective (P
0.3
0.3 P
0.25 P
0.25
0.2 LCL
0.2 LCL
0.15 Pbar
0.15 Pbar
0.1 UCL
0.1 UCL
0.05
0.05
0
0
The P Ch
Th Chart plots
l the
h proportion
i off nonconforming
f i unitsi collected
ll d ffrom subgroups
b off equall or
unequal size (percent defective). The proportion of defective units observed is obtained by dividing
the number of defective units observed in the sample by the number of units sampled. P Charts
name comes from plotting the Proportion of defectives. When using samples of different sizes, the
upper and lower Control Limits will not remain the same - they will look uneven as exhibited in the
graphic. These varying Control Chart limits are effectively managed by Control Charting software.
A common application
pp of a P Chart is when the data is in the form of a p
percentage
g and the sample
p
size for the percentage has the chance to be different from one sample to the next. An example
would be the number of patients that arrive late each day for their dental appointments. Another
example is the number of forms processed daily that had to be reworked due to defects. In both of
these examples, the total quantity would vary from day to day.
SPC on X’s
X’ or Y’s
Y’ with
ith ffully
ll ttrained
i d operators
t and
d staff
t ff who
h respectt the
th rules.
l O
Once a
chart signals a problem everyone understands the rules of SPC and agrees to shut down
for special cause identification. (Cpk > certain level).
SPC on X’s or Y’s with fully trained operators. The operators have been trained and
understand the rules of SPC
SPC, but management will not empower them to stop for
investigation.
S.O .P. is implemented to attempt to detect the defects. This action is not sustainable
short-term or long-term.
The second most effective control is called a type 2 corrective action. This a control applied to the
process which will detect when an error condition has occurred and will stop the process or shut
down the equipment so that the defect will not move forward. This is the “detection” application of
the Poka-Yoke method.
The third most effective form of control is to use SPC on the X’s with appropriate monitoring on the
Ys. To be effective, employees must be fully trained, they must respect the rules and management
must empower the employees to take action. Once a chart signals a problem, everyone understands
the rules of SPC and agrees to take emergency action for special cause identification and
elimination.
The fourth most effective correction action is the implementation of a short-term containment which
i lik
is likely
l to detect
d the
h defect
d f causedd by
b the
h error condition.
di i C
Containments
i are typically
i ll audits
di or 100%
inspection.
Finally you can prepare and implement an S.O.P. (standard operating procedure) to attempt to
manage the process activities and to detect process defects. This action is not sustainable, either
short-term or long-term.
Do not do SPC for the sake of just saying that you do SPC. It will quickly deteriorate to a waste of
time and a very valuable process tool will be rejected from future use by anyone who was
associated with the improper use of SPC.
Using the correct level of control for an improvement to a process will increase the acceptance of
changes/solutions you may wish to make and it will sustain your improvement for the long-term.
SPC is used to detect specia l ca use va ria tion telling us the process
is “ out of control” but does N O T tell us w hy.
SPC has its uses because it is known that every process has known variation called Special Cause and
Common Cause variation. Special Cause variation is unnatural variability because of assignable causes
or pattern changes. SPC is a powerful tool to monitor and improve the variation of a process. This
powerful tool is often an aspect used in visual factories. If a supervisor or operator or staff is able to
quickly monitor how its process is operating by looking at the key inputs or outputs of the process
process, this
would exemplify a visual factory.
SPC is used to detect Special Causes in order to have those operating the process find and remove the
Special Cause. When a Special Cause has been detected, the process is considered to be “out of
control”.
SPC gives an ongoing look at the Process Capability. It is not a capability measurement but it is a visual
indication of the continued Process Capability of your process.
process
UCL=55.24
Special Cause 50
Variation Detected 40
vidual Value
30
_
X=29.06 Process Center
Indiv
20
(
(usually
ll th
the mean))
Control Limits
10
LCL=2.87
0
1 4 7 10 13 16 19 22 25 28
Observation
Control Charts were first developed by DrDr. Shewhart in the early 20th century in the U
U.S.
S Control
Charts are a graphical and visual plot of a process and is charted over time like a Time Series
Chart. From a visual management aspect, a Time Plot is more powerful than knowledge of the last
measurement. These charts are meant to indicate change in a process. All SPC charts have a
Central Line and Control Limits to aid in Special Cause variation.
Notice, again, we never discussed showing or considering specifications. We are advising you to
never have specification limits on a Control Chart because of the confusion often generated.
R
Rememberb we wantt tto control
t l and
d maintain
i t i th
the process iin th
the newly
l iimproved
d process bbased
d on
the recently improved past. These Control Charts and their limits are the Voice of the Process not
the Voice of the Customer which are the specification limits.
C t l Cha
Control Ch rts
t iindica
di tet w hen
h a process is
i “ outt off control”
t l” or ex hibiting
hibiti specia
i l ca use
va ria tion but N O T w hy !
SPC cha rts a llow w ork ers a nd supervision to ma inta in improved process perform a nce
from Six Sigm a projects.
Control lim its describe the process va ria bility a nd a re unrela ted to custom er
specifica tions. (Voice of the Process instea d of Voice of the Custom er)
– An undesira ble situa tion is ha ving control lim its w ider tha n custom er
specifica tion lim its. This w ill ex ist for poorly perform ing processes w ith a Cp
less tha n 1 .0
Common
Ca use
Va ria tion
the consistency of processes producing
Process is “ In
Control”
products and services.
Low er Control
Limit
A primary SPC tool is the Control Chart
- a graphical representation for specific
M ea n
Specia l Ca use quantitative measurements of a process
Va ria tion
Process is input or output
output. In the Control Chart,
Chart
“ O ut of Process Sequence/ Time Sca le
Control” these quantitative measurements are
compared to decision rules calculated
based on probabilities from the actual measurement of process performance.
The comparison between the decision rules and the performance data detects any unusual variation
in the process that could indicate a problem with the process. Several different descriptive statistics
can be used in Control Charts. In addition, there are several different types of Control Charts that can
t t for
test f different
diff t causes, such
h as how
h quickly
i kl major
j vs. minor
i shifts
hift iin process averages are d
detected.
t t d
Control Charts are Time Series Charts of all the data points with one addition. The Standard
Deviation for the data is calculated for the data and two additional lines are added to the chart. These
lines are placed +/- 3 Standard Deviations away from the Mean and are called the Upper Control
Limit (UCL) and the Lower Control Limit (LCL). Now the chart has three zones: (1) The zone between
the UCL and the LCL which called the zone of Common Cause variation, (2) The zone above the
UCL which a zone of Special Cause variation and (3) another zone of Special Cause variation below
the LCL.
Control Charts graphically highlight data points that do not fit the normal level of expected variation.
This is mathematically defined as being more than +/- 3 Standard Deviations from the Mean. It’s all
based off probabilities. We will now demonstrate how this is determined.
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
598
O utlier
3
2
1
99.7%
95%
68%
-1
-2
-3
O utlier
Control Charts provide you with two basic functions; one is to provide time based information on the
performance of the process which makes it possible to track events affecting the process and the
second is to alert you when Special Cause variation occurs. Control Charts graphically highlight data
points that do not fit the normal level of variation expected. It is standard that the Common Cause
variation level is defined as +/- 3 Standard Deviations from the Mean. This is also know as the UCL and
LCL respectively.
Recall the “area under the curve” discussion in the lesson on Basic Statistics, remembering that +/- one
Standard Deviation represented 68% of the distribution, +/- 2 was 95% and +/- 3 was 99.7%. You also
learned from a pprobability
ypperspective
p that yyou would expect
p the output
p of a p
process would have a
99.7% chance of being between +/- 3 Standard Deviations. You also learned that sum of all probability
must equal 100%. There is only a 0.3% chance (100% - 99.7%) that a data point be beyond +/- 3
Standard Deviations. In fact, since we are talking about two zones; one zone above the + 3 Standard
Deviations and one below it. We have to split 0.3% in two, meaning that there is only a 0.15% chance of
being in one of the zones.
There is only a .0015 (.15%) probability that a data point will either be above or below the UCL or LCL.
That is a very small probability as compared to .997
997 (99
(99.75%)
75%) probability the data point will be between
the UCL and the LCL. What this means is there must have been something special happen to cause a
data point to be that far from the Mean, like a change in vendor, a mistake, etc. This is why the term the
term Special Cause or assignable cause variation applies. The probability that a data point was this far
from the rest of the population is so low that something special or assignable happened. Outliers are
just that, they have a low probability of occurring, meaning we have lost control of our process. This
simple, quantitative approach using probability is the essence of all Control Charts.
Size of Subgroups
Lot 1 Lot 5
Lot 3
Lot 2
Lot 4
Short-term studies
Long-term study
Let’s consider if you were tracking delivery times for quotes on new business with an SPC chart. If
you decided to not include averaging across product categories, you might find product categories
are assignable causes but you might not find them as Special Causes since you’ve included them
in the subgroups as part of your rationalization.
rationalization
You really want to have subgroups with only Common Cause variation so if other sources of
variation are detected, the sources will be easily found instead of buried within your definition of
subgroups.
Frequency of Sampling
Sampling Frequency is a balance between cost of sampling and testing versus cost of not detecting
shifts in mean or variation.
Process knowledge is an input to frequency of samples after the subgroup size has been decided.
- If a process shifts but cannot be detected because of too infrequent sampling, the
customer suffers
- If choice is given of large subgroup samples infrequently or smaller subgroups
more frequently, most choose to get information more frequently.
- In some processes, with automated sampling and testing frequent sampling is
easy.
If undecided as to sample frequency, sample more frequently to confirm detection of process shifts
and reduce frequency if process variation is still detectable.
A rule of thumb also states “sample a process at least 10X more frequent than the frequency of ‘out of
control’ conditions”.
Sometimes it can be a struggle how often to sample your process when monitoring results. Unless the
measurement is automated, inexpensive and recorded with computers and able to be charted with
SPC software without operator involvement, then frequency of sampling is an issue.
Let’s reemphasize some points. First, you do NOT want to under sample and not have the ability to
find Special Cause variation easily. Second, do not be afraid to sample more frequently and then
reduce the frequency
q y if it is clear Special
p Causes are found frequently.
q y
Sa mpling too little w ill not a llow for sufficient detection of shifts
in the process beca use of specia l ca uses.
I Chart of Sample_3
Output 7.5
UCL=7.385
7.5
Individual Value
6.5
7
_
X=6.1
6.5 6.0
6
5.5
5.5
5 5.0
Sample every half hour LCL=4.815
1 7 13 19 25 31 37 1 2 3 4 5 6 7 8 9 10 11 12 13
Observation
6.2
7
Individual Value
Individual Value
6.0
_ _
X=6.129 X=5.85
6 5.8
5.6
5 5.4
There are two categories of Control Charts for Continuous Data: charts for controlling the process
average and charts for controlling the process variation. Generally, the two categories are combined.
The principal types of Control Charts used in Six Sigma are: charts for Individual Values and Moving
Ranges (I-MR), charts for Averages and Ranges (XBar-R), charts for Averages and Standard
Deviations (XBar-S) and Exponentially Weighted Moving Average charts (EWMA).
Although it is preferable to monitor and control products, services and supporting processes with
Continuous Data,
Data there will be times when Continuous Data is not available or there is a need to
measure and control processes with higher level metrics, such as defects per unit. There are many
examples where process measurements are in the form of Attribute Data. Fortunately, there are
control tools that can be used to monitor these characteristics and to control the critical process
inputs and outputs that are measured with Attribute Data.
Attribute Data, also called discrete data, reflects only one of two conditions: conforming or non-
conforming, pass or fail, go or no go. Four principal types of Control Charts are used to monitor and
control
t l characteristics
h t i ti measured d iin Att
Attribute
ib t DData:
t ththe p ((proportion
ti nonconforming),
f i ) np ((number
b
nonconforming), c (number of non-conformities), and u (non-conformities per unit) charts. Four
principle types of Control Charts are used to monitor and control characteristics measured in
Discrete Data: the p (proportion nonconforming), np (number nonconforming), c (number of non-
conformities), and u (non-conformities per unit) charts. These charts are an aid to decision making.
With Control Limits, they help us filter the probable noise by adequately reflecting the Voice of the
Process.
A defective is defined as an entire unit that fails to meet acceptance criteria, regardless of the
number of defects in the unit. A defect is defined as the failure to meet any one of the many
acceptance criteria. Any unit with at least one defect may be considered to be a defective.
Sometimes more than one defect is allowed, up to some maximum number, before the product is
considered to be defective.
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
602
T
Type off Cha
Ch rtt W hen
h do
d you need
d it?
Pre-Control u Set-up is critica l, or cost of setup scra p is high. Use for outputs
Ex ponentia lly u Sm a ll shift needs to be detected, often beca use of a utocorrela tion
W eighted of the output results. Used only for individua ls or a vera ges of
M oving Avera ge O utputs. Infrequently used beca use of ca lcula tion com plex ity .
Cum ula tive Sum u Sa m e rea sons a s EW M A (Ex ponentia lly W eighted M oving Ra nge)
ex cept the pa st da ta is a s im porta nt a s present da ta .
Less Com m on
C u W hen you
o w a nt to tra ck the num
n m ber of defects
per subgroup of units produced; sa m ple size is
consta nt
The P Chart is the most common type of chart in understanding Attribute Control Charts.
Control Cha rts indica te specia l ca uses being either a ssigna ble ca uses or pa tterns.
The follow ing rules a re a pplica ble for both va ria ble a nd Attribute Da ta to detect
specia l ca uses.
These four rules a re the only a pplica ble tests for Ra nge (R), M oving Ra nge (M R), or
Sta nda rd Devia tion (S) cha rts.
– O ne point m ore tha n 3 Sta nda rd Devia tions from the center line.
– 6 points in a row a ll either increa sing or a ll decrea sing.
– 1 4 points in a row a lterna ting up a nd dow n.
– 9 points in a row on the sa m e side of the center line.
These rema ining four rules a re only for va ria ble da ta to detect specia l ca uses.
– 2 out of 3 points grea ter tha n 2 Sta nda rd Devia tions from the center line on the
sa m e side.
– 4 out of 5 points grea ter tha n 1 Sta nda rd Devia tion from the center line on the
sa m e side.
– 1 5 points in a row a ll w ithin one Sta nda rd Devia tion of either side of the center
line.
– 8 points in a row a ll grea ter tha n one Sta nda rd Devia tion of either side of the
center line.
Remember Control Charts are used to monitor a process performance and to detect Special Causes
due to assignable causes or patterns. The standardized rules of your organization may have some of
the numbers slightly differing. For example, some organizations have 7 or 8 points in a row on the
same side of the Center Line. We will soon show you how to find what your MINITABTM version has
for defaults for the Special Cause tests.
There are typically 8 available tests for detecting Special Cause variation
variation. Only 4 of the 8 Special
Cause tests can be used. Range, Moving Range or Standard Deviation charts are used to monitor
“within” variation.
If you are unsure of what is meant by these specific rule definitions, do not worry. The next few pages
will specifically explain how to interpret these rules.
• If implementing
i l i SPC manuallyll without
ih software
f iinitially,
i i ll the
h most visually
i ll obvious
b i violations
i l i are
more easily detected. SPC on manually filled charts are common place for initial use of defect
prevention techniques.
• These 3 rules are visua lly the most easily detected by personnel.
– One point more than 3 Standard Deviations from the center line.
– 6 points in a row all either increasing or all decreasing.
– 15 points
i t iin a row allll within
ithi one St
Standard
d dD Deviation
i ti off either
ith side
id off th
the center
t liline.
• Dr. Shewhart that worked with the W estern Electric Co. was credited with the following 4 rules
referred to as W estern Electric Rules.
– One point more than 3 Standard Deviations from the center line.
– 8 points in a row on the same side of the center line.
– 2 out of 3 points greater than 2 Standard Deviations from the center line on the same side.
– 4 out of 5 points greater than 1 Standard Deviation from the center line on the same side.
• You might notice the W estern Electric rules vary slightly. The importance is to be consistent in
your organization and decide what rules you will use to detect special causes.
• VERY few organizations use all 8 rules for detecting special causes.
If a Belt is using MIN ITABTM , you must be aware of what default settings
for the rules. You can alter your program defaults with:
Tools>Options>Control Charts and Quality Tools>Define Tests
This would be
changed to 8 if
you prefer the
W estern Electric
Rules.
W hen a Belt is using MIN ITABTM , the default tests can be set when
running SPC on the variable or Attribute Data.
Tools>Options>Control Charts and Quality Tools>Tests to Perform
A Belt can always change which tests are selected for any individual
SPC chart.
As promised, we will now This is the M OST common specia l ca use test used in SPC cha rts.
closely review the definition
of the Special Cause tests.
The first test is one point
Test 1 One point beyond zone A
more than 3 sigmas from 1
If you want to see the MINITABTM output on the left, execute the MINITABTM command “Stat,
C t l Ch
Control Charts,
t VVariable
i bl Ch
Charts
t ffor IIndividuals,
di id l IIndividuals”
di id l ” and
d th
then select
l t th
the “I chart
h t options
ti and
d
Tests tab”. Remember, your numbers may vary in the slide and those are set in the defaults as
you were shown recently in this module. From now on, we will assume your rules are the same as
shown in this module. If not, just adjust the conclusions.
This rule obviously needs the time order when plotting on the SPC charts to be valid. Typically,
these charts plot increasing time from left to right with the most recent point on the right hand side of
the chart.
chart Do not make the mistake of seeing six points in a line indicating an out of control condition
condition.
Note on the example shown on the right, a straight line shows seven points but it takes that many in
order to have six consecutive points increasing. This rule would be violated no matter what zone the
points occur.
Have you noticed that MINITABTM will automatically place a number by the point that violates the
Special Cause rule and that number tells you which of the Special Cause tests has been violated.
In this example shown on the right, the Special Cause rule was violated two times.
th fifive consecutive
the ti points
i t C
A
from the Center Line and on
the same side, do NOT
make the wrong assumption
that the rule would not be
violated if one of the four
points was actually more
than 2 sigma from the
Center Line.
This test is indica ting a dra m a tic im provem ent of the
The seventh Special Cause va ria tion in the process.
test looks for 15 points in a
row all within one sigma Test 7 Fifteen points in a row in
zone C (both sides of center line)
from the Center Line. You
A
might think this is a good B
thing and it certainly is. C
However, a process might C 7
variation so the
improvement can be
sustained in the future.
A
more than 2 sigma away
from the Center Line
Line. If
you reread the rule, it just
states the points must be
more than one sigma from
the Center Line.
This is a reference for you in case you really want to get into the nitty-gritty
nitty gritty. The formulas shown here
are the basis for Control Charts.
∑R
k
∑x i
i
UCL x = X + E 2 MR UCL MR = D 4 MR
X= i =1 MR = i
k k LCL x = X − E 2 MR LCL MR = D 3 MR
Where:
Xbar: Average of the individuals, becomes the centerline on the Individuals chart
Xi: Individual data points
k: Number of individual data points
Ri : Moving range between individuals, generally calculated using the difference between
each successive pair of readings
MRbar: The average moving range,range the centerline on the range chart
UCLX: Upper control limit on individuals chart
LCLX: Lower control limit on individuals chart
UCLMR: Upper control limit on moving range
LCLMR : Lower control limit on moving range (does not apply for sample sizes below 7)
E2, D3, D4: Constants that vary according to the sample size used in obtaining the moving range
M Rba r (com puted a bove)
σ (st. dev. Estimate)
>
∑R
k
∑x i i UCL x = X + A 2 R UCL R = D 4 R
X= i =1
R = i
LCL x = X − A 2 R LCL R = D3 R
Where: k k
Xi: Average of the subgroup averages, it becomes the centerline of the control chart
Xi: Average of each subgroup
k: Number of subgroups
Ri : Range of each subgroup (Maximum observation – Minimum observation)
Rbar: The average range of the subgroups, the centerline on the range chart
UCLX: Upper control limit on average chart
LCLX: Lower control limit on average chart
UCLR: Upper control limit on range chart
LCLR : Lower control limit range chart
A2, D3, D4: Constants that vary according to the subgroup sample size
Rba r (computed a bove)
σ (st. dev. Estimate) =
>
Yet another reference just in case anyone wants to do this stuff manually
manually…have
have fun!!!!
∑x i ∑s i UCL x = X + A 3 S UCLS = B4 S
X= i =1
S= i=1
k k LCL x = X − A 3 S LCLS = B3 S
Where:
Xi: Average of the subgroup averages, it becomes the centerline of the control chart
Xi: Average of each subgroup
k: Number of subgroups
si : Standard deviation of each subgroup
Sbar: The average s. d. of the subgroups, the centerline on the S chart
UCLX: Upper control limit on average chart
LCLX: Lower control limit on average chart
UCLS: Upper control limit on S chart
LCLS : Lower control limit S chart
A3, B3, B4: Constants that vary according to the subgroup sample size
Sba r (com puted a bove)
σ (st. dev. Estimate) =
>
We are now moving to the formula summaries for the attribute SPC Charts. These formulas are fairly
basic. The upper and lower Control Limits are equidistant from the Mean % defective unless you
reach a natural limit of 100 or 0%. Remember the p Chart is for tracking the proportion or % defective.
These formulas are a bit more elementary because they are for Attribute Control Charts. Recall p
Charts track the proportion or % defective.
Ca lcula te the pa ra m eters of the P Control Cha rts w ith the follow ing:
The nP Chart
Chart’s
s formulas resemble the P Chart
Chart. This chart tracks the number of defective items in a
subgroup.
The U Chart is also basic in construction and is used to monitor the number of defects per unit.
The C Control Charts are a nice way of monitoring the number of defects in sampled subgroups
subgroups.
LCLc = c − 3 c
W here:
This EWMA can be considered a smoothing monitoring system with Control Limits. This is rarely used
without computers or automated calculations. The items plotted are NOT the actual measurements
b t th
but the weighted
i ht d measurements. t Th
The exponentially
ti ll weighted
i ht d movingi average iis useful
f l ffor considering
id i
past and historical data and is most commonly used for individual measurements although has been
used for averages of subgroups.
The CUSUM is an even more difficult technique to handle with manual calculations.
calculations We aren’t
aren t even
showing the math behind this rarely used chart. Following the Control Chart selection route shown
earlier, we remember the CUSUM is used when historical information is as important as present data.
Pre-Control Charts
Pre-Control Cha rts use limits relative to the specification limits. This is the first
and ON LY chart you will see specification limits plotted for statistical process
control. This is the most basic type of chart and unsophisticated use of process
control.
0.0 0.25 0.5 0.75 1.0 Red Zones. Zone outside the
specification limits. Signals the
process is out-of-control and
should be stopped
The Pre-Control Charts are often used for startups with high scrap cost or low production volumes
between setups. Pre-Control Charts are like a stoplight are the easiest type of SPC to use by
operators or staff. Remember Pre-Control Charts are to be used ONLY for outputs of a process.
Another approach to using Pre-Control Charts is to use process capability to set the limits where
yellow and red meet.
Q ua lifying
lif i Process
P
• To qualify a process, five consecutive parts must fall within the green zone
• The process should be qualified after tool changes, adjustments, new
operators, material changes, etc
SPC is an exciting
tool but we must • The power of SPC isn’t to find out what the Center Line and Control Limits are.
not get enamored • The power is to react to the Out of Control (OOC) indications with your Out of Control
Action Plans (OCAP) for the process involved. These actions are your corrective actions
with. The power of to correct the output or input to achieve proper conditions.
SPC is not to find
the Center Line and Individual SPC chart for Response Time
1 VIOLATION :
Control Limits but 40 UCL=39.76
control indications _
20
with an out of X=18.38
effectiveness at LCL=-3.01
additional person on phone bank
1 4 7 10 13 16 19 22 25 28 31
reducing long-term
• SPC requires immediate response to a special cause indication.
variation is to • SPC also requires no “ sub optimizing” by those operating the process.
respond – Variability will increase if operators always adjust on every point if not at the center
immediately to out line. ON LY respond when an Out of Control or special cause is detected.
of control or – Training is required to interpret the charts and response to the charts.
Special Cause
indications.
SPC can be actually harmful if those operating the process respond to process variation with
suboptimizing. A basic rule of SPC is if it is not out of control as indicated by the rules, then do not
make any adjustments. There are studies where an operator that responds to off center
measurements will actually produce worse variation than a process not altered at all. Remember,
being off the Center Line is NOT a sign of out of control because Common Cause variation exists.
Training is required to use and interpret the charts not to mention training for you as a Belt to properly
create an SPC chart.
Attribute SPC Example
Pra ctica l Problem : A project has been launched to get rework
reduced to less than 25% of paychecks. Rework includes contacting a
manager about overtime hours to be paid. The project made some
progress but decides they need to implement SPC to sustain the gains
and track % defective. Please analyze the file “ paycheck2.mtw” and
determine the Control Limits and Center Line.
Step 3 and 5 of the methodology is the primary focus for this example.
– Select the a ppropria te control cha rt a nd specia l ca use tests
to employ
– Ca lcula te the Center Line a nd Control Lim its
– Looking at the data set, we see 20 weeks of data.
– The sample size is constant at 250.
– The amount of defective in the sample is in column C3.
Paycheck2.mtw
W e will confirm what rules for special causes are included in our Control
Chart analysis.
Remember to click on the Options and Tests tab to clarify the rules for
detecting special causes.
…. Chart Options>Tests
W e will confirm what rules for special causes are included in our Control
Chart analysis. The top 3 were selected.
P Chart of Empl_w_Errors
0.30
UCL=0.2802
0.25
Proportion
_
0.20 P=0.2038
0.15
LCL=0 1274
LCL=0.1274
1 3 5 7 9 11 13 15 17 19
Sample
N ow we must see if the next few weeks are showing special cause from
the results. The sample size remained at 250 and the defective checks
were 61, 64, 77.
Remember, we have calculated the Control Limits from the first 20 weeks.
W e must now put in 3 new weeks and N OT have MIN ITABTM calculate
new Control Limits which will be done automatically if we do not follow
this technique. W e are executing Steps 6-8
– Step 6 : Plot process X or Y on the new ly crea ted control
cha rt
– Step 7 : Check for O ut-O f-Control (O O C) conditions a fter ea ch
point
– Step 8 : Interpret findings, investiga te specia l ca use
va ria tion, & m a k e improvem ents follow ing the O ut of
Control Action Pla n (O CAP)
…… Chart Options>Parameters
P Chart of Empl_w_Errors
1
0.30
The new updated SPC chart UCL=0.2802
cause. _
0.20 P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19 21 23
Sample
Beca use of the specia l ca use, the process must refer to the O CAP or O ut of Control
Action Pla n tha t sta tes w ha t root ca uses need to be investiga ted a nd w ha t a ctions a re
ta k en to get the process ba ck in control.
P Chart of Empl_w_Errors
1
0.30
UCL=0.2802
0.25
Proportion
n
_
0.20 P=0.2038
0.15
LCL=0.1274
1 3 5 7 9 11 13 15 17 19 21 23
Sample
After the corrective a ctions w ere ta k en, w a it until the nex t sa mple is ta k en to see if the
process ha s cha nged to not show specia l ca use a ctions
ctions.
– If still out of control, refer to the O CAP a nd ta k e further a ction to improve the
process. DO N O T ma k e a ny more cha nges if the process show s ba ck in
control a fter the nex t rea ding.
• Even if the nex t rea ding seem s higher tha n the center line! Don’t ca use
more va ria bility.
If process cha nges a re documented a fter this project w a s closed, the Control Limits
should be reca lcula ted a s in step 9 of the SPC methodology.
Pra ctica l Problem: A job shop drills holes for its largest customer as
a final step to deliver a highly engineered fastener. This shop uses five
drill presses and gathers data every hour with one sample from each
press representing a subgroup. The data is gathered in columns C3-C7.
Step 3 and 5 of the methodology is the primary focus for this example.
– Select the a ppropria te Control Cha rt a nd specia l
ca use tests to employ
– Ca lcula te the Center Line a nd Control Limits
Holediameter.mtw
Let’s walk through another example of using SPC within MINITABTM but in this case it will be
with Continuous Data. Open the MINITABTM worksheet called “hole diameter” and select the
appropriate type of Control Chart and calculate the Center Line and Control Limits.
W e will confirm what rules for special causes are included in our Control
Chart analysis.
Remember to click on the Options and Tests tab to clarify the rules for
detecting special causes.
……..Xbar-R Chart Options>Tests
W e will confirm what rules for special causes are included in our
Control Chart analysis. The top 2 of 3 were selected.
Also confirm the Rbar method is used for estimating Standard Deviation.
Stat>Control Charts>Variable Charts for Subgroups>Xbar-R>Xbar-R Chart Options>Estimate
N o special causes were detected in the XBar Chart. The average hole
diameter was 26.33. The UCL was 33.1 and 19.6 for the LCL.
Xbar-R
Xbar-RChart
Chartof
ofPart1,
Part1,...,
...,Part5
Part5
35
35
UUCL=33.07
C L=33.07
Mean
eMean
30
30
__
__
Sample
X=26.33
Sample
X=26.33
25
25
20
20 LC L=19.59
LCL=19.59
11 66 11
11 16
16 21
21 26
26 31
31 36
36 41
41 46
46
Sample
Sample
1
1
24 UUCL=24.72
C L=24.72
24
Range
SampleRange
18
18
_
12 _
12 R=11.69
R=11.69
Sample
6
6
0 LC L=0
0 LCL=0
1 6 11 16 21 26 31 36 41 46
1 6 11 16 21 26 31 36 41 46
Sample
Sample
N ow we will use the Control Chart to monitor the next 2 hours and see if
we are still in control.
Remember, we have calculated the Control Limits from the first 20 weeks.
W e must now put in 2 more hours and N OT have MIN ITABTM calculate
new Control Limits which will be done automatically if we do not follow
this step. W e are executing Steps 6-8
– Step 6 : Plot process X or Y on the new ly crea ted Control
Cha rt
– Step 7 : Check for O ut-O f-Control (O O C) conditions a fter ea ch
point
– Step 8 : Interpret findings, investiga te specia l ca use
va ria tion, & ma k e improvem ents follow ing the O ut of
Control Action Pla n (O CAP)
Sample M ean
30
_
_
X=26.33
25
_
12 R=11.69
a bove. 6
0 LC L=0
1 6 11 16 21 26 31 36 41 46 51
Sample
Beca use of no specia l ca uses, the process does not refer to the O CAP or O ut
of Control Action Pla n a nd N O a ctions a re ta k en.
Xbar-R
Xbar-RChart
Chartof
ofPart1,
Part1,...,
...,Part5
Part5
35
35
U C LL=33.07
33.07
UCL 33 07
UCL=33.07
Sample M ean
30
Sample Mean
30
_
__
_
X=26.33
X=26.33
25
25
20 LC L=19.59
20 LCL=19.59
1 6 11 16 21 26 31 36 41 46 51
1 6 11 16 21 26 31 36 41 46 51
Sample
Sample
1
1
24 U C L=24.72
24 UCL=24.72
Sample Range
18
ge
18
Sample Rang
_
12 _
R=11.69
12 R=11.69
6
6
0 LC L=0
0 LCL=0
1 6 11 16 21 26 31 36 41 46 51
1 6 11 16 21 26 31 36 41 46 51
Sample
Sample
If process cha nges a re docum ented a fter this project w a s closed, the Control
Lim its should be reca lcula ted a s in step 9 of the SPC m ethodology .
• St
Step 9 off the
th methodology
th d l refers
f to
t recalculating
l l ti SPC lilimits.
it
• Processes should see improvement in variation after usage of SPC.
• Reduction in variation or known process shift should result in Center
Line and Control Limits recalculations.
– Statistical confidence of the changes can be confirmed with
Hypothesis Testing from the Analyze Phase
Phase.
• Consider a periodic time frame for checking Limits and Center Lines.
– 3, 6, 12 months are typical and dependent on resources and
priorities
– A set frequency allows for process changes to be captured.
• Incentive to recalculate limits include avoiding false special cause
detection with poorly monitored processes.
• These recommendations are true for both Variable and Attribute data.
The extra lines can be helpful if users are using MIN ITABTM for the SPC.
Notes
Control Phase
Six Sigma Control Plans
Now we are going to continue in the Control Phase with “Six Sigma Control Plans”.
Overview
The last physical result
of the Control Phase is W
W elcom
elcomee to
to Control
Control
the Control Plan. This
module will discuss a Adva
Advanced
nced Ex
Ex perim
periments
ents
technique to selection
various solutions you Adva
Advanced
nced Ca
Capa
pability
bility
might want from all of
your defect reduction Lea
Leann Controls
Controls
techniques found
earlier in this phase.
Defect
Defect Controls
Controls
We will also discuss
elements of a Control Sta
Statistica
tisticall Process
Process Control
Control
Plan to aid you and (SPC)
(SPC)
your organization to Solution
Solution Selection
Selection
sustain your project’s Six
Six Sigma
Sigma Control
Control Pla
Plans
ns
results. Control
Control Plan
Plan Elements
Elements
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
We will examine the
meaning of each of
these and show you
how to apply them.
The Control Phase allows the Belt and its team to tackle other processes in the
future.
– The elements of a Control Phase aid to document how to maintain the
process.
We have discussed all of the tools to improve and sustain your project success. However, you might have
many options
ti or too
t many options
ti to
t implement
i l t final
fi l monitoring
it i or controls.
t l This
Thi module
d l will
ill aid
id you iin
defect reduction selection.
Another objective of this module is to understand the elements of a good Control Plan needed to sustain
your gains.
Selecting Solutions
The tool for selecting defect prevention methods is unnecessary for just a
ffew changes
h to
t the
th process.
– Many projects with smaller scopes have few, but vital control
methods put into the process.
Selecting solutions comes down to a business decision. The impact, cost and timeliness of the
improvement are all important. These improvement possibilities must be balanced against the
business needs. A cost benefit analysis
y is always
y ag good tool to use to assist in determining
g the
priorities.
Recall us talking about the progression of a Six Sigma project? Practical Problem – Statistical
Problem – Statistical Solution – Practical Solution. Consider the Practical Solutions from a
business decision point of view.
Impact Considerations
Cost Considerations
Time Considerations
The clock
clock’ss ticking……
ticking
IImplementing
l ti this
thi fa
f milia
ili r tool
t l tot prioritize
i iti proposed d
improvements is ba sed on the three selection criteria of
time, cost a nd impa ct.
– All the process outputs are rated in terms of their relative
importance to the process
• The
Th outputs
t t off interest
i t t will
ill be
b the
th same as those
th in
i your X
X-Y
Y
Matrix.
• The relative ranking of importance of the outputs are the same
numbers from the updated X-Y Matrix.
– Each potential improvement is rated against the three criteria of
time cost
time, cost, and impact using a standardized rating scale
– Highest overall rated improvements are best choices for
implementation
This should
resemble the X-Y
Matrix. This tool is
of no use if you
have one or two
improvement
efforts to consider.
The outputs listed
above in most
cases resemble
those of your
original X-Y Matrix
but you might have
another business
output
p added.
The significance rating is the relative ranking of outputs. If one output is rated a 10 and it is twice the
importance of a second output, the rating for the second output would be a 5. The improvements, usually
impacting the X’s, are listed and the relative impact of each item on the left is rated against its impact to
the output. The overall impact rating for one improvement is the sum of the individual impact ratings
multiplied by their respective significant rating of the output impacted. Items on the left having more
impacts on multiple outputs will have a higher overall impact rating. The cost and timing ratings are
multiplied against the overall impact rating.
The improvements listed with the highest overall ratings are the first to get consideration. The range of
impact ratings can be zero to seven. An impact of zero means no impact. The cost and timing ratings are
rated zero to seven. With zero being prohibitive in the cost or timing category.
Pi
Primary and
dSSecondary
d M
Metrics
t i off your P
Project.
j t
– List each of the Y’s across the horizontal axis
– Rate the importance of the process Y’s on a scale of 1 to 10
• 1 is not very important, 10 is critical
• The Significance rankings must match your updated X-Y Matrix
rankings
The recommended
cost ratings from zero Cost to Implement Ratings
to seven are here. In Improvement Costs are minimal with upfront and ongoing
7
many companies, expenses.
expenditures that are Improvement Costs are low and can be expensed with no capital
6
not capitalized usually authorization and recurring expenses are low.
are desired because Improvement Costs are low and can be expensed with no capital
5
authorization
th i ti and d recurring
i expenses are hi higher.
h
they are smaller and
Medium capital priority because of relative ranking of return on
are merely expensed. 4
investment.
Your business may
Low capital priority because of relative ranking of return on
have different 3
investment.
strategies or need of
High capital and ongoing expenses make a low priority for capital
cash so consider your 2
investment.
business’ situation. High capital and/or expenses without acceptable return on
1
investment.
Significant capital and ongoing expenses without alignment with
0
business priorities.
These time ratings are ranked from zero to seven. You might wonder why something that would take
a year or more we suggest gets a zero rating
i suggestingi the
h iimprovement not b
be considered.
id d M Many
businesses have cycle times of products less than a year so improvements that long are ill
considered.
Example of Completed Solution Selection Matrix
water
o not
kers
ude
Coffee is hot and rrich
"healthy choices"
OVERALL
COST TIME OVERALL
IMPACT
RATING RATING RATING
RATING
available
tasting
Significance Rating 10 9 8 9
Impact Impact Impact Impact
Potential Improvements Rating Rating Rating Rating
1 Hotel staff monitors room 2 2 6 0 86 7 7 4214
2 Mgmt visits/leaves ph # 2 0 4 0 52 7 7 2548
3 Replace old coffee makers/coffee 0 7 0 0 63 3 6 1134
4 Menus provided with nutrition info 0 0 0 4 36 5 5 900
5 Comp. gen. "quiet time" scheduled 6 0 0 0 60 3 3 540
6 Dietician approves menus 0 0 0 7 63 5 2 630
Improvements with the higher overall rating should be given first priority.
Keep in mind that long time frame capital investments
investments, etc
etc. should have
parallel efforts to keep delays from further occurring.
This is just an example of a completed solution selection matrix. Remember that a cost or time
rating of zero would eliminate the improvement from consideration by your project. Remember your
ratings of the solutions should involved your whole team to get their knowledge and understanding
of final priorities.
Again, higher overall ratings are the improvements to be considered. Do NOT forget about the
potential to run improvements in parallel. Running projects of complexity might need the experience
of a trained project manager. Often projects need to be managed with gantt charts or timelines
showing critical milestones.
Once you’ve
O ’ decided
d id d
Implementation Plans should emphasize the need to: defect reduction
solutions, you need to
– Organize the tasks and resources
plan those solutions. A
– Establish realistic time frames and deadlines plan means more than
– Identify actions necessary to ensure success the proverbial back of
the envelope solution
Components of an Implementation Plan include: and should include
– W ork breakdown structure timelines, critical
milestones, project
– Influence strategy for priorities and resourcing review dates and
– Risk management plan specific actions noted
– Audit results for completion and risks. for success in your
solution
All solutions must be part of Control Plan Document. implementation. Many
peoplel use EExcell or MS
Project but many
We have a plan don’t we? options exist to plan
your project closing
with these future
sustaining plans.
Th team
The t working
ki on the
th project!!!!
j t!!!!
The 5 elements of
a Control Plan
include the
Control Pla n
documentation,
monitoring,
response, training Documenta tion Response Pla n Process ow ners
Pla n a ccounta ble to
and aligning m a inta in new
level of
systems and Aligning
process
Sy stems M onitoring Tra ining
structures. Pla n perform a nce
Pla n
& Structures
Training Plan
• Typically
yp y some of the training g is conducted by
y
the project team
– Qualified trainers
• Typically owned by a training department or process owner
• Those who are responsible for conducting the on-going
training must be identified
Tra ining
Pla n
I t
Integration
ti into
i t
Schedule for O ngoing New Final Location of
W ho W ill Create Training M odules W ho W ill be S chedule for E m ployee E m ployee
T raining M odule M odules Com pletion T rained T raining Trainer(s) T raining M anuals
Documentation Plan
Documentation
Documentation is necessary y to ensure that what Documenta tion
Plan Pla n
Pl
has been learned from the project is shared and
institutionalized:
– Used to aid implementation of solutions
– Used for on-going training
Update/
Items
It IImmediate
di t R i
Review
Document Modification
Necessary Responsibility Responsibility
Responsibility
Monitoring Plan
Tests:
– W hen to Sample
M onitoring
• After training Pla n
• Regular intervals
• Random intervals (often in auditing sense)
– How to Sample
– How to Measure
Response Plan
M onitoring
Pla n
Potential C
Process Potential S Potential O Current D R Responsible
p S O D R
Failure
F il M
Modes
d l Recommend
R d T k
Taken
# Function Failure Effects E Causes of C Process E P Person & E C E P
(process a Actions Actions
(Step) (Y's) V Failure (X's) C Controls T N Target Date V C T N
defects) s
1
Monitoring Plan
– Multi-variable tables
Response Plan
Response Pla n
• Deta iled documenta tion Process
urrent Situation
Signal
Cu
on-going continuous
improvement. Detailed Situation
Date
Investigation of Cause
• Reinforce
commitment to Code of Cause
– Job descriptions
– Incentive compensation
– Incentive programs, contests, etc
desired behaviors
Now that’s
h a Controll Plan!
l
Now for the last few questions to ask if you have been progressing on a real world project while
taking this learning. First, has your project made success in the primary metric without
compromising your secondary metrics? Second, have you been faithfully updating your metric
charts and keeping your process owner and project champion updated on your team’s activities. If
not, then start NOW.
Remember a basic change management idea you learned in the Define Phase. If you get
involvement of team members who work in the process and keep the project Champion and
Process Owner updated as to results, then you have the greatest chance of success.
You have now completed Control Phase – Six Sigma Control Plans.
Notes
Control Phase
Wrap Up and Action Items
Gooooaaallllll!!
Organizational Change
• Accept responsibility
• M it i
Monitoring
• Responding
• Managing
• Embracing change & continuous learning
• Sharing best practices
• Potential for horizontal replication or expansion of results
DMAIC Roadmap
wner
Champion/
Process Ow
Estimate COPQ
Establish Team
ure
Assess Stability
Stability, Capability
Capability, and Measurement Systems
Measu
Control Phase
Improvement Selected
Go to N ext Project
Control Questions
Step One: Process Enhancement And Control
Results
• How do the results of the improvement(s) match the requirements of the business
case and improvement goals?
• What are the vital few X’s?
• How will you control or redesign these X’s?
• Is there a process control plan in place?
• Has the control plan been handed off to the process owner?
Step Two: Capability Analysis for X and Y
Process Capability
• How are you monitoring the Y’s?
Step Three: Standardization And Continuous Improvement
• How are you going to ensure that this problem does not return?
• Is the learning transferable across the business?
• What is the action plan for spreading the best practice?
• Is there a project documentation file?
• How is this referenced in process procedures and product drawings?
• What is the mechanism to ensure this is not reinvented in the future?
Step Four: Document what you have learned
• Is there an updated FMEA?
• Is the control plan fully documented and implemented ?
• What are the financial implications?
• Are there any spin-off projects?
• What lessons have yyou learned?
General Questions
• Are there any issues/barriers preventing the completion of the project?
• Do the Champion, the Belt and Finance all agree that this project is complete?
W HAT W HO W HEN W HY W HY N O T HO W
Test validation plan for a specific time
Calculate benefits for breakthrough
Implement change across project team
Process map of improved process
Finalize Key Input Variables (KPIV) to meet goal
Prioritize risks of output failure
Control plan for output
Control plan for inputs
Chart a plan to accomplish the desired state of the culture
Mistake proofing plan for inputs or outputs
Implementation plan for effective procedures
Knowledge transfer between Belt, PO, and team members
Knowledge sharing between businesses and divisions
Lean project control plan
Establish continuous or attribute metrics for Cpk
Identify actual versus apparent Cpk
Finalize problem solving strategy
Complete RPN assessment with revised frequency and controls
Show improvement in RPN through action items
Repeat same process for secondary metrics
Summary
At this point,
point you should:
It’s a Wrap
Congratulations you
have completed
Certified Lean Six Sigma
Black Belt
Training!!!
Control Phase
Quiz
Now we will see what you have retained from the Control Phase of the course. Please answer
these questions to the best of your ability without referencing the text. The answers are in the
Appendix. Please check your answers against the answers provided and review the sections in
the Control Phase where your retention of the knowledge is less than you desire.
2. If the Belt has found a good, statistically significant model from the last Full Factorial
Design, what is the main reason a steepest ascent design be considered in the project?
A. 4 factors were found to be statistically significant.
B. The desired process output was not yet found within the original design space.
C. The project target was achieved but the project wants to further improve the process.
D. The DOE indicated curvature because Center Points were included and the local,,
desired maximum was within the original design space.
3. Advanced Capability Analysis for defects per unit is not possible within MINITABTM.
True False
5. The Lean toolbox including items such as 5S, Visual Factory management and Kanbans
can best be described to ________ a process in the Control Phase.
A. remove labor for
B. overly lengthen the Six Sigma project for
C. confuse
D. stabilize
6. How does the idea of MUDA from Lean Principles best fit with the Six Sigma
methodology?
A. MUDA means waste which is indicating defects are occurring in the process.
B. Lean is Six Sigma that originated in SE Asia.
C. MUDA is an abbreviation for Six Sigma tools.
D. MUDA is the technique of finding the best practices.
8. If excess inventory is one reason for Special Causes in the Six Sigma project, which best
it
item iin L
Lean P
Principles
i i l can hhelp
l iimprove th
the P
Process CCapability
bilit and
d sustainability
t i bilit off th
the
project?
A. Kanban
B. SPC
C. 5S
D. Value Stream Mapping
E. Operator support
9. Kanbans work best with pull systems for determining which products or services are
produced?
True False
10. __________ (fill in the blank) are signals telling a process to process a product or
service.
A. Kaizen
B Kanban
B. K b
C. Andon
D. Poka-Yoke
E. Gemba
11. Since Kanbans are used to control how much inventory exists, it is a quick fix to improve
the inventory.
True False
12. Which are examples of Defect Prevention to consider in your execution of the Control
Phase of your project? (check all that apply)
A. Poka-Yoke or Mistake Proofing
B. Monte Carlo Simulation
C. FMEA
D. Robust product design
E Negotiate
E. N ti t new specification
ifi ti limits
li it ffrom customers
t
13. Which items listed below will cause tolerance specification limits to tighten for an input
statistically affecting the output of interest. (check all that apply)
A. A gauge with a worsening precision.
B. The measuring instrument for the output has improving precision.
C. Other unknown significant Noise factors are increasingly varying.
D. The input
p has a new automated controller to minimize variation the input p from the
desired setting.
14. Every process has causes of variation commonly known as: (check all that apply)
A. Common
B. Insignificant
C. Special
D. Uneducated
15. SPC is an excellent tool for telling us why a process is exhibiting Special Cause
variation.
True False
Glossary
Affinity Diagram - A technique for organizing individual pieces of information into groups or broader categories.
ANOVA - Analysis of Variance – A statistical test for identifying significant differences between process or
system treatments or conditions. It is done by comparing the variances around the means of the conditions
being compared.
Attribute Data - Data which on one of a set of discrete values such as pass or fail, yes or no.
Average - Also called the mean, it is the arithmetic average of all of the sample values. It is calculated by adding
all of the sample values together and dividing by the number of elements (n) in the sample
sample.
Bar Chart - A graphical method which depicts how data fall into different categories.
Black Belt - An individual who receives approximately four weeks training in DMAIC, analytical problem solving,
and change management methods. A Black Belt is a full time six sigma team leader solving problems under the
direction of a Champion.
Breakthrough
g Improvement
p - A rate of improvement
p at or near 70% over baseline p
performance of the as-is
process characteristic.
Capability - A comparison of the required operation width of a process or system to its actual performance
width. Expressed as a percentage (yield), a defect rate (dpm, dpmo,), an index (Cp, Cpk, Pp, Ppk), or as a
sigma score (Z).
Cause and Effect Diagram - Fishbone Diagram - A pictorial diagram in the shape of a fishbone showing all
possible variables that could affect a given process output measure.
Central Tendency - A measure of the point about which a group of values is clustered; two measures of central
tendency are the mean, and the median.
Champion -A Champion recognizes, defines, assigns and supports the successful completion of six sigma
projects; they are accountable for the results of the project and the business roadmap to achieve six sigma
within their span of control.
Common Causes of Variation - Those sources of variability in a process which are truly random, i.e., inherent
in the process itself.
Complexity -The level of difficulty to build, solve or understand something based on the number of inputs,
interactions and uncertainty involved.
Control Limits - Upper and lower bounds in a control chart that are determined by the process itself. They can
be used to detect special or common causes of variation. They are usually set at ±3 standard deviations from
the central tendency.
Cost of Poor Quality (COPQ) - The costs associated with any activity that is not doing the right thing right the
first time. It is the financial qualification any waste that is not integral to the product or service which your
company provides.
Glossary
CP - A capability measure defined as the ratio of the specification width to short-term process performance
width.
CPk -. An adjusted short-term capability index that reduces the capability score in proportion to the offset of the
process center from the specification target.
Critical to Quality (CTQ) - Any characteristic that is critical to the perceived quality of the product, process or
system. See Significant Y.
Critical X - An input to a process or system that exerts a significant influence on any one or all of the key
outputs off a process.
Customer - Anyone who uses or consumes a product or service, whether internal or external to the providing
organization or provider.
Cycle Time - The total amount of elapsed time expended from the time a task, product or service is started
until it is completed.
Deployment (Six Sigma) - The planning, launch, training and implementation management of a six sigma
initiative within a company.
Design
g of Experiments
p (DOE)
( ) - Generally,
y, it is the discipline
p of using
g an efficient,, structured,, and proven
p
approach to interrogating a process or system for the purpose of maximizing the gain in process or system
knowledge.
Design for Six Sigma (DFSS) - The use of six sigma thinking, tools and methods applied to the design of
products and services to improve the initial release performance, ongoing reliability, and life-cycle cost.
DMAIC - The acronym for core phases of the six sigma methodology used to solve process and business
problems through data and analytical methods. See define, measure, analyze, improve and control.
DPMO - Defects per million opportunities – The total number of defects observed divided by the total number
of opportunities, expressed in parts per million. Sometimes called Defects per Million (DPM).
DPU - Defects per unit - The total number of defects detected in some number of units divided by the total
number of those units.
Entitlement - The best demonstrated performance for an existing configuration of a process or system. It is an
empirical demonstration of what level of improvement can potentially be reached
reached.
Failure Mode and Effects Analysis (FMEA) - A procedure used to identify, assess, and mitigate risks
associated with potential product, system, or process failure modes.
Finance Representative - An individual who provides an independent evaluation of a six sigma project in
terms of hard and/or soft savings. They are a project support resource to both Champions and Project
Leaders.
Glossary
Flowchart - A graphic model of the flow of activities, material, and/or information that occurs during a process.
Gage R&R - Quantitative assessment of how much variation (repeatability and reproducibility) is in a measurement
system compared to the total variation of the process or system.
Green Belt - An individual who receives approximately two weeks of training in DMAIC, analytical problem solving,
and change management methods. A Green Belt is a part time six sigma position that applies six sigma to their
local area, doing smaller-scoped projects and providing support to Black Belt projects.
Hidden Factory or Operation - Corrective and non-value-added work required to produce a unit of output that is
generally not recognized as an unnecessary generator of waste in form of resources, materials and cost.
Histogram - A bar chart that depicts the frequencies (by the height of the plotted bars) of numerical or
measurement categories.
Implementation Team - A cross-functional executive team representing various areas of the company . Its charter
is to drive the implementation of six sigma by defining and documenting practices,
practices methods and operating policies
policies.
Input - A resource consumed, utilized, or added to a process or system. Synonymous with X, characteristic, and
input variable.
Input-Process-Output (IPO) Diagram - A visual representation of a process or system where inputs are
represented by input arrows to a box (representing the process or system) and outputs are shown using arrows
emanating out of the box.
lshikawa Diagram - See cause and effect diagram and fishbone diagram.
Least Squares - A method of curve-fitting that defines the best fit as the one that minimizes the sum of the squared
deviations of the data points from the fitted curve.
Long-term Variation - The observed variation of an input or output characteristic which has had the opportunity to
experience the majority of the variation effects that influence it.
L
Lower Control
C t l Limit
Li it (LCL) - for
f controlt l charts:
h t theth limit
li it above
b which
hi h th
the subgroup
b statistics
t ti ti mustt remain
i ffor th
the
process to be in control. Typically, 3 standard deviations below the central tendency.
Lower Specification Limit (LSL) - The lowest value of a characteristic which is acceptable.
Master Black Belt - An individual who has received training beyond a Black Belt. The technical, go-to expert
regarding technical and project issues in six sigma. Master Black Belts teach and mentor other six sigma Belts,
their projects and support Champions.
Measurement - The act of obtaining knowledge about an event or characteristic through measured quantification
or assignment to categories.
Measurement Accuracy - For a repeated measurement, it is a comparison of the average of the measurements
compare to some known standard.
Measurement Precision - For a repeated measurement, it is the amount of variation that exists in the measured
values.
Glossary
Measurement Systems Analysis (MSA) - An assessment of the accuracy and precision of a method of obtaining
measurements. See also Gage R&R.
Median - The middle value of a data set when the values are arranged in either ascending or descending order.
Metric - A measure that is considered to be a key indicator of performance. It should be linked to goals or
objectives and carefully monitored.
Nominal Group Technique - A structured method that a team can use to generate and rank a list of ideas or
items.
Non-Value Added (NVA) - Any activity performed in producing a product or delivering a service that does not add
value, where value is defined as changing the form, fit or function of the product or service and is something for
which the customer is willing to pay.
Normal Distribution - The distribution characterized by the smooth, bell- shaped curve. Synonymous with
Gaussian Distribution
Distribution.
Objective Statement - A succinct statement of the goals, timing and expectations of a six sigma improvement
project.
Opportunities - The number of characteristics, parameters or features of a product or service that can be classified
as acceptable or unacceptable.
Out of Control - A process is said to be out of control if it exhibits variations larger than its control limits or shows a
pattern of variation.
Output - A resource or item or characteristic that is the product of a process or system. See also Y, CTQ.
Pareto Chart - A bar chart for attribute (or categorical) data categories are presented in descending order of
frequency.
Pareto Principle - The general principle originally proposed by Vilfredo Pareto (1848-1923) that the majority of
influence on an outcome is exerted by a minority of input factors.
Problem Statement - A succinct statement of a business situation which is used to bound and describe the
problem the six sigma project is attempting to solve.
Process - A set of activities and material and/or information flow which transforms a set of inputs into outputs for
the purpose of producing a product, providing a service or performing a task.
Process Characterization - The act of thoroughly understanding a process, including the specific relationship(s)
between its outputs and the inputs, and its performance and capability.
Process Certification - Establishing documented evidence that a process will consistently produce required
outcome or meet required specifications.
Glossary
Process Member - A individual who performs activities within a process to deliver a process output, a product
or a service to a customer
customer.
Process Owner - Process Owners have responsibility for process performance and resources. They provide
support, resources and functional expertise to six sigma projects. They are accountable for implementing
developed six sigma solutions into their process.
Quality Function Deployment (QFD) - A systematic process used to integrate customer requirements into
every aspect of the design and delivery of products and services.
Range - A measure of the variability in a data set. It is the difference between the largest and smallest values
in a data set.
Regression Analysis - A statistical technique for determining the mathematical relation between a measured
quantity and the variables it depends on. Includes Simple and Multiple Linear Regression.
Repeatability (of a Measurement) - The extent to which repeated measurements of a particular object with a
particular instrument produce the same value. See also Gage R&R.
Reproducibility (of a Measurement) - The extent to which repeated measurements of a particular object with
a particular individual produce the same value. See also Gage R&R.
Risk Priority Number (RPN) - In Failure Mode Effects Analysis -- the aggregate score of a failure mode
including its severity, frequency of occurrence, and ability to be detected.
Rolled Throughput Yield (RTY) - The probability of a unit going through all process steps or system
characteristics with zero defects.
R.U.M.B.A. - An acronym used to describe a method to determine the validity of customer requirements. It
stands for Reasonable, Understandable, Measurable, Believable, and Achievable.
Run Chart - A basic graphical tool that charts a characteristic’s performance over time.
Scatter Plot - A chart in which one variable is plotted against another to determine the relationship, if any,
between the two.
Screening Experiment - A type of experiment to identify the subset of significant factors from among a large
group of potential factors.
Short Term Variation - The amount of variation observed in a characteristic which has not had the opportunity
to experience all the sources of variation from the inputs acting on it.
Sigma Score (Z) - A commonly used measure of process capability that represents the number of short-term
standard deviations between the center of a process and the closest specification limit. Sometimes referred to
as sigma level, or simply Sigma.
Significant Y - An output of a process that exerts a significant influence on the success of the process or the
customer.
Six Sigma Leader - An individual that leads the implementation of Six Sigma,
Sigma coordinating all of the necessary
activities, assures optimal results are obtained and keeps everyone informed of progress made.
Glossary
Six Sigma Project - A well defined effort that states a business problem in quantifiable terms and with known
impro ement e
improvement expectations.
pectations
Six Sigma (System) - A proven set of analytical tools, project management techniques, reporting methods and
management techniques combined to form a powerful problem solving and business improvement methodology.
Special Cause Variation - Those non-random causes of variation that can be detected by the use of control charts
and good process documentation.
Stability (of a Process) - A process is said to be stable if it shows no recognizable pattern of change and no
special causes of variation are present.
Standard Deviation - One of the most common measures of variability in a data set or in a population. It is the
square root of the variance.
Statistical Problem - A problem that is addressed with facts and data analysis methods.
Statistical Process Control (SPC) - The use of basic graphical and statistical methods for measuring, analyzing,
and controlling the variation of a process for the purpose of continuously improving the process. A process is said to
be in a state of statistical control when it exhibits only random variation.
Statistical Solution - A data driven solution with known confidence/risk levels, as opposed to a qualitative, “I think”
solution.
S pplier - An indi
Supplier individual
id al or entity
entit responsible for providing
pro iding an input
inp t to a process in the form of reso
resources
rces or
information.
TSSW - Thinking the six sigma way – A mental model for improvement which perceives outcomes through a cause
and effect relationship combined with six sigma concepts to solve everyday and business problems.
Two Level Design - An experiment where all factors are set at one of two levels
Two-Level levels, denoted as low and high (-1
( 1 and
+ 1).
Upper Control Limit (UCL) for Control Charts - The upper limit below which a process statistic must remain to be
in control. Typically this value is 3 standard deviations above the central tendency.
Upper Specification Limit (USL) - The highest value of a characteristic which is acceptable.
Variability - A generic term that refers to the property of a characteristic, process or system to take on different
values when it is repeated.
Variable Data - Data which is continuous, which can be meaningfully subdivided, i.e. can have decimal
subdivisions.
Variance - A specifically defined mathematical measure of variability in a data set or population. It is the square of
the standard de
deviation.
iation
Glossary
VOB - Voice of the business – Represents the needs of the business and the key stakeholders of the
business. It is usuallyy items such as p
profitability,
y revenue, g
growth, market share, etc.
VOC - Voice of the customer – Represents the expressed and non-expressed needs, wants and desires of the
recipient of a process output, a product or a service. Its is usually expressed as specifications, requirements or
expectations.
VOP - Voice of the process – Represents the performance and capability of a process to achieve both
business and customer needs. It is usually expressed in some form of an efficiency and/or effectiveness
metric.
Waste - Waste represents material, effort and time that does not add value in the eyes of key stakeholders
(Customers, Employees, Investors).
X - An input characteristic to a process or system. In six sigma it is usually used in the expression of Y=f(X),
where the output (Y) is a function of the inputs (X).
Y - An output characteristic of a process. In six sigma it is usually used in the expression of Y=f(X), where the
output (Y) is a function of the inputs (X)
(X).
Yellow Belt - An individual who receives approximately one week of training in problem solving and process
optimization methods. Yellow Belts participate in Process Management activates, participate on Green and
Black Belt projects and apply concepts to their work area and their job.
Appendix
Quiz Answers
The Quiz questions at the end of each phase are intended to be a sampling of the topics covered
and provide you a guide to assess your level of knowledge retention. OpenSourceSixSigma.com
provides a Certified Lean Six Sigma Black Belt Assessment that is comprehensive in its coverage
of the topics addressed in this course. It contains 100 questions and exercises fully covering the
subject matter for Lean Six Sigma Black Belts. We suggest you consider this CLSSBB
Assessment package should you choose to pursue certification in Lean Six Sigma.
1. C. How tightly all the various outcomes are clustered around the average
2. Standard Deviation
3. A. Features
B. Delivery
D. Integrity
E Expense
E.
4. True
5. E. Awareness
7. False
8. Change Agent
10. Brainstorming
12. Secondary
14. True
17 False
17.
19. True
20. False
1. Reproducibility
2. Linearity
4. True
5 A.
5. A Nominal Scale Data
6. C. Mode
7. True
9. True
10. True
12. False
14. True
15. A. Precision
C. Accuracy
17. True
19. False
1. False
2. Multi-Vari
3. D. Error in measurement
4. True
5. A. A Hypothesis
yp Test is an a p
priori theory
y relating
g to differences between variables
B. A statistical test or Hypothesis Test is performed to prove or disprove the theory
C. A Hypothesis Test converts the Practical Problem into a Statistical Problem.
6. A. Skewness
B. Mixed Distributions
C. Kurtosis
E. Granularity
7. False
9. True
10. D. Having
g the tails of the distribution equal
q each other
11. True
12. B. Compare more than two sample proportions with each other
13. True
14 C.
14. C 30
15. B. Median
16. False
18. True
19. True
2. A. Simple Linear
B. Quadratic
C. Cubic
D. Multiple Linear
E. Logarithmic
4. A. Independent of the transform, the upper specification will be a larger number than the
l
lower specification
ifi ti when
h ttransformed.
f d
D. The process data is transformed but not the specification limits.
8. E. 64
10. B. The root cause for the defective product characteristic needs to be found.
C. The variation needs to be affected by the input factors.
D. The response time to calls needs to be reduced.
11. B. The process may show little change if curvature exists and the local maximum of the
process output
p p is between the largeg differences of factor levels chosen.
13. False
14. C. If the experiment is going to start in a week, contact the Process Owners to work out
the needs before the experiment.
D. Use a log book and note any unusual observations during the experiment.
15. False
16. B. Implement
p solutions
18. A. 13
19. B. A design with IV resolution will not have Main Effects confounded with 2-way
interactions.
C Ad
C. design
i with
ith V resolution
l ti will
ill h
have 2
2-way interactions
i t ti confounded
f d d with
ith 3
3-way
interactions.
E. A design with V resolution has no Main Effects confounded with other Main Effects
F. A design with III resolution has no Main Effects confounded with other Main Effects
1. B. It attempts to find the optimum region outside the original design space.
2. B. The desired process output was not yet found within the original design space.
3. False
5. D. stabilize
6 A.
6. A MUDA means waste
t which
hi h iis iindicating
di ti d defects
f t are occurring
i iin th
the process.
7. True
8. A. Kanban
9. True
10. B. Kanban
11. False
13 A
13. A. A gauge with
ith a worsening
i precision.
i i
C. Other unknown significant Noise factors are increasingly varying.
14. A. Common
C. Special
15. False