Вы находитесь на странице: 1из 117

Looking into Data

Time Plots

Identify special causes, shifts and other patterns
Data based Process Analysis to detect special causes, shifts and other patterns and Monitor the process Determine the shape, center, and range of numeric data and type of distribution Determine relative importance or impact of different problems

Continuous data in sequence

Control Charts

Continuous data in sequence

Frequency plots (histograms)

Continuous data

Pareto Charts

Discrete (multiple categories)

Understand the relationship between quality and variation Be able to differentiate between common and special cause variation Be able to create and interpret time plots, control charts, histograms and Pareto Charts Understand the difference between control limits (process capability) and specification limits (customer requirements) Be able to use Minitab to display data

Understanding Variation

Data on Shipments per Day

The Operations manager at a company was told that the month before, his plant had shipped 79 orders/day early in the month and 135 orders/day near the end of the month. His questions: Was the 79 more typical? Or 135? Was there a clear trend upwards? How does looking at the data this way help?

His staff charted the orders/day for April.

Time Plot of Shipped Orders per Day
April 1-30


Number of orders

120 110 100 90 80 70





What Is Time-Ordered Data?

Data that is collected regularly Hourly Daily Weekly Monthly


Data collected over time from a process. Measurements on the first 30 lots completed one week Measurements on every 5th lot Weekly yields from the past two years The first step in understanding variation should always be to plot such data in time order Data used for analysis in a DMAIC project can be either existing (historical) data or new data you collect.

Why Is Time Order Important?

Process conditions can change over time; data from one point in time is not always comparable to that from another point in time Biases can enter the data If you ignore timerelated patterns, your conclusions may be false
65 60 55 50 45 40 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41

Process has a sudden shift

65 60 55 50 45 40 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41

4 points look very different from the others

Focus on the Variation

When analyzing time-ordered data, you need to look at the variation, how the data values change from point to point Certain patterns in the variation can provide clues about the source of process problems

What Is Variation?
No two anythings are exactly alike.
How a process is done will vary from day to day. Measurements or counts collected on process output will vary over time.


Process data shows how the process varies over time




40 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41

Quantifying the amount of variation in a process is a critical step towards improvement. Understanding what causes that variation helps us decide what kinds of actions are most likely to lead to lasting improvement.

Variation vs. Specifications/Targets

The amount of variation in a process tells us what that process is actually capable of achieving Specifications tell us what we want a process to be able to achieve


Types of Variation
Special Cause: something different happening at a certain time or place Temporary or local; specific May come and go sporadically Evidence of the lack of statistical control is a signal that a special cause is likely to have occurred Reminder: a process with special cause variation is called unstable
Common Cause: always present to some degree in the process Common to all occasions and places Degree of presence varies Each cause contributes a small effect to the variation in results Variation due to common causes will almost always give results that are in statistical control Reminder: a process with only common cause variation is called stable


A Time Plot

70 65

55 50 45


GOAL= reduce variation

40 1 11 21 31 41 51 61 71 81


Reacting to Variation
The appropriate managerial actions are quite different for common causes than for special causes.




Special cause strategy

Common cause strategy


Special Cause Strategy

The goal is to eliminate the specific special causes; to make an unstable process stable. Get timely data so that special cause signal can identified easily. causes are signaled quickly. Take immediate action to remedy any damage. Immediately search for a cause. Find out what was different on that occasion. Isolate the deepest cause you can affect. Develop a longer-term remedy that will prevent that special cause from recurring. Or, if results are good, retain that lesson. Use early warning indicators throughout your operation. Take data at the early process stages so you can tell as soon as possible when something has changed.

Special Cause Strategy

You may not need to complete the DMAIC process to address a special cause. See what changed at the point in time when the special cause appeared. What was different then? If the cause it not clear, it is no more a special cause for you. If the cause is clear, confirm it with additional data, if possible. Then develop longer-term action to prevent the special cause (if the impact was bad) or preserve it (if the impact was good).


Common Cause Strategy

The process shown here is stable. But does it need to be improved?

Customer Needs

A process with only common causes is said to be statistically stable and in statistical control. Merely being in statistical control does not mean the results of a system are acceptable. Leaving the process alone is not improvement. Different approach is needed to improve stable system.

Improving a Stable Process

When improving a stable system you dont single out one or two data points. You need to look at all the datanot just high points or low pointsnot just the points you dont likenot just the latest point. All the data are relevant Improving a stable process is more complex than identifying a special cause. More time and resources are generally needed in the discovery process. Common causes of variation can hardly ever be reduced by attempts to explain the difference between individual points if the process is in statistical control When dealing with special causes, you focus on a few data points. For common cause variation, you need to look at all the data points to fully understand the pattern. You should look at the entire System Processes in statistical control usually require fundamental changes for improvement Using the DMAIC Method can help you make fundamental changes in a process This is a TRUE SIX-SIGMA PROJECT


Investigating Common Cause Variation


Sort data into groups or categories based on different factors. Look for patterns in the way the data points cluster or do not cluster.
Downgraded Quantity






Example of Down grading data stratified by day of week

Mondays are consistently worse than other daysfind out why.


Many figures we see are aggregated. For example, if we look at total monthly production figures, each data value is really a combined figure representing all products, lines, shifts, weeks, etc. If we take apartdisaggregatethese figures, we can often see patterns that are masked in the roll up.


How to Disaggregate
Disaggregate by process phase or step
Whole process (Do the job)

Time to complete entire process

Phase 1

Phase 2

Phase 3

Disaggregate by process output

Measure time separately for each phase or step

Total Results

Sum results across all product or service Separate results by product or service type

Results for Product 1 Results for Product 2 Results for Product 3 Results for Product 4 Results for Product 5

1100 1300 2000 2200 2400 2600 2800 3000 3200 300 9/91 10/91 10/91 11/91 12/91 1/92 2/92 3/92 4/92 5/92 6/92 7/92 8/92 9/92 10/92 11/92 12/92 1/93 2/93 3/93 4/93 5/93





9/91 10/91 10/91 11/91 12/91 1/92 2/92 3/92 4/92 5/92 6/92 7/92

Merge B Tons Downgraded

Total Tons of Downgraded

Month Month
8/92 9/92 10/92 11/92 12/92 1/93 2/93 3/93 4/93 5/93 6/93 7/93 6/93 7/93

300 500 1100 1300 300 9/91 10/91 10/91 11/91 12/91 1/92 2/92 3/92 4/92 5/92 6/92 7/92 8/92 9/92 10/92 11/92 12/92 1/93 2/93 3/93 4/93 5/93 500 700 900

1100 1300 700 900

9/91 10/91 10/91 11/91 12/91 1/92 2/92 3/92 4/92 5/92 6/92 7/92 8/92 9/92 10/92 11/92 12/92 1/93 2/93 3/93 4/93 5/93 6/93

Example: Disaggregation


Merge A Tons Downgraded

Merge C Tons Downgraded



7/93 6/93 7/93

Common cause variation stems from the interaction of a large number of factors in a process Identifying which of those many factors are contributing the most to the variation can be tricky and time-consuming Often, people have theories about which factors are most important Experimentation can help us confirm those theories

Experimental Approach
You can be formal or informal in your experimental design Even if informal, use PDCA:
Plan the experiment Identify factors (potential causes) you want to study Develop operational definitions of the factors and the responses you will measure Select/develop your experimental design Do the experiment Collect data Check the results Analyze and interpret data Act on what you learn

Matching Action to the Type of Variation

Discussion What would it mean in practice to treat a special cause like common cause variation? What would it mean to treat common variation like special causes?

Decide whether each of the following examples describes a special cause or a common cause. Then decide what the appropriate response should be. Remember to treat special causes differently than common causes. Be prepared to discuss your answers with the class. Discuss also what happens if you treat a common cause like a special cause, and vice versa.

Time: 10 min.
Example 1: One quality inspector is found to be making errors
in filling out the inspection report. Is this a special cause or a common cause? What is an appropriate response to this situation? What might happen if you took the wrong course of action?

Example 2: All inspectors are found to make occasional errors

in filling out inspection reports. Is this a special cause or a common cause? What is an appropriate response to this situation? What might happen if you took the wrong course of action?

Tools for Understanding Variation

Time plots (run charts) Control charts Frequency plots

Pareto Charts


Time Plots (Run Charts)


Time Plots (Run Charts)

June 130 Production


Production in tons






70 0 10 20 30


Why Use a Run Chart?

Use a run chart: To study observed data for trends or patterns over a specified period of time. To focus attention on truly vital changes in the process. To track useful information for predicting trends.

When to Use a Run Chart

Use a run chart: To understand variation in the process.

To compare a performance measure before and after implementation of a solution to assess the solutions impact.
To detect trends, shifts, and cycles in the process.

Time Plot / Run Chart Features

Production in tons
June 1-30

Vertical axis shows the numerical value or count


Data points plotted in time order

Production in tons

120 110 100 90 80

Points are connected by a line to aid in visual interpretation

0 10 20 30


Horizontal axis reflects passage of time


How to Construct a Run Chart

1. Decide on the measure you want to analyze. 2. Gather data (minimum 20 data points). 3. Create a graph with a vertical line and a horizontal line. 4. On the vertical line (y-axis), draw the scale related to the variable you are measuring. 5. On the horizontal line (x-axis), draw the time or sequence scale. 6. Calculate the median and draw a horizontal line at the median value. 7. Plot the data in time order or sequence. 8. USE MINTAB QUALITY TOOLS 9. Identify runs (ignore points on the median). 10. Check the table for run charts.

Looking at Both Time and Distribution

Production in tons June 130
140 130

Production in tons

120 110

90 80 70 0 10 20 30


Counting Runs on a Run Chart

Here is a chart that has the runs circled. A run is a series of points on the same side of the median. A run can be any length from 1 point to many points. Too few or too many runs are important signals of special causesthey indicate something in the process has changed. Because you often count runs on a time plot, they are also called run charts. In this example, there are 5 data points on the median, which are ignored because they neither add to nor interrupt any runs. That leaves 20 data points that are counted for the run test, and 11 runs in the example shown here.
45 40 35 30 25 20 15 10

Time Plot from Exercise 2(F)






20 data points not on median 11 runs

Note: Points on the median are ignored. They do not add to or interrupt a run.


Runs Above and Below the Median

Number of Data Points Not On Median Lower Limit for Number of Runs Upper Limit for Number of Runs Number of Data Points Not On Median Lower Limit for Number of Runs Upper Limit for Number of Runs

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

3 3 3 4 4 4 5 5 6 6 6 7 7 8 8 9 9 9 10 10 11 11 11 11

8 9 10 10 11 12 12 13 13 14 14 15 16 16 17 17 18 19 19 20 20 21 22 22

34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 60 70 80 90 100 110 120

12 13 13 13 14 14 15 16 16 17 17 17 17 18 18 19 19 24 28 33 37 42 46 51

23 23 24 25 25 26 26 26 27 27 28 29 30 30 31 31 32 37 43 48 54 59 65 70


Signals of Special Causes on Time Plots

Special causes may be present if there are: Too many or too few runs. 6 or more points in a row continuously increasing or decreasing (trend). 9 or more points in a row on the same side of the median (shift). 14 or more points in a row alternating up and down.


Examples of Signals

Too Few Runs


Too Many Runs




Trends: 6 or more points in a row increasing or decreasing

Upward Trend

Downward Trend


More Examples of Signals


Process Shift: 9 or more points in a row above or below the centerline



Bias or Sampling Problems: 14 or more points in a row alternating up and down (sawtooth)


Control Charts for Individual Values


Control Charts:

Control Charts

Are time-ordered plots of results (just like time plots). Use statistically determined control limits that are drawn on the plot. Their centerline calculation uses the mean, not the median.
100 90 80 70 60 50 40 30 20 10 0 J A S O N D J F M A M J J A S O N D J F M




Why Use a Control Chart

Statistical control limits establish process capability.

Statistical control limits are another way to separate common-cause and special-cause variation. Points outside statistical limits signal a special cause.
Can be used for almost any type of data collected over time. Provides a common language for discussing process performance.

When to Use a Control Chart

Use a Control Chart: To track performance over time. To evaluate progress after process changes/improvements. To focus attention on detecting and monitoring process variation over time.


Control Chart Features

Basic features same as a time plot


Statistical control limits are not based on what we would like the Process to do. They are based on what the process is capable of doing. They are computed from the data using statistical formulas.

90 80 70 60 50 40 30 20 10


Control limits 0 (calculated J A S O N D J F M A M J J A S O N D J F M from data) added to plot Centerline usually average instead of median

How to Construct Control Charts

1. Select the process to be charted.
2. Determine sampling method and plan. 3. Initiate the data collection.

4. Calculate the appropriate statistics.

5. Plot the data values on the first chart (mean, median or individuals). 6. Interpret the control chart and determine if the process is in control.

Individuals Control Chart and Individuals Data

The kind of control chart shown on the previous pages is called an individuals chart because the data points are individuals dataactual measurements on a single output.
Sales Costs

Efficiencies Maintenance time Pollution levels

Cycle Times
Losses in money Chemical analyses Production time lost

Production amounts
Pressures Waste

Speeds Conductivity


What Are Control Limits?

40 35

Process Distribution


30 25 20

Stable Process

A control limit defines the bounds of common-cause variation in the process. A control limit is a tool we use to help us take the right actions. If all points are between the limits, we assume only common-cause variation is present (unless one of the other Signals of a Special Cause is present). If a point falls outside the limit, you treat it as a special cause / Otherwise, you do not investigate individual data points, but instead study the commoncause variation in all data points.

How to Calculate Control Limits

There are two formulas commonly used for calculating control limits: Centerline = X
Control Limits: Method 1 Control Limits: Method 2

UCL = X + 3.14 R
LCL = X 3.14R
Using Median Moving Range

UCL= X + 2.66 R LCL = X 2.66 R

Using Average Moving Range

UCL = Upper Control Limit. LCL = Lower Control Limit.


Control Charts and Tests for Special Causes

On a control chart, any data point outside the control limits is a signal of a Special Cause. But can you use the previous Tests for special causes on a control chart, too? The answer is: It depends. Two of the previous testscounting runs and 9 pointsare determined relative to the median of the data. But on a control chart, the centerline is the average, not the median. Solution?
You can use the average with caution if you think the data have a roughly Normal distribution (this will be covered later in this course). With caution means to check your interpretation in other ways before taking action.

Plotting the Data in Individual Chart

Individuals Chart
40 38 36 34 32 30 28 26 24 22 20 1 3 5 7 9 11 13 15 17 19 21 23 25

UCL= 36.1

X = 29.8

L CL= 23.5

Many people like to plot the Moving Range chart at the same time they plot an individuals chart. The moving ranges are the differences between adjacent points. As Wheeler & Chambers point out in their book Advanced Topics in Statistical Control, Its not that [the mR chart] improves the ability of the X-chart to detect signals, but that it serves as a reminder of the correct way of computing the limits for the X-Chart.


Enter data in two columns, one for date/time & the other for the variable. Ensure at least 20 data sets otherwise consult your MBB. Go to Stat, Select Control Chart, Select Individuals, Select the Variable.

Go to Test, tick First Four Tests, click ok.

Set the Lower Specification Limit by clicking S-limits where non negative values are not feasible. Set the x axis points by selecting Stamp.Go to Frame> select Tick > set the X axis by 1:n/k( where n is the total number of observations and k is selected by you). Click ok, ok.

Determining Limits for individual Charts

Control Limits Based on 2.66 R
40 38 36 34 32 30 28 26 24 22 20 1 2 3 4 5 6 7 8

UCL =39.1
40 38 36 34 32 30 28 26 24 22 20 1 2 3 4 5

Control Limits Based on 3.14 R

UCL =36.4

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24


Specification Limits vs. Control Limits

Specification limits
Are set by the customer, management, or engineering requirements. Describe what you want a process to achieve.

Control limits
Are calculated from the data. Describe what the process is capable of achieving.



UCL Upper spec



Lower spec LCL


70 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35


Special Causes, Common Causes, and Process Capability

It is extremely unlikely that an unstable process (with special causes) will ever be capable. In a highly capable process, the control limits are much narrower than the specification limits.


X-Bar, Range Control Charts


When to use an X-Bar, R Chart

When data are collected in rational subgroups, it makes sense to use an X-Bar, R chart.

Rational subgroups

In the rational subgroup, we hope to have represented all the common causes of variation and none of the special causes of variation. X-Bar, R charts allow us to detect smaller shifts than individuals charts. Also they allow us to clearly separate changes in process average from changes in process variability.

Key Properties of Subgrouped Data

Have variation within a group that produces: Average of subgroup (X)

Have variation between subgroups that produces Average of the subgroup averages (X)
6 9.40 11.10 11.60 10.70 42.80 10.70 2.20 7 10.80 12.80 10.90 11.50 46.00 11.50 2.00 22 10.40 9.40 10.20 10.00 40.00 10.00 1.00 11.20

Range within a subgroup (R) Subgroup Data

1 1 12.80 13.80 11.80 12.80 51.20 12.80 2.00 2 13.50 11.40 13.20 12.70 50.80 12.70 2.10 3 12.40 11.40 11.45 11.75 47.00 11.75 1.00 4 12.60 10.60 10.40 11.20 44.80 11.20 2.20 5 11.00 9.60 11.80 10.80 43.20 10.80 2.20


2 3 4 sum X R

Average of the subgroup is noted as X

Range within the subgroup, R

Average of the subgroup averages is noted as X-double-bar, X


X, R Chart

UCL =13.1

X = 11.2
10 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

LCL = 9.3

UCL = 5.9

R = 2.6
2 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

An X bar, R chart uses the variation within subgroups to establish limits for the averages of the subgroups. When there is more variation between subgroups than within subgroups, a special cause will be signaled. The chart will NOT detect special causes within a subgroup. Selection of the subgroups is of primary importance..

Xbar, R Charts Features


Factors for X-Bar, R Charts


Interpreting an X-Bar, R Chart

Use the Signals of Special Causes on both charts. Look at the R chart first. If the range chart is unstable (has special causes), the limits on the X-Bar chart will be of little value. If the range chart is unstable, it is unsafe to draw conclusions about variation in the process average. Look for positive or negative correlations between the data points on the X-Bar and the R chart (both move in the same direction or in opposite directions for every point). This happens when the data have a skewed distribution, and some conclusions may be affected.


When to Use X-Bar, R Charts

Though used in both administrative and manufacturing applications, they are the tools of first choice in many manufacturing applications. Advantages over other charts:
Subgroups allow for a precise estimate of local variability. Changes in process variability can be distinguished from changes in process average. Small shifts in process average can be detected. The advantages of an X-bar, R chart disappears if systemic special causes occurthat is, a special cause that appears in each subgroup. For example, suppose youre counting errors in orders received by phone and you have four operators taking orders. It would be natural to want to construct subgroups of 4, taking one order form from each operator. But if one operator is consistently better or worse than the others, you would be mixing special cause and common cause variation in the data. The chart will be useless: obscuring differences between operators AND making it hard to detect changes in the process or variability.


Open MINITAB Worksheet. (A) When subgroups are in one column, enter the data in one column. (B) When subgroups are in rows, enter a series of columns.


When (A), enter the data column in single column and in subgroup size, enter a subgroup size.

When (B), enter a series of columns in subgroups across rows of.

Enter OK.

Frequency Plots


Case Study: Speeding Up Improvements

A company had instituted a Corrective Action system used to ensure that improvement suggestions were followed up on. Their goal was to process and close all Corrective Action requests within 50 days. Several years after the system was started, data on all the Corrective Actions was collected. Someone crunched the numbers and was able to tell management it took an average of 94 days to resolve the issues identified through this system. Managements reaction was clear: We need to reduce our cycle time in half!

Case Study: Speeding Up Improvements, Cont.

One employee decided to create a frequency plot of the actual data, which he then showed to management.
Time Needed to Resolve Corrective Action Requests X = 94 days
20 18 16 14 12 10 8 6 4 2 0 800 1000 10 20 30 40 50 1100 1200 300 400 100 200 500 600 700 900

How do your reactions differ from just knowing the average was 94 days to seeing the actual distribution?


Frequency Plots (Histograms) Fill weight for a bag

1 July 7 July

A frequency plot shows the shape or distribution of the data by showing how often different values occur.

10 9 8

Target Weight

Number of occurrences

7 6 5 4 3 2 1 0













Fill Weight




Why Use Frequency Plots

A Frequency Plot:

Summarizes data from a process and graphically present the frequency distribution in bar form.
Helps to answer the question: Is the process capable of meeting customer requirements?


When to Use Frequency Plots

Use a Frequency Plot:

To display large amounts of data that are difficult to interpret in tabular form. To show the relative frequency of occurrence of the various data values.
To reveal the centering, spread, and variation of the data. To illustrate quickly the underlying distribution of the data.

Frequency Plot Features

Height of column indicates how often that data value occurred Fill weight for a Bag 1 July 7 July
Target Weight

10 9 8

Label target and/or specifications

Number of occurrences

7 6 5 4 3 2 1 0 4:60 4:00 4:10 4:20 4:35 4:40 4:45 4:05 4:15 4:25 4:30 4:50 4:55 4:65

Overall shape shows how the data is distributed

Axes clearly labeled

Fill Weight


Frequency Plot Uses

A Frequency Plot creates a picture of the variation in a process. It can reveal patterns that provide clues to certain types of problems. It can verify whether data are distributed normally.
Overfill in 100 Boxes
25 20


15 10 5 0 0.0 0.2 0.4 0.6 0.8

Ounces over listed weight


Types of Frequency Plots

10 0 95 85 75 65 55 45 35 25 5

95 85 75 65 55 45 35 25

Dot Plot

9 8 7 6 5 4 3 2

149 25396 974532826 68274435082 42083443 90812156 349 1


Stem-and-Leaf Plot


How to Construct a Frequency Plot

1. Decide on the process measure.

2. Gather data (at least 50 data points). 3. Prepare a frequency table of the data. Count the number of data points. Calculate the range. Determine the number of class intervals. Determine the class width. Construct the frequency table.

4. Draw a frequency plot (histogram) of the table.

5. Interpret the graph.

What to Look for on a Frequency Plot

1. Center of the data


2. Range of the data (distances between largest value and smallest value)


What to Look for on a Frequency Plot, Cont.


3. Shape of the distribution (provides information about process capabilities) 4. Comparison with Target and Specification





Common Shapes of Frequency Plots

Bell shaped. Symmetric.

Two humps. Bimodal.

Long tail. Not symmetric.


Interpreting Distribution
If a frequency plot shows a bell-shaped, symmetric distribution
Conclude: No special causes indicated by the distribution; data may come from a stable process (Caution: special causes may appear on a time plot or control chart). Action: Make fundamental changes to improve a stable process (common-cause strategy). Note: Well learn more about bell-shaped or Normal curves in the next Chapter.

Interpreting Distribution

If a frequency plot shows a bellshaped, symmetric distribution

Conclude: No special causes indicated by the distribution; data may come from a stable process (Caution: special causes may appear on a time plot or control chart). Action: Make fundamental changes to improve a stable process (common-cause strategy). Note: Well learn more about bell-shaped or Normal curves in the next Chapter.

Interpreting Distribution If a frequency plot shows a two-humped, bimodal distribution

Conclude: What we thought was one process operates like two processes (two sets of operating conditions with two sets of output). Action: Use stratification or other analysis techniques to seek out causes for two humps; be wary of reacting to a time plot or control chart for data with this distribution.

Interpreting Distribution If a frequency plot shows a long-tailed distribution (is not symmetric)
Conclude: Data may come from a process that is not easily explained with simple mathematical assumptions (like normality). A long-tailed pattern is very common when measuring time or counting problems. Action: Youll need to use most data analysis techniques with caution when data has a long-tailed distribution. Some will lead to false conclusions. For example, the control limit calculations are based on the assumption that the data have a bell-shaped curved. Calculating control limits for data with a long-tailed distribution will likely make you overreact to common cause variation and miss some special causes. Other tests that rely on normality include hypothesis tests, ANOVA, and regression. To deal with data with this kind of distribution, you may need to transform it.

Additional Frequency Plot Patterns

Basically flat
If a frequency plot shows a basically flat distribution Conclude: Process may be drifting over time or process may be a mix of many operating conditions. Action: Use time plots to track over time; look for possible stratifying factors; standardize the process. Conclude: Outlier data points are likely the If a frequency plot shows one result of clerical error or something unusual or more outliers happening in the process. Action: Confirm outliers are not clerical error; treat like a special cause.

One or more outliers


Frequency Plot Irregularities

4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Five or fewer distinct values Large pile-up around a minimum or maximum value One value is extremely common

Saw-tooth pattern


Interpreting Distribution

If a frequency plot shows five or fewer distinct values

Conclude: Measuring device not sensitive enough or the measurement scale is not fine enough. Action: Fine tune measurements by recording additional decimal points.


Interpreting Distribution

If a frequency plot shows a large pile up of data points

Conclude: A sharp cut-off point occurs if the measurement instrument is incapable of reading across the complete range of data, or when people ignore data that goes beyond a certain limit. Action: Improve measurement devices. Eliminate fear of reprisals for recording unacceptable data.

Interpreting Distribution
If a frequency plot has one value that is extremely common
Conclude: When one value appears far more commonly than any other value, the measuring instrument may be damaged or hard to read, or the person recording the data may have a subconscious bias. Action: Check measurement instruments. Check data collection procedures.


Interpreting Distribution
If a frequency plot shows a saw-tooth pattern
Conclude: When data appear in alternating heights, the recorder may have a subconscious bias for even (or odd) numbers, the measuring instrument may be easier to read at odd or even numbers, or the data values may be rounded incorrectly. Action: Check measuring instrument and procedures.

Distributions, Capability, and Targets

On target and capable*



Target LSL

Not capable (probably because off target)

Not capable even if on target





Interpreting Distribution
Capable doesnt just mean what it produces today or tomorrow, but into the future. A process that is barely within the specifications isnt capable because its likely something will happen to produce data points outside the specifications. A process needs to be well within the specifications to be considered capable. The distribution of the top chart, for instance, shows that the process is centered around the target and all the current data are well within specifications. It is both capable and on-target. The lower left chart shows a process that is off targetbut the output looks like it could be within the specs if the center of the distribution could be moved closer to the target. The third chart shows a process that is not capablethe spread of variation is too wide to reliably produce input within the specification limits. These concepts will be explored in much more depth in the next chapters, when we examine the concept of Process Sigma.


Checking Both Time Order and Distribution


Checking Both Time Order and Distribution

In practice, if your data has a natural time order, you should always do a time plot (or control chart) as well as a frequency plot. Both give you different information.

In this case, there are no special causes that appear in the time plot (according to the Tests for Special Causes already taught), but the frequency plot clearly has a bimodal pattern and youd want to investigate why.


WORK SHEET # 7: Interpreting Distribution and Time Order

Objective: Gain an understanding of the different types of information provided by frequency plots and time plots, and how looking at the data from different perspectives can lead to different conclusions.
Instructions: Divide into pairs or small groups. Read the case study below and discuss your interpretation of the data shown in the back-to-back frequency plots. Then look at the time plot on the next page and discuss the questions shown there. Be prepared to discuss your answers with the class. Time: 10 min.

Interpreting Distribution and Time Order

Supplier A
40 Deliver ie s
4.5 4

Supplier B
40 Deliver ie s

This company was having trouble Scheduling the services delivered to its customers because of delays in receiving materials from their suppliers. They went into their computer records and recovered data from the past 40 weeks comparing promised delivery dates to actual delivery dates from their two main suppliers. Based on the frequency plots, which supplier would you recommend this company choose? Note: A negative number indicates the delivery was early.

3 2.5

1.5 1 0.5 0


Days fr om Tar ge t


Interpreting Distribution and Time Order, cont.

Now look at the time plot of the same data shown previously on the frequency plots.
Time Plot of Suppliers A and B Late Deliveries
(40 weekly deliveries each)
4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 -0.5 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

=supplier A =supplier B

What is your interpretation now that youve seen both the time plot and frequency plot? Which supplier would you recommend using?

In the frequency plot, Supplier B looks far superior to Supplier A, having a much narrower distribution generally much nearer the target. In the time plot, however, it looks like Supplier A is making rapid strides in improving its ability to deliver on time. You would probably want to collect more data to make sure that Supplier A can sustain its current level of performance.
Supplier A
40 Deliveries
4 3.5 3 2.5 2 1.5 1 0.5 4.5 4 3.5 3 2.5 2 4.5

Supplier B
40 Deliveries

Time Plot of Suppliers A and B Late Deliveries

(40 weekly deliveries each)

=supplier A =supplier B

1 0.5 0 0.5 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39


Days from Target



Make Sure the data window is active. Choose GRAPH>HISTOGRAM (or) STAT > BASIC STATISTICS > DISPLAY DESCRIPTIVE STATISTICS In X (or) Variables, enter the column containing the data, click OK (or) choose GRAPHS > HISTOGRAM OF DATA > OK. You can choose Type of Histogram, No. of classes by clicking OPTIONS in GRAPH>HISTOGRAM menu. You can use GRAPH>Dot plot to display the data You can classify the source wise data to display the data in two different dot plots.


Make Sure the data window is active. Choose GRAPH> DOTPLOT Enter the column containing the data, click OK If you have source wise data, put the source variable (either Text or Numeric) in the next column. Click By Variable in the Dot Plot Menu, select the source column, click ok. You get the Stratified Dot Plot on the same Graph.


Dotplot for PRODUCTION















Pareto Diagrams


Dividing Data into Categories

So far, weve looked at data that has either a time order and/or a numeric order. The tools we used were time plots, control charts, and frequency plots. But many times you end up with data that can best be analyzed by dividing it into categories. A Pareto chart is one of the best tools for looking at categorical data.

Why Use a Pareto Chart

Use a Pareto Chart when you want to: Understand the pattern of occurrence for a problem. Judge the relative impact of various parts of a problem (quantifying the problem). Track down the biggest contributor(s) to a problem. Decide where to focus efforts. Often, it is very useful to add cost information to the chart. A Pareto chart helps you decide where your improvement efforts will have the biggest payoff. Ultimately, you are trying to uncover clues that will help you pinpoint causes. You can use a Pareto chart only when the problem under study can be broken down into categories and the number of occurrences can be counted for each category.

When to Use a Pareto Chart

You can use a Pareto Chart when:

The problem under study can be broken down into categories. You want to identify the vital few categoriesfocus your improvement effort. Profit margins are critical in the grocery business. Anything you can do to eliminate waste has a direct influence on the bottom line. One large grocery store wanted to reduce the amount of money wasted through spoiled food, which obviously could not be sold to the public. The supervisors of various departments were all clamoring that their problem was worst and should be addressed first. Where should the store focus its attention? They combed through records for the past three months and created the Pareto chart above.

Pareto Charts Definition

A Pareto chart is a graphical tool that helps you break a big problem down into its parts and identify which parts are the most important.

Departmental Store Spoilage by Department

October December 2004

20000 Amount of Spoilage (Rs)




0 Dairy Produce Bakery Other Meat



How to Construct a Pareto Chart

1. Decide which
problem you want to know more about.

2. Gather the necessary data.

3. Compare the relative frequency (or cost) of each problem category.

4. List the problem categories (sorted by frequency, in descending order) on the horizontal line and frequencies on the vertical line.
5. Draw the cumulative percentage line showing the portion of the total that each problem category represents (optional). 6. Interpret the results.

Contract Size ($000s) 100 150 200 50 0

Lack of Capacity
Lack of Capability


Lost Contract Bids January December

Sched Mntnc

Hardware Failure

Examples of Pareto Charts

Computer Downtime August 131


Software Bugs

Power Outages

Reason Personality Other

Hours 10 0 Upgrades Unexplained 0 10 20 30 40 50 60 70 20 30 40 50


70 90



Percent of total


What to Look For: Relative Heights of the Bars

If you see this

Interpretation & Action

Pareto Principle applies: one or a few categories account for most of the problem. Focus improvement effort on top one or two bars.

Pareto Principle does not hold: bars are all about equal height. Not worth it to investigate tallest bar. Look for other ways to categorize data, or look for different kind of data on this problem.


What to Look For: Height of the Y-Axis

If you see this
80 60 40 20 0

Interpretation & Action

Y-axis is only as tall as the tallest bar. Height of bars is seen relative to the tallest bar, not in relation to the total number of problems.






When drawn correctly, it doesnt appear as if Bar A is really that much taller than other bars. Treat as if the Pareto Principle does not hold (that is, dont focus solely on Bar A).


What to Look for: Size of the Other Category

If you see this
Other bar is small

Interpretation & Action

Most data accounted for by actual categories. Relative heights of Other bars should accurately reflect the current state. Continue with analysis on tallest bars.

Other bar is very tall

Perhaps items clustered under Other should be redistributed to existing categories or a new category created. Re-examine Other items.


What to Look for: The Data Used

If you see this
Defects in Packaging June 4 10

Interpretation & Action

Ask if the time period indicated is large enough to fairly represent the process output. Is that one week typical of the process, or would it be better to use data from an entire month?

Most important problems Tally of opinions 10/19

Ask whether the data used to create the chart are valid. For example, its usually risky to take action based on surveys or votes. If you suspect the data may be biased or irrelevant, come up with alternative data you can collect.

Summary: What to Look for on a Pareto Chart

Relative heights of the bars (including height of the Y-axis) make sure the Pareto Principle applies. Size of the Other categorymake sure you cant make another category from some of the Other data. Type of data used to create the chartis the chart based on valid data?

Reacting to a Pareto Chart: When the Pareto Principle Holds

Amount of Spoilage

If the Pareto principle holdsand a few

categories are responsible for most of the problems


Spoilage by Department
20000 15000 10000 5000 0 Produce

1. Begin work on the
largest bar(s) 2. When youve narrowed down the problem, continue to Step 3: Analyzing Causes



12500 10000 Amount 7500 5000 2500 0



Spoilage by Type of Produce


Other vegetables


Other fruit







When the Pareto Principle Does NOT Hold

When all the bars are roughly the same height and/or many categories are needed to account for most of the problem, you need to find another way to look at the data.
65 55 45 35 25 15 5 Finger Ankle Back Arm # of injuries

# of Injuries by Body Part

Number of Injuries by Department

70 60 50 40 30

Break down another way



Adjust for Impact

Lost Time (days)

# of Injuries

350 300 250 200 150 100 50 0 D 80 70 60 50 40 30 20 10 0 E D B

Lost time from injuries by Department

10 0 G: Transportation A: Maintenance E: Finishing C: Pressing B: Forming D: Baking

E Dept


Normalize the data


Inj/100k Hrs

# of Injuries per 100,000 hours worked

C B Department





Open MINITAB Worksheet. Put your data (no. of defects) in one column and the nomenclature in the other column.


Choose Chart Defects Table. In labels in: enter the nomenclature column and in frequencies in: enter the no. of defects column. Enter the required Title.

Case Example
A company was very much interested in reducing the number of injuries. Their first attempt at data analysis is shown on the left: this Pareto shows the number of injuries categorized by department. The bars are all very similar in height, which means the Pareto Principle does not hold. They used three strategies to look at the data differently in hopes of finding clues that would help them eliminate injuries. Breaking down or categorizing the problem another way. To create the top right chart, the company categorized the data by injured body part instead of department. The Pareto Principle does hold for this chart, so the company would be justified in investigating the causes of Finger injuries first.

Case Example
Adjusting the counts for impact (time, dollar cost, etc.) The company realized that not all injuries are created equal. Six finger injuries may not have as much impact as one back injury. So the middle chart shows how much each department is affected by injuries, as reflected in hours lost. The Baking department suffers most from injuriesand much more than any other department. Based on this chart, the company could justify focusing its efforts on reducing injuries in the Baking department.


Case Example
Normalize the data.
The departments in this company are not all the same size; the finishing department, for instance is very small compared to maintenance. To truly compare the rate of injuries across departments, the company converted the counts of injuries to the number of injuries that occur per 100,000 hours worked.
As you can see, though the finishing department is small, it has a relative high rate of injuries.

Based on these four Pareto charts, what would you recommend to this company?


WORK SHEET # 8: Review Exercise

Open ProcEx Hypo Mod- file on milliohms. Do appropriate data graphing and infer about the Process. Specification for Milliohms is 150-280.


A Look Ahead
The previous modules provided you with a basic understanding of data collection and analysis. The next module discusses how to calculate the capability of your process. To understand process capability, you will be using the concepts of variation, specifications, yield, and distributions.

Вам также может понравиться