Вы находитесь на странице: 1из 74

Probability and Inference

1. INTRODUCTION

1.1 Statistics Defined

Statistics is both a science and art.

Plural Sense: a set or mass of numerical data.

e.g. the number of defectives in a given lot


the time it takes to produce 1,000,000 IC chips

Singular Sense: the science of collecting, organizing, analyzing, and


interpreting data (information).

e.g. Using the data collected from an experiment, an engineer can institute
measures and design on an IC design to make the process insensitive
to defects and breakdown.

1.2 The Many Uses of Statistics

 helps to unravel existing relationships within a process

e.g. One of the numerous successful applications of statistics was seen in


the photolitography process used to form contact windows on silicon
wafers. In this process, photoresist is applied to a wafer and dried by
baking. The wafer is exposed to ultraviolet and the photoresist is
removed from the exposed areas, which are the future windows.
These areas are etched in a high-vacuum chamber, and the remaining
photoresist is removed.

Using what is known as Taguchi approach, in eight weeks and only 18


experiments, nine important process parameters were studied. The
analysis of the results improved settings of the parameters, and the
process was adjusted accordingly. The variance of pre-etch line width
was reduced by a factor of four. The defect density due to unopened
or unprinted windows per chip was reduced by a factor of three.
Finally, the time spent by wafers in window photolitography was
reduced by a factor of two.

 aids in decision making

e.g. In the manufacturing line, raw materials being used for the production
of a product are subjected to various tests, i.e., to see if they conform to
the required specifications. At times, it is impossible for every item to
be inspected. Hence a sampling procedure is used and on the account
of the testing results for the selected samples, a decision is made
whether the raw materials can be used for production.

Vic Baluyot
Probability and Inference

 predicts future outcome

e.g. The future of business establishments is basically dictated by sales,


particularly in the semiconductor industries. Prediction of future sales
will help an establishment to optionally allocate its resources to attain
production levels that will meet the sales requirement. Accuracy in
prediction, typically made using statistically techniques, will help
minimize opportunity loss.

 estimates unknown process parameters

e.g. Customer satisfaction can be attained by providing them with quality


products/services. In the industrial setting, good quality is sometimes
gauged by the number of defective items shipped to the customer.
Typically, customer requirements include an estimate of the
benchmark value for the number of defective items shipped to them.
This estimate is computed using sound statistical techniques and
methodology.

1.3 Uses in Total Quality Management

Commitment to Continuous Quality Improvement

 continuous application of on- and off-line quality control methods such as


design of experiments and reliability testing

 use of data to analyze and solve problems

e.g. using the Deming cycle (“plan-do-check-act”)


plan: design and conduct of experiments
check: statistical process control procedures

 using statistics as a tool

1.4 How Statistics Can Be Abused

 using and interpreting results of a poorly planned experiment to adjust the


process

 insistence of a low ppm through the manipulation of sampling procedures


and estimation formulas

Vic Baluyot
Probability and Inference

 reflecting low defective levels to please the boss

 estimation of cost savings using linear and deterministic methods

 implying that the system is robust by just looking at SPC charts

1.5 Descriptive vs. Inferential Statistics

Descriptive Statistics

 description and summarizing sets of numerical data

 includes the construction of graphs, charts, and tables, and calculation of


various descriptive measures such as averages and percentiles

Examples

1. The operator’s performance in the line can be based on criteria such as


output, understanding of the process, trouble-shooting capabilities, and
cooperativeness. He will be given scores in each of these items. The
scores will then be weighted and a figure will be calculated to measure his
overall performance.

2. To describe the productivity of a line area, one can take the ratio of chips
produced against the number of operators involved in the production.
Productivity is said to be highest in the area where the ratio is highest.

Inferential Statistics

 will allow an inference to be made about the whole population based on a


sample from the population

 used when one wishes to determine the characteristics of a larger group


by collecting data on a smaller group

Examples

1. Prior to the shipment of a product, inspection procedures are carried out.


Through sampling methodologies, the average number of defectives is
estimated. If the estimate is within the customer’s specification, the
product is shipped, otherwise, some corrective actions will be done.

2. On line quality control methods require the product characteristic


variability be in a state of statistical control. Deviation from this state
means that there is an assignable cause responsible for the variability
going haywire. Estimation of variability is usually doen by taking sample
batches out of the line, checking them, and computing a variance

Vic Baluyot
Probability and Inference

measure. This sample variance is used to give an overall assessment of


the product characteristic variability.

1.6 Levels of Measurement

 Statistical treatment or analysis of data depends on the scale used to


measure the variables of interest.

1. Nominal (or Classificatory) Scale of Measurement

 weakest level and used simply to classify and object, person, or


characteristic

e.g. state of a processed IC chip - defective or non-defective


bar codes
telephone numbers
marital status

2. Ordinal Level (or Ranking Scale) of Measurement

 numbers or categories follow some ordering

e.g. job assessment - poor, satisfactory, good, very good, excellent


employee rank
educational attainment
satisfaction level

3. Interval Level of Measurement

 scale with defined distance between two numbers

 scale has common and constant unit of measurement with no “true zero”
point

e.g. psychological evaluation


IQ test scores
temperature
calendar dates

4. Ratio Level of Measurement

 contains all the properties of the interval level, and in addition, it has a
“true zero” point

e.g. cardinal numbers


number of defective within a given lot
output capacity of an assembly area
wire pull strength

Vic Baluyot
Probability and Inference

Analysis can only proceed when data have been collected and verified.

Some Terminologies

population - the totality of units under consideration and from which


measurements will be obtained

variable - characteristic or attribute of points which can assume different


values for each unit in the population
observation - any numerical recording of information, the collection of which
is known as the data

measurement - the assignment of numbers to observations in such a way


that the numbers are amenable to analysis by manipulation or
operation according certain rules

Example

In a typical customer audit schedule, the client visits the physical facilities of
the supplier with the intention of checking whether the suppliers conforms to
the mutually agreed manufacturing procedure. The client might investigate
whether the operators apply and understand statistical process control (SPC)
procedures. In this case, the population of interest is the plant operators with
knowledge of SPC procedures as the variable of interest. A typical
observation might involve the client asking a battery of SPC questions to a
particular operator. Depending on the answers, the client might be satisfied
or not. Hence, the measurement process defines a nominal scale for the
variable of interest.

In the observation process, data can be collected from each unit (census or
100% inspection) or just a subset of the population (sample).

Vic Baluyot
Probability and Inference

2. SOME SAMPLING SCHEMES

2.1 Census vs. Sample

Census

 method of gathering observations on every unit of the population

 not always possible to get timely, accurate, and economical data

 costly, if number of units in the population is too large

Sampling

 method of collecting data from a subset of the population

 representative if it reflects the characteristics of the population under


study

Advantages of Sampling

1. reduced cost
2. greater speed
3. greater scope
4. greater accuracy

The accuracy of sampling as against 100% inspection is a contentious issue.


The argument say that since there are fewer units in a sample, recording and
analysis will be more focused and organized as compared to 100% inspection

Example

Sampling in the semiconductor setting is commonly known as acceptance


sampling. Acceptance sampling is used as a basic tool for assessing product
quality. It helps in making inferences about a process based on sample of
items from the process.

Traditionally, acceptance sampling has been applied at either final inspection


(by the producer) or incoming inspection (by the customer). Products are
grouped together into lots before they are shipped to the customer.
Normally, lot size is related to the physical size of the produce.

Acceptance sampling proceeds by taking a random sample from a particular


lot and either measuring a particular quality or characteristic or counting the
number of sample items that do not meet specifications. A lot is rejected if a

Vic Baluyot
Probability and Inference

sufficiently large number of the sampled items do not meet the


specifications.

More Terminologies

parameter - a numerical characteristc of the population; a number derived


from the knowledge of the entire population

statistic - a numerical characteristic of the sample

variation - the inevitable differences among units of the population

A population is typically described by a parameter which is unknown. To


assess or estimate this unknown quantity, the observation process is set into
motion. If a sample is observed, a statistic is generated and used as
estimate for the parameter. Since different samples will yield different
statistics, a measure of variation is obtained for each statistic produced to
convey its accuracy.

Example

In order to estimate the average outgoing quality (AOQ) level of IC chips


being turned out by a manufacturing company, an auditor samples 200 chips
from a given batch of lots and determined the proportion of chips in the
sample which do not conform to specifications. Here, the AOQ level is the
parameter and the proportion of chips in the sample which do not conform to
specifications is the estimate. This proportion will vary if a different set of
200 chips within the back is observed or if a different batch of lots is
observed. For precision purposes, a measure of variation is needed for the
obtained statistic to evaluate how far it is from the true AOQ level.

2.2 Probability vs. Non-Probability Sampling

 Monitoring and testing typically involve sampling.

Monitoring - check if the process is in a state of statistical control


Testing - insures that the products that will be shipped satisfy the client’s
requirements

Probability sampling ensures that each unit produced in the line will have a
chance of being included in the sample. No such chance is guaranteed when
non-probability sampling is resorted to.

Examples of Non-Probability Sampling

Vic Baluyot
Probability and Inference

1. The first 100 units produced by every process everyday are obtained to
construct control checks.

2. Raw material inspection wherein only the topmost and bottom most layers
are inspected for conformance.

3. Quota sampling procedure where outputs of key operators are observed,


the reason being they represent the typical operators within the plant.

Quota sampling seems to be sensible, but really does not work well due to
unintentional bias.

Bias affects the analysis by inflating or deflating statistical estimates of


parameters. This will make you deliberately miss the target.

Examples

1. Selection Bias - kind of bias you commit if non-probability sampling is


resorted to

2. Non-response Bias - situation wherein measurement on a unit selected


was not recorded or cannot be measured

3. Measurement Bias - happens when the instrument used are not


calibrated

Usually, the following holds:

estimate (statistic) = parameter + bias probability + chance error

To minimize bias, an appropriate sampling procedure should be used.

2.3 Some Sampling Plans

1. Simple Random Sampling

 process of selecting a sample, giving each sampling unit an equal chance


of being included in the sample.

Two Variations

Simple Random Sampling with Replacement

 a chosen unit is always replaced before the next selection is made so that
an element maybe chosen more than once.

Simple Random Sampling without Replacement

Vic Baluyot
Probability and Inference

 a chosen unit is not replaced before the next selection is made so that an
element may only be chosen once.

To ensure constancy of chance, a listing of the sampling units is required.


This list is called a frame. Thus, the method seems to be not feasible in on-
line specifications.

Vic Baluyot
Probability and Inference

Steps:

1. Make a list of the sampling units and number them from 1 to N, N denoting
the population size.
2. Select n (distinct) random numbers (n denotes the sample size) ranging
from 1 to N using the table of random numbers or by lottery. The sample
consists of the units corresponding to the selected numbers.

A table of random numbers is constructed by guaranteeing that each digit (0


to 9) will appear in any position within the table with probability one in ten.

Advantage

Implementation is simple and easy.

Disadvantages

1. The units chosen might be widely spread in physical location hence


entailing a certain amount of cost.
2. A listing is basically required for implementation.
3. The chosen sample may not be typical of the population if the population
is heterogeneous with respect to the variable under study.

Case Studies

Wire Bond Sampling Scheme

Sub-lots consisting of an average of six to eight magazines with 39 strips per


magazines are considered for inspection. The sampling scheme proceeds by
taking two strips per magazine and then 20 or 56 units per strip depending
on whether it is 18/20 lds or 8 lds. The units are inspected for visual defects.
A defect seen will mean 100% inspection of the affected magazines.

Sampling Flow:

Lots
 No Sampling Indicated.
Sub-lots
 No Sampling Indicated.
Magazines
 Get two strips per magazine.
Strips (Primary Units)
 Get 20 or 56 units per strip.
Chips ( Secondary Units)

Target Population : Chips Produced


Sampled Population : Depends on how the lots and sub-lots were selected

Vic Baluyot
Probability and Inference

Type of Sampling Procedure : Derivative of Multistage Sampling


Variable of Interest : State of the Chip (Visual Defect, not including other
types as indicated by machines)
Parameter : Proportion of defectives (Usually expressed in ppm)

Goodness of the scheme is anchored on the following considerations:


1. How the lots, sub-lots, magazines and strips were selected
2. Number of chips sampled
3. Non-sampling error

Suggestions:
1. The manner in which the units are to be selected should be govern by
past experience on how defects usually occur when system trouble erupts.
2. The representativeness of the units is a function of how coverage of the
strip can be done.
3. The number of chips to be sampled should be governed by type I and type
II error considerations but definitely more units sampled the better.
4. Safeguards should be put in place so that non-sampling errors, e.g.
operator-related, will not occur.

Weakness: Why only the affected magazine is inspected 100%?

Second Optical Inspection (for Wafers)

Each wafer is to be inspected for defect using a Z- pattern. If a reject is found


within the Z-pattern, a second inspection will be carried using a Z-pattern
imbedded in the first pattern. If a reject is found the whole wafer is subjected
to a 100% inspection.

Target Population : Dies in Wafer


Sampled Population : Dies falling within the Z- and imbedded Z-pattern
Type of Sampling Procedure : Non-probability sampling
Variable of Interest : State of the Dies
Parameter : Proportion of defectives

Technical Notes:

1. To make the sampling procedure a probability one, the orientation of the


pattern must be randomized.
2. The usage of the Z-pattern should be based on some engineering principle
dictating the pattern of defectives when they do occur.

Remark: You can always be skeptical with any sampling plan as far as their
motivation. The real test will always be experience. You can only ensure that
it is a good one by conducting trials on alternative sampling strategies
through statistics calculated from cohort panel studies or through simulation
procedures assuming a constant defective rates.

Vic Baluyot
Probability and Inference

2. Systematic Sampling

 method of selecting a sample by taking every kth unit from an ordered


universe of units, the first one being selected at random

 k is called the sampling interval and calculated as N/n. The inverse is


called the sampling fraction.

Steps:

1. Number the units of the population from 1 to N.


2. Determine k, the sampling interval.
3. Using a table of random numbers, choose a number r between 1 and N.
The unit corresponding to r is the first unit in the sample.
4. Consider the list of units of the population as a circular list, i.e., the last
unit in the list is followed by the first. The other units chosen are r+k,
r+2k, r+3k, ... until n units are selected.

On-line implementation of this procedure requires an estimate of the total


number of units that can be produced by the process at a particular time
interval. In this case, the day’s average production, suitably scaled, can be
used to provide a value for N.

Advantages

1. Drawing of the sample is administratively easy.


2. It is possible to select a sample without a sampling frame.

Disadvantages

1. If the units possess periodic regularities, then a systematic sample may


consist of only similar types.
2. If the population is not in random order, one cannot validly estimate
chance from a single systematic sample.

3. Stratified Sampling

 used when the universe of units is made up of units which are


heterogeneous

 population should be divided, or stratified, into more or less homogeneous


sub-populations or strata before sampling is done

 consists of selecting a simple random sample or systematic sample from


each of the sub-populations into which the population has been divided

Vic Baluyot
Probability and Inference

Steps:

1. Stratify the population into L strata in such a way that each will consist of
more or less homogeneous units (in this case, stratum i will consist of N i
units, i=1,2,...,L).
2. After the population has been stratified, samples should be selected from
each stratum. the stratum samples taken together constitute the stratified
sample.

The variable used as a basis for the stratification is called the stratifying
variable.

Allocation Rule: to maintain proportionality ni should be calculated as

ni = n * (Ni/N) , i=1,2,...,L

Advantages

1. Stratification may bring about a gain in precision of the estimates of the


parameters of the population.
2. It allows for more comprehensive data analysis since information is
provided for each stratum.

Disadvantages

1. A listing of the population from stratum to stratum is needed.


2. The stratification of the population may mean the need for additional prior
information about the population and its sub-population.
3. It is administratively inconvenient.

4. Cluster Sampling

 method wherein units are grouped together to form sub-populations that


are more or less similar in characteristics as the parent population

 the groupings are called clusters which serve as sampling units for a
random sampling or systematic procedure

Steps:

1. Form clusters out the parent population and assign labels 1 to M.


2. Using a table of random numbers, select m numbers. The numbers
selected correspond to the sampled clusters.
3. Units within the selected clusters constitute the sample.

Advantages

Vic Baluyot
Probability and Inference

1. A frame is not needed; only a population list of clusters is required, thus


listing cost is reduced.
2. Imported costs due to physical location will be reduced.

Vic Baluyot
Probability and Inference

Disadvantages

1. The costs and problems of statistical analysis are greater.


2. Estimation procedures are difficult.

5. Multi-Stage Sampling

 a procedure wherein selection of units is done in stages principally to


lessen inputed costs brought about by the physical location of the units to
be sampled

 population is divided in a number of first-stage or primary units from


which a sample is drawn from which a sample of second-stage units or
secondary units is drawn

 universe of units can be divided further into a hierarchy of sampling units


corresponding to the different sampling stages

Steps:

1. Number the first-stage units consecutively from 1 to N in the frame.


2. Using a table of random numbers, choose the first stage units.
3. Number the second stage units consecutively from 1 to M in the frame for
each of the n selected first stage units.
4. Using a table of random numbers, obtain n sets of m random numbers
each (m less or equal to M).
5. In each of the n first stage units, select the m second stage units
corresponding to the selected numbers.
6. Continue the same procedure until the desired nth stage units are
obtained.

Advantages:

1. Listing cost is reduced.


2. Inputed cost due to physical location is reduced.

Disadvantages:

1. Estimation procedures are difficult, especially when the first stage units
are not of the same size.
2. The sampling procedure entails much planning before selection is done.

Exercise

Suppose that your present company has tasked you to design a system that
would reduce the risk of your customer receiving bad shipment. Using the
concepts that you learned in probability sampling formulate an easy, simple
and acceptable sampling plan that will help you carry your task.

Vic Baluyot
Probability and Inference

3. PRESENTATION AND ORGANIZATION OF DATA

3.1 Tabular Presentation of Data

Presentation comes next after data collection.

Some Guidelines:

 should capture the very essence of the characteristic that is being studied
and should create the necessary impact

 results in some degree of loss of information, in the sense that once


figures are summarized, recovering absolute numbers in the absence of
the original data is next to impossible

3.2 The Frequency Distribution Table

 way of summarizing the mass of data collected

Steps:

1. Determine an adequate number of classes to group the data.

Suggestion: Sturges’ Formula;

k (no. of classes) = 1 + 3.322 log 10 n


where n = no. of observations

2. Compute the range.

R (Range) = highest value - lowest value

3. Divide the number R by k to estimate the approximate class size c, i.e.,

c = R/k

(c should be rounded off to the nearest significant digit.)

4. List the lower class limit of the bottom interval. Add the class size to the
lower class limit of the next class interval. (The lower and upper class
limits define a particular class interval.)

5. List all the class limits and class boundaries by adding the class size to the
class limits and class boundaries of the previous interval. (The class
boundaries are called true class limits since they close the gaps existing
between successive upper and lower limits. They are formed by extending
the class limits halfway between a particular lower and upper class limits.)

Vic Baluyot
Probability and Inference

6. Determine the class marks of each interval by averaging the class limits or
the class boundaries.

7. Tally the frequencies for each class.

8. Sum the frequency column and check against the total number of
observations.

To highlight the importance of a particular class in terms of magnitude, a


relative frequency column is appended to a frequency distribution table. The
relative frequency for each class is computed by dividing the frequency entry
by the total number of observations.

Example

Consider the following data set of bond pull test results using a certain
machine.

9.6 11.1 12.3 11.2 9.2


10.6 10.0 11.7 11.7 8.5
9.9 11.9 12.1 12.6 10.8
11.1 12.5 11.5 12.6 11.4
9.8 11.0 11.2 8.4 10.9

Computational steps:

1. k = 1 + 3.322 * log10(25)  6.
2. c = R/k = (12.6 - 8.4 )/6 =0.7.

Table Proper:

Class Class Frequen


Limits Boundarie cy
s
8.4 - 9.0 8.35 - 9.05 2
9.1 - 9.7 9.05 - 9.75 2
9.8 - 10.4 9.75 - 3
10.45
10.5 - 11.1 10.45 - 6
11.15
11.2 - 11.8 11.15 - 7
11.85
11.9 - 12.5 11.85 - 4
12.55
12.6 - 13.2 12.55 - 1
13.25

Vic Baluyot
Probability and Inference

Statistical tables such as the FDT are given appropriate table titles and
number, formal boxhead, and footnote. The table title should be as self-
sufficient as possible for descriptive purposes; while the footnote should give
details about the data content of the FDT.

3.3 Graphical Presentation of Data

A graph is a device for showing numerical values or relationships in pictorial


form.
Types:

1. Line Diagrams or Curves

 basically used in showing trends over time or across a characteristic

2. Bar Charts

 used in comparing categories with each other or numerical values over a


period of time

 magnitude is represented by the height of the bar

3. Pie Charts

 used in showing the component parts of a whole

4. Pictographs

 numerical figures are compared through the use of symbols or pictures

3.4 Graphical Presentation of FDT

1. Histrogram

 displays the classes in the horizontal axis and the frequencies of the
classes on the vertical axis

 frequency of each class is represented by a vertical bar whose height is


equal to the frequency of the class

Uses

1. The histrogram shows how the data scatter against each other and the
location around which most of the data observation cluster (given by the
class with the tallest bar).

Vic Baluyot
Probability and Inference

2. Depicts the general shape of the data and hence gives the data a
characterization.

If the relative frequency is used instead of the frequency, then the histogram
is called a relative frequency histogram.

2. Frequency Polygon

 constructed by plotting the frequencies against the class marks and


connecting the plotted points by means of straight lines

 also called a line diagram for the FDT


3. Stem-and-Leaf Display

 both a graphical and a numerical display of data which at the same time
shows the range and concentration of the data

Steps:

1. Convert each data point to a new value y using the formula

y = LCB + {[x - LCB]/c}

where LCB = lower class boundary for the class containing x


c = class size

2. Sort the converted data from lowest to highest.

3. Split each data point at its decimal point. Digits to the right are called the
leaves while digits to the left are called stems.

4. Produce the leaf display of the converted values. Leaves with the same
stem should be displayed in the same row from lowest to highest, keeping
a space between the stem and the leaves.

5. Append to the left side of the display the leaft count of each stem.

Examples

1. Consider the following data set of bond pull test results using a certain
machine.

9.6 11.1 12.3 11.2 9.2


10.6 10.0 11.7 11.7 8.5
9.9 11.9 12.1 12.6 10.8
11.1 12.5 11.5 12.6 11.4
9.8 11.0 11.2 8.4 10.9

Computational steps:

Vic Baluyot
Probability and Inference

1. k = 1 + 3.322 * log10(25)  6.
2. c = R/k = (12.6 - 8.4 )/6 =0.7.

Vic Baluyot
Probability and Inference

Table Proper:

Class Class Frequen


Limits Boundarie cy
s
8.4 - 9.0 8.35 - 9.05 2
9.1 - 9.7 9.05 - 9.75 2
9.8 - 10.4 9.75 - 3
10.45
10.5 - 11.1 10.45 - 6
11.15
11.2 - 11.8 11.15 - 7
11.85
11.9 - 12.5 11.85 - 4
12.55
12.6 - 13.2 12.55 - 1
13.25

Converted Data:

9.8 11.4 12.5 11.2 9.3


10.7 10.1 11.9 11.9 8.6
10.0 11.9 12.2 11.8 11.0
11.4 12.8 11.7 12.6 11.5
9.8 11.2 11.2 8.4 11.1

Sorted Data:

8.4 8.6 9.3 9.8 9.8


10.0 10.1 10.7 11.0 11.1
11.2 11.2 11.2 11.4 11.4
11.5 11.7 11.8 11.9 11.9
11.9 12.2 12.5 12.6 12.8

2. Consider the following data set representing the average wear out rate of
blade life in a dicing saw station (mil*1,000,000/cutline).

596 670 68 536 430


588 682 345 467 536
583 74 326 47 406
593 71 465 568 335
602 69 388 356 459

Vic Baluyot
Probability and Inference

4. MEASURES OF CENTRAL TENDENCY

Measures of central tendency or simply ‘averages’ are convenient tools for


depicting the location at which data tend to cluster around.

They provide a snapshot of the other data observations in the data set
without necessarily learning the actual figures. As such, they are called
“representative” observations.

They provide a “common denominator” for comparing two groups of data.

4.1 Some Measures

1. Mean

 obtained by adding the observations together and dividing the sum by the
number of observations

When to Use:

1. If the data observations form a symmetric distribution.


2. If it is thought that the data come from a normal population (bell-shaped
curve).

When Not to Use:

1. If the data contains extreme observations (extremity is defined in terms of


magnitude).
2. If the scale of measurement is of the nominal and ordinal type.

Example
Consider the following data set representing the average wear out rate of
blade life in a dicing saw station (mil*1,000,000/cutline).

596 670 68 536 430


588 682 345 467 536
583 74 326 47 406
593 71 465 568 335
602 69 388 356 459

Vic Baluyot
Probability and Inference

blade
700

600

500

400

300

200

100

10260
X   410.4
25

2. Median

 point that cuts the distribution of observations into two equal parts

Remark: The median is usually calculated depending on the number of


observations. If the number is odd, the median is the middle observation (in
magnitude). If the number is even, then the median is the average of the
two middle observations.

When to Use:

1. If the shape of the distribution deviates mildly from a symmetric


distribution.
2. If the situation calls for a positional measure rather than a ‘representative’
figure.

When Not to Use:

1. If the data exhibit clustering around several locations.


2. If the scale of measurement is of the nominal or ordinal type.

Example (Continuation)
~
X  X ( 26/ 2 )  X (13)  459

Vic Baluyot
Probability and Inference

3. Mode

 the value in the distribution which occurs with the most frequency

 maybe non-unique in which case data exhibit clustering around several


locations

When to Use:

1. If the scale of measurement is of the nominal or ordinal type.


2. If the data exhibit clustering around several locations.

When Not to Use:

If the shape of the distribution is relatively flat.

Examples (Continuation)

The Mode is non-existent.

4.2 Computational Forms for Grouped Data

1. Mean

Steps:

i) Create an additional column in the FDT by multiplying the frequency entries


with the corresponding entries of the class mark column.

ii) Sum the entries of the new column formed in (i) and then divide the sum
by the total number of observations.

2. Median

Steps:

i) Find n/2, or 50% of the data observations falling below the median.

ii) Create a column for cumulative frequencies where the entry for the ith
class is obtained by summing its frequency together with the frequencies
of the classes below it.

iii) Locate the interval that contains the median (median class), the point at
which n/2 observations fall below.

iv) Once the median class is determined, compute the median as

Vic Baluyot
Probability and Inference

Median = LMd + {c [(n/2 -F Md-1 )/fMd]}.

where
LMd = the lower class boundary of the median class
c = the class size of the median class
F Md-1 = the cumulative frequency of the class immediately before the
median class
fMd = the frequency of the median class

3. Mode

Steps:

i) Locate the class which has the highest frequency (modal class).

ii) Compute the mode as

Mode = LMo + {c[(fMo-f1)/(2fMo-f1-f2)]}

where
LMo = the lower boundary of the modal class
fMo = frequency of the modal class
c = class size
f1 = frequency of the class preceding the modal class
f2 = frequency of the class following the modal class

(Examples)

4.3 Measures of Location

 used to find the location of a specific piece of data in relation to the entire
set

 values below which a specific fraction or percentage of the observations in


a given set must fall

Types

1. Percentiles

 values which divide a set of observations into 100 equal parts

2. Deciles

 values that divide a set of observations into 10 equal parts

Vic Baluyot
Probability and Inference

3. Quartiles

 values that divide a set of observations into 4 equal parts

The median is also known as the 50th percentile, the 5th decile, and the
second quartile.

For ungrouped data, measures of position are determined by inspection as


the median case.

4.4 Determination of Positional Measures for Grouped Data

1. Percentile

Steps:

i) Find n(k/100), the k% of the data observations falling below the kth
percentile.

ii) Using the column for cumulative frequencies, locate the class which
contains the kth percentile (Pkth class).

iii) The kth percentile is computed as

Pk = LPk + c {[nk/100-F Pk-1 ]/fPk}

where

LPk = lower class boundary of the Pkth class


F Pk-1 = cumulative frequency of the class preceding the P kth class
c = class size
fPk = frquency of the Pkth class

The deciles and quartiles can be directly computed using the formula of
percentile.

2. Decile

The kth decile is computed as

Dk = Pk*10, k=1,2,...,9.

3. Quartile

The kth quartile is computed as

Qk = Pk*25, k=1,2,3.
(Examples)

Vic Baluyot
Probability and Inference

Vic Baluyot
Probability and Inference

5. Measures of Variability

These refer to quantities that describe the scatter of observations about an


average.

If the measure of dispersion is large, then the average is unrepresentative of


the rest of the observations, otherwise most of the observations are near the
average.

In process control, measures of dispersion are used to construct the upper


and lower class limits of X  charts and R - charts. If observations fall within
the limits, the process is said to be stable or is in control.

Some Measures:

1. Range

 the difference between the largest and the smallest observations of a data
set

2. Standard Deviation

 the square root of the average of the squared deviations from the mean

 measures the relative distance of the mean from the rest of the
observations

Squaring the standard deviation yields the variance.

3. Coefficient of Variation (CV)

 measure of relative variation (unitless) which compares the magnitude of


the standard deviation to the size of the mean

 commonly used as a measure of variation when comparing different data


sets

The CV is related to Taguchi’s signal-to-noise ratio. By looking at its size, the


optimal combination of factor levels in a parameter design can be obtained.

Concretely, the CV is computed as

s
CV  x100%
X

(Examples)

Vic Baluyot
Probability and Inference

Some Characterizations:

Range

 fails to communicate any information about the clustering or lack of


clustering of the values in the distribution located between the two
extremes

 sensitive to sampling variation (tends to be smaller in smaller samples)


and extreme observations

Standard Deviation

 less influenced by sampling variation

 most oftenly used measure of dispersion

Vic Baluyot
Probability and Inference

6. MEASURES OF SKEWNESS AND KURTOSIS

6.1 Skewness
 measures the degree and direction of asymmetry, or departure from
symmetry of a distribution

If the distribution tapers more to the right than the left, the distribution is
said to be partially skewed or skewed to the right; otherwise, it is said to be
skewed to the left or negatively skewed.

Steps:

1. Ungrouped Data

i) Get the deviations of the observations from the mean.


ii) Cube each deviation.
iii) The skewness is obtained by averaging the cubes and then dividing it by
the cube of the standard deviation.

2. Grouped Data

i) Create a column for the deviations from the mean by subtracting the
mean from the class mark of each interval.
ii) Augment another column by cubing the entries of the column generated
in (i).
iii) Sum the entries of the column in (ii) and divide it by the total number of
observations.
iv) Divide the quantity in iii) by the cube of the standard deviation which was
obtained using group data formula.

Skewness is useful in judging whether a given set of observations follow a


symmetric or a ‘normal’ distribution. If the skewness values is approximately
zero, then the distribution is said to be symmetric. If the distribution is
skewed then the extreme values are present, and the mean and range will
not be good measures of average and dispersion, respectively.

Tabular Rule:

Kurtosis Description of
Distribution

<0 skewed to the left


=0 symmetric
>0 skewed to the right

(Examples)

6.2 Kurtosis
 a measure of the degree of peakedness of a distribution

Vic Baluyot
Probability and Inference

A very peaked distribution is called a leptokurtic distribution; a symmetric


distribution is called mesokurtic; a flat distribution is called platykurtic.

Steps

1. Ungrouped data

i) Get the deviations of the observations from the mean.


ii) Quadruple each deviation.
iii) The kurtosis is obtained by averaging the quadrupled deviations and then
dividing it by the square of the variance.

2. Grouped data

i) Create a column for the deviations from the mean by subtracting the mean
from the class marks of each interval.
ii) Augment a column by quadrupling the entries of the column generated in
(i).
iii) Sum the entries of the column in (ii) and divide it by the total number of
observations.
iv) Divide the quantity in iii) by the square of the variance which was
obtained using grouped data formula.

Tabular Rule:

Kurtosis Description of
Distribution

<3 platykurtic
=3 mesokurtic
>3 leptokurtic

If the distribution is leptokurtic, then majority of the observations are near


the mean, hence the mean is a good representative value.

If the distribution is platykurtic, then the distribution is highly variable and


the mean fails to be a representative value.

If a distribution is mesokurtic and has a skewness value near zero, then the
distribution follows a bell curve. In this case, the mean, median and mode
coincide.

Vic Baluyot
Probability and Inference

Exercise

Consider the following the following wedge size measurements of 3.7 x 3.7
mils pad used in wire bonding.

Pad No. Bond Bond Pad No. Bond Bond


Width Length Width Length
1 2.1 2.8 21 2 2.6
2 2 2.7 22 2.2 2.8
3 1.9 2.8 23 2.2 2.8
4 2.1 2.8 24 2.2 2.7
5 2.1 2.7 25 2.1 2.8
6 2.2 2.8 26 2 2.7
7 2.1 2.7 27 2.3 2.8
8 2 2.8 28 2.2 2.8
9 2.2 2.8 29 2.1 2.8
10 2.2 2.6 30 2.1 2.8
11 2.2 2.8 31 2.1 2.6
12 2.3 2.7 32 2 2.8
13 2 2.8 33 2 2.6
14 2.2 2.7 34 2 2.7
15 2 2.8 35 2 2.6
16 2.2 2.8 36 2.1 2.6
17 2 2.8 37 2 2.6
18 2.1 2.7 38 2 2.5
19 2.1 2.7 39 2 2.6
20 2.2 2.8 40 2 2.7

1. Compute the corresponding summary statistics (measures of central


tendency, variability, skewness and kurtosis) using the ungrouped
method.

2. Construct a frequency distribution for the given data.

3. Using the grouped method measures of central tendency and location.

Vic Baluyot
Probability and Inference

7. PROBABILITY AND PROBABILITY DISTRIBUTIONS

7.1 Probability Defined

 a numerical quantity used to assign chance that a particular event will


occur
 assigns a degree of confidence to the reliability of a certain statistic

Properties:

1. A sure event is assigned a probability of one and the impossible event


zero.
2. An occurrence will be assigned a probability value between zero and one.
3. A collection of independent occurrences will be assigned a probability
equal to the sum of the probability assigned to each occurrence.

A phenomenon or an inquiry is usally modeled by a statistician as an


“experiment” whose outcomes are uncertain. Probabilities are used to gauge
the likelihood of a given outcome of the experiment. For example, in
destructive testing, the lifespan T of the unit being tested is unknown and
can go anywhere from zero to a very large value. Probabilities can be used
to assess up to what time T will the unit wear out.

Some Definitions

1. An event is an outcome of a statistical experiment.


2. An event which cannot be decomposed is a simple event; otherwise, it is
called a compound event.
3. The sample space is a listing of all possible outcomes of the experiment.
4. If two events do not share a common outcome, they are said to be
mutually exclusive.
5. Two events are independent if the outcome of either of them cannot affect
the outcome of the other.

Notations:

1. Capital letters are usually used to denote events. ‘P’ is used for
probability.
2. A  B or A and B - events A and B will occur.
3. A  B or A or B - events A or B will occur.
4. Ac - complement of A will occur.

7.2 Methods of Assigning Probabilities

1. Frequency Approach

Vic Baluyot
Probability and Inference

P(E) = n(E)/N

where n(E) = the number of simple events in E


N = totality of simple events in the sample space

2. Subjective Approach

 assignment of probabilities is kept to the investigator

Some Rules:

1. P(Ac) = 1 - P(A).

2. A and B mutually exclusive events, then


P(A or B) = P(A) + P(B).

If A and B are arbitrary, then


P(A or B) = P(A) + P(B) - P(A and B).

3. If A and B are independent, we have


P(A and B) = P(A)P(B).

4. The probability of A conditioned on B is given by


P(A|B) = P(A and B)/P(B)
where the probability of B is nonzero.

5. The multiplicative law is the given by


P(A and B) = P(A|B)P(B).

Examples

1. Suppose that a unit passes through three inspection gates, say A, B, and
C. There is a chance of 0.7 that a unit will be declared defective in gates A
and B, while the corresponding figure for gate C is 0.85. What is the
probability that the unit will pass through the three gates?

Solution

Let A = unit will pass through gate A.


B = unit will pass through gate B.
C = unit will pass through gate C.

Assuming independence of gates,

P(A and B and C) = P(A)P(B)P(C)


= 0.3 x 0.3 x 0.15
= 0.0135

Vic Baluyot
Probability and Inference

2. In (1), what is the probability that the unit will pass through one of the
gates?

Solution

P(A or B or C) = P(A or B) + P(C) - P((A or B) and C)

P(A or B) = P(A) + P(B) - P(A and B)


= P(A) + P(B) - P(A)P(B)
= 0.3 + 0.3 - (0.3 x 0.3)
= 0.51

P((A or B) and C) = P(A or B)P(C)


= 0.51 x 0.15
= 0.0765

Thus,

P(A or B or C) = 0.51 + 0.15 - 0.0765


= 0.5835

3. An inspection procedure calls for the rejection of a lot if inspection yields


two successive defective materials in the lot. Currently, lots of size 10 are
being examined with average quality level of 5% defective. What is the
probability of rejection if sample sizes of 2 and 5 are used?

Solution

Let A = probability of lot rejection.

a) Sample of size 2

P(A) = (5 x 5)/(100 x 100)


= 1/400

b) Sample of size 5

P(A) = (4 + 6 + 4 )x (1/400) x (399/400)3

7.3 Random Variables

 a rule which assigns real numbers to outcomes of statistical experiment

 used to simplify the calculation of probability and to expedite


mathematical analysis

Vic Baluyot
Probability and Inference

You can think of the measurements on your population units as outcomes of


a statistical experiments. Since the measurements vary from unit to unit, the
characteristics being measured can be said to be a random variable.

Two Types

1. Discrete

 random variables that can take only a finite number of countable values

2. Continuous

 random variables that can take an uncountable or infinite values

Examples

Discrete Continuous
audit points cost width of door gap
defective wire bonds chemical reaction time
paint chips per units wire pull strength

In quality control parlance, a discrete random variable is referred to as an


attribute, while a continuous random variable is called a variable.

The set of possible values of a random variable generates a distribution (can


be obtained by constructing the <CF ogive of its values). This distribution is
both an identification and characterization device. Majority of the
calculations involving random variables can be answered if the distribution is
known.

A distribution can come in all shapes and sizes: symmetric, skewed, bell-
shaped, flat, and peaked. In applications, the distributional shape of the
random variable of interest in unknown. A random sample is obtained to
provide an intelligent guess of its shape. Usually, this is done by computing
its summary statistics or plotting the histogram.

Due to the unknown structure of a distribution, it is usually the case that a


known probability distribution is assumed to govern the outcomes of an
experiment. These distributions are typically function of the population
parameters. Estimates for these parameters are produced by calculating
sample-based statistics. As a consequence, probabilities of events can then
be computed.

Vic Baluyot
Probability and Inference

7.4 Probability Models

7.4.1 Discrete

7.4.1.1 Binomial Distribution

 used for experiments consisting of n trials where each trial can result in
either a “success” or “failure”

Characterization:
 trials are independent of each other
 probability of a success remains constant from trial to trial
 interest is in the number of successes in n trials

 n
Form: f ( x )    p x (1  p) n  x , x = 0, 1, . . . , n.
 x

where n = number of trials


p = probability of success
x = number of successes out of n.

Notes:

1. P(X = x) = f(x), i.e., the probability of the random variable X taking the
value of x is given by f(x).
2. n! = n x (n - 1) x . . . x 2 x 1.

Example

Suppose that a process is known to produce conforming items about 95% of


the time and that random sampling is used to select five items to test. Given
that the overall proportion of conforming items is 90%, what is the probability
that all tested are conforming? At least two are conforming?

Solution

Let X = number of conforming items out of 5 tested.

Given:n = 5 and p = 0.9.

 5
P ( X  5)    ( 0.9) 5 ( 0.1) 55
 5
= 0.59049

P ( X  2)  1  P ( X  0)  P ( X  1)

Vic Baluyot
Probability and Inference

 5  5
 1    (0.9) 0 ( 0.1) 5 0    ( 0.9) 1 ( 0.1) 51
 
0  1

7.4.1.2 Poisson Distribution

 used for assessing occurrence of events that can happen within a given
time/space/volume/area

Characterization:
 the probability of an occurrence is proportional to the length of
time/space/volume/area
 the probability of at least two occurrences in a very small length of
time/space/volume/area is negligible

e  x
Form: f ( x)  , x = 0, 1, . . . .
x!
where  = intensity parameter denoting the average number of occurrence
within a given time/space/volume/area.
x = number of occurrences in a given time/space/volume/area.

Example

Flaws in a certain fabric occur at a rate of about two flaws per square yard.
i) In a given one-square-yard section of the material, what is the probability
of finding three or more flaws?
ii) What is the probability of finding three or more flaws in a ten-square-yard
section of the material?

Solution

Let X = number of flaws in a one-square-yard section.


Y = number of flaws in a ten-square-yard section.

Given:  = two per square-yard.

i) P( X = 2 or 3 ) = P( X = 2 ) + P( X = 3 )

e 2 2 2 e 2 2 3
 
2! 3!

ii) P (Y  3)  1  P (Y  2)
 1  { P (Y  0)  P (Y  1)  P(Y  2)}
e 20 20 0 e 20 201 e 20 20 2
 1 {   }
0! 1! 2!

7.4.1.3 Hypergeometric Distribution

Vic Baluyot
Probability and Inference

 like the binomial distribution it is used for experiments consisting of n


trials where each trial can result in either a “success” or “failure”

Characterization:
 used for experiments consisting of n trials where each trial can result in
either a ‘success’ or ‘failure’
 the totality of ‘successful’ outcomes is fixed and changes from outcome to
outcome

 m  N  m
  
 x n  x 
Form: f ( x )  , x = 0, 1, . . . , min{n,m}
 N
 
 n
where m = total number of successes
n = number of trials
x = number of successes in n trials.

Sampling from the Binomial distribution is called sampling with replacement,


while sampling from the Hypergeometric distribution is called sampling
without replacement.

Example

Suppose that samples of size 5 are drawn from a lot of size N = 100.
Furthermore, suppose that the lot contains 95 conforming units. Calculate
the probability that the sample will not contain any non-conforming units.

Solution

Let X = number of non-conforming units in a sample of size 5.

Given: N = 100, m = 5.

 5  95
  
 0  5 
P ( X  0) 
 100
 
 5 
= 0.76958

P( X  1)  1  P ( X  0)
= 1 - 0.76958

7.4.1.4 Negative Binomial Distribution

Vic Baluyot
Probability and Inference

 used when the interest is in the number of failures before the rth success
is observed generating experiment is the Binomial experiment

 r  x  1 r
Form: f ( x )    p (1  p) x , x = 0, 1, 2, . . .
 x 

where r = the required number of successes


p = probability of success (constant form trial to trial)
x = number of failures before the rth success is observed

Note: If r =1, the distribution is known as the geometric distribution.

Example

In a given sampling plan a lot is not considered for shipping unless it passes
through three inspection gates each of which utilizes the same sampling
procedure. If the lot is rejected in a gate, the defectives found in the sample
are reworked immediately and resubmitted with rest of the lot for another
round of sampling in the same game. Suppose the sampling plan calls for the
examination of ten units and the lot is rejected if at least three defectives are
found. Given that the current yield is 95%, compute the probability that a
given lot will have to be re-examined four times before being shipped.

Solution

Let X = number of times the lot is rejected until it passes through the three
gates.
p = the probability that a lot will pass a given gate in a single inspection.
Assume that the size of the lot is large relative to the sample size.

 10  10  10


p    ( 0.05) 0 ( 0.95) 10    ( 0.05) 1 ( 0.95) 9    ( 0.05) 2 ( 0.95) 8
 0  1  2
 3  4  1 3
P ( X  4)    p (1  p) 4
 4 

7.4.2 Continuous

7.4.2.1 Normal Distribution

 most known and useful of the distributions because of its idealization of a


‘normal process
 heavily used in SPC since many processes approximate it

This ‘phenomenon’, most of the time, is reasoned out as the result of the so-
called central limit effect.

Vic Baluyot
Probability and Inference

Typical examples: alignment, track width.

1
1  2 ( x  )2
Form: f ( x )  e 2 
 2
where  = the mean of the process
 = standard deviation of the process
x = measurement value.

Calculation of probabilities of continuous processes involves integration.


Most of the time, the integrals are cumbersome to calculate. Hence for
application processes, tables were prepared to ease up the calculations.

Standard Normal Table

 used to obtain normal probabilities by standardizing the original


measurements

X 
Standardization: Z 

Examples

1. Let X follow a normal distribution with mean 25 and variance 9. What is


the probability that X will exceed 31? What is the probability that X will be
between 21 and 40 (inclusive)?

Solution

Given:  = 25, 2 =9.

31  25
P ( X  31)  P ( Z  )
3
= 1 - P( Z  2 )

P ( 21  X  40)  P ( X  40)  P ( X  21)


 P ( Z  5)  P ( Z  4 / 3)

2. In 1988, Motorola Corporation was one of the first recipients of the


Malcolm Baldridge National Quality Award. This recognition was an off-
shoot of its ‘6-sigma’ program. Under this policy, conformance to product
standards is adhered to if the measurement is within 6 sigma limit of the
mean. If the process follows a normal distribution with mean 8 and
variance 0.16, what is the probability that the unit will be within 6 sigma
limits of the mean?

Vic Baluyot
Probability and Inference

Solution

Let X = measurement of the unit.

Given:  = 8, 2 = 0.16.

P(| X  |  6 )  P ( 6  X    6 )
 P ( 6  Z  6)
 P( Z  6)  P( Z  6)

Hence, the probability of non-conformance is given by


1 - P(Z  6) - P(Z < -6)

Central Limit Theorem

 result which states that the sampling distribution of the mean can be
characterized by the normal distribution
 makes it possible to state probabilistic statements regarding the behavior
of the sample mean

Result: If the measurements follow some distribution with finite mean  and
standard deviation , then the sample mean has an approximate normal
distribution with mean  and standard deviation /n.

Example

A sample of 50 items from a large shipment yielded a mean of 25 units and


variance of 9 square units. If the true process mean is 23 and the standard
deviation is 2.5, what is the probability hat a similar sample with the same
size will exceed the registered sample mean?

Solution

Let X = the measurement.

Given: = 23 x  25
 = 2.5 s = 3.

23  25
P ( X  25)  P ( Z  )
2.5 / 50

Some Approximations

1. Normal to the Binomial


 used when the original measurements follow the Binomial distribution
where n is large and p is near 0.5

Vic Baluyot
Probability and Inference

Approximation: X follows a normal distribution with mean np and variance


np(1 - p).

Example

Consider a binomial random variable X with parameter n = 35 and p = 0.3.


Suppose one wants to find the probability that X exceeds 9.

Solution

Let X = the measurement.

Given: n = 35, p = 0.3.

9.5  3.5(0.3)
P ( X  9)  P ( Z  )
35(0.3)(0.7)

Since a continuous random variable is being used to calculate the probability


of a discrete random variable, a continuity correction is usually applied. If the
event is of the greater than type, 0.5 is added on the treshold number,
otherwise 0.5 is subtracted from the number.

2. Poisson to the Binomial


 used when the original measurements follow the Binomial distribution
where n is large and p is small (close to zero)

Approximation: X follows Poisson distribution with mean np.

Example

Consider a Binomial random variable X with parameters n = 100 and p =


0.01. Suppose one wants to find the probability that X will be at most 5.

Solution

Let X = measurement.
Given : n = 100, p = 0.01.

P ( X  5)  P ( X *  5)
where X* = Poisson random variable with mean occurrence of 1. Thus,
5
P( X  5)   P( X  x )
x0
5
e 1 (1) x
 x0 x!

Vic Baluyot
Probability and Inference

7.4.2.2 Exponential Distribution

 used mostly to model lifetimes and waiting times


 heavily used in reliability theory to measure how products will last before
they become unfit for use
 distribution is skewed to the right with thin tails

Typical Examples:

1. Lifetime of a particular type of electronic component


2. Arrival time of a single call to a switchboard

Form: f ( x )  e  x , x>0
where  = arrival/failure rate
x = waiting time/ life time of the unit.

Examples

1. To assess the failure rate of an electronic component, many components


(of the same type) are tested by operating them continuously and
recording the time when each fails. This procedure is called life testing.
Suppose that for a particular component, 7 out of 200 units failed in a
1,000 hours of operation. What is the probability that a given unit within
the component will still be in operation after 1,000 hours?

Solution

Let X = the life time of a given unit.

Given:  = 7/200 failures per 1000 hour.

P ( X  1)  1  P( X  1)
1
7  ( 7 / 200 ) t
 1  e dt
0
200
= e-7/200

2. Suppose that calls arrive at a particular switch board at a rate of 200 calls
per minute. Assuming that the calls are distributed as exponential, what
is the probability that no calls will arrive in the next two minutes?

Solution

Let X = the arrival time of a call.

Given:  = 400 calls per 30 seconds.

Vic Baluyot
Probability and Inference

P ( X  1)  1  P( X  1)
1

 1   400e  400t dt
0
= e-400

7.4.2.3 Gamma-type Distribution

 used mainly to model the inter-arrival time or waiting time/failure time for
a batch of r units
 used to calculate the probability for the total waiting time/failure time for
units whose individual waiting times/failure times are exponentially
distributed

1
Form: f ( x)  r x r 1e  x , x > 0
 (r )

where  = the arrival/failure rate


 (r ) = (r-1)!, for r integer
x = the waiting time/failure time for r units.

Example

Suppose that the life times of a particular electronic unit is exponentially


distributed with a rate of 2 failures in 6 hours. What is the probability that
one has to wait for more than four hours to observe six failures of the said
electronic unit?

Solution

Let X = total life time of six electronic units.

Given:  = 2/6 failures per hour, r = 6 units.

P ( X  4)  1  P ( X  4)
6
4 ( 2 / 6)
 1  t 61e  2 t / 6 dt
0  ( 6)

7.4.2.1 Chi-Square Distribution

 arises when n independent standard normal variates are squared each


and the squares summed
 the sum of the squares is distributed as chi-square with n degrees of
freedom

Vic Baluyot
Probability and Inference

 main reference distribution when doing goodness of fit tests (testing


whether distributional assumption fits the observed data) and testing
variances

As the calculation of chi-square probabilities is difficult, a table of chi-square


probabilities has been prepared and can be found in most elementary
statistics books. The rows in the table represent degrees of freedom, while
the columns represent the corresponding tail probabilities. The entries are
the upper percentage points corresponding to a given tail probability and
degrees of freedom.

Notation: 2(n) = chi-square random variable with n degrees of freedom

Examples

i) P( 2(15) > 11.72 ) = 0.7

ii) P( 10.85 < 2(20)  28.41 ) = P(2(20)  28.41 ) - P(2(20)  10.85)


= 0.9 - 0.5
= 0.85

iii) P( 2(5)  5 ) = 0.3

7.4.2.1 Student’s t - Distribution

 arises when a standard normal variate is divided by the root of the ratio of
a chi-square variate and its degree of freedom, the two variates being
independent
 main reference distribution when testing the significance of the
coefficients in linear models and testing means under small sample sizes

Notation : t(n) = t random variable with n degrees of freedom

As in the case of chi-square variates, t tables abound to facilitate the


computation of probabilities. In published tables, the rows represent the
degrees of freedom, while the columns represent the corresponding upper
tail probabilities. The entries are the upper percentage points corresponding
to a given tail probability and degrees of freedom.

Examples

i) P( t(7) > 2.365 ) = 0.025

ii) P( 1.706 < t(26)  2.056 ) = P( t(26)  2.056 ) - P( t(26)  1.706 )


= 0.975 - 0.95
= 0.025

Vic Baluyot
Probability and Inference

iii) P( t(400)  1.96 ) = 1 - P(t(400)


 1 - 0.025
= 0.975

7.4.2.1 F - Distribution

 arises when the ratio of two independent chi-square variates divided by


their respective degrees of freedom is taken
 used as reference distribution in Analysis of Variance (ANOVA) problems,
and testing equality of variances

Tables involving 0.1 and 0.05 tail probabilities are published for F distribution.
The rows represent the denominator degrees of freedom, while the columns
represent the numerator degrees of freedom. The entries represent the
upper percentage point for a particular numerator and denominator degrees
of freedom.

Notation : F(v1,v2) = F random variable with v1 numerator degrees of


freedom and v2 denominator degrees of freedom

Note: F(v2 ,v1) = 1/F(v1,v2).

Examples

i) P( F(24,4) > 2.19) = 0.1

1 1
ii) P (  F (15,20)  2.2)  P ( F (15,20)  2.2)  P ( F (15,20)  )
2.33 2.33
= 0.95 - P( F(20,15)  2.33 )
= 0.95 - 0.05
= 0.90

iii) P( F(10,10)  2.32 ) = 0.9

Exercises

1. A carton contains a dozen electric light bulbs including one that is


defective. In how many ways can two of the light bulbs be selected so
that
i) the defective bulb is not included;
ii) the defective bulb is included.

Vic Baluyot
Probability and Inference

2. Ten identical personal computers are in the inventory of a dealer, and one
has a hidden defect. If three are to be shipped, and the computers are
selected in such a way that each has the same probability of being
shipped, find
i) the probability that a computer with a hidden defect will be shipped;
ii) the probability that all the computers that will be shipped are defect-
free.

3. The probabilities that 0, 1, 2, 3, 4, 5 or at least 6 private aircraft wil land at


a small airport on a certain day are 0.003, 0.009, 0.090, 0.158, 0.197,
0.261 and 0.282. What are the probabilities that
i) at least five private aircraft will land;
ii) at least two private aircraft will land;
iii) from 2 to 5 private aircraft will land.

4. The probability that a part will turn out to be pitted is 0.05 and the
probability that it will crack is 0.20. If the occurrence of these two types of
defects is independent of each other find the probability that the part will
be defective.

5. A large lot of parts is rejected by your customer and found to be 20%


defective. What is the probability that the lot would have been accepted
by the following sampling plan: sample size = 10, accept if the no
defectives and reject if one or more defectives.

Vic Baluyot
Probability and Inference

8. POINT ESTIMATION

The need to know the actual values of population parameters bring into focus
the problem of estimation.

Statistics generated out of the data collected are considered as estimates of


the population parameter.

The process of generating estimates can be likened to a guessing game. The


methods of statistics ensure that the guesses are scientific and can be
considered as “best” under the prevailing circumstances.

An estimator is a rule of assigning a value to a collected data. The value is


called an estimate.

Questions:

1. How do you come up with a best estimator?


2. Will you prefer a a single estimate (point estimate) or range of estimates
(interval estimate).
3. How will an estimator fare when the sample size is small or sufficiently
large?

8.1 Some Criteria

1. Consistency

An estimator is said to be consistent when it eventually yields a value equal


to the actual parameter value as the sample size becomes very large.

If the parameter value is considered as the bulls eye of a dart game, then
consistency means the darts hitting the bulls eye in the long run.

2. Unbiasedness

An estimator is said to be unbiased if it generates estimates which equals the


parameter value on the average.

Following the dart game analogy, unbiasedness means the darts hitting the
bulls eye on the average.

3. Sufficiency

An estimator is sufficient if it successfully reduces the dimension of the data


without losing pertinent information.

A sufficient estimator contains the same amount of details as the original set
of data.

Vic Baluyot
Probability and Inference

4. Minimum Variance

An estimator is a minimum variance estimator if it has the least variance


among all possible estimators of the parameter.

Minimum variance is sometimes equated to precision. Least variability


means that the dart hits are close to each other.

Due to external reasons and errors in the measurement process, an


estimator may exhibit some degree of bias. Bias is generally defined as the
average distance of an estimator from its target value. It affects the
precision through this relation.

precision = variance + bias2

From the relation above maximum precision can be obtained only if both the
variance and the square of the bias are minimized simultaneously.
Examples

1. The sample mean, X , is a consistent, unbiased, sufficient, and minimum


variance estimator for the population mean, while the mode and median
are not.

2. The sample variance, s 2, is a consistent, unbiased, sufficient, and


minimum variance estimator for the population variance.

8.2 Estimation of the Mean and Variance

For simple characterization of the population of interest, it is enough to


estimate the population mean and variance.

The mean gives a representative value while the variance gives a measure of
the scatter of observations around this average.

Forms:

1. Population Mean

  xf ( x) ,
x 
if X is discrete

  E( X )  

xf ( x ) dx , if X is continuous

2. Variance

 
2
 ( x  )
x 
2
f ( x) , if X is discrete

 2  E ( X  ) 2   
( x   ) 2 f ( x ) dx , if X is continuous

Vic Baluyot
Probability and Inference

where E = expectation operator


f = density/mass function of the random variable X.

The mean and variance change as the distributional assumptions on X


change.

Summary of Means and Variances for Various Distributions

Distribution Mean Variance

Discrete

1. Binomial np np(1-p)
2. Poisson  
nm nm( N  m)( N  n)
3. Hypergeometric
N N 2 ( N  1)
r (1  p) r (1  p)
4. Negative Binomial
p p2

Continuous

1. Normal  2
1 1
2. Exponential
 2
r r
3. Gamma
 2
4. Chi-square n 2n
n
5. t-distribution 0
n2
v2 2 v 2 2 ( v1  v 2  2 )
6. F-distribution
v2  2 v1 ( v 2  2 ) 2 ( v 2  4 )

Examples

1. The gap 1 wafer thickness is measured for 12 batches. The table


below gives the measurements.

Shift 1 2 3 4 5 6
Thickness 216 212 209 216 207 210

Shift 7 8 9 10 11 12
Thickness 215 204 195 210 201 198

The average thickness is given by

Vic Baluyot
Probability and Inference

12

X i
i 1
X 
12
216  212 ...198

12
= 207.75.

The sample variance is given by the short-cut formula

2
12
 12 
12 X    X i  2

i 1  i 1 
i

s2 
12(11)
= 48.75.

Assuming that wafer thickness is normally distributed,  is estimated as


207.75 and 2 as 48.75.

2. Wafer batches of size twenty were inspected for scratches. Wafer in each
batch were also classified according to defective or nondefective. A total
of 10 batches were inspected, one for each shift. The data summary is
given the table below.

Shift 1 2 3 4 5 6
No. of Scratches 27 23 30 28 29 31
No. of Defectives 6 3 4 5 3 4

Shift 7 8 9 10
No. of Scratches 37 29 36 27
No. of Defectives 4 5 4 3

The average number and variance of the scratches are given by

10

X i
i 1
X 
10
27  23...27

10
= 29.7
and

2
 10 
10
10 X    X i  2

i 1  i 1  (Formula)
i

s2 
10(9)
=17.57.

Vic Baluyot
Probability and Inference

On the other hand, the average and variance of the number defective in a
wafer batch of size twenty is given by

10

X i
i 1
X 
10
6  3...3

10
= 4.1

and

2
 10 
10
10 X    X i  2

i 1  i 1 i

s 
2
10(9)
= 0.99

If the number of scratches is assumed to be distributed as Poisson, then  is


estimated as 29.10. On the other hand, if the number of defectives is
assumed to be binomially distributed then p is estimated as 0.205.

8.3 The Standard Deviation

Although the variance gives a measure on how far observations are from the
average, the real distance is given by the standard deviation.

One can make use of the sample standard deviation to estimate the
population standard deviation. However, unlike the sample variance it is
biased and inefficient.

8.4 The Sample Range

The range does not contain the same amount of information as the varaince
or the standard deviation in assessing variability. What we know though is
that if the range of values is small, variability would also tend to be small.

In variable control chart construction, the range is being used as an input to


construct the upper limits and lower limit in lieu of the standard deviation.

The population range is usually estimated by the sample range which is


biased and inefficent, especially when the support of the distribution is the
entire real line.

Vic Baluyot
Probability and Inference

9. INTERVAL ESTIMATION

Unlike point estimation, interval estimation provides a range of values which


can be used as educated guesses to the true parameter value.

A range of estimates as opposed to a single estimate provides greater


confidence of hitting the actual parameter value. Hence, interval estimates
are also judged according to their corresponding confidence levels (usually
expressed in terms of probabilities).

9.1 Some Criteria

1. Narrow Width

An interval estimator with a narrow width will be more informative since it


will connote greater precision.

2. Accuracy

An interval estimator is considered accurate if its probability of capturing the


correct parameter value is higher than its probability of capturing an
incorrect parameter value.

3. Unbiased

An interval estimator is considered unbiased if its probability of capturing the


correct parameter values exceeds what is desired.

An interval estimator is defined by a lower confidence bound (LCB) and an


upper confidence bound (UCB). Values in between these are regarded as
possible estimators for the parameters of interest.

LCB = point estimator - k*standard error


UCB = point estimator + k*standard error

where k is a percentage point of the distribution of the distribution being


followed by the point estimator corresponding to some confidence level.

Note: The standard error is the standard deviation of the point estimator.

9.2 Normal Population

Suppose the basic measurement follows a normal distribution, where the


population mean is the parameter of interest. The table below gives a
summary of the ( 1-  )100% confidence interval estimators for the mean.

Vic Baluyot
Probability and Inference

LCB UCB Scenario


  If the population
1. X  Z /2 X  Z /2
standard is unknown
n n
deviation  is known.
s s If the population
2. X  t  / 2 (n  1) X  t  / 2 ( n  1)
standard deviation  is
n n
unknown.
s s If the sample size n is
3. X  Z /2 X  Z /2
large ( > 120) even if 
n n
is unknown.

Here, P(Z > z) =  and P(t > t(n-1) = .

Example

Assume that track widths are distributed normally. Based on an


undetermined sample size an average width of 28.2 units and a standard
deviation of 3.8 units were obtained. Construct a 95% confidence interval for
the population mean if n=125. How about if n=60?

Solution

Given: /2 = 0.025, X  28.2 and s=3.8.

i) n = 125
t0.025(24) = 2.064

Since the population standard deviation  is unknown, we use 2.

3.8 3.8
95% CI  ( 28.2  2.064 x ,28.2  2.064 x )
125 125

Interpretation: If a sample size of 125 is repeatedly obtained from the


process and the corresponding confidence intervals will be constructed, then
95% of the CI constructed will contain the true value of the population mean.

ii) n = 60
Z0.025 = 1.96

3.8 3.8
95% CI  ( 28.2  1.96 x ,28.2  1.96 x )
60 60

9.3 Two Independent Normal Populations

Vic Baluyot
Probability and Inference

Population groups are usually differentiated through comparison of their


means and variances. If measurements were taken from two independent
normal populations denote by X i and si, the sample mean and standard
deviation, respectively, of the measurements from population i, i=1,2. The
table below gives a summary, the ( 1 -  )100% level interval estimators for
the difference between the means 1 - 2.

LCB UCB Scenario


 2
 2 If 1 and 2 are known.
1 2
1. X 1  X 2  Z / 2   12  22
n1 n2 X 1  X 2  Z / 2 
n1 n2
2. 1 1 If 1 and 2 are unknown
X 1  X 2  t / 2 S p  but assumed to be
1 1 n1 n2
X 1  X 2  t / 2 S p  equal.
n1 n2
where
(n1  1) s12  (n2  1) s22
S 
2
p
n1  n2  2
s12 s22 s12 s22 If 1 and 2 are unknown
3. X 1  X 2  t  / 2 '  X 1  X 2  t / 2 '  but assumed to be
n1 n2 n1 n2 unequal.
Where
s12 s22
t ( n  1)  t  / 2 ( n2  1)
n1  / 2 1 n2
t /2 ' 
s1 s22
2


n1 n2
s12 s22 If 1 and 2 are unknown
4. X 1  X 2  Z / 2  s 2
s 2 but n1 and n2 are
n1 n2 X 1  X 2  Z / 2 
1 2
sufficiently large (n1, n2
n1 n2  120).

Example

Two sample lots were taken out of two areas producing the same make of ICs
in a manufacturing plant. The samples were subjected to accelerated testing
to determine whether the two areas to produce ICs with the same life span.
Results showed that Area 1, with a sample of 21 chips, yielded a mean life
span of 427 units with a standard deviation of 14 units. While a sample of 30
chips from Area B yielded a mean life span of 400 units with a standard
deviation of 9 units. Construct 95% CIs for the difference of the mean life
spans if the population standard deviations are assumed to be equal and
assumed to be not equal.

Solution

Vic Baluyot
Probability and Inference

Given: n1 = 21, X 1  427 , s1 = 14


n2 = 30, X 2  400 , s2 = 9.

i) Population Standard Deviations Are Equal

t 0.025 (49)  2.01


( 21  1) x14 2  ( 30  1) x 9 2
S 
2
p
21  30  2

 1 1 1 1 
95% CI   427  400  2.01xS p  , 427  400  2.01xS p  
 21 30 21 30 

ii) The Population Standard Deviation Are Assumed Unequal

t 0.025 (20)  2.086 , t 0.025 (29)  2.0452

14 2 92
x 2.086  x 2.0452
t 0.025 '  21 30
14 2 9 2

21 30

 14 2 9 2 14 2 9 2 
95% CI   427  400  t 0.025 '  , 427  400  t 0.025 '  
 21 30 21 30 

9.4 Estimating the Variance of Normal Populations

The problem of variance estimation arises when the process variability is


more of interest than the process level. This is very common in
manufacturing industries where process capability is always of interest. To
them, the lower the deviations are from products’ specs the better. The table
below gives a summary of the (1 - )100% confidence level interval
estimates for the variance
.

LCB UCB Scenario


(n  1) s 2
(n  1) s 2 If the process variance
1. 2 is of interest.
2 / 2 ( n  1) 12 / 2 (n  1)
s12 s12 If the ratio of two
2 2 variances given by
2. s s
2 2 12/22 is of interest.
F / 2 (n1  1, n2  1) F1 / 2 (n1  1, n2  1)

Vic Baluyot
Probability and Inference

Example

Process engineers in a particular plant were concerned whether operators in


the QC gate and the 100% inspection gate have the same level of efficiency
in identifying defective materials. The two areas of inspection report on the
average the same level of defectives, however, the reported defectives wildly
varies, maybe because of fatigue. Data culled showed that on a 21-day
basis, QC gate has a variance of 1,600; while on a 16-day basis, 100%
inspection has a variance of 1,225. Compute for a 95% CI for the ratio of the
two variances.

Solution

Given: s12 = 1600, s22 =1225


n1 = 21, n2 = 16, F0.025(20, 15) = 2.76, F0.975(20, 15) = 1/2.76.
 1600 / 1225 2.76(1600) 
95% CI   , 
 2.76 1225 
= (0.473, 3.36).

The result shows that it is possible that the two groups’ operators have the
same efficiency since the interval contained 1, the point at which 12 = 22 .
However, the interval has more values greater than 1, hence, it is more likely
that 12 > 22 . Thus, QC gates report more variable defectives, hence, more
efficient in the customer’s eyes, but less efficient in the management’s eyes.

9.5 Estimating Proportions

If the interest is focused in an attribute which can result in either a success or


a failure, then estimation of the proportion of success will be of interest. For
example, the proportion of product defectives turned out by a plant will
always be of interest to the manufacturers. The table below lists a summary
of the (1 - )100% confidence interval formulas to be used in estimating
proportions and difference between two proportions.

LCB UCB Scenari


o
p (1  p ) p (1  p ) If n is
1. p  Z / 2 p  Z  / 2 sufficient
n n
ly large (
n  30).
where
  number of successes/n
p
2. 2. If ni is
sufficient

Vic Baluyot
Probability and Inference

p 1 (1  p 1 ) p 2 (1  p 2 ) p 1 (1  p 1 ) p 2 (1  p 2 ) ly large (
p 1  p 2  Z / 2  p 1  p 2  Z / 2  ni  30, i
n1 n2 n1 n2
= 1,2).
where
p i  number of successes in grp.
i / ni

Example

A client has narrowed down his choices to two plants A and B for
subcontracting a portion of his production load. The client made random
inspection of the prospective subcontractors and found 50 defectives out of
100 units in plant A and 100 defectives out of 1,250 units in plant B. If he
were to use these data, which plant would he choose?

Solution
50
Given : n1 = 1000, p 1   0.05
1000

100
n2 = 1250, p 2   0.08 , Z0.025 = 1.96.
1250

 0.05(0.95) 0.08(0.92) 0.05( 0.95) 0.08(0.92) 


95% CI   0.05  0.08  1.96  ,0.05  0.08  1.96  
 1000 1250 1000 1250 

Vic Baluyot
Probability and Inference

10. HYPOTHESIS TESTING

Beliefs and supposition usually arise when trying to characterize a


population. This often happens when a decision maker makes an inference
regarding a parameter of interest. Structurally speaking, the inference
comes in the form of hypotheses. Data are then collected and using them
the formulated inference is either rejected or accepted.

Example

In process control, a process engineer may have suspicion that the process is
turning out units that do not conform to specs. This hypothesis may be
restated in terms of the mean or variance of some measurement on the units.
A sample is then collected and based on some calculated statistic, the
process may be declared as either within control or out of control.

10.1 Some Terminologies

1. Null Hypothesis (Ho)

 conjecture or belief being tested, usually stated in terms of conditions


presumed to be true in the population of interest

2. Alternative Hypothesis (Ha)

 complement of the null hypothesis, usually considered as a fallback


whenever the null hypothesis is rejected
 to be considered true, data must exhibit extreme evidence supporting it

3. Test Statistic

 value computed from the data whose principal use is to measure the
difference between the data and what is expected in the null hypothesis

Observed  Expected
Form: Statistic 
SE
where SE = standard error

Note: The value of the test statistic varies as the sample is varied and hence
a random variable. Thus, it generates a distribution which can be used as
reference for predicting its values.

4. Rejection Region

 range of values which if achieved by the test statistic will instruct the
decision maker to reject the null hypothesis in favor of the alternative
hypothesis

Vic Baluyot
Probability and Inference

5. Acceptance Region

 range of values which if achieved by the test statistic will instruct the
decision maker no to reject the null hypothesis

6. Level of Significance of a Test

 expressed in terms of probability


 gives the chance of getting evidence against the null hypothesis

Rejection of the null hypothesis implies that sufficient evidence has been
found to warrant its rejection. Non-rejection of the null hypothesis, on the
other hand, implies that not enough evidence has been found.

10.2 Truth Table

The table below summarizes the result of a testing situation.

Possible Null Hypothesis


Condition of
Ho True Ho False
Possible Action Do not reject Correct Action Type I Error
Ho
Reject Ho Type II Error Correct Action

In any testing scenario, the two errors indicated above usually come in. It is
the goal that these errors be minimized in any testing situation.

A good test is one that minimizes the rejection of a true hypothesis and
maximizes rejection of a false hypothesis.

Since both errors can not be minimized simultaneously, the usual approach is
to set the level of significance to a small value and then the rejection level
for false hypothesis is maximized.

Flow Diagram

State Hypotheses

Gather Data

Select Test Statistic

State Decision Rule



Calculate Test Statistic

Vic Baluyot
Probability and Inference


Do Not Reject Ho  Make Statistical Decision  Reject Ho
 
Conclude Ho May Be True Conclude Ha Is
True

10.3 Testing the Mean of a Normal Population

If measurements are believed to be normal and a hypothesis about the true


population mean is to be made, then the null hypothesis will have the
following form:

Ho:  = 0

The table below summarizes the corresponding decision making elements for
testing Ho.

Ha Test Statistic Rejection Conditions


Region

 > 0 X  0 Z > z If the population


Z standard
 n
deviation  is
known.
 < 0 Z < - z
  0 |Z| > z/2

 > 0 X  0 t > t(n - 1) If the population


t standard
s n
deviation  is
unknown.
 < 0 t < - t(n - 1)
  0 |t| > t/2(n - 1)

 > 0 X  0 Z > z If  is unknown


Z
s n and n is large (n 
120).
 < 0 Z < - z
  0 |Z| > z/2

Example

In the track width example, suppose that it is desired to test at level 0.05
whether the sample was obtained from a population with an average width of
26.5 units, where n = 25.

Vic Baluyot
Probability and Inference

Solution

Given : /2 = 0.025, s = 3.8


X  28.2 , n = 25, 0 = 26.5
t 0.025 ( 24)  2.064

Since the population standard deviation is unknown we calculate the test


statistic
28.2  26.5
t 
3.6 25
If the alternative taken is   26.5, then since t = 2.24 > 2.064 we say that
we have sufficient evidence to assert that the mean track width in not 26.5.

If the test level is to be based on  = 0.01, which is stricter then


t 0.005 (24)  2.7969 . Hence since t = 2.24 < 2.7969, we don’t have sufficient
evidence to reject the assertion that the mean track width is 26.5.

If we take the alternative of  > 26.5 at  = 0.05 then t 0.05 ( 24)  17109
. .
Thus, since t = 2.24 > 1.7109, we reject the null hypothesis and assert that
the average track width is greater than 26.5.

10.4 Testing the Mean Difference of Two Independent Normal Populations

If the measurements are taken from two independent samples and a


conjecture on their mean difference is entertained then the null hypothesis
will take the following form:

Ho: 1 = 2

where i is the population mean of the ith population, i = 1,2. The table
below summarizes the corresponding decision making elements for testing
Ho.

Ha Test Statistic Rejection Conditions


Region

1 > 2 X1  X 2 Z > z If 1 and 2 are


Z known.
12  22

n1 n2
1 < 2 Z < - z
1  2 |Z| > z/2

Vic Baluyot
Probability and Inference

1 > 2 X1  X 2 t > t(n1 + n2 - If 1 and 2 are


t 1) unknown but
1 1
Sp  are assumed to
n1 n2 be equal.
1 < 2 (n1  1) s12  (n2  1) s22 t < - t( n1 + n2 -
where S p2  1)
n1  n2  2
1  2 |t| > t/2(n1 + n2
- 1)

1 > 2 X1  X 2 t > t’ If 1 and 2 are


t unknown but
s12 s22 are assumed to

n1 n2 be unequal.
1 < 2 where t < t’
2 2
s 1 s 2
t  / 2 ( n1  1)  t  / 2 ( n2  1)
n n2
t /2 '  1
s1 s22
2


n1 n2
1  2 |t| > t/2’

1 > 2 X1  X 2 Z > z If 1 and 2 are


Z unknown but n1
s12 s22 and n2 are

n1 n2 sufficiently
large (n1, n2 
120).
1 < 2 Z < - z
1  2 |Z| > z/2

Example

In the accelerated testing example, suppose that the difference of the mean
life spans of the units coming from the two areas are to be tested. Suppose
that instead of n1 = 21 we have n1= 210 and instead of n2 = 30 we have n2 =
300. Use a level of significance equal to 0.05.

Solution

Given : n1= 210, X 1  427 , s1 = 14


n2= 300, X 2  400 , s2 = 9
Z0.025 = 1.96

To test: Ho: 1 = 2.

Vic Baluyot
Probability and Inference

Since the population standard deviations are unknown but n 1 and n2 are large
we calculate the test statistic

427  400
Z
14 2 92

210 300
= 24.77

If the alternative is taken as 1  2 , then since Z = 24.77 > 1.96 we say that
the average life spans differ from each other.

If the alternative is 1 > 2 , then since Z = 24.77 > 1.654 = Z 0.05 we say that
the average life span of units coming from the first area is greater than the
average life span of the units coming from the second area.

10.5 Testing the Variance of Normal Populations

For the evaluation of the population variance one can do a test on the
variance of a normal population where the null hypothesis is expressed as
Ho: 2 = 02 (*)
or on the difference between the variances of two independent population, in
which case Ho is expressed as
Ho: 12 = 22 (2*)

The table below summarizes the corresponding decision making elements for
testing Ho as given in (*) and (2*).

Ho Ha Test Rejection Region Conditions


Statistic

2 = 02 2 > 02 (n  1) s 2  2  2 (n  1) If the process


 
2
variance 2 is
 02
of interest.
2 < 02  2  12 (n  1)
2  02  2  2/ 2 (n  1)
or
 2  12 /2 (n  1)

12 = 12 > 22 s12 F  F (n 1 1, n2  1) If the


22 F difference
s22 between two
variances is
of interest.
12 < 22 F  F1 (n 1 1, n2  1)
12  22 F  F / 2 ( n 1 1, n2  1)
or

Vic Baluyot
Probability and Inference

F  F1 / 2 (n 1 1, n2  1)

Example

In the gate inspection example, test the hypothesis at  = 0.01 that the QC
gate and the 100% inspection gate yield the same variance levels.

Solution

Given:n1 = 21, s12 = 1600


n2 = 16, s22 =1225
F0.01(20, 15) = 3.37

To test: Ho:  12   22 .

Taking the one side alternative Ha:  12   22 , we calculate


1600
F
1225
Since F is not greater than 3.37 we say that there is not enough evidence to
show that the variance level of QC gate is greater than that of the 100%
inspection gate. Hence it can be said that the operators in both gates
perform at the same level of efficiency.

10.6 Testing for Proportions

If the focus is on the population proportions, one can do a test for a single
proportion where the null hypothesis is expressed as
Ho: p  p0 (*)
or difference between two proportions from two independent populations in
which case Ho is expressed as
Ho: p1  p2 (2*)

The table below summarizes the corresponding decision making elements for
testing Ho as given in (*) and (2*).

Ho Ha Test Statistic Rejection Conditions


Region

p = p0 p > p0 p  p0 Z  z If n is
Z sufficiently
p (1  p )
large ( n 
n 30 ).
p < p0 Z   z
p  p0 | Z |  z / 2

Vic Baluyot
Probability and Inference

p1 = p 2 p1 > p 2 p 1  p 2 Z  z If ni is
Z sufficiently
p 1 (1  p 1 ) p 2 (1  p 2 )
 large (ni 
n1 n2 30 ), i = 1,2.
p1 < p 2 Z   z
p1  p 2 | Z |  z / 2

Example

For the subcontracting example, test the hypothesis (at  = 0.01) that the
level of defectives of plant A is the same as plant B against the alternative
that plant A has the lower level of defectives than B.

Solution

50
Given : n1 = 1000, p 1   0.05
1000

100
n2 = 1250, p 2   0.08 , Z0.01 = 2.33.
1250
To test: p1 = p2.

Based on the given the test statistic is given by


0.05  0.08
Z 
0.05(1  0.05) 0.08(1  0.92)

1000 1250

= -2.9

Since Z is less than -2.33, there is sufficient evidence to suggest that plant A
has a lower defective level than plant B.

Exercises

1. An equipment manufacturer offers warranty on a product for a period of two years after
installation. An investigation revealed the following information.

Mean Std. Dev.

Time lag from date of 10 Weeks 3 Weeks


production to date to of sale
(to dealer)
Time lag from date of sale to 14 Weeks 3.5 Weeks
date of installation
Time lag from date of 30 Weeks 10 Weeks

Vic Baluyot
Probability and Inference

installation to date of warranty


claim

Each of these time lags is normally distributed, and each is independent of the other. The
manufacturer produced 4,000 units of a particular model. 45 weeks later, a total of 23 warranty
claims had been processed.

i) What is the average time lag from time of production to date of processing claims?
ii) Out of the 23 warranty claims, what proportion of the likely total (eventual) number of
warranty claims has been processed?
iii) How may of these are likely to eventually result in warranty claims?

2. A process is producing material which is 30% defective. Five pieces are selected at random
for inspection
i) What is the probability of exactly two good pieces being found in the sample?
ii) If two good pieces were exactly found, construct a 95% CI for the proportion of defectives
being turned out by the process (Assume that normality holds.).
iii) Using the inspection result in ii), test the hypothesis that the process is turning 30% defective.

3. A sampling plan calls for taking a random sample of 100 items from a lot. If 3 or less are
non-conforming, the lot is accepted. If 4 or more are non-conforming, the lot is rejected.
What is the chance of accepting a lot of 400 items of which 20 are non-conforming?

4. Past data suggest that the mean diameter of bushings turned out by a manufacturing process
is 2.257 in. and the standard deviation is 0.08 in..
i) Estimate the probability that a sample of 4 bushings will have a mean diameter equal to or
greater than 2.263 in..
ii) Suppose that a sample of 24 bushings yielded an average of 2.33 and a standard deviation of
0.07. Does the sample provide credence to the original assumptions regarding the process?
iii) Based on the sample result in ii) construct a 95% CI for both the mean and the variance.

5. The standard deviation of tests for determining the presence of a certain chemical in a
particular metal strip is known to be 0.06 percent. In a certain experiment, samples of the
same metal strip were put into two boxes. One box is retained in the company and the other
is sent to a state laboratory for test. At each place three determinations are made of the
percentage of the same chemical. The results are as follows:

Company Laboratory State Laboratory

4.42 % 4.39 %
4.43 4.48
4.58 4.31

Could you reasonably conclude from these results that the method of determining percentage of
the said chemical used by the state laboratory has a downward bias relative to that used by the
company?

Vic Baluyot
Probability and Inference

DIAGNOSTIC EXAM

1. A normal (Gaussian) distribution curve is:


a. bell-shaped
b. dome-shaped
c. pear-shaped
d. positive skewed.

2. Calculate the sample standard deviation for the following set of


observations: 1.5, 1.2, 1.1, 1.0, 1.6.
a. 1.280
b. 0.259
c. 0.231
d. 0.518

3. Approximately what percentage of the area under the normal curve is


included within ± 3 standard deviations from the mean?
a. 50.0%
b. 68.0%
c. 90.0%
d. 95.0%
e. 99.7%

4. What is the median for the following set of readings: 1.0, 3.0, 3.5, 4.0, 4.5,
5.0, 5.5?
a. 4.00
b. 5.00
c. 4.50
d. 3.50
e. 4.25

5. A box contains two red balls and two black balls. Given that a black ball
has been drawn, what is the probability of drawing two consecutive two
red balls in the next three draws?
a. 1/6
b. 2/3
c. 1/3
d. 1/4

6. For a normal process, the relationships among the median, mean and
mode are that:
a. They are all equal to the same value.
b. The mean and mode have the same value but the median is different.
c. Each has a value different from the other two.
d. The mean and the median are the same but the mode is different.

Vic Baluyot
Probability and Inference

7. Suppose that the occurrence of defective in a lot is 4 what is the


probability that another lot of the same size will contain no defective?
-4
a. e
-2
b. e
-3
c. e
-8
d. e

8. What value of Z in the normal tables has 5% of the area in the tail beyond
it?
a. 1.96
b. 1.645
c. 2.576
d. 1.282

An electronics firm was experiencing high rejections in their multiple


connector manufacturing departments. A CI for proportion with Z /2 = 3 was
recommended for use in all the departments to monitor their process
defectives. After six weeks, the following record was accumulated:

Dep perce WW1 WW2 WW3 WW4 WW5 WW6


t. nt

104 9 8 11 6 13 12 10
105 16 13 19 20 12 15 17
106 15 18 19 16 11 13 16

9. 1,000 pieces were inspected each week in each department. Which


department(s) exhibited a point or points out of the CIs during this period
(Round off calculations.)?
a. Department 104
b. Department 105
c. Department 106
d. All of the departments
e. None of the departments

10. Estimate the variance of the population from which the following
sample data came: 22, 18, 17, 20, 21.
a. 4.3
b. 2.1
c. 1.9
d. 5.0

11. The hypergeometric distribution is:


a. a continuous distribution;
b. used to describe sampling from a finite population without replacement;
c. the limiting distribution of the sum of several independent discrete
random variables;
d. none of the above.

Vic Baluyot
Probability and Inference

12. Confidence interval when viewed as control limits are set at three-
sigma level because:
a. This level makes it difficult for the output to get out of control.
b. This level establishes tight limits for the production process.
c. This level reduces the probability of looking for trouble in the production
process when none exists.
d. This level assures a very small type II error.

13. If a distribution is skewed to the left, the median will always be


a. less than the mean;
b. between the mean and the mode;
c. greater than the mode;
d. equal to the mean;
e. equal to the mode.

14. Let X be any random variable with mean  and standard deviation .
Take a random sample of size n. As n increases and as a result of the
central limit theorem:
a. The distribution of the sum S n  X 1  X 2 ... X n approaches a normal
distribution with  and standard deviation /n.
b. The distribution of the sum S n  X 1  X 2 ... X n approaches a normal
distribution with  and standard deviation .
c. The distribution of the sum S n  X 1  X 2 ... X n approaches a normal
distribution with n and standard deviation /n.
d. None of the above.

15. Determine the coefficient of variation for the last 500 pilot plant test
0
runs of high temperature film having a mean of 900 Kelvin with a
0
standard deviation of 54 Kelvin.
a. 6%
b. 16.7%
c. 0.06%
d. 31%
e. The reciprocal of the relative standard deviation.

16. A lot of 50 pieces contains 5 defectives. A sample of two is drawn


without replacement. The probability that both will be defective is
approximately
a. 0.4000
b. 0.0100
c. 0.0010
d. 0.0082

Vic Baluyot
Probability and Inference

e. 0.0093

17. A process measurement has a mean of 758 and a standard deviation of


19.4. If the specification limits are 700 and 800, what percent of product
can be expected to be out of limits assuming a normal distribution?
a. 1.7%
b. 7.1%
c. 0.5%
d. 3.4%
e. 2.9%

18. Which table should be used to determine a confidence interval on the


mean when  is not known and the sample size is 10?
a. Z
b. t
c. F
2
d. 

19. The trainees were given the same lot of 50 pieces and asked to classify
them as defective or non-defective, with the following results:

Trainee Trainee Trainee Total


1 2 3

Defective 17 30 25 72
Non- 33 20 25 78
defective
Total 50 50 50 150

In determining whether or not there is a difference in the ability of the three


trainees to properly classify the parts:
a. The value of the Z is
b. Using a level of significance of 0.050, the upper percentage point for this
test
c. Since the computed Z value is , we reject the null hypothesis
d. All of the above.
e. None of the above.

20. If the 95% confidence limits for the mean  turned out to be (6.5, 8.5)
then
a. The probability is 0.95 that the sample mean falls between 6.5 and 8.5.
b. The probability is 0.95 that X falls between 6.5 and 8.5.
c. The probability is 0.95 that the interval (6.5, 8.5) contains .
d. 4 = 8.5 - 6.5.

21. Determine whether the following two types of rockets have


significantly different variances at the 5%level.

Rocket 1 Rocket 2

Vic Baluyot
Probability and Inference

8 9
Readings Readings
1000 2000
2 2
miles miles

a. Significant difference because F calc< Ftable


b. No significant difference because F calc < Ftable
c. Significant difference because F calc > Ftable
d. No significant difference because F calc > Ftable
22. A process calls for the mean value of a dimension to be 2.02" which of
the following should be used as the null hypothesis to test whether or not
the process is achieving this mean?
a. The mean of the population is 2.02".
b. The mean of the sample is 2.02".
c. The mean of the population is not 2.02".
d. The mean of the sample is not 2.02".
e. All of the above are acceptable null hypothesis.

23. The difference between setting alpha equal to 0.05 and alpha equal to
0.01 in hypothesis testing is:
a. With alpha equal to 0.05 we are more willing to risk a type I error.
b. With alpha equal to 0.05 we are more willing to risk a type II error.
c. Alpha equal to 0.05 is a more 'conservative' test of the null hypothesis
(Ho).
d. With alpha equal to 0.05 we are less willing to risk a type I error.
e. None of the above.

24. The type II risk is the risk of:


a. selecting the wrong hypothesis;
b. accepting a hypothesis when it is false;
c. accepting a hypothesis when it is true;
d. rejecting a hypothesis when it is true.

25. If two-sigma limits are substituted for conventional three-sigma limits


on a control chart, one of the following occurs:
a. decrease in type I error;
b. increase in type II error;
c. increase in type I error;
d. increase in sample size.

26. A process is acceptable if its standard deviation is not greater than 1.0.
A sample of four items yields the values 52, 56, 53.55. In order to
determine if the process will be accepted or rejected, the following
statistical test should be used.
a. t - test
b. chi-square test

Vic Baluyot
Probability and Inference

c. Z - test
d. none of the above

27. If in a t - test, alpha is 0.01,


a. 1% of the time we will say that there is a real difference, when there really
is not a difference.
b. 1% of the time we will make a correct inference.
c. 1% of the time we will say that there is no real difference, but in reality
there is a difference.
d. 99% of the time we will make an incorrect inference.
e. 99% of the time the null hypothesis will be correct.

28. Given that random samples of process A produced 10 defective and 30


good units, while process B produced 25 defectives out of 60 units. Using
the Z test what is the probability that the observed Z value could result
under the hypothesis that both processes are operating at the same
quality level?
a. less than % percent
b. between 5 percent and 10 percent
c. greater than 10 percent
d. 50 percent

29. A null hypothesis requires several assumptions, a basic one of which is:
a. that the variables are dependent;
b. that the variables are independent;
c. that the sample size is adequate;
d. that the confidence interval is ± 2 times the standard deviation;
e. that the correlation coefficient is - 0.95.

30. One use for a student t - test is to determine whether or not


differences exists in:
a. variability;
b. quality costs;
c. correlation coefficients;
d. averages;
e. none of the above .

Vic Baluyot

Вам также может понравиться