Вы находитесь на странице: 1из 14

www.studyguide.

pk
Mathematical Modelling
A model is a simplification of the real thing. It will be both quicker and cheaper to produce
than the real one and will help us to understand the real world object or situation.

Mathematical models require the use of probability.

A statistical experiment is a test, investigation or some process adopted for collecting data to
provide evidence for or against a hypothesis.

An event is a sub-set of possible outcomes of an experiment.

We can vary parameters if we wish.

A disadvantage is that a model does not replicate real-world situations in every detail.
www.studyguide.pk

Collecting Data
Collecting data is important as a method must be used to avoid bias.

One source of bias is using data from responses to questions as people may lie about personal
questions such as age and weight.

Another source of bias is when using data that does not properly apply to the problem. eg.
Using published unemployment figures to investigate the number of people looking for work,
but they don't include students, people past retirement age etc. but they may include people
who are not looking for work.

To check data is unbiased ask:


Where has the data come from?
Who is supplying the data and why?
How was the data collected?
Is it all the relevant data or a sample?
If a sample is used, how was the sample chosen?
Is the data relevant to the investigation?
Does the conclusion follow from the investigation?

Types of Data
Qualitative Data
These are non-numerical values such as attitudes, gender, colour, football shirt number

Quantitative Data
These data have valid numerical values such as shoe size, number of broken eggs, height,
time

● Discrete data come from variables which can only take particular values such as shoe
size.
● Continuous data come from variables which can take any value within a given range.

Summarising Data
The reason that a sample is taken is to make deductions about the population.

Graphical and numerical summaries are essential in order to help us analyse the data
collected.

The purpose of these summaries is to condense the data to reveal patterns ans to enable
comparisons to be made. Summarising can lead to a loss of accuracy.

StudyGuide.PK A-Level Maths S1 Notes Page 1


www.studyguide.pk

Ungrouped Frequency Distribution


Data must be sorted before any sense can be made of it. This is often done using a frequency
distribution with a cumulative frequency column.

Stem and Leaf Diagrams


One way of ordering and presenting data is a stem and leaf diagram. The benefits are that it
retains all the original data and yet it is 'grouped' into classes.

We must arrange the leaves in numerical order and give a key.

A stem and leaf diagram gives a quick visual impression of the shapes of distribution. Both
integers and decimal can be represented though the data is usually to 2 sig fig. It may be
necessary to round data to meet this constraint.
www.studyguide.pk

If a large number of leaves are associated with one line then it is usual to use two lines. We
can also improve our diagrams by showing the number of leaves on each stem in brackets.

If direct comparison of two data sets is required, a back-to-back stem and leaf diagram can be
drawn.

Grouped Frequency Distributions


We can summarise data into grouped frequency tables.

The information becomes more concise, but the original information has been lost.
It allows summaries and estimates to be made.

Both continuous data and discrete data can be grouped. The boundaries of the groups must be
matched, even if this results in a negative starting point.

Groups are usually referred to as classes.

Age is a special case, the boundaries are matched to complete years ie. 21-24, 25-28 is
actually 21-25, 25-29.

Cumulative Frequency Curves and Polygons for Grouped Data


When data is grouped (discrete or continuous) we consider the cumulative frequencies to be
the total frequency up to the upper class boundary of each interval.

To draw a cumulative frequency curve, we plot the ucb of each interval against its cumulative
frequency (cf) and join with a smooth curve. For a cumulative frequency polygon, we join the
points with straight lines as opposed to a smooth curve.

Histograms
If the data available is for a continuous variable and it is summarised by a grouped frequency
distribution, then the data can be represented by means of a histogram.

There are no gaps between the bars of a histogram. Thus boundaries must be matched.
There is an important relationship between the area of a histogram bar and the frequency that
it is representing.

Area is directly proportional to frequency.

Total area is directly proportional to total frequency.

StudyGuide.PK A-Level Maths S1 Notes Page 2


www.studyguide.pk
frequency
Frequency density = /class width

Frequency = Frequency Density * class width

There are times when it is useful to draw a histogram based on relative frequencies rather
than frequencies. The relative frequencies are obtained by expressing the frequencies as a
proportion of the total frequency.

Methods of Summarising Sample Data


Measures of Location (averages)
These are sometimes called measures of central tendency which attempts to locate a typical
value about which a distribution clusters.

Methods of Dispersion
These are used to represent the spread or variation within the data since it is unlikely that all
the values in a data set will be the same.

All these measures are generally numerical quantities.

Measures of Location
The Mode
The mode is the value that occurs most often. It is not always unique (can be bi-modal) and
there may not be a mode. In the case of grouped frequencies, the mode is not always useful,
but there are ways to estimate the mode using a histogram. Usually, the modal class would be
sufficient.
www.studyguide.pk

It is easy to calculate and is not affected by any extreme values.


It is useful to shops to know what sizes to stock.

The Median
The middle value of an ordered set of data. If there are n observations arranged in order of
size, the median value is the n + 1 th observation.
2
To find the median, we use the cumulative frequency.

We can estimate the median of grouped data using linear interpolation:

n+1 -fL
Median Q2 = L + 2 *c
f

L = Lower class boundary of median class


n = total frequency
fL = cumulative frequency up to the median class
f = frequency in the median class
c = class width of median group

Similar advantages and disadvantages to the mode.

Other Quantiles
Can be done using the formula above but with n+1 over 4 for quartiles, 10 for deciles and 100

StudyGuide.PK A-Level Maths S1 Notes Page 3


www.studyguide.pk
for percentiles, and then multiplied by which quantile it is e.g. the 43 rd percentile would be
43(n+1/100) in the place of n+1/2
www.studyguide.pk

The Arithmetic Mean


The mean is the most widely used measure of location and is often used in conjunction with
the standard deviation (a measure of spread)

If x1, x2, x3, ...xn are a set of numbers then


Σx
x= /n
Σfx
For a frequency distribution this formula is re-written as x = /Σf where Σf = n

Always state the appropriate values in your answer ie. Σfx, Σf, n

When given two means and the frequency you must find the totals and add these together and
divide by the total frequency to get the new mean (weighted mean)

For grouped data we use the midpoint. Remember age is special:


If you have the groups
0-9
10-19

then you consider the first group as 0-10 therefore the midpoint would be 5.

Advantages and disadvantages


The mean is influenced by extreme values; it is sensitive to the presence of outliers.
It is not as easily calculated as the median
All the values are used directly when calculating the mean.
The mean has important mathematical properties.

Even if we have grouped frequency distributions of unequal intervals, this makes no difference
to the calculation of the mean. Remember that for grouped data, the mean is only an
estimate.

Calculating the Mean Using the Method of Coding


use this method if asked to do so
x–a
y= /b alters the original x values

a = the midpoint of the modal class


b = the class width (if class widths are not equal then use the smallest class width)

From this we can calculate the mean of y and decode to find the mean of x

x = by + a

Weighted Mean
When we wish to place greater emphasis on some of the values we use a weighted mean

Measures of Dispersion
Range
● The simplest measure of spread
● Based entirely on extreme values

StudyGuide.PK A-Level Maths S1 Notes Page 4


www.studyguide.pk
● Smallest value is subtracted from largest value.
● For grouped frequency distributions, an estimate of the range is the difference between
the lower class boundary of the first group and the upper class boundary of the last
group.
● Does not lend itself to mathematical use
● Used only with small data sets in conjunction with either the mode or the median

Interquartile Range
● range of the middle 50%
● IQR = Q3 – Q1
● Not affected by extreme values
● If the median is the measure of location used then the IQR is the appropriate measure
of dispersion
● Often used when data has extreme values or has open-ended classes or is not
symmetrical
● Used extensively in conjunction with box plots
● Can help us identify outliers and examine the skewness of a distribution

Semi-Interquartile Range
SIQR = IQR/2

Standard Deviation and Variance


Standard deviation is used in conjunction with the mean.
Uses all the data values

The population variance is denoted by σ2


The sample variance is denoted by s2

The standard deviation is the positive square root of the variance.


The population sd is denoted by σ
The sample sd is denoted by s

σ2 = Σx2 - x2
n

σ = Σx2 - x2
n

Where

x = Σx
n

For most distributions, the bulk (95%) of the distribution lies within 2sd's of the mean
The units of sd are the same as the original data
We can never get a negative variance (as its sqrt is the sd)
For similar sets of data it is useful to compare the sd's

When there is a frequency distribution we use the formula:

σ = Σfx2 - x2

StudyGuide.PK A-Level Maths S1 Notes Page 5


www.studyguide.pk
Σf

We can code and decode like before but when decoding, you do not need to +a as this does
not alter the spread.

See purple notes for Combining sets of numbers


www.studyguide.pk

Skewness
Symmetrical Bell-Shaped Distribution
mean=median=mode
Normal Distribution

Positively Skewed Distribution


mean>median>mode
The mean is pulled in a positive direction

Negatively Skewed Distribution


mean<median<mode
The mean is pulled in a negative direction

Measures of Skewness
Pearson's Measure of Skewness
Pearson's Measure of Skewness = mean – mode
standard deviation

If this value is positive then we have positive skewness.


If this value is negative then we have negative skewness.
Generally skewness can take any value between -3 and 3

This can be rewritten as:


3(mean – median)
standard deviation

Quartile Coefficient of Skewness

Normal Distribution
Q3 - Q2 = Q2 – Q1
Quartile skewness = 0

Positively Skewed Distribution


Q3 - Q2 > Q2 – Q1
Quartile skewness > 0

Negatively Skewed Distribution


Q3 - Q2 < Q2 – Q1
Quartile skewness < 0

Box Plots
 illustrates the dispersion or spread of the distributions, as well as the average (median)
 it uses the highest and lowest values of the data, and the three quartiles
 the box encloses the middle 50% (the IQR)
 The whiskers extend to the upper and lower values (the range)

When commenting on box plots you must


 give all the summary statistics (median, IQR, range)

StudyGuide.PK A-Level Maths S1 Notes Page 6


www.studyguide.pk
 comment on the skewness of the given distributions with justification calculations
 make comparisons of the two or more distributions

Always draw box plots on graph paper and label your axis clearly. Use a suitable scale.

Symmetrical Bell-Shaped Distribution


The whiskers are of equal length and the median is in the middle of the box.

Positively Skewed Distribution


The right hand whisker is longer and the median is nearer to the lower quartile.

Negatively Skewed Distribution


The left hand whisker is longer and the median is nearer to the upper quartile.

Use of Box Plots to Identify Outliers


 Extreme values are known as outliers
 There may be good reason for these results but they are often due to errors
 They may need to be highlighted
 They are often considered as points lying more than 1.5 times the IQR above Q 3 or below
Q1

Procedure
 Find the value of the quartiles
 Evaluate Q1 – 1.5(Q3 – Q1) and Q3 + 1.5(Q3 – Q1) and note any values that fall outside this
range
 Draw a box based on the quartile values.
 If there are any outliers, label them with crosses. The whisker is usually drawn to the next
value towards the median
Only calculate these outliers if the question specifically asks you to do so

Correlation
 the relationship between two variables x and y
 bi-variate data
 produce a bi-variate distribution
 There may be a relationship but you cannot necessarily expect to find a law/formula
relating them
 We initially look for basic associations

Scatter Diagrams
Bi-variate data is conveniently displayed through scatter diagrams
They help to assess correlation and regression.
We can use to help show linear correlation
Even if we find a mathematical relationship, this does not imply that there is a relationship in
reality, or indeed that an increase in one variable causes an increase in the other.

Correlation measures the relationship and the strength of this relationship between the two
variables.

If both variables increase together we say that they are positively correlated.
If one variable increases as the other decreases we say that they are negatively correlated.
If no relationship can be seen we say there is no correlation.

When drawing scatter diagrams it doesn't matter which axis is used for which variable,
however it does when measuring regression.
www.studyguide.pk

StudyGuide.PK A-Level Maths S1 Notes Page 7


www.studyguide.pk

If a horizontal line and a vertical line are drawn through the mean point (x, y), you can see the
association between the two variables in a different way:

For a postive correlation most points lie in the first and third quadrants (top right and bottom
left respectively)
For a negative correlation most points lie in the second and fourth quadrants (top left and
bottom right respectively)
If there is no correlation the points are randomly distributed in all four quadrants.

Product Moment Correlation Coefficient, r


PMCC
The pmcc r is a numerical value that indicates the degree of scatter. It measures the
relationship between the two variables and its strength.
We must calculate this value and interpret its meaning.
The value of r lies between -1 and 1
It is a useful measure because it is independent of the units of the scale of the variables.

The calculation of r should only follow after a scatter diagram has been drawn in reality. It
should only be calculated if the scatter diagram reveals some degree of linear correlation. If
correlation is non-linear than pmcc is not appropriate.
Outliers, or rogue results, should be identified as they may upset the general trend.

If r = 1 there is perfect positive linear correlation between the two variables.


If r = -1 there is perfect negative linear correlation between the two variables.
If r = 0 (or close to 0) there is no linear correlation; this does not, however, exclude the
existence of another type of relationship.

Calculation

r= Sxy
√(SxxSyy)
where Sxy = Σxy – ΣxΣy
n

where Sxx = Σx2 – (Σx)2


n

where Syy = Σy2 – (Σy)2


n

We must find n, Σx2, Σx, Σy2, Σy, Σxy


And then use above formulae

Calculator must be in linear regression mode.

Using A Method of Coding for Correlation


The beauty of coding for the PMCC is that we do not need to decode at the end.

It makes the values of x and y smaller. You can subtract any number from the x values, since
this only moves the axis. You can divide the result by any number since this only changes the
scale. The correlation coefficient is unaffected by either of these operations.

You can rewrite the variables x and y as:

x-a
X= /b
StudyGuide.PK A-Level Maths S1 Notes Page 8
www.studyguide.pk
y-c
Y= /d

where a, b, c and d are suitable numbers to be chosen.

Note:
Just because two variables have a linear correlation does not necessarily mean that they are
related. Thus, you should have some reason to believe that there might be a relationship
before calculating the PMCC, unless your aim is to prove that they are unrelated.
Data can be distorted by an outlier, so the information should be plotted on a scatter-graph
first.

Note:
A quadratic graph would give a PMCC of 0, as it has correlation, but it is non-linear.

Often variables are linked only through a third variable. Particularly changes that take place
over time.

Regression
Purpose: to find a law connecting two variables, so that we can make predictions about the
value of y for any given value of x.

Explanatory and Response Variables


The value of x is controlled. It is known as the explanatory or independent variable whilst y is
called the response or dependent variable.

The response variable will be subject to some level of error or natural variation.

To see if there is a relationship, we plot a scatter diagram.

The explanatory variable is always plotted horizontally and the response variable is always
plotted vertically.

By examining the scatter diagrams for data, we can see if a straight line would be a good or
appropriate model for the relationship between x and y.

The Straight Line Law


In statistics, instead or writing y= mx + c, we use y = a + bx
This can be rearranged to y - y = b(x - x)

Having assumed the linear regression model, the results are used to find a regression line.
This line is known as the regression line of y on x, since y is the response variable for a given
value of x.

If you assume a linear regression line, each point with coordinates (x i, yi) will have a vertical
distance ri from the regression line. These are known as residuals.
If the residuals are very small, a line may be drawn by eye, however a much better solution is
to find the line of best fit using the method of least squares.
Legendre formulated this method. The resulting line is known as the least squares regression
line.

The Least Squares Regression Line


Making the sum of the squares of the residuals as small as possible. ie Σ r i 2 is minimised.

We substitute the mean point (x, y) into the equation y - y = b(x - x) and rearrange to get
y = a + bx
The gradient m is given by the letter b and is called the regression coefficient of y on x. We will
StudyGuide.PK A-Level Maths S1 Notes Page 9
www.studyguide.pk
need to calculate b using the formula;

b= Sxy
Sxx

x= Σx
n
y= Σy
n

To draw this line, we choose three points: the mean point and one point whose x value is at
the low end of the observed values and another point whose x value is at the high end of the
observed values.

We can use our regression line to obtain estimates of y given values of x under appropriate
conditions.

Application and Interpretation


To make estimates of the response variable within the range of the observed values of the
data is know as interpolation.

You do not know what happens outside the range of our values of our experimental data.
We are assuming a linear relationship within our observed values and for all we know the
relationship between the variables outside of the range of values may be non linear. Therefore
it is dangerous to make predictions or estimates for the response variable based on values
outside the range of observed values. The process is known as extrapolation.

You will also be asked to give interpretation for the values of a and b from your regression lie
within the context of the question.

While regression is concerned with finding a linear law between the two variables in question,
the value of the response depending for its value upon that of the explanatory, correlation is
concerned with how strongly two variables are linearly associated (not a law)

Probability
Venn Diagrams and Probability Definitions

∩ = intersection AND
U = union OR OR in maths means the probability of both
A| = NOT A

P(A) = 1 - P(A|)
P(A|) = 1 - P(A)

P(AUB) = P(A) + P(B) - P(A∩B)

P(A|UB) = P(A|) + P(B) - P(A|∩B)

P(AUB|) = P(A) + P(B|) - P(A∩B|)

P(A|UB|) = P(A|) + P(B|) - P(A|∩B|)

P(A|∩B|) = 1 - P(AUB)

P(A|∩B) = 1 - P(AUB) - P(A)

StudyGuide.PK A-Level Maths S1 Notes Page 10


www.studyguide.pk

P(A∩B|) = 1 - P(AUB) - P(B)

Mutual Exclusivity
Two events A & B are said to be mutually exclusive (m.e) if they cannot occur at the same
time. In this case, in the Venn Diagram, A & B do not overlap

Thus P(A∩B) = 0
P(AUB) = P(A) + P(B) for these events

Exhaustion
If two events A & B are such that AUB makes up all the possible outcomes
P(AUB) = 1
We say that A & B are exhaustive
P(A) + P(B) - P(A∩B) = 1

Conditional Probability (Dependent Events)


If A & B are any two events where P(A) ≠ 0 and P(B) ≠ 0 then the probability of A given that B
has already occurred is written as P(A|B)
P(A∩B)
P(A|B) = / P(B)
P(A∩B)
P(B|A) = / P(A)

Conditional probability reduces the sample space

Note: If events A & B m.e then we know P(A∩B) = 0 so P(A|B) = P(B|A) = 0

Note: We can extend this basic conditional probability definition to things like
P(A||B) = P(A|∩B) / P(B)

Note: P(A||B) = 1 - P(A|B)


P(A||B|) = 1 - P(A|B|)

* without replacement is conditional probability.


* with replacement is independent event
* Use common sense where possible
* Resort to definitions when common sense fails

Independent Events
2 events are independent if the probability that 1 of them occurs is no way influenced by
whether or not the other has occurred.

Thus P(A|B) = P(A)


P(B|A) = P(B)
In this case P(A∩B) = P(A) * P(B)

Discrete Random Variables

The following are examples of discrete random variables.


● the score when a die is thrown
● the value of a prize awarded
● the profit in a game of chance etc

The set of all possible values of a r.v. together with their probabilities is called a probability
distribution (probability disn)

StudyGuide.PK A-Level Maths S1 Notes Page 11


www.studyguide.pk

Also, the function that describes how the probabilities are assigned is called the probability
function.

For an r.v, X the probability function is denoted by


P(X=x)

Remember Σ P(X=x) = 1

Random variables are denoted by capital letters and the particular values they take are
denoted by lower case letters.
www.studyguide.pk

Whatever the question is, always define what the random variable is.

The function that is responsible for allocating the probabilities P(X=x) is also known as the
probability density function (pdf)

Sometimes it can be expressed in a tabular form or in a formula.

The cumulative distribution function (cdf)


F(x) = P(X≤x)
F(last number) = 1

Expectation E(X)
E(X) = Σ x P(X=x)

E(X) is the expected value, the mean of the probabilities.


We obtain this value of the expected mean by multiplying each score by its corresponding
probability and summing them.

This is a theoretical approach (the mean of the frequency distribution is a experimental


approach).

Note: Some probability distributions are symmetrical about a central value.


In this case the E(X) is the middle value.

A discrete random variable with pdf P(X=x) = k , for all given values of x, where k is a
constant is said to follow a Uniform Distribution

The Expectation of Any Function of X

The definition of expectations can be extended to any function of the r.v X, such as X 2 , 9X,
X-4, 3X2 - 5X

In general, if g(x) is a function of X, a discrete random variable, then

E[g(x)] = Σ [g(x)] P(X=x)

The following results hold when X is a discrete random variable and when both a and b are
constants

1. E(a) = a
2. E(aX) = aE(X)
3. E(aX + b) = aE(X) + b

StudyGuide.PK A-Level Maths S1 Notes Page 12


www.studyguide.pk

The Variance of X

Var(X) = E(X2) - [E(X)]2 where E(X) is the mean μ

Var(a) = 0
Var(aX) = a2 Var(X)
Var(aX + b) = a2 Var(X)
Var(aX ± bY) = a2 Var(X) + b2 Var(Y)

The Discrete Uniform Distribution


If the discrete random variable X is defined over the set of distinct values. {x 1 , x2 , x3 ... xn}
and each value is equally likely, then X has a discrete uniform distribution and

P(X = xr) = 1/n r = 1, 2, 3 ... n

X = the value of next outcome

If X is the discrete uniform variable and x n = n (ie. x values start at 1 and progress up
consecutively)

μ = E(X) = n+1/2 σ2 = Var(X) = (n+1)(n-1)


/12

The Normal Distribution

Most important continuous distribution in statistics.

Seen in heights, masses, age etc.

The probability density function of the normal random variable is very complicated. The shape
of the curve depends on two parameters, mean and variance.

X ~ N(μ, σ2)

The distribution is bell shaped and symmetrical about the mean


Mean = median = mode
95% of the distribution lies within 2 sd's of the mean. 99.8% lies within 3 sd's of the mean
It is a two parameter distribution. The probability of X relies only on μ and σ 2

Area under curve = 1

We must standardise X to get the standard normal random variable (Z)

Z ~ N(0, 1)

Areas under the Curve


Use the tables to find values of ф(a) in the interval 0 to 4
For values between -4 and 0 we use the symmetry of the normal distribution to find
appropriate probabilities.

P(Z < a) = ф(a)

P(Z > a) = 1 - P(Z < a)


= 1 - ф(a)

P(Z < -a) = ф(-a)

StudyGuide.PK A-Level Maths S1 Notes Page 13


www.studyguide.pk
= 1 - ф(a) (by symmetry)

P(Z > -a) = ф(a)

P(a<Z<b) = ф(b) - ф(a)

P(-a<Z<a) = P(|Z| < a) = 2ф(a) - 1

P(|Z| > a) = 1 - P(|Z| < a)


= 2 - 2ф(a)

Use all four decimal places from table.

Special Probability Table

This contains z values for the normal variable Z~N(0,1) such that r.v exceeds z with
probability p.

P(Z>z)

You can use both tables in reverse to find the value of z, given a probability.

Transformation of any Normal Random Variable to a Standard Normal r.v.


If X~N(μ,σ2) then

Z =X-μ Where Z~N(0,1)


σ

This is called standardising X to the normal r.v Z


www.studyguide.pk

StudyGuide.PK A-Level Maths S1 Notes Page 14

Вам также может понравиться