Вы находитесь на странице: 1из 7

3

Physics 2049C Laboratory 1


Distribution Functions
Purpose
To develop techniques for the use of statistics in data analysis.
Apparatus
One sheet of polar graph paper, several sheets of linear graph paper, ten coins, carbon paper, and
a marble.
I. Preliminary Discussion
Typically an experiment on a physical system determines a parameter (call it x i ) for each
measurement. For example, if the experiment consists of studying the weight (or height) of
students in this laboratory section, then x i would be the weight (or height) of each student.
Alternatively the experiment could be quite different, such as studying the sum of the numbers
which come up when you throw two dice.
Whatever the experiment, experimenters are interested in the average value x of the
measurements x i obtained in the experiment and in the standard deviation , which is related to
the uncertainty in the measurement. To determine these quantities, it is convenient to introduce
the method of binning of data and plotting of histograms. To bin data, simply divide the range
of the parameter measurements ( x min , x max ) equally into N bins bins and count the number of data
points n j in the jth bin. Plotting n j vs the bin index j gives the distribution function of the
measurement. Clearly the number of data points in each bin n j will increase in proportion to the
total number of measurements N . If, instead, we plot f j n j N vs bin index x j we get a
distribution function which is independent of N, called the normalized distribution function of
the measurement. Thus
n
j fj = j j N = N N = 1
After determining the f j for an experiment which consists of N independent measurements of the
parameter x, resulting in x1, x 2 ,... x N , it is straightforward to calculate the average value x , and
the standard deviation .
x =
=

nj x j

j (x j

nj
j

x )2 f j

xj =

f jxj

4
Note:

j (x j
or

x )2 f j = j x j 2 f j 2 x
=

j xj f j +

x2 x

j f j =

j x j2 f j

x2 x

The standard deviation is a measure of the spread in the values of x i from the average x . The
statistical uncertainty in n j is n j = n j .
A. An Example - Throwing Dice
Consider throwing two dice. What would you expect for the distribution of results for a large
number of throws? You might expect that double sixes would come up once for every 36 throws
on average, because there are 36 possible combinations for the dice (6 possibilities for each die or
6x6 = 36 for both dice) and only one combination gives double sixes. How many times would
you expect a total of seven to occur, on average? How many combinations give seven? A
statistical analysis of the two dice provides a useful example of the technique of developing a
distribution function which describes the system.
Suppose we throw two dice a large number of times N. We will get a distribution of totals x j
( j ranges from 1 to 11) between 2 and 12. What determines the distribution? We assume that
each face of the die is equally likely to appear on each throw. Thus the fraction of the time that
the value x i appears on a particular throw is simply the ratio of the number of ways the two dice
can combine to produce x i compared to the total combinations of the dice (36). To phrase this in
statistical terms, we call the value xi , observed after a throw, the macroscopic state of the system
and each possible way such a total can be achieved a microscopic state of the system.
Let's determine the microscopic states for xj = 3. The value for one die is italic and the value for
the other is bold. There are 2 microscopic states 12 and 21 which will result in the macroscopic
state xj = 3. How about for xj = 7? The possible microscopic states are 16, 25, 34, 43, 52, and
61; for a total of six states. The xj with the largest number of ways of being attained (the most
microscopic states) will be the most probable macroscopic state. The probability that xj = 3 will
appear after a throw of the dice is 2/36 = 1/18, while the probability that xj = 7 will appear is 6/36
= 1/6.
The whole system is summarized in the following table and histogram of the normalized
distribution function:

Notice that the sum of the microscopic states for this system is 36 and that the sum of the
probabilities for each macroscopic state to occur adds to one, or j f j = 1. You can see from the
table that the most probable macroscopic state is the one ( x j = 7 ) with the most microscopic
states.
II. Experiment
A. Statistical Analysis of Coin Toss
Experimental Procedure

Toss ten coins N = 100 times and record the number of heads for each toss. The total
number of heads which appears on each toss is the macroscopic state xj. There are 11
possible macroscopic states 0 through 10. Calculate and record the number of
measurements nj which fall into each macroscopic state.

Calculate and record the distribution function fj = nj /N for each macroscopic state. Plot
the distribution function fj vs xj as a histogram.

Using the formulas of Part I , calculate x , , and the most probable macroscopic state
xj . (Remember this is the one for which nj is the greatest.)
Comparison of Experimental Results with Statistical Expectations

Using the same sort of analysis as we did for the throwing of dice, we can calculate the statistical
expectation for the distribution function f j st based on the assumption that each coin is equally
likely to come up heads or tails. As before, the probability that the coins give xj heads on a
particular toss is f j st and is equal to the number of possible microscopic states for a particular xj

6
divided by the total number of microscopic states for all macroscopic states. Remember for the
microscopic state we must keep track of which coin is doing what. For example, for the state
with one heads, we could have (if you imagine all coins to be numbered) number one heads and
the rest tails, or we could have number 2 heads and the rest tails, etc. Thus there are 10 possible
states which give 1 heads. It is easy to count the number of microscopic states for the low
probability macroscopic states. However, for the higher probability states, the counting is difficult.
The binomial distribution gives us a handy formula for counting the number of microscopic states
corresponding to a particular xj (number of heads). It is C!/{(C-xj)! xj!} where C is the number of
coins.
What is the total number of microscopic states? Each coin can have two states, and there are 10
coins, thus there are 210 total microscopic states.
C!
st

Calculate and record f j =


for every macroscopic state xj.
( C x j )! x j ! 2 10

st
Plot f j on your experimentally determined distribution for each value of xj and smoothly

connect the values of f j st .


B. Study of a Continuous Distribution
In this part we want to study continuous distributions. In the previous sections each data point
was an integer and the width of a "bar" on the histogram was one integer wide. The width of a
"bar" on a histogram is called the bin width r and is the range of values along the abscissa which
are plotted as one ordinate. To normalize a histogram for comparison with a continuous
distribution which is normalized to unity,

f ( r )dr = 1 , it is necessary to define the bin height as

n j / ( Nr ), where nj is the number of events in the bin, N is the total number of events, and r is

the bin width. The area under the histogram is


A=

nj
j

N r

r =

1
N

n
j

=1

as required for comparison with a continuous distribution normalized to unity.


It is quite often the case with limited amounts of data that the statistical fluctuations between
adjacent data values hide the overall shape of the data. In the following experiment you are to
vary the bin width of your data to obtain the smoothest possible histogram to compare with a
theoretical curve.
Experimental Procedure

Tape a sheet of polar graph paper face up on the floor. Pitch a marble at the center of the
graph paper a couple of times while adjusting your distance from the paper so that 1) you
seldom miss the paper entirely, and 2) you are unable to hit consistently too close to the

7
center. Fig. 1 shows a practical arrangement you can try with another person to facilitate
data taking for both of you.

When you have found a good tossing distance, tape a sheet of carbon paper, carbon side
up, underneath the graph paper (lines down). Mark a target on the back side of the graph
paper. Do not move the paper to number your data points for this experiment, but keep
track of the total throws you make including any which miss the target altogether. After
making 50 data points, exchange positions with your lab partner to reduce systematic
biases in your data and make 50 more data points.
Experimental Analysis

After making N = 100 data points, count the number of hits n(r) in each ring between r
and r+r, where r is the distance from the center of the target and r = a division on your
graph paper small enough to give you 8 to 10 bins for histograms. Make a histogram of
n(r) versus the radius of the ring r. The effective r for each bin is in the center of the
interval. You will need to decide how to count hits which overlap two rings, but be
consistent and count all hits.

If you do not get a smooth histogram resembling Fig. 2 in the Appendix, plot a new
histogram, adjusting the bin width r to remove statistical fluctuations but to keep the
functional dependence on r.

Normalize your final histogram having the best bin width (that is plot f(r) = n(r)/Nr).

Calculate and record on your plot of f(r) the experimentally determined r and using
the equations from Part I.

The Appendix gives a theoretical fit for your data as


r2

r
f (r ) = 2 e 2 R
R

where R is a constant, which is determined from you experimental distribution. The Appendix
shows that R is related to your experimentally determined r as R = 2 r .

Using R = 2 r , plot the theoretical f(r) on top of the normalized histogram of your
data.

Appendix - Radial Distributions


If a person were dropping marbles onto a target trying to hit the center, the expected results of
the distribution of hits F(r) versus the distance r from the center of the target would be expected
to follow a Gaussian curve of the form
F (r ) e

r2
k

(1)

if the variations of the hits were randomly distributed. In Eq.(1), k is a constant. F(r) is shown in
Fig. 1. However, the area between r and r + r becomes greater as r becomes greater. Hence
there is an increasing chance of hitting in an interval r to r + r as r becomes greater.
The area between r and r + r is
A = ( r + r ) 2 r 2

(2)

and 2Brr if r is small compared to r. A ring is defined by the position of its center, its
inner radius, and its width. If r is a fixed interval, then the area of a ring centered on the target
(r = 0) with inner radius r and width r increases linearly as r, its inner radius, increases. The
distribution of hits in an interval r is defined as f(r). The distribution f(r) versus r should be
proportional to r F(r).
f ( r ) re

r2
k

(3)

The expected graph for hits on the target is shown in Fig. 2 where r = R is the maximum value of
f(r).

To find the constants in Eq. (3), we normalize f(r)

r
1

f ( r ) dr = re k dr = 1
0

(4)

with " being a constant of proportionality. It is also necessary to set the first derivative equal to
zero since at this value r = R.
df (r )
dr

(5)

=0
r= R

Solving these two equations, we can determine for " and k and obtain
r2

r 2
f (r ) = 2 e 2 R
R

(6)

To compare an experimental plot of data points with Eq. (6), it is necessary to determine the
constant R for the data. This may be done by comparing to the experimentally determined r .
The theoretical average r is given as

1
r = r f ( r ) dr = 2
R
0

r
0

r2
2 2 R2

dr = R

If r is known for any set of data, then R is found by R = 2 r .

(7)

Вам также может понравиться