Вы находитесь на странице: 1из 22

Lecture 4: Probability and Statistics

Introduction

Probability

Definitions and Axioms

Random Variables and PDFs

Important Distributions

Statistical Metods of Parameter !stimate

Mean and Si"ma

Fittin"

!rror Propa"ation

#ypotesis $estin"

%
$est

&S $est

Systematic 'ncertainties

Visuali(in" Data
Introduction

Pysics is based on experimental measurements

Must understand consistency) precision and


accuracy of tese measurements

Must also determine *eter data is consistent *it


our teory and *eter ne* pysics could be idin"
in our data
Statistics pro+ides te tools to do tis
Probability: ,asic Definitions and Axioms

Define set S as te sample space *it subsets A),---

Probability P is a real.+alued function defined by axioms:


/-For e+ery subset A in S) P0A12
%-For dis3oint subsets 0ie A, 421) P0A,14P0A15P0,1
6-P0S14/

7onditional Probability P0A8,1 0prob of A "i+en ,1


Probability: Random Variables and PDFs

For continuous +ariables x define te probability


density function f(x;)dx as te probability tat x lies
bet*een x and x+dx- #ere represents one or more
parameters- 9e *on:t boter to carry alon" ere

7umulati+e probability F0a1dx:

For discrete +ariables replace inte"ral *it sum

Any function of te random +ariable x is a random


+ariable u0x1 *it expectation +alue
Fa=

f x dx
E[u x]=

u x f x dx
PDF Moments: Mean) Variance

Mean:

Variance:

Standard De+iation:
$ese basic definitions are used essentially
e+ery*ere- If *e ;no* te pdf) *e ;no*
o* to determine te mean and s-d-
=

x f x dx

2
=Var x=

x
2
f x dx
2
,inomial Distribution

Random process *it t*o possible outcomes

P 4 prob of outcome </) Q=1-P4prob of outcome <%

In N trials) 4NP;
%
=NPQ

!xample of use: Measurement of asymmetries


,inomial PDF
,inomial 7umulati+e Prob
Poisson Distribution

Probability of findin" exactly N e+ents in an inter+al


bet*een x and x+dx if a+era"e rate in inter+al is

= =
%
4

For lar"e ) it approaces a >aussian Distribution

!xample of use: !stimate of error on number of


e+ents in bin of isto"ram
?ormal 0>aussian1 Distribution

?ormal distribution as mean and +ariance


%

A nice rule of sum for estimatin" :


F9#M 4 0% ln0%1 1 4 %-6@4

!xample of use: 7entral limit teorm says tat all


PDFs approac >aussian for Lar"e ?
Statistical !stimators: Definitions

Ane aim of statistical analysis is to estimate te true


+alue of one or more parameters from experimental
data and to understand te uncertainty on tat
estimate-

Important properties of estimators are:

7onsistency: If amount of data lar"e) estimate


con+er"es to true +alue

,ias: Difference bet*een expectation +alue of estimator


and true +alue of parameter

Robustness: !stimator doesn: t can"e muc if true pdf


differs from assumed pdf 0e" tails in distribution1
Statistical !stimators: Mean and Si"ma

'nbiased estimators of mean and si"ma:

$e eB abo+e assume tat all measurements a+e


same -

If te x
i
a+e different :s
*it *
i
4/C
i
Statistical !stimators: Fittin"

Very often *e don:t *ant an estimate of te


parameter *e a+e measured) but rater a
parameter of te pdf-

$ecniBue ere is to fit for best +alue of and its


uncertainty

Most statistical analysis pac;a"es contain al"oritms


to do fittin"

Se+eral basic metods

Least sBuare fittin"

Li;eliood
$ese all reBuire minimi(ation of a function- In complex
cases) tis minimi(ation can only be done numerically

Standard pac;a"e used in #!P is: Minuit


Minuit is included as part of RAA$
$e Metod of Least SBuares

>oal is to minimi(e te scatter of data from fit


function) ta;in" uncertainties on data points into
account-

$is scatter defined in terms of


%
:

9e can *rite te
%
in terms our obser+able

Minimi(e
%
*it respect to parameter
Li;eliood Distribution and Li;eliood Fits

For ? measurements of x) define te li;eliood


function as te 3oint pdf:

It:s usually easier to *or; *it ln0L1 0since product


becomes a sum1 and ten maximi(e:
7orrelated Variables

Aften te +ariables *e measure not independent

$en) *en doin" te minimi(ation) te correlations


must be ta;en into account

$is is te moral eBui+alent of ta;in" a Dacobean

Define te co+ariance matrix suc tat its in+erse


depends in te partial deri+iati+es *rt parameters:
!ffect of 7orrelated !rrors

Standard error elipse for t*o parameters *it a


ne"ati+e correlation

Slope related to correlation coefficient d


i
Cd
3

7orrelation matrix typically determined from data


numerically durin" fittin" process
Propa"ation of !rrors

A "ood description is found on *i;ipedia:


ttp:CCen-*i;ipedia-or"C*i;iCPropa"ationEofEuncertainty

,asic expression for propa"ation is:

In case *ere +ariables uncorrelated) tis reduces to


te usual expression you a+e probably used in
under"rad labs
Introduction to #ypotesis $estin"

!+erytin" so far as been "eared to findin" best


+alue of parameters and uncertainty under
assumption tat *e ;no* te pdf

?otin" in our procedure told us if data *ere


consistent *it te pdf

?eed statistical tests of *eter ypotesis is true:

Si"nificance tests: #o* li;ely is it tat a si"nal is 3ust a


fluctuationF

>oodness of Fit tests: Is te data consistent *it


comin" from our proposed distribution

!xclusion tests: #o* bi" a si"nal could be idin" in our


dataF
Si"nificance $ests

Suppose *e measure a +alue t for te data- #o*


li;ely is it tat *e see a +alue tat far or furter from
te predictionF
Ar

Suppose *e measure a distribution of data- #o*


consistent is tis distribution *it te ypotesis

9e can use our friend


%:

P.+alue 4

f z ; n
d
dz
7onfidence Le+els

7onfidence Le+el: For sin"le Buantity) "i+e prob of


e+ents expected to be furter a*ay

For si"nal and bac;"round) "i+es impro+ement in


prob due to si"nal

CL
s
=
pvalue of signal plus background hypothesis
1pvalue of background
&olmo"oro+.Smirno+ 0&S1 $est

Depends on sape of distribution not normali(ation

7ompares t*o distributions or data samples *itout


needin" any assumptions about distribution of data

'ses cumulati+e distributions and sees o* de+iations


+ary *it x
Systematic 'ncertainties

Systematic uncertainties are an art rater tan a


science-

?eed to estimate o* *ron" your ans*er mi"t be


due to your setup or assumptions about your data or
te pysical process you are modelin"

!xamples:

Mis.modelin" resolution

'n;no*n decay rates

Mis.calibration

Variations in conditions 0temp) pressure) etc1

'sually assume >aussian errors 0don:t ;no* *at


else to do1

$ypically Buote separately from statistics


Visuali(in" Data

Вам также может понравиться