Академический Документы
Профессиональный Документы
Культура Документы
UNIVESITY
Math assignment
Group members
NAMES OLARA ALLAN ARIKOD RICHARD MUKIIZA JULIUS NDARAMA MICHEAL SIMON TWEHEYO DISHAN BULUMA MELINDA NAMIYA MARIAM SSONKO EMMANUEL BUYINZA ABBEY ARIKE PATRICK TUKAMUSHABA EMMY ANGURA GABRIEL SEMAHORO ALLAN NANKABIRWA ROSE MUTONGOLE SAMUEL NAMPEERA ROBINAH KIGONYA ALLAN OCAN GEOFREY MUTYOGOMA MAUYA DRATELE SIGFRIED BUDRA MUKIIBI SSEMAKULA PETER KINENE SERWANGA BRIAN OLUKA PATRICK
REGISTRATION NUMBER 10/U/683 10/U/657 10/U/671 10/X/3007/PSA 10/U/690 10/U/662 10/U/676 10/U/9979/PSA 10/U/663 10/U/9989/PSA 08/U/3053/PSA 10/U/9946/PSA 10/U/9945/PSA 10/U/678 10/U/9965/PSA 10/U/677 10/U/668 10/U/9971/PSA 10/U/9998/PS 10/U/1914 10/U/9964/PSA 10/U/687 10/U/1002/PS
STUDENT NUMBER 210001123 210001135 210001151 210004611 210001016 210000809 210000345 210005498 210000879 210018734 208006302 210018907 210006460 210000348 210009531 210001032 210000683 210017525 210006589 210001946 210006993 210001541 210006598
The posterior distribution consists of information from both subjective-prior distribution and the objective sample distribution and express the degree of belief in the location of parameter after you have observed the sample. If we denote by ( ) the joint probability distribution of the sample,
conditional on the parameter in a situation where is a random variable; The joint distribution of the sample = and then the parameter is then f()
Page 1
Note; the mean of the posterior distribution called Bayes estimate of . The density is called the posterior density.
denoted by *, is
Consider the Bayes estimation of the probability p of an event where p is a realization of a random variable X with probability density function whose range is [ ]. Prior estimates of p can b obtained from; = . (1)
To improve on the estimate of p, we conduct an experiment of tossing a die n times and observing the number of aces to be k. Applying Bayes theorem, the posterior density is written as;
.. (2) }
Where B = {
. (3)
. (4)
Page 2
in
Assuming that is uniformly distributed in [ range [ ] , equation (4) can be simplified. The integral mathematical induction. Substituting ( ) We can express the conditional density
as
Bayesian methods of estimation concerning the mean of normal population are based on the following theorem. If is the mean of the random sample of size n from a normal population with known the variance 2 and the prior distribution of the population mean is a normal distribution with prior mean o and prior variance o2, then the posterior distribution of the population mean is also normal distribution with mean * and standard deviation *, where
and
The posterior mean * is the Bayes estimate of the population mean and 100(1-)% Bayesian interval for can be constructed by computing the interval
Group 5 math assignment Page 3
)% of the posterior probability. Example An electrical firm manufactures light bulbs that have a length of life that is approximately normally distributed with a standard deviation of 100hours. Prior experience leads us to believe that is a value of a normal random variable with a mean o = 800 hours and standard deviation o = 10hours. If a random sample of 25 bulbs has an average life of 780 hours, find a 95% Bayesian interval for . Solution The posterior distribution of the mean is also normally distributed with the mean * =
= 796
The 95% Bayesian interval for is then given by; 796 - 1.96 < < 796 + 1.96 778.5 < < 813.5
By ignoring the prior information about in the example above, one can continue to construct the classical 95% confidence interval
780-(1.96)(
)<<780+(1.96)(
Or 740.8<<819.2, observed to be wider than the corresponding Bayesian interval. Advantages of Bayesian Estimation: i. The prior probability allows you to incorporate knowledge about a particular hypothesis. In case of a study on lemur foraging and a similar study was conducted on lemur foraging 20 years ago you can incorporate that study into your scientific decision-making because says that science works by building on established knowledge. . ii. The name of the game is to compare the relative probability of competing hypotheses. The Bayesian framework allows you to make that comparison.
Page 4
Disadvantages of Bayesian Estimation i. You can get very different posterior distributions by changing what parameters have uninformative priors. In other words, there are some tricky mechanical issues. ii. The frequentist-based framework is ideal for the Popperian view of science because it allows you to falsify hypothesis. Under Bayesian statistics, there is no such thing as falsification, just relative degrees of belief. Frequentist statistics is "easy" and has accepted conventions of method and notation. The same cant be said of Bayesian statistics. Bayesian requires understanding probability and likelihood.
iii.
Page 5
VECTOR RANDOM VARIABLES A random matrix (or random vector) is a matrix (vector) whose elements are random variables. Its elements jointly distributed. Two random matrices X1 and X2are independent if the elements of X1(as a collection of random variables) are independent of the elements of X2 but the elements within X1or X2 do not have to be independent. Similarly, a collection of random matrices X1, . . . ,Xk is independent if their respective collections of random elements are (mutually) independent. (Again, the elements within any of the random matrices need not be independent.) Similarly, an infinite collection of random matrices is independent if every finite sub-collection is independent. Expectation (mean) of a random matrix The expected value or mean of an m n random matrix X is the m n matrix E(X) whose elements are the expected values of the corresponding elements of X, assuming that they all exist. That is if; X=[ Then E(X)=[ ] ]
Properties: E (X) = E(X) If X is square, E (tr(X)) = tr (E(X)) If a is a constant, E (aX) = a E(X) E (vec(X)) = vec (E(X))If A and B are constant matrices E(AXB) = AE(X) B E(X1 +X2) = E(X1)+E(X2) If X1 and X2 are independent, E(X1X2) = E(X1) E(X2)
Covariance This is the relationship between two random variables. If 3 or more random variables are jointly distributed, one must consider the covariance for all possible pairs. The covariance of 3 jointly distributed random variables x, y and z is specifically the 3 covariances; xy for x and y, yz for y and z and xz for x and z. Thus, in dealing with m jointly distributed random variables, it is convenient to collect them into a single vector. A random
Page 6
vector is one whose components are jointly distributed random variables. Therefore, if x1, x2,..., xm are m jointly distributed random variables , the vector, [ ] is a random vector;
=[
][
=[
Note; The variance of the individual random variables form the main diagonal of
xx is
xx
If the random variables in X are uncorrelated, all covariance (off diagonal) elements of xx are zero and the matrix is diagonal. The relationship between the weight matrix W and the corresponding variance matrix/covariance matrix, with subscripts added to indicate reference to random vector X, is restated as; Wxx = 02xx-1 0 2 is the reference variance. Caution; If Wxx is non diagonal the simple weights calculated in;
Group 5 math assignment Page 7
W1 = 0 2/ 12 W2 = 0 2/ 0 2. (4-16)
Wm = 0 2/ m2, are not to be used as diagonal elements of Wxx , but only when Wxx is diagonal are the weights calculated in (4-16) identical to the diagonal elements. Example 1 Two observations are represented by the random vector; X= The variances X1 and X2 are 1 2 and 22 respectively. The covariance of X1 and X2 is 12, and the correlation coefficient is 12. (a) For a selected reference variance 02, derive the weight matrix of X in terms of the given parameters. (b) Show that the weights calculated in (4-16) are identical to the diagonal elements of the weight matrix only when 12 = 0 Solution; (a) The weight of the matrix X is; Wxx = 02xx-1 = 02[ )[ ]
/ ( 12 22 -
and
only when
=0
When =/ 0, the weights W1 and W2 cannot be used as diagonal elements of Wxx . Each of the xx can be divided by to yield a scaled version of called Qxx( co-factor of matrix of X)
Qxx = xx
[ xx = Qxx
Qxx is also called the relative co=variance matrix. The variance-covariance matrix (or covariance matrix) of an m matrix V(x)(or Var(x) or cov(x)) defined by V(X)=E((x-E(X))(X-E(x))) when the expectations all exist Also if [ [ ] ] random vector x is the m
And, in particular, V(x) is symmetrical and diagonal if the elements of x are independent. Properties If a is constant, If A is a constant matrix and b a constant vector, negative definite) The covariance between the defined to be matrix. random variables x1 and the
is always non
, when all expectations exist; if a and b are constants. If A and B area constant matrices and c and d are constant vectors;
Group 5 math assignment Page 9
and
Conditional expectation The conditional expectation of two random matrices X1and X2; E(X1/X2) (of X1 given X2=A) is the expectation of X1defined using the conditional distribution of its elements given X2=A (A being a constant matrix). The conditional expectation E(X1/X2), is the expectation of X1 defined using the conditional distribution of its elements given X2. The double expectation formula is E (E(X1/X2))=E(X1) The conditional variance-covariance matrix V(x1/X2=A) or V(x1/X2) for a random vector x1 is defined by replacing conditional expectations into the definition of the variance-covariance matrix appropriately. The conditional covariance formula applies: V(x1) = E(V(x1/X2) + V(E(x1/X2)) For random vectors x1 and x2, the conditional covariance Cov(x1,x2/x3=A) or cov(x1,x2/x3) can be defined by putting the appropriate conditional expectations into the definition of the covariance. An additional covariance formula is cov(x1x2)= E(cov(x1,x2/X3))+cov(E(x1/X3),E(x2/X3)
Page 10
REFERENCES Probability and statistics for engineers and scientists, 6th edition by Walpole. Myers, Myers Ronald E Walpole, Raymond H.Myers, Sharon L.Myers Page 275-280. Analysis and adjustment of survey measurements by Edward M. Mikhai, School of Engineering, Purdue University West Lafayette, India. Probability and random processes by Venkatarama Krishna 2006 published by John Wiley and Sons, pages 384-405. Amos Storkey. Mlpr lectures: Distributions and models. http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodelsprint4up.pdf, 2009. School of Informatics, University of Edinburgh.
Page 11