Вы находитесь на странице: 1из 3

Applied Multivariate Statistical Modelling – II, Assignment – 2

Due (February 15, 2019) 50 marks


Please do your own assignment. Any evidence of duplication will be met with strictest
measure as per the IIT KGP code of conduct.

1. Suppose a random variable 𝑌 follow Weibull Distribution with where 𝜆 is fixed at a


constant value and MLE of 𝜃 is 𝜃̂. The pdf is given by
𝜆𝑦 𝜆−1 𝑦 𝜆
𝑓(𝑦, 𝜆, 𝜃) = exp (− ( ) )
𝜃𝜆 𝜃
a. Write an expression for Wald Statistic and score statistic in terms of 𝑦, 𝜃. Show
all steps. (10)

𝑦 𝜆
Ans. The log-likelihood function 𝑙 = (𝜆 − 1) log 𝑦 + log 𝜆 − 𝜆 log 𝜃 − (𝜃)
𝑑𝑙 𝜆 𝜆𝑦 𝜆
𝑈 = 𝑑𝜃 = [− 𝜃 + 𝜃𝜆+1 ]
For MLE 𝑈(𝜃̂) = 0 ; solving we get 𝜃̂ = 𝑦.

Representing Weibull distribution as a member of Exponential family in the form


of 𝑓(𝑦, 𝜃) = exp(𝑎(𝑦)𝑏(𝜃) + 𝑐(𝜃) + 𝑑(𝑦) we get
𝑎(𝑦) = 𝑦 𝜆 , 𝑏(𝜃) = −𝜃 −𝜆 , 𝑐(𝜃) = log 𝜆 − 𝜆 log 𝜃 , 𝑑(𝑦) = (𝜆 − 1) log 𝑦
𝜆 (𝜆+1)𝜆
𝑏 ′ (𝜃) = 𝜃𝜆+1 ; 𝑏 ′′ (𝜃) = − ;
𝜃𝜆+2
𝜆 𝜆
𝑐 ′ (𝜃) = − 𝜃 ; 𝑐 ′′ (𝜃) = 𝜃2
𝔍 = 𝑉𝑎𝑟(𝑈) = [𝑏 ′ (𝜃)]2 𝑉𝑎𝑟(𝑎(𝑦))
𝑏 ′′ (𝜃)𝑐 ′ (𝜃)− 𝑐 ′′ (𝜃)𝑏 ′ (𝜃)
= [𝑏 ′ (𝜃)]2 [𝑏 ′ (𝜃)]3
[𝑏 (𝜃)𝑐 (𝜃)− 𝑐 (𝜃)𝑏 ′ (𝜃)]
′′ ′ ′′
= 𝑏 ′ (𝜃)

After putting the expressions for 𝑏 ′ (𝜃), 𝑐 ′ (𝜃), 𝑏 ′′ (𝜃), 𝑐 ′′ (𝜃), we get 𝔍 = 𝜆2 /𝜃 2

2 (𝜃−𝜃 ) ̂ 2 𝜆2
Wald Statistics (𝜃 − 𝜃̂) 𝔍(𝜃̂) = 𝜃̂2
**Note that for Wald statistic 𝔍 is calculated at MLE 𝜃̂.
Score Statistic (SS) is 𝑈 𝑇 𝔍−1 𝑈
2
𝜆 𝜆𝑦𝜆
[− + 𝜆+1 ]
. **Note here 𝔍 is not evaluated at 𝜃̂.
𝜃 𝜃
In this case SS = 𝜆2
𝜃2
b. For 𝑦= 10, and 𝜆 = 2, determine model adequacy for 𝑎) 𝜃 = 5 ; 𝑏) 𝜃 = 11 (5)
(10−5)2 22
At 𝜃 = 5, 𝑦 = 10, 𝜆 = 2, 𝑊𝑎𝑙𝑑 𝑆𝑡𝑎𝑡(𝑊𝑆) = = 1 𝑎𝑠 𝜃̂ = 𝑦 = 10
102
2 2
𝜆 𝜆𝑦𝜆 2 102
[− + 𝜆+1 ] (− +2∗ 3 )
𝜃 𝜃 5 5
Score Stat (SS) 𝜆2
= 22
=9
𝜃2 52

At 𝛼 = 0.05, 𝜒 2 (1) = 3.84


Hence from Wald Test, since WS<3.84we cannot reject the null hypothesis, so the conclusion
would be that 𝜃 = 5 is not worse than the MLE 𝜃̂ = 10.
From the Score Test, since SS>3.84, we reject the null hypothesis and claim that 𝜃 = 5 is not
adequate.

22
At 𝜃 = 11, WS = 1 ∗ 102 = 0.04

SS = 0.03
Hence at 𝜃 = 11, both WS and SS <3.84, so we cannot reject the null hypothesis that 𝜃 = 11 is
an adequate estimator.

2. The pareto distribution is given by 𝑓(𝑦, 𝜃) = 𝜃𝑦 −𝜃−1


a. Show that the distribution belongs to the exponential family and find the a, b, c
and d functions. (5)
𝑙 = log 𝜃 − (𝜃 + 1) log 𝑦
𝑎(𝑦) = log 𝑦
𝑏(𝜃) = −𝜃 𝑏 ′ (𝜃) = −1; 𝑏 ′′ (𝜃) = 0
𝑐(𝜃) = log 𝜃 𝑐 ′ (𝜃) = 1/𝜃 ; 𝑐 ′′ (𝜃) = −1/𝜃 2
𝑑(𝑦) = − log 𝑦

b. Find the score statistics U. Verify that 𝐸(𝑈) = 0. (5)


𝑑𝑙 1
𝑈 = 𝑑𝜃 = 𝜃 − log 𝑦
𝑙 = 𝑎(𝑦)𝑏(𝜃) + 𝑐(𝜃) + 𝑑(𝑦)
𝑑𝑙
𝑈 = = 𝑎(𝑦)𝑏 ′ (𝜃) + 𝑐′(𝜃)
𝑑𝜃
1 1
𝐸(𝑈) = 𝐸 [log 𝑦 (−1) + 𝜃] = −𝐸(log 𝑦) + 𝜃
Note that you cannot directly use 𝐸(𝑎(𝑦)) = −𝑐′(𝜃)/𝑏 ′ (𝜃) since this relation is
derived from the result that 𝐸(𝑈) = 0.
So we need to determine 𝐸(log 𝑦 ).
Now pareto distribution of the form 𝑓(𝑦, 𝜃) = 𝜃𝑦 −𝜃−1 is defined over support
𝑦 ∈ [1, ∞) . so

𝐸(log 𝑦) = ∫ log 𝑦 (𝜃𝑦 −𝜃−1 ) 𝑑𝑦
1
Doing integration by parts, taking log 𝑦 = 𝑢 and (𝜃𝑦 −𝜃−1 ) = 𝑣,
∫ 𝑢𝑣𝑑𝑦 = 𝑢 ∫ 𝑣𝑑𝑦 − ∫ 𝑢′ ∫ 𝑣𝑑𝑦
1 ∞ 1
This gives us 𝐸(log 𝑦) = [−𝑦 −𝜃 {log 𝑦 + 𝜃}] = − 𝜃
1
1 1
Hence 𝐸(𝑈) = − 𝜃 + 𝜃 = 0 (proved)

c. Find the information 𝒥 = 𝑉𝑎𝑟(𝑈). (5)


[𝑏 ′′ (𝜃)𝑐 ′ (𝜃) − 𝑐 ′′ (𝜃)𝑏 ′ (𝜃)] 1
𝑉𝑎𝑟(𝑈) = ′
= 𝑐 ′′ (𝜃) = 2
𝑏 (𝜃) 𝜃

3. Download “smoking.txt” dataset. Analyze the data using statistical models in class and
explain the findings. Use appropriate plots to aid your explanation. (20)

#################################

Вам также может понравиться