Вы находитесь на странице: 1из 3

Assignment Week 7

More Bayesian inference!


Reading
Not everyone thinks that Bayesian inference is a good model of cognition. This
paper lists some potential issues. In the process it also gives a pretty good
account of the different ways that Bayes rule has been applied to study the brain.
Bowers & Davis (2012). Bayesian Just-So Stories in Psychology and
Neuroscience, Psychological Bulletin Vol. 138, No. 3, 389414
Question 1 How biased is the slot machine now? What about now?
This is very similar to question 2 from last time (inferring the probability of a
click), but with a couple of changes. First, to make it closer to the reinforcement
learning examples we want the probability of reward (b) instead of the probability
of a click. Second, we want to do dynamic Bayesian inference, that is, we want
to show how the posterior evolves over time.
Imagine that you have a slot machine that pays out with probability b. So the
probability of winning is b and the probability of losing is 1-b.
Suppose you see the following sequence of wins and losses (the data, d1:t)
d1:t = 1,1,1,0,1,0,1,1,1,0,1,1,1,0,0,1,0,1,1,1,0,0,1,1,1 (i.e. 17 wins and 8 losses)
In the following questions well compute the probability distribution over b
sequentially i.e. we will update p( b | d1:t ) after every play
1. First, just like last time, lets setup the probability space. b must be between 0
and 1. In principal any value between 0 and 1 is possible, but we will focus on a
finite subset of possibilities. In particular lets consider the possible values of b to
be between 0 and 1 in steps of 0.001. Lets start by assuming a uniform prior,
so p(b) is constant. Plot the prior.
2. Compute the posterior after one trial using Bayes rule
p ( b|d 1 ) p ( d 1|b ) p ( b )
Plot the resulting distribution.

3. Write a loop to keep updating your posterior over b based on the data. Keep
track of the posterior in a matrix P, where P(:,t) is the distribution on trial t. Dont
forget to keep normalizing the posterior when you update!
Plot the result in two ways. First, plot the posterior for each time point as a line
plot on the same axes (plot(b, P) in matlab). Second, plot the posterior as a heat
plot using the imagesc function (imagesc(P)). What does this tell you about the
accuracy of the estimate of the bias over time?
4. Compute the mean of the posterior over time. Remember that the mean of a
probability distribution is
(t)= db p ( bd 1 :t ) b b ( i ) pBD ( i , t )
i

Plot the mean over time.


5. Does your answer for part 4 remind you of anything? Cast your mind back to
the brute force averaging model for reinforcement learning in Assignment 4. Plot
how the brute force average of number of wins evolves over time. How
does it compare with your answer to part 4?
Question 2 The changing slot machine
Imagine you have a slot machine and you are trying to figure out its bias.
However, the bias of this slot machine can change over time. In particular theres
a probability (called the hazard rate, h = 0.1) that on each trial the slot machine
will take on a new bias, thats completely randomly chosen from 0 to 1, on the
next play. Suppose you know this hazard rate is 0.1. How will your posterior for
this time varying bias evolve?
1. Lets start by generating the data (i.e. the outcomes we want to infer the bias
from). To do this we need to know the true bias. Generate 100 trials of b
assuming that it starts out randomly between 0 and 1 and on each trial it either
stays the same (with probability 1-h) or changes to another random value (with
probability h). Plot this changing bias over time.
2. Next generate a sequence of wins and losses from the slot machine.
3. Starting from a uniform prior, whats the posterior over bt after the first coin
toss?
4. Next, compute the prior for coin toss 2 based on the posterior for coin toss 1.
i.e. compute p(b2 | d1) based on p(b1 | d1). Plot the result.
5. Now compute the posterior on the second trial, i.e. p(b2 | d1:2). Plot the result.

6. Now loop over all trials to compute how the posterior changes over time. Plot
the result using imagesc. Plot the true bias on the same plot using the plot
function and hold on.
7. Compute and plot the mean of the bias over time. Does this remind you of
anything? Cast your mind back to Assignment 4 again and the finite learning rate
models
8. Suppose that you did not know the slot machine could change its bias i.e.
suppose you tried to infer the bias using the model in Question 1. Compute what
would happen and plot the resulting posterior over time using imagesc.
Extra Credit (5%)
Suppose that instead of the slot machine changing its bias abruptly on some
trials, the slot machine changes its bias gradually on all trials.
In particular, lets assume that the transition probability p(b t | bt-1) is given by
p ( bt|bt 1 ) b tX ( 1bt )Y
where X and Y are related to bt-1
X =40 bt 1
Y =40(1bt1 )
Simulate how bt evolves over time with this transition structure. And the simulate
how Bayes rule can infer how the bias changes over time.

Вам также может понравиться