Вы находитесь на странице: 1из 1

CSCI 4100/6100 RPI

Machine and Computational Learning Fall 2010

ASSIGNMENT 2, due Thursday, September 30


Homeworks are due in class or in my mail box by 2pm on the due date. LFD is the class textbook

1. (100) Exercise 2.1 in LFD


2. (100) Exercise 2.2 in LFD
3. (100) Exercise 2.4 in LFD
4. (100) Exercise 2.5 in LFD
5. (100) Generalization Bound

Suppose mH (N ) = N + 1, so dVC = 1. You have 100 training examples. Use the generalization
bound to give a bound for Eout with confidence 90%. Repeat for N=10,000;

6. (100) Sample Complexity

For an H with dVC = 10, what sample size do you need (as prescribed by the generalization
bound) to have a 95% confidence that your generalization error is at most 0.05?

7. (200) Monotonically Increasing Functions in R2

The monotonically increasing hypothesis set is

H = {h | x1 ≥ x2 =⇒ h(x1 ) ≥ h(x2 )}.

(x1 ≥ x2 if and only if the inequality is satisfied for every component.) Give an example mono-
tonic classifier which is in H, clearly showing the +1 and −1 regions.
Compute mH (N ) and hence the VC-dimension. [Hint: Consider a set of N points generated by
choosing one point and generating the next point by increasing the first component and decreasing
the second component until N points are obtained.]

8. (100) Candidate Growth Functions

Which of the following are possible growth functions mH (N ) for some hypothesis set:

N (N − 1) √ N (N − 1)(N − 2)
1 + N; 1 + N + ; 2N ; 2⌊ N ⌋ ; 2⌊ N/2 ⌋ ; 1 + N + .
2 6

9. (100) Bounding mH (N )
d  
N eN d
P 
Prove that for N ≥ d, i ≤ d .
i=0
d   d
N d−i
 
N N 1 x
P P  
[Hints: since N ≥ d, i ≤ d i ; 1+ x ≤ e for x > 0.]
i=0 i=0

Вам также может понравиться