Вы находитесь на странице: 1из 2

Syllabus

for Dynamic Programming


Spring 2017
Course Instructor: Diego Klabjan

Course Time and Location: MW 11:00-12:15 pm; Tech M228

Best Way to Contact in Order of Preference :


skype dklabjan (do not add me as a contact)
hangout dklabjan
sms 847 691 1148
email d-klabjan@northwestern.edu
Response time directly correlated with the order of preference.

Textbook: Warren B. Powell; Approximate Dynamic Programming: Solving the Courses of


Dimensionality; John Wiley & Sons, 2007. ISBN 978-0-470-17155-4

Goal: This course will cover reinforcement learning aka dynamic programming, which is a modeling
principle capturing dynamic environments and stochastic nature of events. The main goal is to learn
dynamic programming and how to apply it to a variety of problems. The course will cover both
theoretical and computational aspects.

Tentative list of topics:

1. Introduction to dynamic programming (Chapters 1 and 2)


2. Value and policy iterations (Chapter 3)
3. Stochastic gradient algorithm (Chapter 6)
4. Q-learning and temporal differences (Chapter 8)
5. Value function approximation and Monte-Carlo sampling (Chapter 4)
6. Linear and dynamic programming (time permitting, not in the text book)

Grading: There will be mandatory individual homework assignments. In addition, each student will
have two options: either performing a quarter long project, or having two take home exams. They will
be weighed as follows:

Homework assignments: 30%


Midterm and final exam: 35% each (if you choose the homework/exam route)
Project: 70% (if you choose the homework/project route).
Homework Assignments: There will be a homework assignment every other week. You will not be
allowed to use any literature except the textbook. You may discuss homework assignments but the final
homework solution that you turn in may not be collaborative. The grading will be based on peer grading.

Exams: The two exams will be seven-day take home exams. Only the textbook will be allowable. These
are individual exams and thus absolutely no collaboration is allowed.

Project: There will be a major project of your choice. Teams of at most three (preferably two) students
are allowed. The project must be selected during the first two weeks. During the quarter, bi-weekly
status reports must be posted on Canvas under discussions. These reports must be approximately half a
page long and they must state the progress made in the previous two weeks. At the end of the course,
an in-class presentation will be made and a final report not exceeding five pages is expected.

Requirements: Basic mathematical knowledge is required. Exposure to fundamental probability is


requested. Only very basic mathematical programming knowledge is desired (basic linear programming
duality, linear modeling concepts). The course is suitable also for first year students.

Вам также может понравиться