Академический Документы
Профессиональный Документы
Культура Документы
Teaching Assistants
TBA Labs: Tuesdays 12:00-1:00 pm Office Location: TBA Office hours: TBA E-mail: TBA
Prerequisites
Background: "Data Structure and Software Principles" or consent of instructor (good statistics and machine learning knowledge will help better understanding the course materials). Programming: We will give one programming assignments. You will need to be familiar with at least one programming language, such as C++, or Java. We will not cover programming-specific issues in this course.
Textbook
Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques, 2nd ed., Morgan Kaufmann, 2006.
References
The following texts are recommended but not required. There are numerous other books or online resources on data mining available. E. Alpaydin. Introduction to Machine Learning, 2nd ed., MIT Press, 2011. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., Springer-Verlag, 2009. T. M. Mitchell, Machine Learning, McGraw Hill, 1997. P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, Addison Wesley, 2005. I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, 2nd ed. 2005. Lecture slides contain most technical briefing and reference materials. Please study the materials in class preparation and class review. There are many research papers that will help understand the course contents. Please check the references of this course to obtain further information
Examinations
There will be three exams: Two midterm exams each will be 1.5 hours in length, and the final will be 3 hours in length. We will not normally give make-ups for missed exams.
Evaluation
We plan to determine final grades of the course in the following way: Assignments: 6% (2 homework assignments, 3% each) Quizzes: 6%. (3 quizzes, 2% each) Lab work: 3% (attendance and lab work) Two Midterm exams: 30% (First exam 10%, second exam 20%) Final exam: 40% Project: Option1: survey (10%) + assignment 3 (5%) Option 2: Software project or research project(15 %)
Course project
You can choose one of the following options: 1. Survey: (2-3 students) Writing a focused, comprehensive survey on a focused topic of data warehouse or data mining, for example, a survey on data warehouse architectures, clustering methods, or Frequent Itemsets techniques. You will need to make a talk by the end of the year (no power point presentation is required). For this option you will be required to do and submit assignment 3. 2. Data mining software function maker or a full data warehouse: (4-5 students) Implementing one high-performance, fully documented open source data mining function maker or a full data warehouse application, as discussed in the textbook, in Java or C++ (or any programming language that you may prefer). This should include a user-interface and visualization package. You will be required to write a report and do a presentation. Whoever decides to go with this one will be exempted from assignment 3 with its mark to be added to the project mark. [Note: copying online open source software is considered as plagiarism!] 3. Research Project: (3-4 students) You can also propose and work out a research project. In this project you compare two or three algorithms and try to study the time, accuracy, or space performance of the different algorithms under comparison. You may come up with a conclusion from your results about the best algorithm to use and in which cases. You will be required to write a report and do a presentation. You will be exempted from assignment 3 with its mark to be added to the project mark.
Project Schedule
1. One page proposal (week 4) (1%)
One page project proposal, with name, title, abstract and reference list should be handed in for comments and feedbacks. Please submit the proposal in class or to go to Dr. Ghada Badr office (Room 12, basement) and submit it before 1:30 pm Wednesday, Oct. 5th