Assistant Professor of Information Systems, ISB (http://www.isb.edu/faculty-research/faculty/directory/saha-rajib-l) Textbook
OR
You may create a free account to download data
and try out the examples given in the book Publisher’s website: www.dataminingbook.com Please bring your laptop in every class. You should have Spotfire, XLMiner installed on your machine before coming to the next class. Course Outline & Evaluation Components
<< Course outline >>
Dataset • From textbook (www.dataminingbook.com) • Scanner data from retail chain • Healthcare data from case studies • Real data from competition platform like www.kaggle.com, etc.
• Project: Propose a project in a area you are
interested in; publicly available datasets are acceptable; exact format for proposing a project will be given in class during Week 2. Today’s class • A glimpse of applications in different areas • How business objectives are achieved using data mining techniques • An overview of several data mining techniques • A glimpse of how theses techniques are evaluated What is OUTSIDE the scope of this class although relevant to what we do:
• Scale―Big Data (e.g., analyzing
unstructured data like text mining, dealing with streaming data with time- critical applications, parallel storage and computing, etc.) • Programming • Exclusive focus on applications or a particular domain – Plan is do an appropriate mix of tools and applications, and not applications alone A word on Honor Code • Individual Assignments (honor code 2N-b): – Discussion at the broad level only is allowed but the work should be completely your own – Do not copy from or share with your classmates or even participants from previous batches • Group Assignments/Projects (honor code 3N-b): – Absolutely NO discussion with other teams • Project evaluation in class (honor code 4N): – Absolutely NO discussion with others, no external reference • Individual Take-home exercise (final) (honor code 4N): – Absolutely NO discussion with others, no external reference • Cite external references appropriately • Detection software is used for identifying overlaps with others’ work as well as external resources • !!! Punishment > Crime !!! – Getting zero on the component is not a punishment Classroom Etiquettes Attendance policy beyond ASA’s mandate
• The last class has project
presentation. • (5 points) You must be present to get points for presentations. • (5 points) You are also required to critically evaluate other teams’ presentation. It carries points. Before the next class… • Go over the course outline • Revise the Linear Regression (before- class materials loaded on LMS; since you have already done regression in your stat class) • Install XLMiner and Spotfire (MUST) • Bring your laptop to the class