Вы находитесь на странице: 1из 15

Data Science using Python

A comprehensive, job-oriented training program crafted by experts

Disclaimer: This material is protected under copyright act AnalytixLabs ©, 2011-2018. Unauthorized use and/ or duplication of this material or any part of this material including data, in any form without explicit and written permission from AnalytixLabs is strictly prohibited. Any violation of this copyright will attract legal actions

About AnalytixLabs

AnalytixLabs is a capability building and training solutions firm led by McKinsey, IIM, ISB and
AnalytixLabs is a capability building and training solutions firm led by McKinsey, IIM, ISB and IIT alumni with deep industry experience
and a flair for coaching. We are focused at helping our clients develop skills in basic and advanced analytics to enable them to emerge as
“Industry Ready” professionals and enhance career opportunities. AnalytixLabs has been also featured as top institutes by prestigious
publications like Analytics India Magazine and Higher Education Review, since 2013.

Bottom line

• Job-oriented training Faculty • Lucrative job prospects in high growth domain • Seasoned analytics
• Job-oriented training
Faculty
• Lucrative job prospects in high
growth domain
• Seasoned analytics professionals
Content
• Support for relevant
certifications and diplomas
• World class course structure
Approach
• Career counseling and planning
• Surpasses industry requirements
• Together we have 30 + years of
experience with prestigious firms,
like McKinsey, KPMG, Deloitte
and AOL

80-20 focus on practical & theory

Personal attention and Individual counselling

Industry best practices

Cater to Standard certifications

High quality course material and real life case studies

Regular sessions by industry experts

Value for money with high return on investment

Global Data science and Big Data skill gap

McKinsey Global Institute estimates a shortage of nearly 1.7 million big data talents by 2018.
McKinsey Global Institute estimates a shortage of nearly 1.7 million big data talents by 2018. This includes a
shortage of 140,000 to 190,000 workers with deep technical and analytical expertise, and a shortage of 1.5
million managers and analysts equipped to work with and use big data outputs
analytical expertise, and a shortage of 1.5 million managers and analysts equipped to work with and
analytical expertise, and a shortage of 1.5 million managers and analysts equipped to work with and

Candidates trained by us are working in leading companies across industries…

Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…
Candidates trained by us are working in leading companies across industries…

Program Objective

Data Science using Python program aims to provide its students an international, wide-spectrum qualification for job-readiness and seamless absorption in Big Data job roles.

The program will expose the students and professionals to the roles of Big Data Analysts who have:

Ability to translate business problem into analytics problem

Understanding of storage, retrieval and mining of data

Possess Outcome-Oriented and Global Industry-Specific expertise in Critical Data Analytics and Data Management Skills

Hands-on practical skills on exploratory analysis, prescriptive and predictive analysis using Python

Application of analytics in various domains, like ecommerce, Retail, Telecom, BFSI etc.

Skills to leverage analytics to drive smart business decisions

Crafted by team of experts and maintains a balance between theoretical concepts and practical applications
Crafted by team of experts and maintains a balance between theoretical concepts and practical applications

Data Science using Python is a comprehensive program with following modules, weekly assignments and case studies

• Python Foundation – 21 hours + Practice exercises Module 1 • Basic data handling,
• Python Foundation – 21 hours + Practice exercises
Module 1
• Basic data handling, data manipulation and visualization
• Business Analytics – 27 hours + Practice exercises
Module 2
• Data preparation for advanced analytics and predictive modeling
• Machine Learning – 24 hours + Practice exercises Module 3 • Supervised & Unsupervised
• Machine Learning – 24 hours + Practice exercises
Module 3
• Supervised & Unsupervised learning (ANN, SVM, KNN) and Text Mining
Crafted by team of experts and maintains a balance between theoretical concepts and practical applications
Crafted by team of experts and maintains a balance between theoretical concepts and practical applications

Data Science using Python-Python Foundation (1/4)

Total Duration: 21 hours live training + Practice

Introduction to Data Science with Python

What is analytics & Data Science?

Common Terms in Analytics

Analytics vs. Data warehousing, OLAP, MIS Reporting

Relevance in industry and need of the hour

Types of problems and business objectives in various industries

How leading companies are harnessing the power of analytics?

Critical success drivers

Overview of analytics tools & their popularity

Analytics Methodology & problem solving framework

List of steps in Analytics projects

Identify the most appropriate solution design for the given problem statement

Project plan for Analytics project & key milestones based on effort estimates

Build Resource plan for analytics project

Why Python for data science?

Python Essentials (Core)

Overview of Python- Starting with Python

Introduction to installation of Python

Introduction to Python Editors & IDE's(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)

Understand Jupyter notebook & Customize Settings

Concept of Packages/Libraries - Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)

Installing & loading Packages & Name Spaces

Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)

Python Essentials (Core)

List and Dictionary Comprehensions

Variable & Value Labels – Date & Time Values

Basic Operations - Mathematical - string - date

Reading and writing data

Simple plotting

Control flow & conditional statements

Debugging & Code profiling

How to create class and modules and how to call them?

Scientific distributions used in python for Data Science

Numpy, scify, pandas, scikitlearn, statmodels, nltk etc

Accessing/Importing and Exporting Data using python modules

Importing Data from various sources (Csv, txt, excel, access etc)

Database Input (Connecting to database)

Viewing Data objects - subsetting, methods

Exporting Data to various formats

Important python modules: Pandas, beautifulsoup

Data Manipulation – cleansing – Munging using Python modules

Cleansing Data with Python

Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)

Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)

Python Built-in Functions (Text, numeric, date, utility functions)

Python User Defined Functions

Stripping out extraneous information

Data Manipulation – cleansing – Munging using Python modules

Normalizing data

Formatting data

Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

Data Analysis – Visualization using Python

Introduction exploratory data analysis

Descriptive statistics, Frequency Tables and summarization

Univariate Analysis (Distribution of data & Graphical Analysis)

Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)

Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)

Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and scipy.stats etc)

Data Science using Python-Business Analytics (2/4)

Total Duration: 27 hours live training + Practice

Introduction to Statistics

Basic Statistics - Measures of Central Tendencies and Variance

Building blocks - Probability Distributions - Normal distribution - Central Limit Theorem

Inferential Statistics -Sampling - Concept of Hypothesis Testing

Statistical Methods - Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square

Important modules for statistical methods: Numpy, Scipy, Pandas

Introduction to Predictive Modeling

Concept of model in analytics and how it is used?

Common terminology used in analytics & modeling process

Popular modeling algorithms

Types of Business problems - Mapping of Techniques

Different Phases of Predictive Modeling

Data Exploration for modeling

Need for structured exploratory data

EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)

Identify missing data

Identify outliers data

Visualize the data trends and patterns

Data Preparation

Need of Data preparation

Consolidation/Aggregation - Outlier treatment - Flat Liners - Missing values- Dummy creation - Variable Reduction

Variable Reduction Techniques - Factor & PCA Analysis

Segmentation: Solving segmentation problems

Introduction to Segmentation

Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)

Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)

Behavioral Segmentation Techniques (K-Means Cluster Analysis)

Cluster evaluation and profiling - Identify cluster characteristics

Interpretation of results - Implementation on new data

Linear Regression: Solving regression problems

Introduction - Applications

Assumptions of Linear Regression

Building Linear Regression Model

Understanding standard metrics (Variable significance, R- square/Adjusted R-square, Global hypothesis ,etc)

Assess the overall effectiveness of the model

Validation of Models (Re running Vs. Scoring)

Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)

Interpretation of Results - Business Validation - Implementation on new data

Logistic Regression: Solving classification problems

Introduction - Applications

Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models

Building Logistic Regression Model (Binary Logistic Model)

Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)

Validation of Logistic Regression Models (Re running Vs. Scoring)

Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)

Interpretation of Results - Business Validation - Implementation on new data

Time Series Forecasting: Solving forecasting problems

Introduction - Applications

Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition

Classification of Techniques(Pattern based - Pattern less)

Basic Techniques - Averages, Smoothening, etc

Advanced Techniques - AR Models, ARIMA, etc

Understanding Forecasting Accuracy - MAPE, MAD, MSE, etc

Data Science using Python-Machine Learning (3/4)

Total Duration: 24 hours live training + Practice

Machine Learning -Predictive Modeling – Basics

Introduction to Machine Learning & Predictive Modeling

Types of Business problems - Mapping of Techniques - Regression vs. classification vs. segmentation vs. Forecasting

Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning

Different Phases of Predictive Modeling (Data Pre- processing, Sampling, Model Building, Validation)

Overfitting (Bias-Variance Trade off) & Performance Metrics

Feature engineering & dimension reduction

Concept of optimization & cost function

Overview of gradient descent algorithm

Overview of Cross validation(Bootstrapping, K-Fold validation etc)

Model performance metrics (R-square, Adjusted R- squre, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )

Unsupervised Learning: Segmentation

What is segmentation & Role of ML in Segmentation?

Concept of Distance and related math background

K-Means Clustering

Expectation Maximization

Hierarchical Clustering

Spectral Clustering (DBSCAN)

Principle component Analysis (PCA)

Supervised Learning: Decision Trees

Decision Trees - Introduction - Applications

Types of Decision Tree Algorithms

Construction of Decision Trees through Simplified Examples; Choosing the "Best" attribute at each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees

Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other Measures of Randomness

Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules

Decision Trees - Validation

Overfitting - Best Practices to avoid

Supervised Learning: Ensemble Learning

Concept of Ensembling

Manual Ensembling Vs. Automated Ensembling

Methods of Ensembling (Stacking, Mixture of Experts)

Bagging (Logic, Practical Applications)

Random forest (Logic, Practical Applications)

Boosting (Logic, Practical Applications)

Ada Boost

Gradient Boosting Machines (GBM)

XGBoost

Supervised Learning: Artificial Neural Networks (ANN)

Motivation for Neural Networks and Its Applications

Perceptron and Single Layer Neural Network, and Hand Calculations

Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques

Neural Networks for Regression

Neural Networks for Classification

Interpretation of Outputs and Fine tune the models with hyper parameters

Validating ANN models

Supervised Learning: Support Vector Machines

Motivation for Support Vector Machine & Applications

Support Vector Regression

Support vector classifier (Linear & Non-Linear)

Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)

Interpretation of Outputs and Fine tune the models with hyper parameters

Validating SVM models

Data Science using Python-Machine Learning (4/4)

Total Duration: 24 hours live training + Practice

Supervised Learning: KNN

Text Mining & Analytics

What is KNN & Applications?

Segmentation using K-Means/Hierarchical

Text Analytics -

KNN for missing treatment

Clustering

KNN For solving regression problems

KNN for solving classification problems

Validating KNN model

Model fine tuning with hyper parameters

Supervised Learning: Naïve Bayes

Concept of Conditional Probability

Bayes Theorem and Its Applications

Naïve Bayes for classification

Applications of Naïve Bayes in Classifications

Text Mining & Analytics

Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Properties of words; Creating Term-Document (TxD);Matrices; Similarity measures, Low-level processes (Sentence Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)

Finding patterns in text: text mining, text as a graph

Natural Language processing (NLP)

Text Analytics – Sentiment Analysis using R

Text Analytics – Word cloud analysis using R

Text Analytics - Classification (Spam/Not spam)

Applications of Social Media Analytics

Metrics(Measures Actions) in social media analytics

Examples & Actionable Insights using Social Media Analytics

•Important python modules for Machine Learning (SciKit Learn, stats models, scipy, nltk etc)

•Fine tuning the models using Hyper parameters, grid search, piping etc.

Project - Consolidate Learnings:

Applying different algorithms to solve the business problems and bench mark the results

Course completion and career assistance

Course completion & Certification criteria • You shall be awarded an AnalytixLabs certificate only post
Course completion & Certification criteria
• You shall be awarded an AnalytixLabs certificate only
post the submission and evaluation of mandatory course
project work. These will be provided as a part of the
training.
• There is no pass/fail for these assignments and projects .
Our objective is to ensure that trainees get strong hands-
on experience so that they are well-prepared for job
interviews along with performance at their jobs.
• Incase the assignments and projects are not up-to-the-
mark, trainees are welcome to take help and support for
improvisation.
• While weekly schedule is shared with trainees for regular
assignments, candidates get 3 months, post course
completion, to submit their final assignment and
projects.
completion, to submit their final assignment and projects. What is included in career assistance? • Post
What is included in career assistance? • Post successful course completion, candidates can seek assistance
What is included in career assistance?
• Post successful course completion, candidates can seek
assistance from AnalytixLabs for profile building. A team
of seasoned professionals will help you based on your
overall education background and work experience. This
will be followed by interview preparation along with
mock interviews (if required)
• Job referrals are based on the requirements we get from
various organizations, HR consultants and large pool of
AnalytixLabs’ ex-students working in various companies.
• No one can truthfully provide job guarantee, particularly
for good quality job profiles in Analytics. However, most
of our students do get multiple interview calls and good
career options based on the skills they learn during
training. For this there will be continuous support from
our side for as long as required.

Time and investment

Full interactive online training: 72 hours live training + Practice (~120 hours), INR 30,000 + 18% GST / $1200 (foreign nationals) including taxes

Data Science using Python (self-paced): ~72 hours + Practice, INR 25,000 + 18% GST / $900 (foreign nationals)

Timing: 6 hours per weekend live training (Saturday & Sunday 3 hours each) + Practice

Training mode: Fully interactive live online class (In addition to the above, you will also get access to the recordings for future reference and self study)

Components: Learning Management System access for courseware like class recordings - study material, Industry- relevant project work

Certification: Participants will be awarded a certificate on successful completion of the stipulated requirements including an evaluation

We provide trainings both in ‘fully interactive live online’ and classroom* mode

Fully interactive live online class with personal attention Saves commuting time and resources in today’s
Fully interactive
live online class
with personal
attention
Saves
commuting time
and resources in
today’s chaotic
world
Ensures
best use of
time and
resources
Access to quality
training and 24x7
practice
sessions
available at the
comfort of your
place
Delivered
lectures are
recorded and
can be replayed
by individuals as
per their needs
Studies prove
that online
education beats
the conventional
classroom
One of strongest
global trends in
education, both
in developing
and developed
countries

*Classroom only available at Gurgaon and Bangalore center

Contact Us

Visit us on: http://www.analytixlabs.in/

For course registration, please visit: http://www.analytixlabs.co.in/course-registration/

For more information, please contact us: http://www.analytixlabs.co.in/contact-us/ Or email: info@analytixlabs.co.in Call us we would love to speak with you: (+91) 9555219007

Join us on:

Twitter - http://twitter.com/#!/AnalytixLabs Facebook - http://www.facebook.com/analytixlabs LinkedIn - http://www.linkedin.com/in/analytixlabs Blog - http://www.analytixlabs.co.in/category/blog/

Visit Us

Gurgaon Address:

GF 382, Sector 29, Adjoining IFFCO Chowk Metro Station (Gate 2), Next to Vasan Eye Care Hospital, Gurgaon, Haryana 122001, India

Bengaluru Address:

Bldg 41, First floor, 14th Main Road, Near BDA complex, Sector 7, HSR Layout Bengaluru - 560102 Landmark: Max store