Вы находитесь на странице: 1из 39

Digital and Web Analyst.

1. Digital Analytics Foundation


2. Google Analytics
3. Data Analysis
4. A/B testing
5. Google Tag Manager
6. Data Visualization
7. Email, Social and Campaign Analysis

https://www.udemy.com/career-in-digital-web-analytics/
https://www.digitalvidya.com/blog/

SQL Fundamentals for Marketing,


Digital and Web Analytics
https://www.udemy.com/sql-for-marketing-analysis/?couponCode=GAASQLDMA

Introduction
Installation and Overview of SQL Environment (MySQL)
Installing mySQL
My SQL Interface
Database Fundamentals
How databases are structured
Understanding a schema in MySQL
Missing Sakila Database and how to solve it
Selecting the Data
Write your first statement to select the data
Narrowing down your result set with WHERE clause
Sorting your result set
Using Numeric and Date Values to narrow down your result set
Other Key SQL Statements

Adding new records to the database


Updating exiting records
Deleting records
Some advanced Select Statements

Aggregating the data to do sum and count


Joining multiple tables to get desired result set
Marketing Use Cases
Who are the top or bottom customers?
How many products do we have in each category?
Using data from the database in other programs

Exporting data to CSV or Excel file


Connecting with Data Visualization Tool (Tableau)
Database Manipulation

Creating a table
Few more table manipulation commands
Importing data from a CSV file
Importing data from Excel to mySQL

Technical and Computer Skills

Scripting language (Matlab, Python)


Querying Language (SQL, Hive, Pig)
Spreadsheet (Excel)
Statistical Language (SAS, R, SPSS)
Programming (Javascript, XML)
Big data tools (Spark, Hive HQL)

Career Resources in Data Analytics

 Advanced Dеgrее – More Dаtа Sсіеnсе programs are рорріng uр to serve the сurrеnt dеmаnd, but
thеrе аrе аlѕо many Mаthеmаtісѕ, Stаtіѕtісѕ, and Cоmрutеr Sсіеnсе programs.
 MOOCѕ – Cоurѕеrа, Udacity, аnd codeacademy are gооd рlасеѕ tо ѕtаrt.
 Cеrtіfісаtіоnѕ – KDnuggets hаѕ соmріlеd an еxtеnѕіvе lіѕt.
 Bootcamps – Fоr more information аbоut hоw this approach соmраrеѕ to degree programs оr
MOOCs, сhесk out thіѕ guest blоg frоm thе dаtа ѕсіеntіѕtѕ аt Dаtаѕсоре Anаlуtісѕ.
 Kаgglе – Kаgglе hоѕtѕ data science competitions whеrе уоu саn practice, hоnе уоur ѕkіllѕ wіth messy,
rеаl wоrld dаtа, and tасklе асtuаl buѕіnеѕѕ рrоblеmѕ. Emрlоуеrѕ tаkе Kaggle rankings ѕеrіоuѕlу, as
they саn be ѕееn аѕ rеlеvаnt, hаndѕ-оn рrоjесt wоrk.
 LinkedIn Groups – Jоіn rеlеvаnt grоuрѕ to іntеrасt wіth оthеr members оf thе dаtа ѕсіеnсе соmmunіtу.
 KDnuggets – KDnuggets іѕ a gооd rеѕоurсе fоr ѕtауіng аt the fоrеfrоnt оf іnduѕtrу trеndѕ in dаtа
ѕсіеnсе.
Must Have Skills:
• Experience in designing, development and deployment of virtual agent technology such as
bots, auto-responders, virtual assistants, etc. on top of tableau, Qlik or other BI platforms or
database.
• Experience in programming codes using Python, Scala, .NET or similar languages.
Demonstrable knowledge of frameworks like Flask, Django is a definite plus.
• Hands on experience with Tableau, PowerBI, SiSense, Microstrategy and similar
visualization tools and platforms APIs is required.
• Demonstrable use of Cloud technologies using AWS services and having architected
solution for private, public Cloud environment, including Linux/Windows based cloud
deployment.
• Hands on experience in at least 1 relational database like SQL Server, Oracle, MySQL,
etc.
• AWS certification and prior experience with NoSQL databases will be a plus

Must Have Skills:


• 1 - 3 years of experience in analytics teams solving problems for top global companies
• Hands on experience in build and deployment of Statistical Models/Machine Learning
using following techniques – Statistical Algorithms (preferred in sequence)
o Segmentation (Cluster Analysis)
o Exploratory Analytics (T-tests)
o Multivariate Regression (Logistic, OLS, GLM etc.)
o Decision Trees
• Experience with SQL preferred
o Traditional DB’s – Oracle, TD, Sybase, MS SQL, Win SQL
• Excellent written and oral communication skills with ability to clearly communicate ideas
and results to non-technical business people

Good To Have Skills:


• Experience with SQL preferred
• Ability of Visual story-telling with BI & Visualization Tools (Tableau/Spotfire, Cognos,
Business Objects) is preferred.
• Experience using digital & statistical modeling software (one or more) – R, Revo R
(Preferred), SAS Basic & Enterprise Miner (Preferred), SQL (Preferred), SPSS
• Any of the Big Data querying language (Hive, PIG, Hadoop, etc.)

 Advanced MS Excel, Access and VBA (Preffable).


 Web Analytics tools (Google Analytics, Omniture, Marketo)
 Production tools & languages: Photoshop, Dreamweaver, Content Management
Systems and HTML (Desirable)
 Maintaining content within an enterprise web experience management platform (e.g.
Microsoft SharePoint, Adobe CQ5)
 Database reporting packages (e.g. Report Builder, Crystal Report Writer, COGNOS
Reporting) (Desirable)
 -3 years Research& Reporting: experience, analytics for website, social media and
digital advertising.
 COETL - Business Analyst Training in
Hyderabad
 201, Sri Sai Ram's Swarna Latha Estates, Yousufguda Main Rd, Keerthi Apartments Ln, Call :
088865 29997, Ali Nagar, Padala Ramareddy Colony, Yella Reddy Guda, Hyderabad, Telangana

COEPD
Third Floor, Sahithi Arcade, 301, SR Nagar Main Rd, Srinivasa Nagar, Sanjeeva Reddy Nagar,
Hyderabad, Telangana

Imarticus Learning - Data Analytics, Data Science


Courses & Finance Training Institute
303, 3rd floor , Block 1, white house, Begumpet, Kundanbagh Colony, Begumpet, Hyderabad, Telangana

Analytics Path
Plot No. 28, 4th Floor, Suraj Trade Center, Opp. Cyber Towers, Hitech City, Jaihind Enclave, Madhapur,
Hyderabad, Telangana

We enrolled to new courses in Cousrera and edx where we started learning, and searched for job myself.

if one does not have any industry experience in Big Data/Data Science/Statistics/Analytics domain then
he/she WILL NOT magically land a job in the relevant roles by earning a certificate or clearing the course
with high grades.

From my personal experience self learning or online learning websites such as Coursera, edX and
Udacity were far better compared to this Institution for what I paid to achieve the certification. If one would
really want to land in a job in the relevant roles then consider Kaggle data science challenges.

http://sqlschool.com/Microsoft-Datascience-Training.html

The few courses which I can suggest are:

1. Udacity - Take the free course called introduction to computer science at Udacity -
Free Online Classes & Nanodegrees .
2. Take courses from Learn R, Python & Data Science Online | DataCamp
3. Learn data science with Python and R. Get started for free.
4. MIT 6.00.1x on edX
I am learning “Machine Learning” from online
resources and I want to learn “TensorFlow”. I have
some basic knowledge in Python coding. How do I
start? What online resources should I follow?
Follow the deep learning course by Andrew Ng on Coursera. It will teach you.

But before that learn machine learning and linear algebra. For that you can follow
Abdrew Ng’s course named Machine Learning again on Coursera.

https://www.quora.com/What-are-the-best-learning-sites-for-Python
https://www.quora.com/What-are-some-good-free-resources-to-learn-Python
https://www.quora.com/What-are-the-best-sources-to-learn-Python-language
https://www.quora.com/How-should-I-start-learning-Python-1
https://www.quora.com/Which-is-the-best-place-to-learn-Python-having-previous-knowledge-in-
programming
https://www.quora.com/Which-is-the-best-book-for-learning-the-Python-programming-language
https://www.quora.com/My-boss-gave-me-30-days-not-working-days-to-learn-Python-to-transfer-to-
the-Data-Science-team-What-is-the-best-approach-to-learn-as-much-as-possible
https://www.digitalvidya.com/python-data-science-course/

http://www.isb.edu/cba/programme-overview/curriculum
Curriculum Details* Subject to minor changes
 Software tools in the Programme:
R, MySQL, Python, Stata, @Risk, Simio, Tableau, XLMiner, NodeXL MeXL, Hadoop (AWS)
EXPAND ALL | COLLAPSE ALL

PRE-TERM

1) Data Structures and Algorithms Using Python


2) Probability and Statistics using R
3) Relational Database Management Systems

TERM 1

1) Big Data Management

2) Business Fundamentals / Text Analytics

3) Data Collection / Data Visualization


4) Statistical Analysis 1

TERM 2

1) Forecasting Analytics 1 / Simulation


2) Forecasting Analytics 2 (online)

3) Machine Learning 1 - Unsupervised Learning

4) Optimization

5) Statistical Analysis 2

TERM 3

1) Financial Analytics (Online)


2) Machine Learning 2 - Supervised Learning
3) Marketing Analytics / Advanced topics in Machine Learning (Deep learning, IOT)
4) Pricing Analytics / Social Network Analysis

5) Retail Analytics (Customer Analytics / Supply Chain Analytics)

CAPSTONE PROJECT

1. NPTEL’s Data Science


2. Udacity Data Analytics
3. Coursera’s Data Science
Prerequisites might be understanding of R Program or Python.

Below are few of the courses that I consider the best in the field of Data Science and
Analytics that you should do.

DATA SCIENCE AND ANALYTICS

 Data Science Free Online Courses (edX) | Data Science Institute


 Data Science | edX
 Intro to Data Science | Udacity
 Full Stack Data Science Program - Jigsaw - Jigsaw Academy
 DataCamp: Learn R, Python & Data Science Online
 Data Science | Johns Hopkins University Engineering for Professionals
 Data Scientist Foundation | Udacity
 Data Analyst | Udacity
 Data Science - Udacity
 https://classroom.udacity.com/co...
 https://www.edx.org/course/subje...
 https://in.udacity.com/course/in...
 https://www.coursera.org/learn/d...

pgp courses in business analytics @ Praxis bschool which has tieup with PwC and icici for online
and full time course at bangalore and kolkata campud and in collaboration with IIDT at
hyderabad campus in association with Hyderabad government. Even recently they have
launched online courses on pg data science and business analytics for 8-12months duration, fees
very less plus earn a degree from a aicte approved institute,placement support same as full time
batch and every month one industry expert will come n take session via VLP.

Syllabus for PGDBA Written Test 2018

VERBAL ABILITY: Reading & Comprehension; Grammar/correction of sentences.

LOGICAL REASONING: Logical connectives, Statements and Conclusions, Matching and sequences.

DATA INTERPRETATION AND DATA VISUALIZATION: Data driven questions (pie charts, graphs, trends).

QUANTITATIVE APTITUDE: Sets, combinatorics, Algebra (solutions of quadratic equations, inequalities,


simultaneous linear equations, binomial theorem, series, AP, GP, HP, matrices), Euclidean geometry, Coordinate
geometry (lines, circles, conic sections), Trigonometry (triangles, trigonometric identities, heights and distances),
Calculus (functions, limits, continuity, derivative, maxima & minima, methods of integration, evaluation of areas
using integration).

The duration of the test is 3 hours. Calculator, charts, graph sheets, tables or gadgets are NOT allowed in the
examination hall.

Introduction to data mining (DM); DM methodology; Preparing data for mining; Data mining and Statistics;
Hazard Functions; Survival Analysis; Memory based reasoning; Market Basket Analysis, Link Analysis,
Decision Trees, Clustering; Privacy and Societal Issues, the Data mining environment, Putting DM to work.
REFERENCES:
Data Mining: Concepts and Techniques by Jiawei Han and Micheline Kamber, Morgan Kaufman Publishers
2001.
Data Mining: A Tutorial-Based Primer by Richard J. Roiger and Michael W. Geatz, Addison-Wesley 2003
Data Mining: Concepts, Models, Methods and Algorithms by Mehmed Kantardzic, Wiley, 2003

MG 221 – Applied Statistics – 2:1 – Chiranjit Mukhopadhyay


Quick introduction to probability distribution. One and Two Sample Problem for mean, variance and
proportions – Z-test, t-test, X2-test, F-test, Sign Test, Wilcoxon Rank–Sum and Signed-Rank Test.
Introduction to Design of Experiments. Quantitative Response with Qualitative Factors – Analysis of
Variance, Interaction Effects, Multiple Comparisons; General Analysis of Quantitative Response – Multiple
Linear Regression Modeling and Prediction, Multiple and Partial Effects and Correlations, Residual
Analysis, Dummy Variable Techniques, Analysis of Covariance, Model Building; Analysis of Qualitative
Response – Contingency Tables, Logistic Regression.
REFERENCES:
Michael H. Kutner, Christopher J. Nachtsheim, John Neter & William Li, Applied Linear Statistical Models,
McGraw-Hill International Edition, Fifth Edition, 2005.
C.R. Rao: Linear Statistical Inference and its Applications. Wiley. Second Edition.

MG 226 – Time Series Analysis and Forecasting – 3:0 – Chiranjit Mukhopadhyay


Stationary Stochastic Processes, Auto-covariance Function, Random Walk, Moving Average (MA) Models,
Autoregressive (AR) Models, Integration (I) & Differencing, Autocorrelation Functions, Partial
Autocorrelation Functions, Unit Root Tests, ARIMA Modeling, Frequency Domain Analysis, Seasonality
Modeling, SARIMA Models, Forecasting Using ARIMA Models, Vector Auto Regressive Models, Co-
integration and Vector Error Correction Models, Generalized Autoregressive Conditional Heteroscedastic
Model.
REFERENCES:
Brockwell, Peter J & Davis, Richard A: Time series: Theory and methods. Springer series in Statistics.
Second Edition.
Chatfield, Chris: Analysis of Time Series: an Introduction. Chapman & Hall. Sixth Edition.
Lutkepohl, Helmut: Introduction to Multiple Time Series Analysis. Springer-Verlag.

https://www.analytixlabs.co.in/data-science-using-python

Step 1: Learn mathematics and statistics

1. Complete the Udacity course on descriptive statistics. It will give you an


understanding about how to describe data and the statistics knowledge required to
interpret the data.

Intro to Descriptive Statistics | Udacity

2. Complete the Udactiy course on inferential statistics. The course will give you
understanding about how to draw conclusions from data. This is one of the best online
courses I have ever had!

Inferential Statistics: Learn Statistical Analysis | Udacity

Now you have the basic statistical knowledge required to understand data and to draw
conclusions from it.

You may also consider the following courses if you are not from mathematics
background.

Algebra I | Khan Academy


Intro to Statistics | Udacity

Step 2: Tool for analysis

Now that you have the required knowledge to understand the data, you need a tool to do
the analysis. The most popular tools are Python and R. If you already possess
programming skills, I recommend python, else choose R.

1. Python

If you are new to programming, complete the course on python in codeacademy


- Python. The course gives you an introduction to python. Then move on to courses that
is relevant from a data analytics perspective. I did a course in datacamp - Learn Python
for Data Science - Online Course. However, you may choose any course online, but you
need to be clear with the concepts of lists,dictionaries, data frames, numpy, pandas etc.

After this point, do you know when to use a array over a list? If not, you need to revise
the concepts again.

2. R

R is relatively easy to understand. Do the data analysis course on Udacity. The course
teaches you the basics of R as well as the data analysis process.

Exploratory Data Analysis Using R | Udacity

Step 3: Data analysis

Data analysis is an art! You need to ask the right questions in order to find something
interesting in the data. Learn how to analyze data systematically through the following
courses:

1. Python: Intro to Data Analysis | Udacity. Not a recommendable course for a beginner,
but understand the process.

2. R: The course mentioned in Step 2: Exploratory Data Analysis Using R | Udacity

If you have completed the courses mentioned, you should have a clear picture about the
data analysis process.

You may also complete the Intro to Data Science Online Course | Udacity(optional, but
highly recommended)

Step 4: Practice

You can download data sets from lot of sources online. UCI machine learning repository
and kaggle are the popular places where you can download datasets. Find a data set of
your interest and practice. Once you are comfortable with the process and tools, practice
using different type of data sets.
UCI Machine Learning Repository

Your Home for Data Science

You may not have gone through all the techniques required for data analysis through
these courses, but these courses will teach you how to proceed when you stuck at one
point and where and how to look for when you are lost.

Not motivated yet? Have a look at Tyler Field’s analysis on Bayareabikeshare


data: BABS Data Challenge. The data sets which he used for his analysis: Introducing
Bay Area Bike Share, your new regional transit system.

Want to build models? Check out the following links:

The Analytics Edge

In-depth introduction to machine learning in 15 hours of expert videos

Machine Learning | Udacity

Happy learning!

https://www.quora.com/Which-is-the-best-online-course-for-data-analysis-and-data-analytics

If you are looking for a career in data analytics, I would suggest that you take up a job in
one of the analytics companies - Mu Sigma, ZS associates, Fractal, Tredence, etc. These
companies mostly don't need any prerequisites for entry level analyst jobs and provide
great opportunities to learn the skills from scratch

If that's not possible for you, I would suggest you take the following progression to learn
data analytics in each of the key areas:

Maths

1. Basic statistics and data summarizing parameters like mean, median, mode, central
tendencies, distributions, etc.
2. Data integrity, comparison and tendency tests like t-test, z-test, f-test
3. Regression - Linear, Logistic, GLM, Mixed in that order
4. Advanced techniques like predictive modeling and prescriptive methods
Technology

1. Microsoft Excel: This is the Holy grail of analytics. Learn this in and out. From simple
formulae to the data analytics tool and dashboard in, you should learn it all
2. VBA: This is an extension of Excel and though not used very extensively, can help in
making a lot of tasks in excel easier
3. SQL: This is the logical progression from Excel for handling larger data volumes and
also standardizing processes and creating code modules for repeated use
4. SAS/R: The next step will be one of these tools as they can help you do more complex
processing like regression and modeling
5. Tableau: This is almost the standard right now for data visualization and
dashboarding
6. Advanced technologies like Shiny, Hadoop, Hive, etc.
Business

1. Working of different verticals like Technology, Pharma, Healthcare, Retail, Banking,


etc.
2. Applications of analytics in each of these verticals
Hope this helps.

Edit 1: Adding some free useful links that you can use to study the above mentioned
things:

Some useful and free resources:

1. Free Statistics Book


2. Your Home for Data Science - Kaggle
3. SQL Tutorial - W3Schools
4. SQLZOO - SQL Practice
5. SAS Customer Support Knowledge Base and Community - Best place to learn about a
lot of the features
6. Tableau Training & Tutorials - Free Tableau training videos
7. Learn Excel, Charting Online - Chandoo
8. Home - Analytics Vidhya - Stay up to date

If you forgot your 10th and 12th maths - statistics, linear algebra, probability - start from
there

If you dont know SQL - pick this next

If you dont have programming experience - pick R next ( since python would be more
suited for people with programming background)

Then move to Statistical modelling >> Machine Learning >> Forecasting >> NLP

Pick Tableau for data visualization post this

If you are left with more time pick up Big data post this

1. Learn Statistics, from basics to advanced.

a. Distributions, Hypothesis Testing


b. Regression; Linear, Multinomial, Logistic…. there are so many types.
c. Probability, Permutaions and Combinations
d. Classification theories.

1. Learn how to implement these statistics in a computer, be it excel or R


Programming or Pyhton. There are multiple ways in which these can be
implemented. Most popular and robust are R programming and Python.
2. Learn Visualizaition, you have to make your results clear to the audience. For
this you would need tools like RShiny, Tableau, PowerBI, D3JS what not there
are many. Google for the rest of them.
3. Get to advanced stuff. Supervised and Unsupervised Learning, Data Mining,
Deep Learning, Machine Learning. Get to the concepts first and then
Implement them through programming.
a. 1, 2, 3 will follow after these too.

https://www.lunametrics.com/blog/2016/01/27/tips-for-getting-a-job-in-the-digital-analytics-industry/
https://www.digitalvidya.com/blog/google-analytics-jobs/

You can go with scinatics website for SAS

Python by sendex youtube

SQL by khan Academy

Excel from anywhere youtube

follow Kdnuggets and Kaggle

 Certificate Programme on Business Analytics and Intelligence by IIM


Bangalore - It’s the most sought after institutes across Asia, it offers 1-year business
analytics program and intelligence course online and offline methods.
 Data Analytics Course by Digital Vidya - It’s Asia’s leading training institute,
offers Data Analytics Course which imparts comprehensive hands-on training.
 Data Science Specialization Course by Jigsaw Academy - You can get
knowledge of analytic tools (Excel, R, and SAS), statistical concepts, statistical tactics,
and predictive analytics skills after 6 months.
 Certified Big Data Analyst and Data Science by AnalytixLabs
Headquartered in Delhi - They offer live online and offline courses like Big Data
Analytics, Data Science, and Data Visualization.
 Certificate Program in Business Analytics – NMIMS Hyderabad - They offer
the program on Business Analytics that combine real-time applications, introductory
and advanced statistical concepts, and analytical thinking.
 Certificate Course in Big Data and Analytics by Edureka - It’s an online
education training institute that offers both technology and business-based courses.
You can have Online-led live online classes and also Course taught by professionals in
the Business industry.
 Post Graduate Program (PGP) in Business Analytics and Big Data by Aegis
School of Business - The curriculum designed caters to various skill requirements
of organisation across the world including retail, manufacturing, healthcare,
education, insurance, computer services, and more.
 Business Intelligence and Business Analytics Program by OrangeTree
Global - Courses are delivered in online and offline format. Students learn the
application of business analytics tools.
 Big Data Analytics by Manipal Pro Learn - They offer Big Data Analytics using
Hadoop. The course is designed in such a way that it will help you pursue a career in
analytics.
 Post Graduate Program in Business Analytics by Praxis Business School -
It is one of the top institutes for management courses and training. 9-month long
training through classroom mode.
Start a Free DATA Analytics Course with Digital Vidya

If you are not a programmer & looking ahead for Job Oriented Courses Connect on
LinkedIn with me I will be Happy To Guide You :)

Do these three courses that is offered by Udacity. They are very helpful.

 Intro to Descriptive Statistics|Udacity


 Intro to Inferential Statistics|Udacity
 Intro to Statistics | Udacity
These above courses will help you build a foundation and then you should do the
following courses to get a stronger understanding.

https://www.khanacademy.org/math...

 Introduction to Statistics: Descriptive Statistics - edX


 Probability and Statistics | Coursera
Now you must be pretty familiar with statistics and probability.

R and Python are two programming tools which are pretty popular right now and you
have to be pretty efficient in either of the two that you take up. To learn Python and R
you can do the following courses.

PYTHON

 Intro to Python Programming Course | Udacity


 Introduction to Python for Data Science
 Learn data science with Python | Dataquest.
R

 R Programming | edX
 Kaggle R Tutorial on Machine Learning (practice) - DataCamp
 Introduction to R for Data Science - edX
 Free Introduction to R Programming Online Course | DataCamp
 R Programming | Coursera
 R Programming AZ™: R For Data Science With Real Exercises! - Udemy
 Online Learning – RStudio
Now let us check the data analytics courses that will help you build your
career in this field.

 Data Analyst | Udacity


 Data Analysis | Coursera
 Intro to Data Analysis | Udacity
 Data Analytics Course: Data Analytics Certification Training
 Data Analysis: Online Courses, Training and Tutorials on LinkedIn ...
 Free Data Science and Analysis Training Courses | DataCamp
Now you have to practice and test your skills that you learned above.
Practice using different data sets. You will find them here.

 UCI Machine Learning Repository


 Your Home for Data Science
Then if you want to build models using your skills then check the following:

 Machine Learning | Udacity


 The Analytics Edge
Now you must be having a great insight into data analysis using the tools. You should
also learn Tableau as it is great for visualization. Data analysis is all about visualization
after all.

All the best. I hope this helps.

From my experience, not in any particular order

1. Coursera - Data Science Specialization


2. Edx- Most of the data science courses are worth
3. Jigsaw Academy - Pricey but worth considering
4. Udemy/Udacity- Plenty of choices. cheaper also
5. PG programs from IIITB/IIMB/Great Lakes etc - In depth coverage + longer
duration(6–12 months)+very expensive

Web Analytics
Introduction to Digital Media Analytics

Introduction to Google Analytics

Concept of Account, Property and View

Concept of Sessions and Users

Concept of Dimension, Metric and Segment

Reading a Google Analytics Report

Audience Analytics

Acquisition Analytics

Behaviour Analytics

Real-Time Analytics

Setting Up and Analysing Events

Intelligent Events

Setting Up and Analysing Experiments

Setting Up and Measuring Conversion Goals

Attribution Modelling

Segment Reporting

Designing Custom Reports

Introduction to Google Adwords

Search Marketing

Display Marketing

Google Adwords Analytics

Managing a Google Analytics Account

https://www.361online.com/bigdata/PGP-ba-placementassured
Introduction to Data

Introduction to Probability

Distributions

Introduction to linear regression

Foundations for inference and estimation

Foundations for inference and hypothesis testing

Linear Regression and Multiple Regression

Inference and hypothesis testing on single population

Analysis of difference in two populations

Analysis of Variance

Chi-Square Analysis

Analysis of data using Non-parametric Statistics

Linear regression analysis

Multiple regression analysis

Advanced Multiple regression analysis

Logistic Regression

Forecasting

I have crafted a learning path for you right here to learn data science in 5 – 6 months at
1/5th of the budget you will blow away at these Business schools.

Step 1 Learn the fundamentals of applied Analytics in R, SAS at Equiskill - Analytics


Accelerator program

Step 2 Knowbigdata Big Data Analytics course for Hadoop and Spark

Step 3 Coursera - Free Online Courses From Top Universities | Coursera(Machine


learning course) Brush up your maths! The maths and integration etc. are high level

Once you have been through this Learning path in 6 months you will be ready to
participate in Kaggle competitions and make a name for yourself in Data Science.
Tools covered during the course: SAS, R, Python, RapidMiner, Hadoop, Excel& Tableau

Centre Name: : C-DACs - Advanced Computing Training School


Address: :
No.1, Shiv Bagh, Satyam Theatre Road, Ameerpet, Hyderabad
Andhra Pradesh 500016
Telephone: :
040-2373 7127
Contact Person: :
Mr. Sharanabasappa , Senior Technical Officer
Fax: 040-2374 3382
e-Mail cdachyd[at]cdac[dot]in
Courses>: : PG-DAC, PG-DESD, PG-DSSD

PG Diploma in Big Data Analytics (PG-DBDA)


NSQF level: 8

Register for C-CAT| C-CAT Candidate Login | Download Admission Booklet | FAQ

Course Focus

Eligibility Criteria

Course Fees

Course Contents

Course Outcome

Training Centres

FAQs

Statistical Analysis with R


100 Hours
Programming with Python
50 Hours

Fundamentals of Linux Programming


40 Hours

Java with Scala


80 Hours

Cloud Computing & HPC Applications


40 Hours

Data Collection & DBMS (Principles,Tools & Platforms)


80 Hours

Big Data Technologies


130 Hours

Data Visualization - Analysis and Reporting


40 Hours

Advanced Analytics
60 Hours

Practical Machine Learning


60 Hours

Aptitude and General English


50 Hours

Effective Communication
50 Hours

Project
120 Hours

Table 7b: Reference books for the various topics in C-CAT. SECTION TOPIC REFERENCE BOOK A English
Any High School Grammar Book (e.g. Wren & Martin) Quantitative Aptitude & Reasoning Quantitative
Aptitude Fully Solved (R. S. Aggrawal) Quantitative Aptitude (M Tyara) Barron‟s New GRE 2016 B
Computer Fundamentals Foundations of Computing (Pradeep Sinha & Priti Sinha) Data Communication
& Networking Data Communication & Networking (Forouzan) C Programming C Programming Language
(Kernighan & Ritchie) Let Us C (Yashavant Kanetkar) Data Structures Data Structures Through C in Depth
(S. K. Srivastava) Operating Systems Operating System Principles (Silberschatz, Galvin, Gagne) OOP
Concepts Test Your C ++ Skills (Yashavant Kanetkar) C Computer Architecture Computer Organization &
Architecture (William Stallings) Digital Electronics Digital Design (Morris Mano) Digital Design: Principles
& Practices (John Wakerly) Modern Digital Electronics (R. P. Jain) Microprocessors Microprocessor
Architecture, Programming & Applications with 8085 (Ramesh Gaonkar) The Intel Microprocessor (Barry
Brey)
Books to read:

To understand power of analytics:

These books provide a good overview of how analytics can impact our business decisions
and thought process, challenges faced in implementing data based solutions and also its
limitations (the last one).

Freakonomics by Steven D. Levitt

Moneyball by Michael Lewis

Scoring points by Clive Humby and Terry Hunt

When Genius Failed by Roger Lowenstein

Gearing up on the subject:

The Signal and the Noise by Nate Silver

Big Data – A revolution that will transform How we live, work and think

Web Analytics 2.0 by Avinash Kaushik

Video based trainings:

Learning the basics:

Linear Algebra and Statistics from Khan Academy – All the basics you would need
explained in awesome way! You realize how learning can be fun when you see them for the
first time

Intro to Descriptive Statistics on Udacity & Inferential Statistics on Udacity – for the
activity filled classes and exercises they provide.
For learning tools

Base SAS and Statistics course from SAS Institute – If you choice of tool is SAS

SAS Analytics U tutorials from SAS Institute (again if SAS is your choice)

Data Science Specialization from John Hopkins University on Coursera – If you want
to take learning in relaxed manner (3 – 4 hours every week over a period on 9 months)

edX Analytics Edge (R) – For those who can sustain more intensive schedule (20 – 25
hours every week for 3 months)

Google Analytics certification by Google – if you want to build a career in Web Analytics

Chandoo.org for learning and refreshing Excel – it contains some nice tips and tricks.

Qlikview / Tableau Tutorial – I think you should learn one of these visualization tools, so
that you can draw powerful visualizations quickly

Other Reference material:

SAS Analytics U – download center

SAS documentation & SUGI papers

CRAN project website for downloading R and packages

Videos from Google on R (available on YouTube)

R documentation

For staying up to date with the industry

Subscribe to following blogs;

 Analytics Vidhya
 KDNuggets.com
 R-bloggers,
 Occam’s Razor by Avinash Kaushik
You should focus on gaining breadth over depth – for this kind of path. Read, read and read
a lot – subscribe to various blogs (including Analytics Vidhya) like KDNuggets,
smartdatacollective, big data made simple. You can rely on news aggregators like Prismatic
to provide you with latest news in the industry.

 Join a course on basics of data science on Coursera / eDX to get first hand flavor of
being an analyst. Here are some good courses:
o The Analytics Edge – Intensive 12 week course which should give a goof
headstart
o Data Science speicialization from John Hopkins University – Relatively more
relaxed, but longer duration. A collection on 9 courses.
 Make sure you do all the assignments. This is your chance to get the experience first
hand.
 Subscribe to some of these blogs / communities to regularly read about the subject:
o Analytics Vidhya (what else did you expect first!)
o Occam’s Razor (For people interested in Web analytics)
o Smartdata Collective
o KDNuggets
 Be a part of Linkedin Groups related to analytics –
o Advanced Business Analytics, Data Mining and Predictive Modeling
o Big Data / Analytics / Strategy / Predictive and Business Analytics

Once you have followed these blogs / communities for a while (say at least a month after your
course at Coursera), you can look out for certification courses to begin your journey.

For example, companies like Mu-Sigma, Fractal, WNS, Citi etc. are open to hire people
without prior work experience.

If it looks expensive, you can look at learning stats from Statistics Course on Udacity. You
can also look at the course we are running on Internshala. Regarding stats, you need to
start with probability, distributions, hypothesis testing and t-tests.

Clinical Analysis and Analytics typically refer to same field. Knowledge about statistics
hypothesis testing, A/B testing, DOE are must for clinical trials and analysis. Most of the
jobs in India in this domain would be for US clients (in KPOs).

Stats courses are good place to start.


Jigsaw: http://analyticstraining.com/2012/learn-sas-for-free/

If you are interested in Web Analytics, check out the course with Market Motive. It is run by
some of the best professionals in the industry.

Stats? Coding?

If the answer to all that is yes, you should pursue a few courses from Coursera to get a
flavour of industry. Machine Learning from Stanford University, Data Science track by John
Hopkins University are some of the best places to start.

in depth work knowledge in Regression, ARIMA, Conjoint analysis etc and also knowledge
of statistical tools like SAS, R etc.

here are some free courses on Coursera / eDX which would help you do so. So enroll for
them. Look for data science specialization from John Hopkins University

Once you are done with the basics, I would suggest that you can start participating in a few
data science competitions (e.g. on Kaggle,com) to put some of these concepts into real
world testing. This way you can invest in training at a low cost and get up the curve fast.

take a course from Jigsaw, start learning additional skills on the side and try and
demonstrate it by practical applications – e.g. application on a few simple Kaggle
competitions

==

have few questions to be resolved before I decide to enroll for foundation of analytics
course from Jigsaw academy which is asking for 26,000 rs. I have sent a similar query to
Jigsaw today.
I checked the syllabus of this course; I can see core subjects as these:

a) Predictive Modeling Techniques


b) Statistical Concepts and their Application in Business
c) Basic Analytic Techniques – Using Language of SAS

These are basic concepts which are available for free on platforms like
Coursera – https://www.coursera.org/course/datascitoolbox
Udacity – https://www.udacity.com/courses#!/data-science
SAS.com.

SAS has recently introduced a free university edition which I have installed which seems
great for now. I explored more and found that SAS provided free e-learning modules like
and they are really good in terms of explaining the basic concepts.

Currently, I am taking few courses from sas.com


– https://support.sas.com/edu/schedules.html?ctry=US&id=277

There are so many other free e- learning tutorials from SAS available which you can
practice on SAS studio university edition. Check out the link
– http://support.sas.com/training/tutorial/

What do u think of Coursera? They are providing online courses and even providing
certifications if you complete the course. I checked it out for R
Programming. https://www.coursera.org/course/rprog
(They are affiliated with Johns Hopkins Bloomberg School of Pubic health.) Now here’s the
catch! they arent charging for it, its for free!!)
What are your views on this?
he Entire Specialisation of “Data Science” is for 310$ (9 courses) with some accessibility
privileges.

https://www.coursera.org/specialization/jhudatascience/1?utm_medium=courseDescripTop

Masters course in Singapore Management University for Business Analytics track.


NUS in Singapore.

1. Begin learning how to code in R. It will be difficult at first, and will take a few months
of spending many hours at it per day. You can take a course on Coursera or Data
Camp. Coursera can be expensive, but I think DataCamp is only $40 a month for
access to all the classes. Examples:
a. R Programming - Johns Hopkins University | Coursera
b. Free Introduction to R Programming Online Course
2. Take an online course in data science (data camp and/or coursera).
3. Get a few books (ex: Introduction to Statistical Learning — James, Witten, Hastie,
Tibshirani). Ideally you’d want to learn this information in a few different forms, since
it will be difficult to really master it with just one course or one book.
4. Practice machine learning with datasets on Kaggle.
5. Join some Data Science/ Statistical Programming / Machine learning meetup
groups in your city. You can learn from free lectures and also make new friends to
learn from or learn with.

Jigsaw Academy Data Science Certification


Great Lakes’ Post Graduate Program in Business Analytics.

 Knowledge of Statistics – While it is not necessary to be a Statistician in order to


become a business analyst, it is important to have an understanding of basic statistical
concepts that have wide applicability in business analytics. Concepts like measures of
central tendency, measures of dispersion, hypothesis testing, probability, distributions
etc. are essential to analytics and one must have a good understanding of these topics
and should be comfortable applying these concepts to business situations.
 Knowledge of the modelling methodology – There is a sequence of events that
precedes and follows the actual predictive modelling. Starting with an exploration of
data to preparing the data for modelling to validating the model results – there is a
time and place for every step and it is important to understand this sequence.
 Analytic techniques – Analytic techniques include popular ones like regression,
ANOVA, decision trees, clustering etc. There are also domain specific techniques that
come in handy. For example, price promotion analysis for consumer goods, market
basket analysis for retail, churn analysis for telecom. Any training on analytics needs
to cover the most widely used techniques.
 Analytic tool training: There are a large number of different software available in the
market for analytics. Some are script based, some are GUI based. While it is not
possible to train on every available software, it is a good idea to be trained on some
of the most popular tools like Excel and SAS language or the R software. You can read
more about the popular analytic tools here.
 Soft skills: Soft skills are important for any job. However there are a few skills that are
more specific to analytics. For example, being able to explain complex modeling results
to non-statistical people. Any analysis is only as good as how the results are presented.
Too often, analysts get too involved in the methodology and algorithms to be able to
present their results in a manner that is understood by lay-people.

https://www.jigsawacademy.com/pgpdm/
https://www.kdnuggets.com/2018/05/beginners-guide-data-science-pipeline.html

1. Descriptive Analysis:
Exploratory data analysis is an approach that analyses a given data set by summarizing
its characteristics with visual methods. It could also represent the entire data set with its
features or just a part of the data set sample.

So there are two ways of describing data:

 Measures of central tendency – Mean, median, mode.


 Measures of variability – Used to analyze data spread or variability in a data set.

2. Statistical Inference:

From the data described or visualized during the descriptive analysis, we try to
understand the characteristics of the data set.

For example, to predict the number of voters going to support a candidate, the data
would include, clearly defined population of interest, a clear parameter chosen by data
scientists, and an estimate of the population having that set parameter.

3. Bayesian Statistics:

Bayesian statistics is a mathematical procedure that applies probabilities to statistical


problems. It also provides tools to update the existing beliefs with the evidence of new
data. This approach, consequently, allows for better accounting of uncertainty, more
intuitive results, comprehensible meaning, and more explicit statements of assumptions.

To understand this thoroughly you’d have to study:

 Conditional Probability
 Bayes Theorem

4. Experimental Design:

Experimental Design is the concept of designing and planning experiments to yield the
cause and effect of the relationship between variables in a study.

The best example of this is everyday cooking. Consider the dish being prepared to be
the end result.

A certain process is designed to attain the goal, parameters such as salt, sugar and
other flavoring could be varied to end up with a different result every time – i.e.
understanding the various causes and effects.

https://www.jigsawacademy.com/online-analytics-training/
https://www.manipalprolearn.com/custom-search
With respect to the MOOC introduction to data analysis by Berkeley’s and Analytics Edge by MIT
in edx.org 23 are the two that helped me kick start. You can find a lot more
in coursera.com 17 Andrew NG’s course is famous

https://www.quora.com/Whats-the-best-way-to-learn-data-science-as-a-beginner
https://www.quora.com/How-do-I-learn-analytics-and-data-analysis-in-SQL-Is-there-a-book-or-course-
for-it

target and timelines Transitioner Data Scientist


 Learn basic mathematics and statistics required for data science
 Develop a basic understanding of machine learning algorithms
 Work on projects and create a portfolio of projects
 Skills required to land your first data science internship / job.
 Time spent ~ 5 hours / day

Structure for your 2017 journey:

 Step 1: Getting started and testing the waters


 Step 2: Mathematics & Statistics
 Step 3: Introducing the tool – R / Python
 Step 4: Basic & Advanced machine learning tools
 Step 5: Building your profile
 Step 6: Applying for Jobs / Internships

3.2: Basics of Mathematics and Statistics


Time suggested: 8 weeks (February 2017 – March 2017)

Topics to be covered:

 Descriptive Statistics – 1 week


 Probability – 2 weeks
 Inferential Statistics – 2 weeks
 Linear Algebra – 1 week
 Structured Thinking – 2 weeks

Descriptive Statistics – 1 week

 Course (mandatory) – Descriptive Statistics from Udacity is a basic and must


do course to get started.
 Books (optional) – Supplement your online course with online stats book. A good
book for any one looking for learning basic statistics.
Probability – 2 weeks

 Course (mandatory) – Introduction to probability – The science of uncertainty is an


excellent course on edX to learn concepts of probability like conditional probability
and probability distributions.

 Books (optional) – The textbook Introduction to probability – Berkley’s stats 134


standard textbook will supplement the course above and can be used as a good
reference material.

Inferential Statistics – 2 weeks

 Course (mandatory) – Intro to Inferential Statistics from Udacity – Once you have
gone through the descriptive statistics course, this course will take you through
statistical modeling techniques and advanced statistics.

 Books (optional) – Online Stats Book – This online book can be used for a quick
reference for inference tasks.

Linear Algebra – 1 week

 Course (mandatory)
o Linear Algebra – Khan Academy : This concise and an excellent course on
Khan Academy will equip you with the skills necessary for Data Science and
Machine Learning.
 Books (optional)
o Linear Algebra/ Levandosky – This is an often cited book to Stanford
graduates for Linear Algebra.
o The Manga guide to Linear Algebra – This is a fun filled Linear Algebra book
which keeps Machine Learning in context. You will never forget these Algebra
lessons for sure.

Structured Thinking – 2 weeks

 Articles (mandatory): These articles will guide you to structure your thinking
process to approach problems in a better way so as to improve your efficiency.
o How to train your mind for analytical thinking?
o Tools for improving structured thinking
o The art of structured thinking and analyzing

 Competitions (mandatory): No amount of theory can beat practice. This is


a strategic thinking problemwhich will test you on your thinking process. Also, keep
an eye on business case studies as they help in structuring your thoughts
tremendously.
3.3: Introducing the tool – R / Python
Time suggested: 8 weeks (April 2017 – May 2017)

Topics to be covered:

 Tools (R/Python) – 4 weeks


 Exploration and Visualization (R/Python) – 4 weeks
 Feature Selection/ Engineering

Tools

1. R

 Course – Interactive Intro to R Programming Language by DataCamp – An excellent


course by DataCamp to give you hands-on experience in R. The course includes
interactive examples You will never feel bored while learning R.

 Books – R for Data Science – This is your one stop solution for referencing basic
materials on R.

 Blogs/Articles
o This article will serve a great point for collating the entire process of model
building starting from installation of RStudio/R.
o R-bloggers – This is one of the most recommended blog for R- users. Every
R practitioner should keep this blog bookmarked. It has some of the most
effective and practical R tutorials. Bookmark it now.

2. Python

 Course (mandatory) – Intro to Python for Data Science – An interactive course


developed by DataCamp to facilitate Data Science learning using Python.

 Books (mandatory) – Python for Data Analysis – This book covers various aspects
of Data Science including loading data to manipulating, processing, cleaning and
visualizing data. Must keep reference guide for Pandas users.

 Blogs/Articles (optional)
o A Complete Tutorial to Learn Data Science with Python from Scratch: This
article will serve as a quick guide to learning Data Science using Python.
Exploration and Visualization

1. R

 Course
o Exploratory Data Analysis – This is an awesome course by Johns Hopkins
University on Coursera. You will need no other course to perform
visualization and exploratory work in R.

 Blogs/Articles
o Comprehensive guide to Data Exploration in R – This will be a one-stop
article that I will suggest you to go through carefully and follow every step.
This is because the steps mentioned in the article are the same steps you will
be using while solving any data problem or a hackathon problem.
o Cheat sheet – Data Exploration in R – This cheat sheet contains all the steps
in data exploration with codes. I suggest you to take out a print and paste it
on your wall for quick reference.

2. Python

 Course (optional)
o Intro to Data Analysis – This is an excellent course by Udacity on Data
Exploration using Numpy and Pandas.

 Blogs/Articles (mandatory)
o Comprehensive guide to Data Exploration using Python NumPy, Matplotlib
and Pandas – This is a sufficient and comprehensive article which uses the
most popular Python libraries for exploration and visualization purposes.
o 9 popular ways to perform Data Visualization in Python – This article presents
the most commonly used graphs and plots used in Data Exploration along
with Python codes. This is a must bookmarked article for people working in
Data Science using Python.

 Books (optional) – Python for Data Analysis – A one stop solution for your Data
Exploration and Visualization in Python.

Feature Selection/ Engineering


 Blog – A Comprehensive Guide to Data Exploration: This article will
explain underlying techniques of feature engineering and different methods for
feature creation

 Books (optional) – Mastering Feature Engineering: This book is master piece to


learn feature engineering. Not only will you learn how to implement feature
engineering in a systematic way. You will also learn different methods involved in
feature engineering.

3.4: Basic & Advanced machine learning tools


Time suggested: 12 weeks (June 2017 – August 2017)

Topics to be covered (June 2017 – July 2017):

 Basic Machine Learning Algorithms.


o Linear Regression
o Logistic Regression
o Decision Trees
o KNN (K- Nearest Neighbours)
o K-Means
o Naïve Bayes
o Dimensionality Reduction
 Advanced algorithms (August 2017)
o Random Forests
o Dimensionality Reduction Techniques
o Support Vector Machines
o Gradient Boosting Machines
o XGBOOST

Linear Regression

 Course
o Machine Learning by Andrew Ng – There is no better resource to learn Linear
Regression than this course. It will give you a thorough understanding of
linear regression and there is a reason why Andrew Ng is considered the
rockstar of Machine Learning.

 Blogs/Articles
o This lesson out of PennState Stat 501 course outlines the main features of
Linear Regression ranging from a simple definition of a Linear Regression to
determining the goodness of fit of a regression line.
o This is an excellent article with practical examples to explain Linear
Regression with code.

 Books
o The Elements of Statistical Learning – This book is sometimes considered the
holy grail of Machine Learning and Data Science. It explains Machine
Learning concepts mathematically from a Statistics perspective.
o Machine Learning with R – This is a book I personally use to have a brief
understanding of Machine Learning algorithms along with their
implementation code.

 Practice
o Black Friday – Like I already said – No amount of theory can beat practice.
Here is a regression problem that you can try your hands on for a deeper
understanding.

Logistic Regression

 Course (mandatory)
o Machine Learning by Andrew Ng– The week 3 of this course will give you a
deeper understanding of the one of the most widely used classification
algorithm.
o Machine Learning: Classification – Week 1 and 2 of this practical oriented
Specialization course using Python will satiate your knowledge thirst about
Logistic Regression.

 Blogs/Articles (optional)
o Logistic Regression by Machine Learning Mastery – This is an excellent non-
code based approach to Logistic regression to deepen your knowledge. I
suggest you to have a look at it.

 Books (optional)
o Introduction to Statistical Learning – This is an excellent book with a quality
content on Logistic Regression’s underlying assumptions, statistical nature
and mathematical linkage.

 Practice (mandatory)
o Loan Prediction – This is an excellent competition to practice and test your
new Logistic Regression skills to predict whether loan status for a person was
approved or not.

Decision Trees

 Course (mandatory)
o Machine Learning: Classification – Week 3 and 4 in this course is about the
working of decision trees, preventing overfitting and handling missing values
 Blogs/Articles (mandatory)
o Technical Overview of decision trees – This is a quick overview of decision
trees and a must read for anyone new to decision trees.
o Complete tutorial on tree based modeling – This is a python based tutorial on
decision trees. For the sake of decision trees, read only sections 1-6 in this
article.

 Books (mandatory)
o Introduction to Statistical Learning – Section 8.1 and 8.3 explain the basics of
decision trees through theory and practical examples.
o Machine Learning with R – Chapter 5 of this book provides you the best
explanation of Machine Learning Algorithms available in the market. Here, the
decision trees are explained in an extremely non-intimidating and easier style.

 Practice (mandatory)
o Loan Prediction – This is an excellent competition to practice and test your
new Logistic Regression skills to predict whether loan status for a person was
approved or not.

KNN (K- Nearest Neighbors)

 Course (mandatory)
o Machine Learning – Clustering and Retrieval: Week 2 of this course
progresses to k-nearest neighbors from 1-nearest neighbor and also
describes the best ways to approximate the nearest neighbors. It explains all
the concepts of KNN using python.

 Blogs/Articles (mandatory)
o Introduction to k-nearest neighbors: simplified – This basic article describes
when to use KNN, the ways in which k can be chosen and the way in which
KNN algorithm works.
o Learning KNN algorithm using R – This article is a comprehensive guide to
learning KNN with hands-on codes for future references.

K-Means

 Course
o Machine Learning Course – Unsupervised Learning with K-means algorithm:
Week 8 of this discusses how to use course how K-means algorithm is used
for handling unstructured data.

 Blog
o An Introduction to Clustering and different methods of clustering: In this
article, you will learn what is k-means clustering and the intricacies involved in
that. It will give you a step by step approach how K-means algorithm works.

Naive Bayes

 Course
o Intro to Machine Learning: Take this course to see Naive Bayes in action. In
this course, Sebastian Thrun has explained Naive Bayes in Simple English.

 Blog / Article
o 6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python) : This
article will take you through Naive Bayes algorithm in detail. In this guide, you
will learn how Naive Bayes algorithm works, applications and many more. It
will also give you hands-on knowledge of building a model using Naive
Bayes.
o Naive Bayes for Machine Learning : This is one of the most comprehensive
articles I have come across. Go through this article to have a complete
understanding of why naive bayes algorithm is important for machine
learning.

Dimensionality Reduction

 Course
o Machine Learning – Dimensionality Reduction: Week 8 of this course will
walk you through dimensionality reduction and how Principal Components
Analysis can be used for data compression of complex data.

 Blog / Article
o Beginners Guide To Learn Dimension Reduction Techniques: In this article,
you will learn why dimension reduction is important in machine learning and
the various techniques of dimension reduction.

Random Forests

 Videos (mandatory)
o How Random Forest algorithm works? – Watch this video to have a visual
perspective of how the Random Forest algorithm works.

 Books (optional)
o Introduction to Statistical Learning – Section 8 explains the basics of Random
Forests including bagging and boosting through theory and practical
examples.
o Applied predictive modeling – Chapter 8

 Blogs/Articles (mandatory)
o A tutorial on tree based modeling from scratch – This is an excellent article on
trees based modeling using python. I suggest you to bookmark it right now.
o Random Forests – This blog explains the entire working, nuts and bolts of
Random Forest.

Gradient Boosting Machines

 Blogs/Articles (mandatory)
o Guide on Boosting methods
o Parameter tuning GBM
o Machine Learning Mastery- GBM

 Presentation (mandatory): Here is an excellent presentation on GBM. It contains the


prominent features of GBM and the advantages and disadvantages of using it to solve
real-world problems. It is must see article for somebody trying to understand GBM.

XGBOOST

 Blogs /Articles (mandatory)


o Official Introduction XGBOOST – Read the documentation of hackathons
winning algorithm. It is an improvement over GBM and is right now the most
widely used algorithm for winning competitions.
o Using XGBOOST in R – An excellent article on deploying XGBOOST in R
using a practical problem at hand.
o XGBOOST for applied Machine Learning – An article by Machine Learning
Mastery to evaluate the performance of XGBOOST over other algorithms.

Support Vector Machines

 Course (mandatory)
o Machine Learning by Andrew Ng – Week 7 of this course is an interesting
place to start your SVM journey.

 Books (mandatory)
o Introduction to Statistical Learning – Chapter 9 of the book contains a detail
discussion about SVMs and the ways to deploy them.

 Blogs/Articles (optional)
o Understanding support vector machines – This is an excellent article to
understand an algorithm practically using examples.
o SVM by Machine Learning Mastery – This article discusses the different types
of kernels employed in SVM and their uses.

3.5: Building your profile


Time suggested: 8 weeks (September 2017 – October 2017)

Topics to be covered:

1. GitHub Profile Building


2. Practice via competitions
3. Discussion Portals

GitHub Profile Building (mandatory)

It is very important for a Data Scientist to have a GitHub profile to host all the codes of the
project he/she has undertaken. Potential employers not only see what you have done, how
you have coded and how frequently / how long you have been practicing data science.

Also, codes on GitHub open up avenues for open source projects which can highly boost your
learning. If you don’t know how to use Git, you can learn from Git and GitHub on Udacity.
This is one of the best and easy to learn course to manage the repositories through terminal.

Practice via competitions (mandatory)

Time and again, I have stressed on the fact that practice beats theory. Moreover coding in
hackathons brings you closer to developing data products in real life for solving real world
problems. Below are most popular platforms to participate in Data Science/ Machine Learning
Competitions.

1. Analytics Vidhya Datahack


2. Kaggle competitions
3. Crowd Analytix human layer

Discussion Forums (optional)

Discussions are a great way to learn in a peer-to-peer setup from finding an answer to a
question you stuck to providing answers to someone else’s questions. Below are some of the
discussion rich platforms which you should keep a tab on to clear your doubts.

1. Analytics Vidhya Discussion Portal


2. Kaggle Discussion
3. StackExchange

3.6: Apply for Jobs & Internships


Time suggested: 8 weeks (November 2017 – December 2017)

Topics to be covered: Jobs / Internships

If you are here after diligently following the above steps, then you can be sure that you are
ready for a Job / Internship position at any Data Science / Analytics or Machine Learning
firms. But it becomes quite difficult to identify the right jobs. So, for the purpose of saving the
trouble, I have created a list of portals which lists down Data Science/ Machine Learning jobs
and Internships.

1. Analytics Vidhya Job Portal


2. Datajobs
3. Kaggle Job portal
4. Internshala

In order to prepare for these interviews, you should go through this Damn Good Hiring
Guide
The Ultimate Path for transitioners
Simply put, if you are looking for a transition under a year, you will need to learn everything
we laid out for the beginner above. Additionally, you will need to carve out additional time to
showcase your skills. You will need to overcome the doubts of your potential employers
through your projects and work.

I am sure you are beginning to understand why transition is not an easy thing.

Structure for your 2017 journey:

The structure of the path is similar, but you will need to accelerate your learning in the first
half of the plan. Start by going through this article and go through a few success stories to
understand what a transition would entail. Once you are set for the journey, follow the plan
by sticking to these timelines.

 Step 1: Getting started and testing the waters (1 week in January ’17)
 Step 2: Mathematics & Statistics (Jan ’17 – March ’17)
 Step 3: Introducing the tool – R / Python (March ’17 – April ’17)
 Step 4: Basic & Advanced machine learning tools (May ’17 – July ’17)
 Step 5: Building your profile (Aug ’17 – Oct ’17)
 Step 6: Applying for Jobs (Nov ’17 – Dec ’17)

https://www.analyticsvidhya.com/blog/2017/01/the-most-comprehensive-data-science-learning-plan-
for-2017/

https://trainings.analyticsvidhya.com/courses/course-v1%3AAnalyticsVidhya%2BPython-Final-Jan-
Feb%2BPython-Session-1/
https://trainings.analyticsvidhya.com/courses/course-v1:AnalyticsVidhya+Python-Final-Jan-Feb+Python-
Session-1/about

https://classroom.udacity.com/courses/ud198
https://edunxt.manipalprolearn.com/?q=MULNCourseBook/viewSectionMyCourseBook/42223/cidb/full
/view/

https://www.jigsawacademy.com

https://www.datacamp.com/courses/intro-to-sql-for-data-science

https://analyticsindiamag.com/top-6-full-time-analytics-courses-india-ranking-2017/

https://analyticsindiamag.com/top-10-analyticsdata-science-training-institutes-india-ranking-2017/
Let’s chalk out the usual path for data science enthusiasts:

 Learn Python < R < SAS < SQL


 Learn descriptive statistics, hypothesis testing, probability
 Become well-versed in the various types of Machine learning algorithms —
 Supervised, Unsupervised
 Finally learn Data Visualization tools like Tableau

https://analyticsindiamag.com/become-data-scientist-2018/
https://analyticsindiamag.com/top-10-executive-analytics-courses-india-ranking-2017/
https://analyticsprofile.com/business-analytics/best-business-analytics-data-science-courses-in-india-
2018/

HTTPS://ANALYTICSINDIAMAG.COM/TOP-10-ANALYTICSDATA-SCIENCE-TRAINING-INSTITUTES-INDIA-RANKING-2017/
HTTPS://ANALYTICSINDIAMAG.COM/TOP-6-FULL-TIME-ANALYTICS-COURSES-INDIA-RANKING-2017/
HTTPS://ANALYTICSINDIAMAG.COM/TOP-10-EXECUTIVE-ANALYTICS-COURSES-INDIA-RANKING-2017/

One of the best resources to study ML that I’ve personally benefited from is Andrew Ng’s
course on Coursera.

Stanford’s CS231 is also really good, with emphasis on Convolutional Neural Networks.

fast.ai is also considered to be really good, although I haven’t personally used it.

These courses will hopefully teach you the theory behind building ML systems. Beyond
that, one of the best ways to learn is to try those algorithms out and practice. Learning
how to use TensorFlow or Pytorch will also prove to be helpful.

For Machine Learning (ML), I earned my course certificate from coursera, its a
certification program from Stanford University by Professor Andrew Ng.

For AI, you must have knowledge about Machine Learning, Deep Learning, Natural
Language Processing, and Big Data.

Furthermore a good knowledge of JAVA and Python is must.


I am learning as follows:

1. Learned Linear Algebra and Calculus from Khan Academy


2. Learned Python or R from Online Courses - Learn Anything, On Your
Schedule | Udemy
3. Learned Basics / Theory about Machine Learning Coursera | Online
Courses From Top Universities. Join for Free
4. Learned Practically how to implement ML models in Python / R Machine
Learning A-Z™: Hands-On Python & R In Data Science
5. Participated in Kaggle Competitions Your Home for Data Science
That’s it. It will definitely take some time but it will be worth it at the end.

Also I have started a series called as “The Machine Learning Story” to


explain ML from Layman, Mathematician and Coding point of view. To view
it, You may visit - Jitesh Lalwani – Medium

Вам также может понравиться