Академический Документы
Профессиональный Документы
Культура Документы
https://www.udemy.com/career-in-digital-web-analytics/
https://www.digitalvidya.com/blog/
Introduction
Installation and Overview of SQL Environment (MySQL)
Installing mySQL
My SQL Interface
Database Fundamentals
How databases are structured
Understanding a schema in MySQL
Missing Sakila Database and how to solve it
Selecting the Data
Write your first statement to select the data
Narrowing down your result set with WHERE clause
Sorting your result set
Using Numeric and Date Values to narrow down your result set
Other Key SQL Statements
Creating a table
Few more table manipulation commands
Importing data from a CSV file
Importing data from Excel to mySQL
Advanced Dеgrее – More Dаtа Sсіеnсе programs are рорріng uр to serve the сurrеnt dеmаnd, but
thеrе аrе аlѕо many Mаthеmаtісѕ, Stаtіѕtісѕ, and Cоmрutеr Sсіеnсе programs.
MOOCѕ – Cоurѕеrа, Udacity, аnd codeacademy are gооd рlасеѕ tо ѕtаrt.
Cеrtіfісаtіоnѕ – KDnuggets hаѕ соmріlеd an еxtеnѕіvе lіѕt.
Bootcamps – Fоr more information аbоut hоw this approach соmраrеѕ to degree programs оr
MOOCs, сhесk out thіѕ guest blоg frоm thе dаtа ѕсіеntіѕtѕ аt Dаtаѕсоре Anаlуtісѕ.
Kаgglе – Kаgglе hоѕtѕ data science competitions whеrе уоu саn practice, hоnе уоur ѕkіllѕ wіth messy,
rеаl wоrld dаtа, and tасklе асtuаl buѕіnеѕѕ рrоblеmѕ. Emрlоуеrѕ tаkе Kaggle rankings ѕеrіоuѕlу, as
they саn be ѕееn аѕ rеlеvаnt, hаndѕ-оn рrоjесt wоrk.
LinkedIn Groups – Jоіn rеlеvаnt grоuрѕ to іntеrасt wіth оthеr members оf thе dаtа ѕсіеnсе соmmunіtу.
KDnuggets – KDnuggets іѕ a gооd rеѕоurсе fоr ѕtауіng аt the fоrеfrоnt оf іnduѕtrу trеndѕ in dаtа
ѕсіеnсе.
Must Have Skills:
• Experience in designing, development and deployment of virtual agent technology such as
bots, auto-responders, virtual assistants, etc. on top of tableau, Qlik or other BI platforms or
database.
• Experience in programming codes using Python, Scala, .NET or similar languages.
Demonstrable knowledge of frameworks like Flask, Django is a definite plus.
• Hands on experience with Tableau, PowerBI, SiSense, Microstrategy and similar
visualization tools and platforms APIs is required.
• Demonstrable use of Cloud technologies using AWS services and having architected
solution for private, public Cloud environment, including Linux/Windows based cloud
deployment.
• Hands on experience in at least 1 relational database like SQL Server, Oracle, MySQL,
etc.
• AWS certification and prior experience with NoSQL databases will be a plus
COEPD
Third Floor, Sahithi Arcade, 301, SR Nagar Main Rd, Srinivasa Nagar, Sanjeeva Reddy Nagar,
Hyderabad, Telangana
Analytics Path
Plot No. 28, 4th Floor, Suraj Trade Center, Opp. Cyber Towers, Hitech City, Jaihind Enclave, Madhapur,
Hyderabad, Telangana
We enrolled to new courses in Cousrera and edx where we started learning, and searched for job myself.
if one does not have any industry experience in Big Data/Data Science/Statistics/Analytics domain then
he/she WILL NOT magically land a job in the relevant roles by earning a certificate or clearing the course
with high grades.
From my personal experience self learning or online learning websites such as Coursera, edX and
Udacity were far better compared to this Institution for what I paid to achieve the certification. If one would
really want to land in a job in the relevant roles then consider Kaggle data science challenges.
http://sqlschool.com/Microsoft-Datascience-Training.html
1. Udacity - Take the free course called introduction to computer science at Udacity -
Free Online Classes & Nanodegrees .
2. Take courses from Learn R, Python & Data Science Online | DataCamp
3. Learn data science with Python and R. Get started for free.
4. MIT 6.00.1x on edX
I am learning “Machine Learning” from online
resources and I want to learn “TensorFlow”. I have
some basic knowledge in Python coding. How do I
start? What online resources should I follow?
Follow the deep learning course by Andrew Ng on Coursera. It will teach you.
But before that learn machine learning and linear algebra. For that you can follow
Abdrew Ng’s course named Machine Learning again on Coursera.
https://www.quora.com/What-are-the-best-learning-sites-for-Python
https://www.quora.com/What-are-some-good-free-resources-to-learn-Python
https://www.quora.com/What-are-the-best-sources-to-learn-Python-language
https://www.quora.com/How-should-I-start-learning-Python-1
https://www.quora.com/Which-is-the-best-place-to-learn-Python-having-previous-knowledge-in-
programming
https://www.quora.com/Which-is-the-best-book-for-learning-the-Python-programming-language
https://www.quora.com/My-boss-gave-me-30-days-not-working-days-to-learn-Python-to-transfer-to-
the-Data-Science-team-What-is-the-best-approach-to-learn-as-much-as-possible
https://www.digitalvidya.com/python-data-science-course/
http://www.isb.edu/cba/programme-overview/curriculum
Curriculum Details* Subject to minor changes
Software tools in the Programme:
R, MySQL, Python, Stata, @Risk, Simio, Tableau, XLMiner, NodeXL MeXL, Hadoop (AWS)
EXPAND ALL | COLLAPSE ALL
PRE-TERM
TERM 1
TERM 2
4) Optimization
5) Statistical Analysis 2
TERM 3
CAPSTONE PROJECT
Below are few of the courses that I consider the best in the field of Data Science and
Analytics that you should do.
pgp courses in business analytics @ Praxis bschool which has tieup with PwC and icici for online
and full time course at bangalore and kolkata campud and in collaboration with IIDT at
hyderabad campus in association with Hyderabad government. Even recently they have
launched online courses on pg data science and business analytics for 8-12months duration, fees
very less plus earn a degree from a aicte approved institute,placement support same as full time
batch and every month one industry expert will come n take session via VLP.
LOGICAL REASONING: Logical connectives, Statements and Conclusions, Matching and sequences.
DATA INTERPRETATION AND DATA VISUALIZATION: Data driven questions (pie charts, graphs, trends).
The duration of the test is 3 hours. Calculator, charts, graph sheets, tables or gadgets are NOT allowed in the
examination hall.
Introduction to data mining (DM); DM methodology; Preparing data for mining; Data mining and Statistics;
Hazard Functions; Survival Analysis; Memory based reasoning; Market Basket Analysis, Link Analysis,
Decision Trees, Clustering; Privacy and Societal Issues, the Data mining environment, Putting DM to work.
REFERENCES:
Data Mining: Concepts and Techniques by Jiawei Han and Micheline Kamber, Morgan Kaufman Publishers
2001.
Data Mining: A Tutorial-Based Primer by Richard J. Roiger and Michael W. Geatz, Addison-Wesley 2003
Data Mining: Concepts, Models, Methods and Algorithms by Mehmed Kantardzic, Wiley, 2003
https://www.analytixlabs.co.in/data-science-using-python
2. Complete the Udactiy course on inferential statistics. The course will give you
understanding about how to draw conclusions from data. This is one of the best online
courses I have ever had!
Now you have the basic statistical knowledge required to understand data and to draw
conclusions from it.
You may also consider the following courses if you are not from mathematics
background.
Now that you have the required knowledge to understand the data, you need a tool to do
the analysis. The most popular tools are Python and R. If you already possess
programming skills, I recommend python, else choose R.
1. Python
After this point, do you know when to use a array over a list? If not, you need to revise
the concepts again.
2. R
R is relatively easy to understand. Do the data analysis course on Udacity. The course
teaches you the basics of R as well as the data analysis process.
Data analysis is an art! You need to ask the right questions in order to find something
interesting in the data. Learn how to analyze data systematically through the following
courses:
1. Python: Intro to Data Analysis | Udacity. Not a recommendable course for a beginner,
but understand the process.
If you have completed the courses mentioned, you should have a clear picture about the
data analysis process.
You may also complete the Intro to Data Science Online Course | Udacity(optional, but
highly recommended)
Step 4: Practice
You can download data sets from lot of sources online. UCI machine learning repository
and kaggle are the popular places where you can download datasets. Find a data set of
your interest and practice. Once you are comfortable with the process and tools, practice
using different type of data sets.
UCI Machine Learning Repository
You may not have gone through all the techniques required for data analysis through
these courses, but these courses will teach you how to proceed when you stuck at one
point and where and how to look for when you are lost.
Happy learning!
https://www.quora.com/Which-is-the-best-online-course-for-data-analysis-and-data-analytics
If you are looking for a career in data analytics, I would suggest that you take up a job in
one of the analytics companies - Mu Sigma, ZS associates, Fractal, Tredence, etc. These
companies mostly don't need any prerequisites for entry level analyst jobs and provide
great opportunities to learn the skills from scratch
If that's not possible for you, I would suggest you take the following progression to learn
data analytics in each of the key areas:
Maths
1. Basic statistics and data summarizing parameters like mean, median, mode, central
tendencies, distributions, etc.
2. Data integrity, comparison and tendency tests like t-test, z-test, f-test
3. Regression - Linear, Logistic, GLM, Mixed in that order
4. Advanced techniques like predictive modeling and prescriptive methods
Technology
1. Microsoft Excel: This is the Holy grail of analytics. Learn this in and out. From simple
formulae to the data analytics tool and dashboard in, you should learn it all
2. VBA: This is an extension of Excel and though not used very extensively, can help in
making a lot of tasks in excel easier
3. SQL: This is the logical progression from Excel for handling larger data volumes and
also standardizing processes and creating code modules for repeated use
4. SAS/R: The next step will be one of these tools as they can help you do more complex
processing like regression and modeling
5. Tableau: This is almost the standard right now for data visualization and
dashboarding
6. Advanced technologies like Shiny, Hadoop, Hive, etc.
Business
Edit 1: Adding some free useful links that you can use to study the above mentioned
things:
If you forgot your 10th and 12th maths - statistics, linear algebra, probability - start from
there
If you dont have programming experience - pick R next ( since python would be more
suited for people with programming background)
Then move to Statistical modelling >> Machine Learning >> Forecasting >> NLP
If you are left with more time pick up Big data post this
https://www.lunametrics.com/blog/2016/01/27/tips-for-getting-a-job-in-the-digital-analytics-industry/
https://www.digitalvidya.com/blog/google-analytics-jobs/
If you are not a programmer & looking ahead for Job Oriented Courses Connect on
LinkedIn with me I will be Happy To Guide You :)
Do these three courses that is offered by Udacity. They are very helpful.
https://www.khanacademy.org/math...
R and Python are two programming tools which are pretty popular right now and you
have to be pretty efficient in either of the two that you take up. To learn Python and R
you can do the following courses.
PYTHON
R Programming | edX
Kaggle R Tutorial on Machine Learning (practice) - DataCamp
Introduction to R for Data Science - edX
Free Introduction to R Programming Online Course | DataCamp
R Programming | Coursera
R Programming AZ™: R For Data Science With Real Exercises! - Udemy
Online Learning – RStudio
Now let us check the data analytics courses that will help you build your
career in this field.
Web Analytics
Introduction to Digital Media Analytics
Audience Analytics
Acquisition Analytics
Behaviour Analytics
Real-Time Analytics
Intelligent Events
Attribution Modelling
Segment Reporting
Search Marketing
Display Marketing
https://www.361online.com/bigdata/PGP-ba-placementassured
Introduction to Data
Introduction to Probability
Distributions
Analysis of Variance
Chi-Square Analysis
Logistic Regression
Forecasting
I have crafted a learning path for you right here to learn data science in 5 – 6 months at
1/5th of the budget you will blow away at these Business schools.
Step 2 Knowbigdata Big Data Analytics course for Hadoop and Spark
Once you have been through this Learning path in 6 months you will be ready to
participate in Kaggle competitions and make a name for yourself in Data Science.
Tools covered during the course: SAS, R, Python, RapidMiner, Hadoop, Excel& Tableau
Register for C-CAT| C-CAT Candidate Login | Download Admission Booklet | FAQ
Course Focus
Eligibility Criteria
Course Fees
Course Contents
Course Outcome
Training Centres
FAQs
Advanced Analytics
60 Hours
Effective Communication
50 Hours
Project
120 Hours
Table 7b: Reference books for the various topics in C-CAT. SECTION TOPIC REFERENCE BOOK A English
Any High School Grammar Book (e.g. Wren & Martin) Quantitative Aptitude & Reasoning Quantitative
Aptitude Fully Solved (R. S. Aggrawal) Quantitative Aptitude (M Tyara) Barron‟s New GRE 2016 B
Computer Fundamentals Foundations of Computing (Pradeep Sinha & Priti Sinha) Data Communication
& Networking Data Communication & Networking (Forouzan) C Programming C Programming Language
(Kernighan & Ritchie) Let Us C (Yashavant Kanetkar) Data Structures Data Structures Through C in Depth
(S. K. Srivastava) Operating Systems Operating System Principles (Silberschatz, Galvin, Gagne) OOP
Concepts Test Your C ++ Skills (Yashavant Kanetkar) C Computer Architecture Computer Organization &
Architecture (William Stallings) Digital Electronics Digital Design (Morris Mano) Digital Design: Principles
& Practices (John Wakerly) Modern Digital Electronics (R. P. Jain) Microprocessors Microprocessor
Architecture, Programming & Applications with 8085 (Ramesh Gaonkar) The Intel Microprocessor (Barry
Brey)
Books to read:
These books provide a good overview of how analytics can impact our business decisions
and thought process, challenges faced in implementing data based solutions and also its
limitations (the last one).
Big Data – A revolution that will transform How we live, work and think
Linear Algebra and Statistics from Khan Academy – All the basics you would need
explained in awesome way! You realize how learning can be fun when you see them for the
first time
Intro to Descriptive Statistics on Udacity & Inferential Statistics on Udacity – for the
activity filled classes and exercises they provide.
For learning tools
Base SAS and Statistics course from SAS Institute – If you choice of tool is SAS
SAS Analytics U tutorials from SAS Institute (again if SAS is your choice)
Data Science Specialization from John Hopkins University on Coursera – If you want
to take learning in relaxed manner (3 – 4 hours every week over a period on 9 months)
edX Analytics Edge (R) – For those who can sustain more intensive schedule (20 – 25
hours every week for 3 months)
Google Analytics certification by Google – if you want to build a career in Web Analytics
Chandoo.org for learning and refreshing Excel – it contains some nice tips and tricks.
Qlikview / Tableau Tutorial – I think you should learn one of these visualization tools, so
that you can draw powerful visualizations quickly
R documentation
Analytics Vidhya
KDNuggets.com
R-bloggers,
Occam’s Razor by Avinash Kaushik
You should focus on gaining breadth over depth – for this kind of path. Read, read and read
a lot – subscribe to various blogs (including Analytics Vidhya) like KDNuggets,
smartdatacollective, big data made simple. You can rely on news aggregators like Prismatic
to provide you with latest news in the industry.
Join a course on basics of data science on Coursera / eDX to get first hand flavor of
being an analyst. Here are some good courses:
o The Analytics Edge – Intensive 12 week course which should give a goof
headstart
o Data Science speicialization from John Hopkins University – Relatively more
relaxed, but longer duration. A collection on 9 courses.
Make sure you do all the assignments. This is your chance to get the experience first
hand.
Subscribe to some of these blogs / communities to regularly read about the subject:
o Analytics Vidhya (what else did you expect first!)
o Occam’s Razor (For people interested in Web analytics)
o Smartdata Collective
o KDNuggets
Be a part of Linkedin Groups related to analytics –
o Advanced Business Analytics, Data Mining and Predictive Modeling
o Big Data / Analytics / Strategy / Predictive and Business Analytics
Once you have followed these blogs / communities for a while (say at least a month after your
course at Coursera), you can look out for certification courses to begin your journey.
For example, companies like Mu-Sigma, Fractal, WNS, Citi etc. are open to hire people
without prior work experience.
If it looks expensive, you can look at learning stats from Statistics Course on Udacity. You
can also look at the course we are running on Internshala. Regarding stats, you need to
start with probability, distributions, hypothesis testing and t-tests.
Clinical Analysis and Analytics typically refer to same field. Knowledge about statistics
hypothesis testing, A/B testing, DOE are must for clinical trials and analysis. Most of the
jobs in India in this domain would be for US clients (in KPOs).
If you are interested in Web Analytics, check out the course with Market Motive. It is run by
some of the best professionals in the industry.
Stats? Coding?
If the answer to all that is yes, you should pursue a few courses from Coursera to get a
flavour of industry. Machine Learning from Stanford University, Data Science track by John
Hopkins University are some of the best places to start.
in depth work knowledge in Regression, ARIMA, Conjoint analysis etc and also knowledge
of statistical tools like SAS, R etc.
here are some free courses on Coursera / eDX which would help you do so. So enroll for
them. Look for data science specialization from John Hopkins University
Once you are done with the basics, I would suggest that you can start participating in a few
data science competitions (e.g. on Kaggle,com) to put some of these concepts into real
world testing. This way you can invest in training at a low cost and get up the curve fast.
take a course from Jigsaw, start learning additional skills on the side and try and
demonstrate it by practical applications – e.g. application on a few simple Kaggle
competitions
==
have few questions to be resolved before I decide to enroll for foundation of analytics
course from Jigsaw academy which is asking for 26,000 rs. I have sent a similar query to
Jigsaw today.
I checked the syllabus of this course; I can see core subjects as these:
These are basic concepts which are available for free on platforms like
Coursera – https://www.coursera.org/course/datascitoolbox
Udacity – https://www.udacity.com/courses#!/data-science
SAS.com.
SAS has recently introduced a free university edition which I have installed which seems
great for now. I explored more and found that SAS provided free e-learning modules like
and they are really good in terms of explaining the basic concepts.
There are so many other free e- learning tutorials from SAS available which you can
practice on SAS studio university edition. Check out the link
– http://support.sas.com/training/tutorial/
What do u think of Coursera? They are providing online courses and even providing
certifications if you complete the course. I checked it out for R
Programming. https://www.coursera.org/course/rprog
(They are affiliated with Johns Hopkins Bloomberg School of Pubic health.) Now here’s the
catch! they arent charging for it, its for free!!)
What are your views on this?
he Entire Specialisation of “Data Science” is for 310$ (9 courses) with some accessibility
privileges.
https://www.coursera.org/specialization/jhudatascience/1?utm_medium=courseDescripTop
1. Begin learning how to code in R. It will be difficult at first, and will take a few months
of spending many hours at it per day. You can take a course on Coursera or Data
Camp. Coursera can be expensive, but I think DataCamp is only $40 a month for
access to all the classes. Examples:
a. R Programming - Johns Hopkins University | Coursera
b. Free Introduction to R Programming Online Course
2. Take an online course in data science (data camp and/or coursera).
3. Get a few books (ex: Introduction to Statistical Learning — James, Witten, Hastie,
Tibshirani). Ideally you’d want to learn this information in a few different forms, since
it will be difficult to really master it with just one course or one book.
4. Practice machine learning with datasets on Kaggle.
5. Join some Data Science/ Statistical Programming / Machine learning meetup
groups in your city. You can learn from free lectures and also make new friends to
learn from or learn with.
https://www.jigsawacademy.com/pgpdm/
https://www.kdnuggets.com/2018/05/beginners-guide-data-science-pipeline.html
1. Descriptive Analysis:
Exploratory data analysis is an approach that analyses a given data set by summarizing
its characteristics with visual methods. It could also represent the entire data set with its
features or just a part of the data set sample.
2. Statistical Inference:
From the data described or visualized during the descriptive analysis, we try to
understand the characteristics of the data set.
For example, to predict the number of voters going to support a candidate, the data
would include, clearly defined population of interest, a clear parameter chosen by data
scientists, and an estimate of the population having that set parameter.
3. Bayesian Statistics:
Conditional Probability
Bayes Theorem
4. Experimental Design:
Experimental Design is the concept of designing and planning experiments to yield the
cause and effect of the relationship between variables in a study.
The best example of this is everyday cooking. Consider the dish being prepared to be
the end result.
A certain process is designed to attain the goal, parameters such as salt, sugar and
other flavoring could be varied to end up with a different result every time – i.e.
understanding the various causes and effects.
https://www.jigsawacademy.com/online-analytics-training/
https://www.manipalprolearn.com/custom-search
With respect to the MOOC introduction to data analysis by Berkeley’s and Analytics Edge by MIT
in edx.org 23 are the two that helped me kick start. You can find a lot more
in coursera.com 17 Andrew NG’s course is famous
https://www.quora.com/Whats-the-best-way-to-learn-data-science-as-a-beginner
https://www.quora.com/How-do-I-learn-analytics-and-data-analysis-in-SQL-Is-there-a-book-or-course-
for-it
Topics to be covered:
Course (mandatory) – Intro to Inferential Statistics from Udacity – Once you have
gone through the descriptive statistics course, this course will take you through
statistical modeling techniques and advanced statistics.
Books (optional) – Online Stats Book – This online book can be used for a quick
reference for inference tasks.
Course (mandatory)
o Linear Algebra – Khan Academy : This concise and an excellent course on
Khan Academy will equip you with the skills necessary for Data Science and
Machine Learning.
Books (optional)
o Linear Algebra/ Levandosky – This is an often cited book to Stanford
graduates for Linear Algebra.
o The Manga guide to Linear Algebra – This is a fun filled Linear Algebra book
which keeps Machine Learning in context. You will never forget these Algebra
lessons for sure.
Articles (mandatory): These articles will guide you to structure your thinking
process to approach problems in a better way so as to improve your efficiency.
o How to train your mind for analytical thinking?
o Tools for improving structured thinking
o The art of structured thinking and analyzing
Topics to be covered:
Tools
1. R
Books – R for Data Science – This is your one stop solution for referencing basic
materials on R.
Blogs/Articles
o This article will serve a great point for collating the entire process of model
building starting from installation of RStudio/R.
o R-bloggers – This is one of the most recommended blog for R- users. Every
R practitioner should keep this blog bookmarked. It has some of the most
effective and practical R tutorials. Bookmark it now.
2. Python
Books (mandatory) – Python for Data Analysis – This book covers various aspects
of Data Science including loading data to manipulating, processing, cleaning and
visualizing data. Must keep reference guide for Pandas users.
Blogs/Articles (optional)
o A Complete Tutorial to Learn Data Science with Python from Scratch: This
article will serve as a quick guide to learning Data Science using Python.
Exploration and Visualization
1. R
Course
o Exploratory Data Analysis – This is an awesome course by Johns Hopkins
University on Coursera. You will need no other course to perform
visualization and exploratory work in R.
Blogs/Articles
o Comprehensive guide to Data Exploration in R – This will be a one-stop
article that I will suggest you to go through carefully and follow every step.
This is because the steps mentioned in the article are the same steps you will
be using while solving any data problem or a hackathon problem.
o Cheat sheet – Data Exploration in R – This cheat sheet contains all the steps
in data exploration with codes. I suggest you to take out a print and paste it
on your wall for quick reference.
2. Python
Course (optional)
o Intro to Data Analysis – This is an excellent course by Udacity on Data
Exploration using Numpy and Pandas.
Blogs/Articles (mandatory)
o Comprehensive guide to Data Exploration using Python NumPy, Matplotlib
and Pandas – This is a sufficient and comprehensive article which uses the
most popular Python libraries for exploration and visualization purposes.
o 9 popular ways to perform Data Visualization in Python – This article presents
the most commonly used graphs and plots used in Data Exploration along
with Python codes. This is a must bookmarked article for people working in
Data Science using Python.
Books (optional) – Python for Data Analysis – A one stop solution for your Data
Exploration and Visualization in Python.
Linear Regression
Course
o Machine Learning by Andrew Ng – There is no better resource to learn Linear
Regression than this course. It will give you a thorough understanding of
linear regression and there is a reason why Andrew Ng is considered the
rockstar of Machine Learning.
Blogs/Articles
o This lesson out of PennState Stat 501 course outlines the main features of
Linear Regression ranging from a simple definition of a Linear Regression to
determining the goodness of fit of a regression line.
o This is an excellent article with practical examples to explain Linear
Regression with code.
Books
o The Elements of Statistical Learning – This book is sometimes considered the
holy grail of Machine Learning and Data Science. It explains Machine
Learning concepts mathematically from a Statistics perspective.
o Machine Learning with R – This is a book I personally use to have a brief
understanding of Machine Learning algorithms along with their
implementation code.
Practice
o Black Friday – Like I already said – No amount of theory can beat practice.
Here is a regression problem that you can try your hands on for a deeper
understanding.
Logistic Regression
Course (mandatory)
o Machine Learning by Andrew Ng– The week 3 of this course will give you a
deeper understanding of the one of the most widely used classification
algorithm.
o Machine Learning: Classification – Week 1 and 2 of this practical oriented
Specialization course using Python will satiate your knowledge thirst about
Logistic Regression.
Blogs/Articles (optional)
o Logistic Regression by Machine Learning Mastery – This is an excellent non-
code based approach to Logistic regression to deepen your knowledge. I
suggest you to have a look at it.
Books (optional)
o Introduction to Statistical Learning – This is an excellent book with a quality
content on Logistic Regression’s underlying assumptions, statistical nature
and mathematical linkage.
Practice (mandatory)
o Loan Prediction – This is an excellent competition to practice and test your
new Logistic Regression skills to predict whether loan status for a person was
approved or not.
Decision Trees
Course (mandatory)
o Machine Learning: Classification – Week 3 and 4 in this course is about the
working of decision trees, preventing overfitting and handling missing values
Blogs/Articles (mandatory)
o Technical Overview of decision trees – This is a quick overview of decision
trees and a must read for anyone new to decision trees.
o Complete tutorial on tree based modeling – This is a python based tutorial on
decision trees. For the sake of decision trees, read only sections 1-6 in this
article.
Books (mandatory)
o Introduction to Statistical Learning – Section 8.1 and 8.3 explain the basics of
decision trees through theory and practical examples.
o Machine Learning with R – Chapter 5 of this book provides you the best
explanation of Machine Learning Algorithms available in the market. Here, the
decision trees are explained in an extremely non-intimidating and easier style.
Practice (mandatory)
o Loan Prediction – This is an excellent competition to practice and test your
new Logistic Regression skills to predict whether loan status for a person was
approved or not.
Course (mandatory)
o Machine Learning – Clustering and Retrieval: Week 2 of this course
progresses to k-nearest neighbors from 1-nearest neighbor and also
describes the best ways to approximate the nearest neighbors. It explains all
the concepts of KNN using python.
Blogs/Articles (mandatory)
o Introduction to k-nearest neighbors: simplified – This basic article describes
when to use KNN, the ways in which k can be chosen and the way in which
KNN algorithm works.
o Learning KNN algorithm using R – This article is a comprehensive guide to
learning KNN with hands-on codes for future references.
K-Means
Course
o Machine Learning Course – Unsupervised Learning with K-means algorithm:
Week 8 of this discusses how to use course how K-means algorithm is used
for handling unstructured data.
Blog
o An Introduction to Clustering and different methods of clustering: In this
article, you will learn what is k-means clustering and the intricacies involved in
that. It will give you a step by step approach how K-means algorithm works.
Naive Bayes
Course
o Intro to Machine Learning: Take this course to see Naive Bayes in action. In
this course, Sebastian Thrun has explained Naive Bayes in Simple English.
Blog / Article
o 6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python) : This
article will take you through Naive Bayes algorithm in detail. In this guide, you
will learn how Naive Bayes algorithm works, applications and many more. It
will also give you hands-on knowledge of building a model using Naive
Bayes.
o Naive Bayes for Machine Learning : This is one of the most comprehensive
articles I have come across. Go through this article to have a complete
understanding of why naive bayes algorithm is important for machine
learning.
Dimensionality Reduction
Course
o Machine Learning – Dimensionality Reduction: Week 8 of this course will
walk you through dimensionality reduction and how Principal Components
Analysis can be used for data compression of complex data.
Blog / Article
o Beginners Guide To Learn Dimension Reduction Techniques: In this article,
you will learn why dimension reduction is important in machine learning and
the various techniques of dimension reduction.
Random Forests
Videos (mandatory)
o How Random Forest algorithm works? – Watch this video to have a visual
perspective of how the Random Forest algorithm works.
Books (optional)
o Introduction to Statistical Learning – Section 8 explains the basics of Random
Forests including bagging and boosting through theory and practical
examples.
o Applied predictive modeling – Chapter 8
Blogs/Articles (mandatory)
o A tutorial on tree based modeling from scratch – This is an excellent article on
trees based modeling using python. I suggest you to bookmark it right now.
o Random Forests – This blog explains the entire working, nuts and bolts of
Random Forest.
Blogs/Articles (mandatory)
o Guide on Boosting methods
o Parameter tuning GBM
o Machine Learning Mastery- GBM
XGBOOST
Course (mandatory)
o Machine Learning by Andrew Ng – Week 7 of this course is an interesting
place to start your SVM journey.
Books (mandatory)
o Introduction to Statistical Learning – Chapter 9 of the book contains a detail
discussion about SVMs and the ways to deploy them.
Blogs/Articles (optional)
o Understanding support vector machines – This is an excellent article to
understand an algorithm practically using examples.
o SVM by Machine Learning Mastery – This article discusses the different types
of kernels employed in SVM and their uses.
Topics to be covered:
It is very important for a Data Scientist to have a GitHub profile to host all the codes of the
project he/she has undertaken. Potential employers not only see what you have done, how
you have coded and how frequently / how long you have been practicing data science.
Also, codes on GitHub open up avenues for open source projects which can highly boost your
learning. If you don’t know how to use Git, you can learn from Git and GitHub on Udacity.
This is one of the best and easy to learn course to manage the repositories through terminal.
Time and again, I have stressed on the fact that practice beats theory. Moreover coding in
hackathons brings you closer to developing data products in real life for solving real world
problems. Below are most popular platforms to participate in Data Science/ Machine Learning
Competitions.
Discussions are a great way to learn in a peer-to-peer setup from finding an answer to a
question you stuck to providing answers to someone else’s questions. Below are some of the
discussion rich platforms which you should keep a tab on to clear your doubts.
If you are here after diligently following the above steps, then you can be sure that you are
ready for a Job / Internship position at any Data Science / Analytics or Machine Learning
firms. But it becomes quite difficult to identify the right jobs. So, for the purpose of saving the
trouble, I have created a list of portals which lists down Data Science/ Machine Learning jobs
and Internships.
In order to prepare for these interviews, you should go through this Damn Good Hiring
Guide
The Ultimate Path for transitioners
Simply put, if you are looking for a transition under a year, you will need to learn everything
we laid out for the beginner above. Additionally, you will need to carve out additional time to
showcase your skills. You will need to overcome the doubts of your potential employers
through your projects and work.
I am sure you are beginning to understand why transition is not an easy thing.
The structure of the path is similar, but you will need to accelerate your learning in the first
half of the plan. Start by going through this article and go through a few success stories to
understand what a transition would entail. Once you are set for the journey, follow the plan
by sticking to these timelines.
Step 1: Getting started and testing the waters (1 week in January ’17)
Step 2: Mathematics & Statistics (Jan ’17 – March ’17)
Step 3: Introducing the tool – R / Python (March ’17 – April ’17)
Step 4: Basic & Advanced machine learning tools (May ’17 – July ’17)
Step 5: Building your profile (Aug ’17 – Oct ’17)
Step 6: Applying for Jobs (Nov ’17 – Dec ’17)
https://www.analyticsvidhya.com/blog/2017/01/the-most-comprehensive-data-science-learning-plan-
for-2017/
https://trainings.analyticsvidhya.com/courses/course-v1%3AAnalyticsVidhya%2BPython-Final-Jan-
Feb%2BPython-Session-1/
https://trainings.analyticsvidhya.com/courses/course-v1:AnalyticsVidhya+Python-Final-Jan-Feb+Python-
Session-1/about
https://classroom.udacity.com/courses/ud198
https://edunxt.manipalprolearn.com/?q=MULNCourseBook/viewSectionMyCourseBook/42223/cidb/full
/view/
https://www.jigsawacademy.com
https://www.datacamp.com/courses/intro-to-sql-for-data-science
https://analyticsindiamag.com/top-6-full-time-analytics-courses-india-ranking-2017/
https://analyticsindiamag.com/top-10-analyticsdata-science-training-institutes-india-ranking-2017/
Let’s chalk out the usual path for data science enthusiasts:
https://analyticsindiamag.com/become-data-scientist-2018/
https://analyticsindiamag.com/top-10-executive-analytics-courses-india-ranking-2017/
https://analyticsprofile.com/business-analytics/best-business-analytics-data-science-courses-in-india-
2018/
HTTPS://ANALYTICSINDIAMAG.COM/TOP-10-ANALYTICSDATA-SCIENCE-TRAINING-INSTITUTES-INDIA-RANKING-2017/
HTTPS://ANALYTICSINDIAMAG.COM/TOP-6-FULL-TIME-ANALYTICS-COURSES-INDIA-RANKING-2017/
HTTPS://ANALYTICSINDIAMAG.COM/TOP-10-EXECUTIVE-ANALYTICS-COURSES-INDIA-RANKING-2017/
One of the best resources to study ML that I’ve personally benefited from is Andrew Ng’s
course on Coursera.
Stanford’s CS231 is also really good, with emphasis on Convolutional Neural Networks.
fast.ai is also considered to be really good, although I haven’t personally used it.
These courses will hopefully teach you the theory behind building ML systems. Beyond
that, one of the best ways to learn is to try those algorithms out and practice. Learning
how to use TensorFlow or Pytorch will also prove to be helpful.
For Machine Learning (ML), I earned my course certificate from coursera, its a
certification program from Stanford University by Professor Andrew Ng.
For AI, you must have knowledge about Machine Learning, Deep Learning, Natural
Language Processing, and Big Data.