Вы находитесь на странице: 1из 8

G

NIKAERB The Power of a Data Value Chain For Your Business

The 20 Most Popular Business IntelligenceTools


The Mathematics of Machine Learning

The Next Tech Wave: Why Businesses Use Data Science Platforms

The Internet of Things Entrepreneur Checklist a guide for the budding IoT mogul

Convince Your Boss! 5 Reasons to Attend the IoT Weekend

Infographic: The 4 Types of Data Science Problems Companies Face

6 Ways Business Intelligence is Going to Change in 2017

DATA SCIENCE MACHINE LEARNING RESOURCES TOPICS

THEMATHEMATC
ISOFMACHN
I ELEARNN
IG
WALE AKINFADERIN FEBRUARY 15, 2017

0 COMMENTS 13 15.6K 22

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using
Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, Ive observed that
some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this
blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-
learn, Weka, Tensorflow etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic
aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite
the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is
necessary for a good grasp of the inner workings of the algorithms and getting good results.

HTM
A EHT TU
O
BA YR
W
R
OWY
H

There are many reasons why the mathematics of Machine Learning is important and Ill highlight some of them below:

1. Selecting the right algorithm which includes giving considerations to accuracy, training time, model complexity, number of parameters
and number of features.

2. Choosing parameter settings and validation strategies.

3. Identifying underfitting and overfitting by understanding the Bias-Variance tradeoff.

4. Estimating the right confidence interval and uncertainty.

SHTM
A FO LEVEL TW
A
H

The main question when trying to understand an interdisciplinary field such as Machine Learning is the amount of maths necessary and
the level of maths needed to understand these techniques. The answer to this question is multidimensional and depends on the level and
interest of the individual. Research in mathematical formulations and theoretical advancement of Machine Learning is ongoing and some
researchers are working on more advance techniques. Ill state what I believe to be the minimum level of mathematics needed to be a
Machine Learning Scientist/Engineer and the importance of each mathematical concept.

1. Linear Algebra: A colleague, Skyler Speakman, recently said that Linear Algebra is the mathematics of the 21st century and I totally
agree with the statement. In ML, Linear Algebra comes up everywhere. Topics such as Principal Component Analysis (PCA), Singular Value
Decomposition (SVD), Eigendecomposition of a matrix, LU Decomposition, QR Decomposition/Factorization, Symmetric Matrices,
Orthogonalization & Orthonormalization, Matrix Operations, Projections, Eigenvalues & Eigenvectors, Vector Spaces and Norms are
needed for understanding the optimization methods used for machine learning. The amazing thing about Linear Algebra is that there are
so many online resources. I have always said that the traditional classroom is dying because of the vast amount of resources available on
the internet. My favorite Linear Algebra course is the one offered by MIT Courseware (Prof. Gilbert Strang).

2. Probability Theory and Statistics: Machine Learning and Statistics arent very different fields. Actually, someone recently defined
Machine Learning as doing statistics on a Mac. Some of the fundamental Statistical and Probability Theory needed for ML are
Combinatorics, Probability Rules & Axioms, Bayes Theorem, Random Variables, Variance and Expectation, Conditional and Joint
Distributions, Standard Distributions (Bernoulli, Binomial, Multinomial, Uniform and Gaussian), Moment Generating Functions, Maximum
Likelihood Estimation (MLE), Prior and Posterior, Maximum a Posteriori Estimation (MAP) and Sampling Methods.

3. Multivariate Calculus: Some of the necessary topics include Differential and Integral Calculus, Partial Derivatives, Vector-Values
Functions, Directional Gradient, Hessian, Jacobian, Laplacian and Lagragian Distribution.

4. Algorithms and Complex Optimizations: This is important for understanding the computational efficiency and scalability of our
Machine Learning Algorithm and for exploiting sparsity in our datasets. Knowledge of data structures (Binary Trees, Hashing, Heap, Stack
etc), Dynamic Programming, Randomized & Sublinear Algorithm, Graphs, Gradient/Stochastic Descents and Primal-Dual methods are
needed.

5. Others: This comprises of other Math topics not covered in the four major areas described above. They include Real and Complex
Analysis (Sets and Sequences, Topology, Metric Spaces, Single-Valued and Continuous Functions, Limits), Information Theory (Entropy,
Information Gain), Function Spaces and Manifolds.

Some MOOCs and materials for studying some of the Mathematics topics needed for Machine Learning are:

Khan Academys Linear Algebra, Probability & Statistics, Multivariable Calculus and Optimization.
Coding the Matrix: Linear Algebra through Computer Science Applications by Philip Klein, Brown University.
Linear Algebra Foundations to Frontiers by Robert van de Geijn, University of Texas.
Applications of Linear Algebra, Part 1 and Part 2. A newer course by Tim Chartier, Davidson College.
Joseph Blitzstein Harvard Stat 110 lectures
Larry Wassermans book All of statistics: A Concise Course in Statistical Inference .
Boyd and Vandenberghes course on Convex optimisation from Stanford.
Linear Algebra Foundations to Frontiers on edX.
Udacitys Introduction to Statistics.
Coursera/Stanfords Machine Learning course by Andrew Ng.

Finally, the main aim of this blog post is to give a well-intentioned advice about the importance of Mathematics in Machine Learning and
the necessary topics and useful resources for a mastery of these topics. However, some Machine Learning enthusiasts are novice in Maths
and will probably find this post disheartening (seriously, this is not my aim). For beginners, you dont need a lot of Mathematics to start
doing Machine Learning. The fundamental prerequisite is data analysis as described in this blog post and you can learn the maths on the
go as you master more techniques and algorithms.

This post originally appeared on Wales LinkedIn profile

Like this article? Subscribe to our weekly newsletter to never miss out!

Follow @DataconomyMedia

TAGS: Calculus Linear Algebra Machine Learning Mathematics statistics


SU SESSE
NISU
BWY
H :EV
W
AHCET TXE
N EHT SL
OT EC
NEGILLET
NI SSE
NISU
BRALU
P
OP TM
S
O02 EHT
M
S
R
OFTALP EC
NEICS

T H E
A U T H O R

W
N
A
K
A
N
F
ID
A
R
E
L
I
Wale is a Graduate Research Assistant at the National High Magnetic Field Laboratory, and PhD candidate in Physics at Florida
State University

R E L A T E D
P O S T S

ARTIFICIAL INTELLIGENCE DATA SCIENCE EVENTS MACHINE LEARNING UNDERSTANDING BIG DATA

SU
BRU
O
YMR
OFS
N
ART O
TW
OH
DATA SCIENCE DATA SCIENCE 101 UNDERSTANDING BIG DATA
A
RTS STC
AF EHT TEG

ARTIFICIAL INTELLIGENCE DATA SCIENCE MACHINE LEARNING


N
A
RT DLU
O
CSNIA
H
CKC
OLBW
O
H
CONTRIBUTORS DATA SCIENCE MACHINE LEARNING

TXE G
NINM
IN
OINIP
O
0 Dataconomy
1 Login
Comments

Sort by Best
Recommend 1 Share

Start the discussion

ALSO ON DATACONOMY

What is Metadata and why is it as important as Top Virtual Reality Blog for VR Lovers and
the data itself? Developers
1 comment 2 months ago 5 comments 8 months ago
Robert Lakin Thanks for this. Very helpful Rockefoten We also try to keep up with the world of
explanation of the topic(s). Terrific archeology analogy! virtual reality, playing on Oculus Rift, Gear VR,

Turbocharge Innovation with the Cloud Big Data is Transforming Commercial


1 comment 8 months ago Construction
Ashley Beaumont Very useful article on how the 1 comment 7 months ago
cloud boost Productivity and inspires Adam Cohen I couldn't agree more, the question is
not about collecting the data the question is

Subscribe d Add Disqus to your site Privacy

1
4
K 3
K
2
.
FOLLOWERS FANS

POPULAR RECENT COMMENTS

W
O
N
KOT DEE
NUO
Y TW
A
H -L
QS
O
N .SV L
QS
115.2K VIEWS BY EILEEN MCNULTY

SESR
U
OC ATA
DGIB E
NIL
N
O 01
65.8K VIEWS BY EILEEN MCNULTY

SER
UTAEF SA
D
N
APN
O
HTYP TSEB 41
63.9K VIEWS BY MANU JEEVAN
A
D FOYR
O
TSIH EHT O
T EDIU
G S RE
NIGEB
60.6K VIEWS BY HANNAH AUGUR

O
NM
IRET ATA
DGIBO
T EDIU
G S RE
NIGEB A
51.4K VIEWS BY HANNAH AUGUR

F O L L O W U S O N
T W I T T E R

Tweets by @DataconomyMedia

F O L L O W
U S

1
4
K 3
K
2
.
FOLLOWERS FANS

L A T E S T
P O S T S

P O P U L A R
P O S T S

WEEK MONTH ALL TIME

G
NIN
R
AEL E
NIH
C
M
A FO SCITM
A
EHTM
A EHT
15.6K VIEWS BY WALE AKINFADERIN

C
NEGILLET
NI SSE
NISU
BRALU
P
OP TM
S
O02 EHT
1.9K VIEWS BY JUAN SALAZAR

SER
UTAEF SA
D
N
APN
O
HTYP TSEB 41
63.9K VIEWS BY MANU JEEVAN

S I G N U P T O O U R
N E W S L E T T E R

Email address:

Your email address


Your email address

S I G N U P

Home - About - Imprint - Contact - Site Map - Legal & Privacy


I N T E R E S T I N G
P O S T S

W
O
N
KOT DEE
NUO
Y TW
A
H -L
QS
O
N .SV L
QS
115.2K VIEWS BY EILEEN MCNULTY

SESR
U
OC ATA
DGIB E
NIL
N
O 01
65.8K VIEWS BY EILEEN MCNULTY

SER
UTAEF SA
D
N
APN
O
HTYP TSEB 41
63.9K VIEWS BY MANU JEEVAN

C O P Y R I G H T D A T A C O N O M Y M


This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Privacy Statement

Вам также может понравиться