Вы находитесь на странице: 1из 193

PYTHON PROGRAMMING

3 BOOKS IN 1: ULTIMATE BEGINNER’S, INTERMEDIATE &


ADVANCED GUIDE TO LEARN PYTHON STEP BY STEP
RYAN TURNER
CONTENTS

Python Programming: The Ultimate Beginner’s Guide to Learn Python Step by Step
Introduction
1. What is Python Machine Learning?
2. How to Start Learning Python
3. Review of Data Samples and Visualization of Data
4. How to Create a Dataset with Visualization
5. Making Predictions with Algorithms
6. Examples of Coding
7. Decision Tree
8. Neural Networks
9. Bringing it All Together
Conclusion
Python Programming: The Ultimate Intermediate Guide to Learn Python Step by Step
Introduction
1. What Is Machine Learning
2. Supervised Machine Learning
3. Unsupervised Machine Learning
4. The Basics of Working with Python
5. Setting up Your Python Environment
6. Data Preprocessing with Machine Learning
7. Working with Linear Regression in Machine Learning
8. Using a Decision Tree for Regression
9. Random Forest for Regression
10. Working with a Support Vector Regression
11. What is Naive Bayes and How Does It Work with Machine Learning
12. K-Nearest Neighbors Algorithm for Classification
Conclusion
Python Programming: The Ultimate Expert Guide to Learn Python Step-by-Step
Introduction
1. Working with Inheritances in Python
2. Arguments in Python
3. Namespace and Python
4. Working with Iterators in Python and What These Mean
5. Exception Handling and How to Create a Unique Code with Them
6. The Python Generators
7. What are Itertools in the Python Language
8. What are Closures in Python and Why are they so Important
9. Working with Regular Expressions
10. What are the Conditional Statements and When Will I Need to Use Them?
11. Do I Need to Learn Assert Handling in This Language
12. How to Work with Loops in Your Python Code
13. When to Use User-Defined Functions in Your Code
14. Working with Memoization in Python
Conclusion
Copyright 2018 by James C. Anderson - All rights reserved.
The following eBook is reproduced below with the goal of providing information that is as accurate and
reliable as possible. Regardless, purchasing this eBook can be seen as consent to the fact that both the
publisher and the author of this book are in no way experts on the topics discussed within and that any
recommendations or suggestions that are made herein are for entertainment purposes only.
Professionals should be consulted as needed prior to undertaking any of the action endorsed herein.
This declaration is deemed fair and valid by both the American Bar Association and the Committee of
Publishers Association and is legally binding throughout the United States.
Furthermore, the transmission, duplication or reproduction of any of the following work including
specific information will be considered an illegal act irrespective of if it is done electronically or in
print. This extends to creating a secondary or tertiary copy of the work or a recorded copy and is only
allowed with an expressed written consent from the Publisher. All additional rights reserved.
The information in the following pages is broadly considered to be a truthful and accurate account of
facts, and as such any inattention, use or misuse of the information in question by the reader will render
any resulting actions solely under their purview. There are no scenarios in which the publisher or the
original author of this work can be in any fashion deemed liable for any hardship or damages that may
befall them after undertaking information described herein.
Additionally, the information in the following pages is intended only for informational purposes and
should thus be thought of as universal. As befitting its nature, it is presented without assurance
regarding its prolonged validity or interim quality. Trademarks that are mentioned are done without
written consent and can in no way be considered an endorsement from the trademark holder.
PYTHON PROGRAMMING: THE ULTIMATE
BEGINNER’S GUIDE TO LEARN PYTHON
STEP BY STEP
INTRODUCTION

Congratulations on downloading Python Beginners Guide: Machine Learning


for Newbies, and thank you for doing so.
In this Python Beginner’s Guide, you’re about to learn...

The Most Vital Basics of Python programming. Rapidly get the


dialect and begin applying the ideas to any code that you compose.
The Useful features of Python for Beginners—including some ideas
you can apply to in real-world situations and even other programs.
Different mechanics of Python programming: control stream,
factors, records/lexicons, and classes—and why taking in these
center standards are essential to Python achievement
Protest arranged programming, its impact on present-day scripting
languages, and why it makes a difference.

This guide has been composed specifically for Newbies and Beginners. You
will be taken through each step of your very first program, and we will
explain each portion of the script as you test and analyze the data.
Machine learning is defined as a subset of something called artificial
intelligence (AI). The ultimate goal of machine learning is to first
comprehend the structure of the presented data and align that data into certain
models that can then be understood and used by anyone.
Despite the fact that machine learning is a department in the computer
science field, it truly is different from normal data processing methods.
In common computing programs, formulas are groups of individually
programmed orders that are used by computers to determine outcomes and
solve problems. Instead, machine learning formulas allow computers to focus
only on data that is inputted and use proven stat analysis in order to deliver
correct values that fall within a certain probability. What this means is that
computers have the ability to break down simple data models which enables
it to automate routine decision-making steps based on the specific data that
was inputted.
Any innovation client today has profited from machine learning. Facial
acknowledgment innovation enables internet based life stages to enable
clients to tag and offer photographs of companions. Optical character
acknowledgment (OCR) innovation changes over pictures of content into
portable kind. Proposal motors, controlled by machine learning, recommend
what motion pictures or TV programs to watch next in view of client
inclinations. Self-driving autos that depend on machine learning on how to
explore may soon be accessible to shoppers.
Machine learning is a ceaselessly growing field. Along these lines, there are a
few things to remember as you work with machine learning philosophies, or
break down the effect of machine learning forms.
In this book, we'll look at the normal machine learning strategies for managed
and unsupervised learning, the basic algorithmic methodologies including the
k-closest neighbor calculation, specific decision tree learning, and deeply
impactful techniques. We will also investigate which programming is most
used in machine learning, giving you a portion of the positive and negative
qualities. Moreover, we'll talk about some important biases that are
propagated by machine learning calculations, and consider what can be done
to avoid biases affecting your algorithm building.
There are plenty of books on this subject on the market. Thanks for choosing
this one! Every effort was made to ensure it’s full of useful information as
possible, please enjoy!
1

WHAT IS PYTHON MACHINE LEARNING?

WHAT IS PYTHON?

P ython is an awesome decision on machine learning for a few reasons.


Most importantly, it's a basic dialect at first glance. Regardless of
whether you're not acquainted with Python, getting up to speed is
snappy in the event that you at any point have utilized some other dialect
with C-like grammar.
Second, Python has an incredible network which results in great
documentation and inviting and extensive answers in StackOverflow
(central!).
Third, coming from the colossal network, there are a lot of valuable libraries
for Python (both as "batteries included" an outsider), which take care of
essentially any issue that you can have (counting machine learning).
Wait I thought this machine language was slow?
Unfortunately, it is a very valid question that deserves an answer. Indeed,
Python is not at all the fastest language on the planet.
However, here's the caveat: libraries can and do offload the costly
computations to the substantially more performant (yet much harder to use) C
and C++ are prime examples. There's NumPy, which is a library for
numerical calculation. It is composed in C, and it's quick. For all intents and
purposes, each library out there that includes serious estimations utilizes it—
every one of the libraries recorded next utilize it in some shape. On the off
chance that you read NumPy, think quick.
In this way, you can influence your computer scripts to run essentially as
quick as handwriting them out in a lower level dialect. So there's truly
nothing to stress over with regards to speed and agility.
If you want to know which Python libraries you should check out. Try some
of these.
“Scikit-learn”
Do you need something that completely addresses everything from testing
and training models to engineering techniques?
Then scikit-learn is your best solution. This incredible bit of free
programming gives each device important to machine learning and
information mining. It's the true standard library for machine learning in
Python; suggested for the vast majority of the 'old' ML calculations.
This library does both characterization and relapse, supporting essentially
every calculation out there (bolster vector machines, arbitrary timberland,
Bayes, you name it). It allows a simple exchanging of calculations in which
experimentation is a lot simpler. These 'more seasoned' calculations are
shockingly flexible and work extremely well in a considerable amount of
problems and case studies.
In any case, that is not all! Scikit-learn additionally does groupings, plural
dimensionalities, and so on. It's likewise exceedingly quick since it keeps
running on NumPy and SciPy.
Look at a few cases to see everything this library is prepared to do, the
instructional exercises on the website, and the need to figure out if this is a
good fit.
“NLTK”
While not a machine learning library essentially, NLTK is an unquestionable
requirement when working with regular computer language. It is bundled
with a heap of Datasets and other rhetorical data assets, which is invaluable
for preparing certain models. Aside from the libraries for working with
content, this is great for determining capacities, for example, characterization,
tokenization, stemming, labeling, and parsing—that's just the beginning.
The handiness of having the majority of this stuff perfectly bundled can't be
exaggerated. In case you are keen on regular computer language look at a few
of their website's instructional exercises!
“Theano”
Utilized generally in research and within the scholarly community, Theano is
the granddad of all deeply profound learning systems. Since it is written in
Python, it is firmly incorporated with NumPy. Theano enables you to make
neural systems which are essential scientific articulations with multi-
dimensional clusters. Theano handles this so you that you don't need to stress
over the real usage of the math included.
It bolsters offloading figures to a considerably speedier GPU, which is an
element that everybody underpins today, yet, back when they presented it,
this wasn't the situation. The library is extremely developed now and boasts
an extensive variety of activities, which is extraordinary with regards to
contrasting it and other comparative libraries.
The greatest grievance out there about Theano is the API might be
cumbersome for a few, making the library difficult to use for beginning
learners. In any case, there are tools that relieve the agony and makes
working with Theano pretty straightforward, for example, try using Keras, or
Blocks, and even Lasagne.
“TensorFlow”
The geniuses over at Google made TensorFlow for inside use in machine
learning applications and publicly released it in late 2015. They needed
something that could supplant their more established, non-open source
machine learning structure, DistBelief. It wasn't sufficiently adaptable and too
firmly ingrained into their foundation. It was to be imparted to different
analysts around the globe.
Thus, TensorFlow was made. Despite their slip-ups in the past, many view
this library as a much-needed change over Theano, asserting greater
adaptability and more instinctive API. Another great benefit is it can be
utilized to create new conditions, supporting tremendous amounts of new
GPUs for training and learning purposes. While it doesn't bolster as wide a
scope of functionality like Theano, it has better computational diagram
representations.
TensorFlow is exceptionally famous these days. In fact, if you are familiar
with every single library on this list, you can agree that there has been a huge
influx in the number of new users and bloggers in this library and its
functionality. This is definitely a good thing for beginners.
“Keras”
Keras is a phenomenal library that gives a top-level API to neural systems
and is best for running alongside or on top of Theano or TensorFlow. It
makes bridling full intensity of these intricate bits of programming
substantially simpler than utilizing them all by themselves. The greatest
benefit of this library is its exceptional ease of understanding, putting the end
users’ needs and experiences as its number one priority. This cuts down on a
number of errors.
It is also secluded; which means that individual models like neural layers and
cost capacities can be grouped together with little to no limitations. This
additionally makes the library simple to include new models and interface
them with the current ones.
A few people have called Keras great that it is similar to cheating on your
exam. In case you're beginning with higher learning in this area, take the
illustrations and examples and discover what you can do with it. Try
exploring.
Furthermore, by chance that you need to START learning, it is recommended
that you begin with their instructional exercises and see where you can go
from that point.
Two comparative choices are Lasagne and Blocks; however, they just keep
running on Theano. If you attempted Keras and have difficulty, perhaps,
experiment with one of these contrasting options to check whether they work
out for you.
“PyTorch”
If you are looking for a popular deep learning library, then look no further
than Torch, which is written in the language called Lua. Facebook recently
open-sourced a Python model of Torch and named it PyTorch, which allows
you to easily use the exact same libraries that Torch uses, but from Python,
instead of the original language, Lua.
PyTorch is significantly easier for debugging because of one major difference
between Theano, TensorFlow, and PyTorch. The older versions use
allegorical computation while the newer does not. Allegorical computation is
simply a way of saying that coding an operation, for example, ‘a + b’, will
not be computed when that line is read. Before it is executed it must be
translated into what is called CUDA or C. This makes the debugging much
harder to execute in Theano/TensorFlow since this error is more difficult to
pinpoint with a specific line of code. It’s basically harder to trace back to the
source. Debugging is not one of this library’s strongest features.
This is extremely beginner-friendly; as your learning increases, try some of
their more advanced tutorials and examples.

HISTORY OF PYTHON
Python was invented in the later years of the 1980s. Guido van Rossum, the
founder, started using the language in December 1989. He is Python's only
known creator and his integral role in the growth and development of the
language has earned him the nickname "Benevolent Dictator for Life". It was
created to be the successor to the language known as ABC.
Van Rossum said one the reasons he created Python back in 1996:
““...In December 1989, I was looking for a "hobby" programming project
that would keep me occupied during the week around Christmas. My office ...
would be closed, but I had a home computer and not much else on my hands.
I decided to write an interpreter for the new scripting language I had been
thinking about lately: a descendant of ABC that would appeal to Unix/C
hackers. I chose Python as a working title for the project, being in a slightly
irreverent mood (and a big fan of Monty Python's Flying Circus).”
The next version that was released was Python 2.0, in October of the year
2000 and had significant upgrades and new highlights, including a cycle-
distinguishing junk jockey and back up support for Unicode. It was most
fortunate, that this particular version, made vast improvement procedures to
the language turned out to be more straightforward and network sponsored.
Python 3.0, which initially started its existence as Py3K. Funny right? This
version was rolled out in December of 2008 after a rigorous testing period.
This particular version of Python was hard to roll back to previous
compatible versions which are the most unfortunate. Yet, a significant
number of its real highlights have been rolled back to versions 2.6 or 2.7
(Python), and rollouts of Python 3 which utilizes the two to three utilities,
that helps to automate the interpretation of the Python script.
Python 2.7's expiry date was originally supposed to be back in 2015, but for
unidentifiable reasons, it was put off until the year 2020. It was known that
there was a major concern about data being unable to roll back but roll
FORWARD into the new version, Python 3. In 2017, Google declared that
there would be work done on Python 2.7 to enhance execution under
simultaneously running tasks.

BASIC FEATURES OF PYTHON


Python is an unmistakable and extremely robust programming language that
is object-oriented based almost identical to Ruby, Perl, and Java,
A portion of Python's remarkable highlights:
Python uses a rich structure, influencing, and composing projects that can be
analyzed simpler.
It is a simple to utilize dialect that makes it easy to get your program
working. This makes Python perfect for model improvement and other
specially appointed programming assignments, without trading off viability.
It accompanies a huge standard library that backs tons of simple
programming commands, for example, extremely seamless web server
connections, processing and handling files, and the ability to search through
text with commonly used expressions and commands.
Python's easy to use interactive interface makes it simple to test shorter
pieces of coding. It also comes with IDLE which is a "development
environment".
Python effortlessly extended out by including new modules executed in a
source code like C or C++.
Python can also be inserted into another application to give an easily
programmed interface.
Python will run anyplace, including OS X, Windows Environment, Linux,
and even Unix, with informal models for the Android and iOS environments.
Python can easily be recorded, modified and re-downloaded and distributed,
be unreservedly adjusted and re-disseminated. While it is copyrighted, it's
accessible under open source.
Ultimately, Python is a free software.
Common Programming Language Features of Python
A huge array of common data types: floating point numbers, complex
numbers, infinite length integers, ASCII strings, and Unicode, as well as a
large variety of dictionaries and lists.
Python is guided in an object-oriented framework, with multiple classes and
inheritance.
Python code can be bundled together into different modules and packages.
Python is notorious for being a much cleaner language for error handling due
to the catching and raising of exceptions allowed.
Information is firmly and progressively composed. Blending incongruent data
types, for example, adding a string and a number together, raises an exception
right away where errors are caught significantly sooner than later.
Python has advanced coding highlights such as comprehending lists and
iterators.
Python's programmed memory administration liberates you from having to
physically remove unused or unwanted code.
2

HOW TO START LEARNING PYTHON

O kay, you've given me plenty of options for machine learning


libraries in Python. What would it be a good idea for me to pick?
How would I look at these things? Where do I begin?
My advice for beginners is to attempt it and not get occupied by small things.
Just in case that you've never done anything related to machine learning,
experiment with scikit-learn. You'll get a thought of how the cycle of
labeling, preparing and testing works and how a model is created.
An alternative that you can experiment with machine learning begins with
Keras—which is generally consented to be the most straightforward structure
—and see where that takes you. After you have more involvement, you will
begin to perceive your need from the structure: more prominent speed, an
alternate API, or perhaps something different, and have the ability to settle on
a more educated choice.
What's more, there is a perpetual supply of articles out there looking at
Theano, Torch, and TensorFlow. There's no genuine method to tell which one
is the great one. It's essential to consider that every one of them has wide help
and are enhancing continually, making correlations harder to make. A multi-
month old benchmark might be obsolete and year old cases of structure X
doesn't bolster activity Y could never again be substantial.
In case you're occupied with doing machine adapting particularly connected
to NLP, take look at MonkeyLearn! Our stage gives an interesting UX that
makes it effortless to manufacture, prepare, and enhance NLP models. You
can either utilize pre-prepared models for normal utilize cases (like notion
examination, subject location or watchword extraction) or prepare custom
calculations utilizing your specific information. Likewise, you don't need to
stress over the basic framework or sending your models, our versatile cloud
does this for you. You can begin at nothing and incorporate immediately with
our excellent API.
Python is a mainstream and easy to use language. Python is an entire
language and framework that can be used for both innovative works and
create generational frameworks.
There are likewise a considerable number of modules and libraries to learn,
giving a wide array of approaches to do each assignment. It can be
overpowering.
The most ideal approach to begin utilizing Python for machine learning is to
understand a few basic founding principles of the language.
It will compel you to introduce and begin the Python mediator.
It will give you a 10,000 foot perspective of how to advance through a little
task.
It will give you certainty to go ahead to your own particular little tasks.
Beginners will need to start with a small project that goes from start to finish.
Books and even many available Python courses can be a bit perplexing. Most
of them will give you loads of formulas and scraps; however, you rarely will
get the chance to perceive how all of the pieces of the puzzle fit together.
When it comes time to apply what you have learned and learn about other
particular Datasets, you are taking a shot at an opportunity for mental growth.
Some of the common tasks that make Python so simple to use are:

Determine a Problem
Gather the Data
Analyze the Algorithms
Increase positive results
Define results & present them
The absolute easiest way to face with learning an unfamiliar platform or
language is to never give up, and keep working through a machine learning
case study until you get to the end. Just be sure to note all of the key points
and techniques being taught along the way. In particular, pay special attention
to the data loading, the data summarization, how to evaluate different
algorithms, and how to come up with certain predictions.
If you are not able to do that, you have a template that you have available to
you in each of the Datasets.
Try a new tool called the Iris Dataset; it is the easiest for those who are just
starting out with learning machine language. This is a great project because
of it’s easy to use interface.
Attributes are generally numerical which means you have to figure out how
to load and manage all of the data.
Due to its calculative nature, users are enabled to rehearse with a simpler kind
of Python formula or code.
It is a multiple level arrangement issues (ostensible) that may require a
detailed approach.
It just has four qualities and 150 columns which means it is extremely tiny
and effectively fits into the memory
The majorities of the numeric qualities are in similar units and a similar scale
and are not essential to any unique scaling or changes being made in order to
begin.
We should begin with a simple program, “welcome world” machine learning
program in Python.
During this segment, we will power through an easy and fun coding sample
from start to finish.
Here are some of the common Python features we will discover during this
section.

Introducing the Python and i’s libraries


Stacking the data set
Condensing the data set
Imagining the data set
Assessing a few calculations
Making a few forecasts

Take as much time as needed. Work through each progression.


Attempt to enter in these commands on your own, or try to reorder them for
the sake of saving time.
Step 1: Download and Install
Get the Python and SciPy stage onto your computer by downloading it. This
is assuming you do not have it on your machine already.
I would prefer not to cover in extraordinary detail since there are numerous
articles on Google explaining how to do this.
Step 2: Instructional Exercise
This instructional exercise works with Python 2.7 or 3.5. Please do not
attempt to do this with any other version or your results may be skewed.
Here is a list of libraries you need to have installed in order to complete this
particular case study/exercise.
matplotlib
pandas
scipy
sklearn
numpy
Keep in mind there are numerous approaches to introduce Python libraries.
My best guidance for newbies is to pick one technique at this beginning stage
and maintain consistency.
I have done some research on this topic and found that the SciPy page gives
phenomenal guidelines for installation and introduction to the libraries on
numerous distinctive platforms, for example, Linux, OS X, and of course,
Windows. This guide is extremely popular, thorough and widely used around
the world.
In case you are using a Windows environment, or you are not certain, I
highly suggest, that you install a clean and free version of Anaconda3. It is a
free, open-source management system that will run on Windows, Linux, and
even OS. It will help you in packaging and the distributing of your code.
Disclaimer: The example we are going to look at below is based on the
assumption that you are at least using the 01.8 or higher version of Scikit-
learn.
Step 3 - Build Your Environment
It is a smart thought to ensure that the Python environment was properly
downloaded, set up properly, and it is functioning properly for this sample
coding project.
The below coding script is going to help you to test out your working
environment as well as automatically import each of the different libraries
that are needed in this example.
Open a brand new command line in your interpreter and type the following
one word.
1 python
I definitely recommend working exclusively in the interpreter or composing
your programs ahead of time, and then run all of them in the command line
instead of using large editors and environments. Always keep things
incredibly simple and concentrate on the code itself and not on the tools.
You will need to type the following script you see in this image into the
interpreter, line by line, character by character.
Did you get the same output? You want your versions at least to be even
more recent.
The interfaces will not change very fast; hence, don’t be alarmed if you
happen to output an older version. All of this should still come out correct for
you; however, if you do get any type of error, please don’t try to go on. You
don’t want to learn to fix coding when just starting out. For some reason, you
can’t execute the script without errors; the rest of the steps won’t work for
you. There are plenty of forums for beginners on the web. Try searching your
exact error code, in quotes, to see what the common fix is. Try the fix, if it’s
easy enough for you to handle, then by all means, do so and come back to
finish the rest of the steps.
3

REVIEW OF DATA SAMPLES AND


VISUALIZATION OF DATA

W e will be using the Fisher's Iris Dataset for this example. This
particular Dataset is extremely well known around the
programming community as the “hello world" command in
Python. It practically used in all beginning instructional manuals and training
guides.
The Dataset has 150 different perceptions of Iris blossoms. There are four
sets of estimations of the Iris in centimeters. The fifth set is the possible
different types or species of the Iris flower. It is determined that all of the
flowers that were observed are a part of one of three types of flower.
Wikipedia has a great article on the Fischer Iris Flower Dataset that you
might want to consider reading as a part of your learning.
So let's get on with the example, shall we? In the very first stage of this
process, we have to load the Iris data into Python using CSV format.
Step 4: Load in Parameters
As mentioned, we have to load all of the data, the segments, data functions
and even the objects that are going to needed to complete this example. It’s
best to load them all up front if you and have them available to save time
when trying to recall them later.
Watch and pay particularly close attention for any errors during this step. As
mentioned, if you encounter an error at this stage, please do not continue.
You have to have your environment set up properly for this step. Re-read the
section above if you need a refresher on how to do that. I provided you with a
great resource for beginners and setting that up.
Step 5: Load in all of the Datasets
Simply load in all of the data we need using the "UCI Machine Learning
Repository" to get the Iris data set.
We are utilizing pandas in this example, and what that means is that Pandas
are critical to any Python-based data. It's also extremely helpful in examining
the data further with detailed stats and visual data representation.
Tip: We are specifically naming each column as we load in the data. We can
easily make sense of it when want to analyze it later.
The Dataset should load in with no problems. You will encounter extremely
rare issues with loading the data. It is pretty basic.
In case issues arise, you can install the Iris files directly inside of your
operating directory and load it by utilizing a similar method which is simply
to change the URL to a new file name.
Step 6: The Fun Part
Now, the fun part. This is the part where we investigate and examine the data.
In this particular step of the example, you are going to learn how to analyze
the data in various formats.

Measurements of the Data.


Look at the information itself.
Factual synopsis of all things considered.
Breakdown of the pieces of information through the class variable
quantity.

Try not to stress. There will be common commands that you can keep in your
own library and use on virtually every future project you do.
Step 7: The Shape Property
Shape property allows us to see the data in rows and columns, similar to an
excel spreadsheet.

L OOK at the information itself


You always want to look at the information for yourself.
1
2 # head
print(Dataset.head(20))

Y OUR FIRST 20 rows should look similar to the image below. If not, then see
where you went wrong in the previous steps.
Step 8: Analytical Summary
Let's look at each attribute and its summary. This means that we look at the
low, mid, and high values and its individual percentages.
1
2 # descriptions
print(Data set.describe())
Looking at this set of data, do you notice a certain pattern in the numbers?
They all range from 0 to 8 in centimeters. Can you see that?
Step 9: Class Distribution
We will now investigate the quantity of the examples (pushes) that have a
place in each class. We can see this as a straightforward examination.
You should be able to see here that every single version of the flower has the
exact same number of occurrences; in this case, it's 50. Can you see that in
the data here?
4

HOW TO CREATE A DATASET WITH


VISUALIZATION

S tep 10: Data Visualization


Data visualization is one of the many significant techniques in
machine learning and applied statistics. Statistics does indeed focus on
quantitative descriptions and evaluations of data. Data visualization provides
an important set of tools for gaining a qualitative interpretation. This can be
helpful when discovering and getting to know a dataset and can help with
recognizing patterns, corrupt data, outliers, and a lot more. With a little
domain knowledge, Data visualizations can be used to express and signify
some basic and key relations in charts and plots that are more coherent to
stakeholders and yourself, than measures of alliance or significance.
Data visualization and exploratory data analysis are whole fields. Let’s look
at basic charts and plots you can use to better understand your data.
The five key plots that you need to know well for basic data visualization are:

Bar Graph
Line Chart or Graph
Histogram Plot
Scatter Plot
Box and Whisker Plot

With knowledge of these plots, you can quickly get a qualitative


understanding of most data that you come across.
Data Visualization Libraries
A portion of these libraries can be utilized regardless of what field of use, yet
a large number of them are strongly centered around achieving a particular
undertaking.
11 interdisciplinary Python information perception libraries, from most
famous to
least are:

Matplotlib
Seaborn
ggplot
Bokeh
Plotly
Pygal
Altair
Geoplotlib
Gleam
Missingno
Leather

Y OU SHOULD , at best, have a pretty good idea of the basics about the Dataset.
Now we build on that basic foundation and add-on visualization using plots.
In this Chapter, we are going to dive into examining 2 different types of
plots.

Univariate (plots) to more readily seen on each property.


Multivariate (plots) to more readily comprehend the connections and
the relationship between the various properties.
S TEP 11 - The Plots (Univariate and Multivariate)

T HE U NIVARIATE P LOT
We begin with taking a look at some univariate plots and what these plots are
of every possible individual variable.
We will use "boxplot" which means it will display the summary of a set of
data up to 5 numbers long. This gives us a much clearer thought of the
appropriation of the data characteristics.
We can likewise make a circle graph of each variable in order to get a clear
picture of how the input is distributed. It would appear that maybe two of the
info factors have a "Gaussian distribution". This means the input is assigned a
mean value. It is useful to you as a beginner which allows you to use specific
algorithms and patterns to test and measure the output.
“Boxplot” is used for another method to review the sharing of every single
attribute.
Boxplots summarize the issuance of all attributes, making a mark for the
average (also known as the middle value) as well as a box around the twenty-
fifth and 75th percentiles.
Histograms
By looking at histograms, you can get a strong sense of the equal distribution
of each and every one of the different attributes.
Histograms are utilized to cluster and evaluate information into collection
bins, then supply you an account of the range of examinations in every single
container. You can rapidly get a sense of if an element is skewed or not,
"Gaussian’", (Real value random), or even includes an exponential division,
through the structure of the containers. That can additionally assist you in
finding potential eccentrics.
The Density Plot
One other method of obtaining a rapid understanding of the delivery of each
property is "Density Plots". The plots resemble an "abstracted histogram"
with a sleek contour drawn thru the peak of every single container.
The Multivariate Plot
In this particular component, there are instances of plots with connections
between numerous parameters.
The Scatterplot Matrix
Whether you could recap the connection concerning 2 variables using a line,
Scatter plots are utilized for locating organized connections in the middle of
parameters, like. Properties that have organized connections might be
correlative and effective nominees for elimination from your data set as well.
A scatterplot also illustrates the connection amongst 2 factors as dots in 2
dimensions. Drawing each of these scatterplots collectively is actually
recognized as scatterplot matrix. A person can easily generate a scatterplot
for every set of properties in their data. Practical to detect the two-fold
relationships from a variety of viewpoints, each scatterplot matrix is
proportionate. Due to the fact, there is the very little point of drawing a
scatterplot for every single factor.
Correlation Matrix Plot
The correlation matrix is known for correlation amongst each pair of
characteristics that can be calculated. To get a good grasp of the knowledge
of which variables hold a large correlation with one another, plotting the
correlation matrix is a good idea.
Correlation provides an indicator towards how associated the changes are
among 2 variables. If the 2 variables shift in conflicting directions together
(one going downward, one going upwards) that means that they are
negatively correlated. If 2 variables shift in a matching direction, then that
means that they are positively correlated.
If there are a lot of exceedingly correlated input variables in your data, then
this is very important to acknowledge, for the reason that several machine
learning algorithms similar to the logistic and linear regressions can obtain
unfortunate or bad performance.
Line Chart
A line graph or line chart exhibits info as groups of data points which are
joined by straight lines, called ‘markers’. It’s a primary type of chart used in
various fields. Despite that the dimension points are ordered (particularly
using their x-axis value) and fused with straight line sectors, it is like the
scatter plot. Line charts are frequently used to envision an inclination in data
over periods of time. Hence, in a chronological way, is how frequently drawn
the line is. In these types of cases, they are known as run charts.
Bar Graph
What is a bar graph for?
A bar graph/chart showcases categorical data through bars that are
rectangular with lengths or heights that are corresponding to the values that
they signify. These bars are designed either horizontally or vertically.
In a nutshell, if you want to visualize and showcase the data in a decent
manner from distinctive groups which are being set against each other, then
the bar graphs are of good use.
I think it's important for a beginner, that we take a look at the difference
between the provided variables.
5

MAKING PREDICTIONS WITH ALGORITHMS

H ow are you feeling at this stage? Have you encountered any errors?
Are you feeling as if you have gained some new knowledge?
Hopefully, things are going smoothly and you are grasping the concepts well
at this point. Let’s take a look at how to make predictions with algorithms in
Python and what it means.

WHAT IS PREDICTIVE ANALYTICS?


“Predictive Analysis” is regularly talked about with regards to building
information, for instance, originating from instruments, specific sensors and
associated frameworks in the real world. Business data, at an organization,
for example, may incorporate exchange information, deals results, client
dissensions, and promote data. Progressively, organizations settle on
information-driven choices in light of this important aggregation of data.
With a significant growth in competition, organizations look for a
competitive advantage in bringing items and administrations to open markets.
Information-focused models typically enable organizations to take care of
long-standing issues in creative and unique ways.
Manufacturers, for instance, often think that it is difficult to enhance just its
equipment. Item designers can add prescient abilities to existing answers for
increased incentives to the client. Utilizing prescient examination for
hardware upkeep, or prescient support, can predict future product
development disappointments, figure vitality accurately, and decrease
working expenses. For instance, sensors that measure certain wave patterns
and vibrations in car parts, and in turn, flag the requirements for upkeep
before the car/automobile flops during use by an actual consumer.
Additionally, organizations utilize a prescient investigation to make more
exact predictions, for example, estimating the increased demand for power on
the electrical grids. These figures enable companies to do asset planning, like
looking for other power plants, in order to be more efficient and effective.
To extricate an increase in value from all of the information, organizations
apply calculations to vast information sets, utilizing new and upcoming
technology tools, for example, Hadoop. The information sources may
comprise value-based databases, hardware log files, pictures, audio/video,
sensory details, or a number of other kinds of information. True innovation is
often the result of using and combining data from a variety of different
sources.
With this information, these technologies are of critical importance in the
discovery of trends and patterns. Machine learning methods are utilized to
discover commonalities and patterns in information and to estimate what the
outcomes are going to be.
What does Predictive analysis do? What does it mean?
Predictable analytics allows groups in different job roles, ranging from
financial, healthcare workers in pharmacy industries, and automobile. This
particular analytical process is how we utilize the data that we have analyzed
in order to make viable guesses which are largely based on the analyzed
information.
Whew! Don’t panic.
The great thing is that this process is a predictive model which allows for a
systematic approach of delivering outcomes based on a certain set of
common criteria.
To define what “predictive analytics” means, this process involves applying a
certain statistical approach based on Python machine learning strategies and
models which creates realistic and measurable estimations and predictions
about future outcomes. Regularly, Python machine learning techniques are
used in real-world problem-solving. For example, it is commonly used to
estimate the value of something in the near future such as “How long can my
word processor run before needing it to be replaced or require routine
maintenance?”
Constructed on a set of criteria, it can also be used to guess certain customer
behaviors. A great deal of banks and financial institutions use this to
determine the creditworthiness of their customers, how likely they are to
default on their mortgage or car loan, or the probability of excessive
overdrafts each month. It’s pretty amazing.
Predictive analytics is primarily used in helping companies and organizations
make future predictions and meet certain goals. Think about the most
common goals of any business: stay in business, make money, and reduce
excess waste through the analyzing of data, methods decrease expenses and
ability to offer employee bonuses if goals are met. To do something of this
scale does require an extensive amount of various data types and inputting
them into pre-built models that will ultimately generate concise, measurable,
and most importantly—achievable outcomes to maintain a positive bottom
line and support growth.
In order to make this click, let’s look back at what we said: “predictive
analysis” is and what it’s for, as it relates to some real-world examples. These
are not all inclusive by any means, and more can be found using a simple
Google search and research.
Real World Examples of Predictive Analytics:
The Car Industry–Breakthrough technology in cars, designed to gather
specific details and information regarding how fast the car is going, how far
the car has traveled, its emission levels and the behaviors of drivers are now
used with an extremely sophisticated predictive analysis model. This allows
the analysts to release extremely beneficial data for car manufacturers,
insurance companies, and the racing circles.
Aviation–Determining the viability and health of an aircraft is an application
developed by an aviation engineer, it helped improve the performance of
aircraft speed and reduce costs to maintain and repair them. This particular
application is used to test performance in every critical function of the plan
from the take-off, to the control systems, all the way to the efficiency of the
fuel and maximum take-off conditions.
The Production of Energy–Electricity companies use predictive analytics in
order to determine the cost and demand for electrical supplies. There are a ton
of extremely sophisticated models that forecast access, patterns (future and
past), the different changes in weather and many other factors.
Accounting and Financial Services–The Development of credit risk models
is a prime example of predictive analytics in the real world. Nowadays,
banks, credit unions, and many other financial institutions use these models
and applications in order to determine a customer or potential client’s credit
risk.
Equipment and Machine Manufacturing—Testing and determining future
machine weaknesses and failures. This particular application is used in
helping to improve the efficiency of assembly lines and production of large
equipment and machines and at the same time optimizing its operations and
workforce.
Modern Medicine–This is last on the list, but certainly not least. Predictive
analysis has been used in modern medicine to detect infections and common
diseases and even pre-existing conditions. It’s also a great way to bridge the
communication gaps between those in the medical profession.
Pretty cool, huh? Can you find more ways that predictive analysis is used in
real-world situations to improve our life, our economy, and our businesses?

WORKFLOW IN PREDICTIVE ANALYTICS:


You may or may not be familiar with predictive models at this stage of your
learning, but you can think of a real-world example as to what meteorologists
use in day to day weather forecasting.
A basic industry utilization of prescient models identifies with any circuit that
consumes power and allows a prediction to be made about the demand for
power, as it relates here—energy. For this model, network administrators and
brokers require precise conjectures of each circuit load to make important
choices for integrating them into the electrical grid framework. Huge
amounts of information are easily accessible and utilizing these prescient
analytics, allowing matrix administrators to transform this data into
noteworthy bits of knowledge that can be used to make important decisions
and predictions.
Typically, a simple workflow for a predictive analytics model will follow
these basic steps outlined here:

Import information from changed sources, for example, web


chronicles, databases, and spreadsheets.
Information sources incorporate energy load data and information in
a CSV record and national climate information demonstrating
temperatures and varying dew points.
Clean the information by evacuating anomalies and joining
information sources.
Distinguish information spikes; especially pinpoint missing
information, or even bizarre outputs to expel from the information.
Make a solitary table including energy load, temperature and dew
point.
Build a precise data model in light of the accumulated information.
Predicting any type of energy source is a perplexing procedure with
numerous factors. You may utilize neural systems to assemble and
prepare a prescient model.
Practice training through your data index to achieve diverse
strategies. At the point when the preparation is finished, you can
attempt the model against new information to perceive how well it
performs.
Coordinate the model into a front gauging framework in a
production type of environment.
When you locate a model that precisely gauges the outcomes, you
can move it into your creation framework, making the examination
accessible to programming projects or gadgets, including web
applications, servers, or smartphones.

Your aggregated data tells a tale that is certainly complex. To withdraw the
insights, you are going to need an extremely accurate design which is
predictive. This might not be the best step as a beginner; nonetheless, it is
here for reference to the entire picture of Python capabilities.
Predictive analysis is being modeled after major mathematical models to
predict a conference or result. These designs forecast the desired outcome at
some future time based on modifications placed into data inputs. Using a
repetitive procedure, you create the models by choosing a training
information set where you will proceed to test and further validate the
information. After you examine it to ascertain its reliability and accuracy in
predicting forecasts, play around with different methods until you find one
that is comfortable for you. The important thing is that you choose one that
you can understand, learn and apply without much effort.
To give you an idea of some good examples of such a method, you can try a
time-series reversal model for predicting low and even high levels of flight
traffic or fuel. Of course, this is certainly predicting based on a linear
approach to speed compared to upload and continuing to be extremely
beneficial in real-world estimation models for conjecture.

WHAT IS THE DIFFERENCE BETWEEN PREDICTIVE


ANALYTICS & PRESCRIPTIVE ANALYTICS?
Businesses that have been able to successfully implement predictive analytics
have a competitive advantage to problems, situations and good things in the
future. Predictive analytics is a process that creates an estimation of what will
happen next—literally. It also gives you tips simple about how to be able to
make high-level decisions in a way that maximizes the information you
wouldn’t have access to.
Prescriptive analytics is just a branch of data analytics that makes use of
designs that are predictive guesses to make for the most ideal outcomes.
Prescriptive examination depends on advancement and tenets-based
procedures for decision-making on a most basic of levels. Anticipating any
issues or strains on the framework is absolutely essential in the decision-
making process. It means what is to be done is based on the prediction.
6

EXAMPLES OF CODING

LOOPS

L oops are generally utilized whenever one computer system is used


when there is a program needed to repeat processes more than once.
This particular process is referred to as ‘iteration’ and there will end
up being 1 loop that is ‘for’ and the other is called a ‘while’ loop in Python.
The first image is a representation of the ‘for’ loop and the 2nd image is the
easiest of the two and is the ‘while’ loop.

WORKING WITH NUMBERS


Develop your machine code process to develop zeros direct into computer
memory. The start address is given at address 0x80 and the number of words
to develop is given at address 0x84. We assume the start address is word
aligned and the number of words to develop is greater than zero.

WORKING WITH STRINGS


Strings as part of python, frequently tend to be a conterminous collection of
recognizable possibility delimited through a line or possibly multiple quotes.
Python wouldn’t possess any kind of distinct information range for a
recognizable possibility; therefore, they frequently tend to be portrayed as the
lone recognizable string.
Creating strings
It is essentially the string of recognizable possibility; the string takes place to
be as part of the fact. The recognizable takes place to be as part of the
character. For example, the English language has 26 recognizable
possibilities.
Computer systems do not contend with mere possibility. They contend with
actual numbers (with decimal points included). It is quite possibly an option,
however, that you may not notice any recognizable options on your display
screen inside. It takes place to be as part of the fact actually store and
analyzed as a series and combinations of zeros (o) and ones (1).
This conversion process recognizable to the number takes place to be a part
of fact called encoding. The reverse process takes place to be as part of the
fact called decoding. ASCII and Unicode frequently tend to be more of the
favored among users and especially beginners, as it relates to Python strings,
which tend to take place in a hidden bit of Unicode that remains recognizable.
Unicode was originally coded to include all things considered and bring
consistency as a major aspect of encoding. You can take in additional about
Unicode from here.
Strings as a Python Feature:
Strings can be made through encasing unmistakable probability inside
singular quotation marks or multiple quotes. It is up to you what your desired
outcome will be. For the most part, Python is typically used to be a
representation of multiple strings and doc-strings.
When you run the program, your specific output will be: If ran and executed
properly.
There are numerous tasks that can be performed with the string that makes it
a standout amongst the most utilized types of data in Python.
Link of Two or More Strings - Joining of at least two strings into a solitary
one is called "concatenation".
The "+" function allows you to compose 2 string literals together and links
them.
The " *" function can be utilized to rehash the string for a set number of
repetitions.

TYPE CONVERSION
Python has many data types. You have already seen and worked with some of
them. You have floats and integers to interact with numerical values; boolean
(bool) is used to interact with true/false values and strings to work with
alphanumeric characters. You can make use of Tuples, lists, dictionary, and
sets which are data structures where you can store a large collection of
values. To learn more about them, be sure to check out DataCamp Data
Types for Data Science Course.
Implicit and Explicit Data Type Conversion
Data conversion in Python can happen in two ways: either you tell the
compiler to convert a data type to some other type explicitly or the compiler
understands this by itself and does it for you. In the former case, you're
performing an explicit data type conversion, whereas, in the latter, you're
doing an implicit data type conversion.
Conversion, specifically type conversion, is the complete conversion of one
piece of information into another type of information. Understood Type
Conversion is naturally performed by the Python translator. You don't have to
do anything different. Python is notorious for not losing any of the data in the
conversion process.
Typecasting is another example of Type conversion. This particular process
involves the data types of objects, which are then transformed, using a set of
criteria that you or the end user defines. Unfortunately, loss of data is
common in this type of conversion because it is going from object to specific
type of data and not simply data to data or object to object.
7

DECISION TREE

WHAT IS A DECISION TREE?

T he simple definition of a decision tree is it’s like a flow chart or a


diagram that manifests the different results from a set or series of
decisions. It can also be utilized for many other purposes, such as a
decision-making device, for researching evaluation, or perhaps for creating a
strategy. A primary benefit of utilizing a decision tree is that it is very
painless and straightforward to follow and comprehend.
For banks to figure out if they should offer a person a loan or not, they will
often work through a list of questions to see if the person would be safe to
give the loan to. These types of questions could simply start like, “What kind
of income do you have?” If the answer is between $30 and $70,000 they will
continue onto the following question. “How long have you worked at your
current job?” If they say one to five years it will continue onto their next
question. “Do you make regular credit card payments?” If they answer yes,
then they will offer them a loan, and if they don’t, they won’t get the loan.
This is the most basic decision tree.
A decision tree is pretty much just a non-parametric machine learning
modeling technique that is used for classification and regression problems. In
order to find the solutions, a decision tree will create a hierarchical and
sequential decision that variables of the outcome based on data.
And this means what?
Hierarchical refers to the model that is defined as a series of questions that
will lead to a label or value once it has been applied to an observation. After
it is set up, this model will work like a protocol using a bunch of “if this
happens, then this will happen” conditions that will give a certain result from
the data that was added.
A method that is non-parametric means that there won’t be an underlying
assumption concerning the distribution of the data or errors. This basically
means that you model will be created by using observed data.
Decision trees that use a discrete value set for the target variable are class
classification trees. With these types of trees, the nodes, or leaves, are
representations of class labels, and the branches show the feature
conjunctions that lead to the class. Decision trees that have target variables
that are taking a continuous value, which is typically numbers, are referred to
as Regression Trees. Together, these two types of decision trees are called
CART.
Each one of these models is a case of Directed Acyclic Graph. All of the
graphs have nodes that show a decision point about the top variable given the
edges and predictor that are between each node. If we continue with the loan
scenario, $30 to $70,000 would represent an edge, and “Years at present job”
would be a node.
The main goal of the decision tree is that it will make the best choice once
you reach the end of a node, so it will need an algorithm that does that. This
is known as Hunt’s algorithm, which works recursive and greedy. Greedy
means it makes the best choice, and recursive means that it splits big
questions into smaller ones. The choice of splitting a node is decided
according to a purity metric. Nodes are considered 100% impure if a node is
split 50/50, and it is considered 100% if all of the data is a part of one class.
To make sure that the model is optimized, you have to reach a max purity and
stay away from impurity. You do this by using the Gini impurity, which will
measure how often a random element ends up being labeled wrong if the
distribution randomly labeled it. This is figured out by adding the odds, pi, of
a node with the label, I, being picked then multiplied by the odds of a mistake
in categorization. The goal is to make sure you reach 0 where it should be
pure.
The other metric that it will use is information gain. This is for deciding the
feature that you split at each tree step. This can be figured out using this
equation.
“Information Gain = Entropy(parent) – Weight Sum of Entropy(Children)”
This is a pretty good model, but it presents a problem because it results in a
model that will only stop once all of the information is distributed into a
single attribute or class. At the cost of bias, the model’s variance is huge and
will end up leading to overfitting. This can be fought by setting a max depth
of your tree, or by setting an alternative to specify the minimum amount of
points that will be needed to make a decision to split.
With decision trees, comes advantages and disadvantages. On the downside,
they are greedy algorithms that are locally optimized where a return to the
global tree isn’t guaranteed. On the positive side, they are super simple to
understand because they have a visual representation that doesn’t require all
that much data.
Start with a Goal, a Mindset, Then Make a Plan
Every tremendous decision tree begins with a goal; figuring out what exactly
what you want to achieve with your trees. Whatever the purpose may be,
you’ll have to start off by a plan to get maximum positive and expected
results.
No matter the purpose, it’s important to identify right off the bat—from there,
you'd better be able to break down a plan of action. Building and connecting
your nodes is much easier and more efficient is you put an outline together.

UTILIZE THE TECHNIQUE THAT FITS FOR YOUR NEEDS


Once your goals are thoroughly planned, you can explore the Zingtree
Gallery to gather inspiration from existing decision trees, and even copy the
structure. Few ways of creation to suit everybody:

Wizard: Fill out simple forms that give the questions, answers and
messaging for your tree, as well as prompt for more in-depth info.
Overview: Start your decision tree entirely from scratch (for the
more experienced or adventurous!)
Designer: Draw your decision tree using your imagination to create
each node easily and simply, along with their connections, and
expect navigational flow.

STRUCTURE OF A DECISION TREE


Leaf nodes, branches, and a root node are the three main parts of a Decision
Tree. In the same way, leaf and the root node are composed of criteria or
questions to be answered. The root node is the beginning part of the tree.
Every single node characteristically has 2 or more nodes outspreading from
it. Arrows connecting nodes are Branches showcasing the movement from
question to answer. Let’s say the question in the first node needs a "yes" or
"no" response, there will be one leaf node for "no" response, and another
node for a "yes".

USES OF A DECISION TREE


A decision tree can be utilized in 2 manners, a descriptive manner or a
predictive manner. In both situations, they are mostly utilized to picture all
probable results and decision points that transpire chronologically and are
generated in an identical way. In the financial world, Decision trees are
mostly utilized for criteria such as portfolio management, spending, and loan
approval. When it comes to defining a new market for a prevailing product or
inspecting the validity and reliability of a new product, a decision tree is
useful. A decision tree will typically provide you with a more accurate read
than the logistic regression model.

BAGGING OR BOOTSTRAP AGGREGATING


Bagging will involve making several models of one algorithm like a decision
tree. Each one of them will be trained on different bootstrap sample. Since
this bootstrapping will involve sampling with replacement, some of your data
won’t be used in all of the trees.
The decisions trees that are made are created with different samples, which
will help to solve the problem of sample size overfitting. Decision trees that
are created in this way will help to lower the total error since the variance
will continue to lower with every tree that is added without increasing the
bias.
A random forest is a bag of decision trees that use subspace sampling. There
is only one selection of the tree’s features that is considered at the split of
each node, which removes the correlation of the trees in your forest.
These random forests also have their own built-in validation tool. Since there
is only a percentage of this data that gets used for every model, the error of
the performance can be figured out using only 37% of the sample that was
left of the models.
This was only a basic rundown of some statistical properties that are helpful
in data science. While some data science teams will only run algorithms in R
and Python libraries, it’s still important to have an understanding of these
small areas of data science. They will make for easier abstraction and
manipulation.

WHAT IS A DECISION TREE IN PYTHON?


This is an overseen, non-parametric learning process utilized mostly for
Regression and Classification. Learning basic decision rules deduced from
the characteristics of the provided data, to produce a model that foresees the
amount or worth of a target variable simply is the Decision Trees’ purpose.
Decision trees are tasked to the information based learning algorithms which
use various measures of information gain for learning. We can utilize
decision trees for issues such as we have continual but also target features
and categorical input. The main idea of decision trees is to find those
descriptive features which consist of the most "information" regarding the
target feature and then split the dataset along the values of these features such
that the target feature values for the resulting sub_datasets are as pure as
possible. The feature which leaves the target feature most purely is said to be
the most informative one.

HOW CAN WE CREATE A TREE MODEL OURSELVES?


To answer that question, we should recapitulate what we try to achieve by
using a decision tree model. What we need is a given dataset, for training a
model which acknowledges the relation between the illustrative features and
a target feature so we can present the model a new set of query instances and
also compute the target feature values for these query instances. Let us
further recapitulate the basic shape of a decision tree. We know that we have
at the very end of the tree leaf nodes which consist of (in this case) target
feature values. To make this more explanatory, a more practical example The
Zoo Animal Classification dataset which includes properties of animals as
descriptive features and the animal species as the elusive target feature. For
example, the classification of animals as being Reptiles or Mammals is based
on whether they have legs, are toothed, and breed.
Decision Trees Advantages:

Easy to illuminate and understand. Trees also can be envisioned.


Like forecasting data, for example, the vital cost of utilizing the tree
in the amount of data points that are utilized to train the tree is
usually logarithmic.
Requires very little data preparation and inquiry. Other sources or
techniques mostly require data normalization, blank values to be
removed and dummy variables that needs to be generated; although,
missing or invalid values aren’t supported by this module.
To manage the same numerical and categorical types of data,
Decision Trees are used. Other methods only manage and analyze
datasets which only have one type of the variable.
By utilizing statistical tests, it is doable to authenticate a model,
which makes it calculable and dependable.
A white box model is what Trees use. A boolean logic is what the
interpretation of the situation is when a provided predicament is
detectable in a model. On the other hand, in a black box model,
results may be more difficult to understand.
Despite the expectations being somehow desecrated by the real
model from which the data was produced, performance is amazing.
Dts is capable of handling multi-output dilemmas as well.

Decision Trees Disadvantages:


As a result of minor dissimilarities in the data that can actually result
in an entirely altered tree being produced, Decision trees can become
unbalanced. By utilizing decision trees within an ensemble, this
dilemma is diminished.
Beginners of Decision-tree can sometimes produce very complicated
trees that don’t generalize the concept of the data that good. It is
known as overfitting.
Learning an ideal decision tree is a problem called NP-complete,
numerous varieties of basic concepts and even features of optimality.
For this reason, heuristic algorithms are used to establish practical
decision-tree algorithms for learning. For example, at each node, the
greedy algorithms in which mostly optimal decisions are executed.
These algorithms are not able to ensure the return of the globally
optimal decision tree. To reduce this problem, we should train
various trees with an ensemble learner in which the samples and
features are randomly sampled with replacement.
Complexity Of Decision Trees
Decision tree learners generate influenced trees where if several
classes take over, it would cause confusion. For this reason, it is
suggested to manage and balance the data-set earlier to fit with the
decision tree.
There are a few concepts that are very hard and complex to learn for
the reason that sometimes, decision trees don’t covey them
efficiently, which makes them even harder to understand, e.g. XOR,
multiplexer or parity problems.
On the whole, the running rate to create a stable binary tree is and
query time. Even though the tree building algorithm tries to produce
stable trees, it won’t be balanced all the time. Considering the
subtrees are roughly managed and balanced, the cost of each node is
collected from thorough searching to find the characteristic that
provides the biggest entropy reduction. This has a cost at each node,
which leads to a total cost over the total trees by adding up the cost
at every node.

ID3, C4.5, C5.0, AND CART–TREE ALGORITHMS:


Ross Quinlan created Iterative Dichotomiser 3 (ID3) in 1986. The algorithm
has the feature to create a multiway (detailed) tree, looking for each node in a
greedy fashion the categorical feature that is going to hand-in the biggest
information gain for categorical targets. An essential step is mostly applied
when trees grow to their maximum size, for the betterment of the capability
of the tree to establish obscured data.
After ID3, C4.5 came and because of this improvement, the restriction was
removed that features must be categorical by vigorously describing a distinct
attribute that separates the continuous attribute value to a rare set of intervals.
C4.5 converts the trained trees (i.e. the output of the ID3 algorithm) into sets
of if-then rules. These accuracies of each rule are then evaluated to determine
the order in which they should be applied. Eliminating a precondition of a
rule if the accurateness of the rule progresses while lacking it, it is called
Pruning.
The newest version from Quinlan’s is C5.0 because of a patent license. It
creates fewer rule sets and needs lesser memory than C4.5 with being more
accurate.
CART is very much comparable to C4.5, but it differs with it because it
supports numerical target variables (regression) and does not calculate rule
sets. It builds binary trees by utilizing the feature and threshold that consists
of the biggest information gain at each node.

MORE ISSUES AND VARIATION


Something that wasn’t mentioned up there is how to grow a tree if ever the
detailed features are continuously scaled but they are not categorically scaled.
This doesn’t alter that greatly from the approach mentioned above, but has a
noticeable change that we can utilize a continuously scaled feature numerous
times all through the growing of the tree and we have to use the mode or
mean of a feature regarding the values of the target feature instead of the
single (categorical) feature values. These can no longer be used since there
are now a never-ending number of non-identical but attainable values.
The second important variable is when we no longer have a categorically
scaled but continuously scaled target feature. In this case, we do not call it a
tree model but a regression tree model instead of a classification tree model.
For example, we use the variance of a feature about the target feature as
splitting the criteria rather than information gain. For this, we have to use the
feature with the least weighted variance as splitting feature.
Utilizing an ensemble approach, is an alternative way to maximize the
accuracy of a tree model, so we can create different models of trees from the
original dataset, i.e. we envision the target values for the test dataset using
each of the developed models and then return this target c value that was
already foreseen by most of the models. Called the ‘bagging and boosting’ it
is an important application that is ideal to make decision tree ensemble
models. A random forest model, which is an alternative of a decision tree that
is a boosting-based ensemble model, is one of the most useful machine
learning algorithms known. By utilizing dissimilar splitting criteria for the
Information gain ratio as well as the single models, Ensemble models can
also be created.
We’ve perceived a great deal of different approaches and variations to
decision tree models by now. Albeit, there’s no basic instruction towards
which one should be used or whichever approach is best. More frequently, it
relies on discovering the most fitting model for a particular dilemma. Do not
be afraid to try dissimilar models with dissimilar parameters. Nonetheless,
ensemble models are established as dominant models, like the random forest
algorithm.

REGRESSION TREES
In the previous chapter, you have learned about Decision Trees introduced
the basic concepts underlying decision tree models, how you can build tree
models with Python starting from scratch. We have also introduced the
advantages and disadvantages of decision tree models as well as important
extensions and variations. For example, one drawback of Classification
Decision Trees is that they always need a target feature that is categorically
scaled. For example, Days = {Sunday, Monday, Tuesday, Wednesday,
Thursday, Friday, Saturday}.
Here arises a problem: What if we want our tree for instance to predict the
price of a house given some target feature attributes like the number of rooms
and the location? In this situation, the values of the target feature are
continuous and are no longer categorically scaled. Conceptually, a house can
have an unlimited number of prices.
That's where Regression Trees comes into action. Regression Trees work in
the same principle as the Decision Trees with the big difference that the
target feature values can now take on an infinite (never-ending) number of
continuously scaled values. Hence it proves that the task is now to foretell the
value of a continuously scaled target feature Y given the values of a set of
categorically (or continuously) scaled descriptive features X.

WHAT ARE RANDOM FORESTS?


Tree models are used because they are expected to be of low bias models and
high variance. In consequence, they are prone to overfit the training data.
This is catchy if we recapitulate what a tree model does if we do not prune it
or introduce early stopping criteria like a minimum number of instances per
leaf node. For this reason, it attempts to split the data along with the features
till the instances are pure and clean with respect to the value of the target
feature, no data is left there, or there are no features left to split the dataset
on. If one of the above holds true, we grow a leaf node. The consequence is
that the tree model is grown to the maximal depth and therewith tries to
reshape the training data as precise as possible which can easily lead to
overfitting. Another drawback of classical tree models like the (ID3 or
CART) is that they are relatively unstable. This instability can lead to the
situation that a small change in the composition of the dataset leads to a
completely different tree model.
First of all, think about a case where a categorically scaled feature “A” is
used as the "root node feature". Secondly, this feature is replaced from the
dataset and it is no longer visible in the sub_trees because it is not existent
anymore. Now visualize a situation in which we replace a single row in the
dataset and this change leads to the situation that a now feature “B” has the
biggest information gain or deduction respectively. What does this actually
mean? Well, feature “B” is now prior to feature “A” as "root node feature"
which leads to an entirely different tree just because we have altered one
single instance in the dataset. This situation can occur at all interior nodes of
the tree but also at the root node.
8

NEURAL NETWORKS

N eural networks, which are sometimes referred to as Artificial Neural


Networks, are a simulation of machine learning and human brain
functionality problems. You should understand that neural networks
don’t provide a solution for all of the problems that come up, but instead
provide the best results with several other techniques for various machine
learning tasks. The most common neural networks are classification and
clustering, which could also be used for regression, but you can use better
methods for that.
A neuron is a building unit for a neural network, which works like a human
neuron. A typical neural network will use a sigmoid function. This is
typically used because of the nature of being able to write out the derivative
using f(x), which works great for minimizing error.
Even though it has found new fame, the idea of these neural networks isn’t
actually new. The psychologist, Frank Rosenblatt, in 1958 tried to create “a
machine which sense, recognized, remembers, and responds like the human
mind” and he named his creation Perceptron. He didn’t come up with this out
of thin air. Actually, his work was inspired by the works of Walter Pitts and
Warren McCulloch from the 1940s.
Let’s look at what a perceptron is. Dendrites are extensions that come off the
nerve cell. These are what get the signals and they then send them onto the
cell body, which processes the stimulus and then will make a decision to
either trigger a signal or not. When a cell chooses to trigger a signal, the cell
body extension, known as an axon, will trigger a chemical transmission at its
end to a different cell. There is no need to feel like you have to memorize any
of this. We aren’t actually studying neuroscience, so you only need a vague
impression of how this works.
Perceptron looks similar to an actual neuron because they were inspired by
the way actual neurons work. Keep in mind; it was only inspired by a neuron
and in no way acts exactly like a real one. The way a perceptron processes
data is as such:

1. There is a small circle on the left side of the perceptron, which are
“neurons” and they have x subscripts 1, 2, … , m that carries data
input.
2. All of the inputs are multiplied by a weight, which is gets labeled
using a subscript 1, 2, … , m, along with a long arrow, which is the
synapse, and travels to the big circle in the middle. So you will have
w1 * x1, w2 * x2, w3 * x3 and so on.
3. After all of the inputs have been multiplied by weight, you will sum
them all up and add a bias that had been pre-determined.
4. The results are then pushed onto the right. You will then use the step
function. All this tells you is that if the number you get from step
three is greater than or equal to zero, you will receive a one as your
output, otherwise, if your result is lower than zero, the output will be
zero.
5. You will get an output of either zero or one.

If you were to switch the bias and place it on the right in the activation
function such as “sum(wx) ≥ -b” the –b would be known as a threshold value.
With this, if the sum is higher than or equal to your threshold, then your
activation trigger is one. Otherwise, it would come out to be zero. Pick the
one that helps you to understand this process because both of these
representations are interchangeable.
Now you have a pretty good understanding of how a perceptron works. All
it’s made up of is some mechanical multiplications, which then makes
summations, and then ultimately give you an activation, and that will give
you an output.
Just to make sure that you fully understand this, let’s have a look a super
simple example that is likely not realistic at all. Let’s assume that you have
found extreme motivation after you have read this book and you have to
decide if you are going to study deep learning or not. You have three major
factors that will help you to make your decision:

1. Will you be able to make more money once you master deep
learning: 0 – No, 1 – Yes.
2. Is the needed programming and mathematics simple: 0 – No, 1 –
Yes.
3. You are able to use deep learning immediately and not have to get an
expensive GPU: 0 – No, 1 – Yes.

Our input variables will be x1, x2, and x3 for all of the factors and we’ll give
them each a binary value since they are all simple yes or no questions. Let’s
assume that you really love deep learning and you are now ready to work
through your lifelong fear of programming and math. You also have some
money put away to invest in the expensive Nvidia GPU that will train the
deep learning model. You can assume that both of these have the same
importance, because they can both be compromised on. But, you really want
to be able to make extra money once you have spent all of the energy and
time into learning about deep learning. Since you have a higher expectation
of ROI, if you can’t make more moolah, you aren’t going to waste your time
on learning deep learning.
Now that you have a decent understanding of the decision preferences, we
can assume that you have a 100 percent probability of making extra cash
once you have learned deep learning because there’s plenty of demand for
not much supply. That means x1 = 1. Let’s assume that programming and
math are extremely hard. That means x2 = 0. Finally, let’s assume that you
are going to need a powerful GPU such as a Titan X. That means x3 = 0.
Okay, now that you have the inputs, you can initialize your weights. We’re
going to try w1 = 8, w2 = 3, w3 = 3. The higher the value for the weight, the
bigger the influence it has with the input. Since the money you will make is
more important, your decision for learning deep learning is, w1 is greater
than w2 and w1 is greater than w3.
Let’s say that the value of the threshold is five, which equals the bias of
negative five. We add everything together and add in the bias term. Since the
threshold value is five, you will decide to learn deep learning if you are going
to make more money. Even if the math turns out to be easy and you aren’t
going to have to buy a GPU, you won’t study deep learning unless you are
able to make extra money later on.
Now you have a decent understanding of bias and threshold. With a threshold
as high as five, that means the main factor has to be satisfied in order for you
to receive an output of one. Otherwise, you will receive a zero.
The fun part comes next: varying the threshold, bias, and weights will
provide you with different possible decision-making models. With our
example, if you lower your threshold from five to three, then you will get
different scenarios where the output would be one.
Despite how well loved these perceptrons were, the popularity faded quietly
due to its limitations. Later on, people realized that multi-layer perceptrons
were able to learn the logic of an XOR gate, but this requires the use of
backpropagation so that the network can learn from its own problems. Every
single deep learning neural networks are data-driven. If you are looking at a
model and the output it has is different from the desired output, you will have
to have a way to backpropagate the error information throughout the network
in order to let the weight know they need to adjust and fix themselves by
certain amounts. This is so, that in a gradual way, the real output from the
model will start getting closer to the desired output with each round of
testing.
As it turned out, when it comes to the more complicated tasks that involved
outputs that couldn’t be shown with a linear combination of inputs, meaning
the outputs aren’t linearly separable or non-linear, the step function will not
work because the backpropagation won’t be supported. This requires that
your activation function will have to have meaningful derivatives.
Here’s just a bit of calculus: a step function works as a linear activation
function where your derivative comes out to 0 for each of the inputs except
for the actual point of 0. At the point of 0, your derivative is going to be
undefined because the function becomes discontinuous at this point. Even
though this may be an easy and simple activation function, it’s not able to
handle the more complicated tasks.
Sigmoid function: f(x) = 1/1+e^-x
Perceptrons aren’t stable when it comes to being a neural network
relationship candidate. Look at it like this: this person has major bipolar
issues. There comes a day (f z < 0), they are all “down” and “quiet” and
doesn’t give any response. The next day (for z ≥ 0), they are all of a sudden
“lively” and “talkative” and is talking nonstop. A huge change, isn’t it? Their
mood doesn’t have any transition, and you’re not sure if it is going to go up
or down. This is a step function.
Just a bit of a switch in each of the weights within the input of the network
may cause a neuron to flip from zero to one, which could end up affecting the
behavior of the hidden layer, and this would cause a problem for the
outcome. It’s important that you have a learning algorithm that improves the
network because it slows the weights change without any sudden jumps. If
you aren’t able to use step functions to slowly change up the weight values,
then you shouldn’t use it.
We are now going to say goodbye to the perceptron with a step function. A
new partner to use in your neural network is the sigmoid neuron. This is done
by using the sigmoid function, which is written above. This is the only thing
that is going to change is the activation function, and all the other stuff that
you have learned up to this point about the neural network is going to work
the same for the new neuron type.
With this equation, the weight adjustment can be generalized, and you would
have seen that this will only require the information from the other neuron
levels. This is why this is a robust mechanism for learning, and it is known as
the backpropagation algorithm.
9

BRINGING IT ALL TOGETHER

PRACTICAL IMPLEMENTATIONS

P ython is an intense, simple to-utilize scripting dialect reasonable for


use in the enterprise, despite the fact that it isn't right for completely
every use. Python has been and is broadly executed in the
accompanying fields:

Graphic Design and Image Processing Applications:

In making a 2D imaging software that is similar to GIMP and Inkscape,


Python was used.
Furthermore, 3D animation packages also use Python in variable parts and
criterias, such as Blender, 3ds Max, Lightwave, Cinema 4D, Maya, and
Houdini.

Computational and Scientific Applications:

The greater velocities, profitability, and accessibility of instruments, for


example, Numeric Python and Scientific Python, had brought about Python
turning into the primary piece of utilization programming engaged with
calculation and preparing of logical information. 3D demonstrating
programming, for example, FreeCAD, and limited component strategy
programming, for example, Abaqus, are coded in
Games:

Python and Pygame is a decent dialect and system for fast amusement
prototyping or for apprentices figuring out how to make straightforward
recreations. To total up, Python isn't generally the best dialect for
programming games; however, it is an essential instrument in a game
software engineer’s toolbox.
Python has different modules, libraries, and stages that support game
improvement. For instance, PyGame gives usefulness and a library to game
advancement. There have been various games manufactured utilizing Python
including Civilization-IV, Disney's Toontown Online, Vega Strike and so on.

Web Applications and Web Frameworks:

Python has been utilized to make an assortment of web-systems including


Django, Bottle, Flask and so forth. These structures give standard libraries
and modules which rearrange undertakings identified with content the
administration, collaboration with database and interfacing with various web
conventions, for example, HTTP, SMTP, XML-RPC, FTP, and POP. Plone, a
substance administration framework; ERP5, an open source ERP which is
utilized in aviation, clothing, and keeping the money; Odoo—a merged suite
of business applications; and Google App motor are a couple of the well-
known web applications in light of Python.
These structures are dependable for genuine applications. For Example,
Plone, an outstanding open-source content administration framework—to
which the creator is a donor—keeps running on Zope and has been actualized
in associations, for example, Novell and Oxfam. The high-movement
Reddit.com runs Pylons. The Revver.com video sharing site utilizes Django.
Zope was a rising open-source application server that demonstrated Python's
feasibility in the undertaking (albeit numerous Python engineers nowadays
feel it is a bit "unPythonic").
Sending a Python Web application is normally direct, in spite of the fact that
it is not exactly as simple as conveying a PHP application in Apache.
Database network is extremely all around took into account by the
question/social mappers, for example, SQLAlchemy. Nonetheless, most
Python Web structures still can't seem to get up to speed to big business
review application servers for Java or .Net regarding support for high-
accessibility bunching, failover and server administration.
Predicting Earthquakes:
There was a Harvard scientist that figured out how to use deep learning to
teach a computer system to perform viscoelastic computations. These are the
computations that are used to predict earthquakes. Until they figured this out,
these types of computations were computer intensive, but the deep learning
application helped to improve calculations by 50,000%. When we are talking
about earthquake calculation, timing plays a large and important role. This
improvement may just be able to save a life.

Neural Networks for Brain Cancer Detection:

A French research team found that finding invasive brain cancer cells while
in surgery was hard, mainly because of the lighting in the OR. They
discovered that when they used neural networks along with Raman
spectroscopy during surgery, it allowed them to be able to detect the cancer
cells more easily and lowered the leftover cancer. Actually, this is only a
single piece of many over the last couple of months that have matched the
workings of advanced classification and recognition with several kinds of
cancers and screening tools.

Python in the Enterprise:

The conventional venture stages are by need expansive and complex. They
rely upon expounding devices to oversee code, assembles, and organizations.
For some reasons, this is needless excess. Any software engineer ought to
have the capacity to go after her most loved dialect when motivation hits her,
and Python's instantaneousness makes it appropriate for basic mechanization
assignments and speedy prototyping. Engineers as a rule likewise feel that
Python gives them the headroom to move past a model without discarding
their past work. Without a doubt, Python can be utilized for substantial and
complex programming frameworks. YouTube, for example, runs primarily on
Python, and it is an oft-favored dialect at associations including Google,
NASA and Industrial Light, and Magic. Particular Python libraries and
structures exist for logical programming, information control, Web
administrations, XML exchange, and numerous different things

Language Development:

The module and design of Python’s architecture has shaken the development
of many languages. Boo language uses a syntax, object model, and
indentation, related to Python. Moreover, the syntax of these shares common
features with Python. For example, Apple’s Swift, Cobra, CoffeeScript, and
OCaml.

Prototyping:

Python is easy and simple to learn, it also has the advantage of being open
source and free with the help and support of an enormous community. This
makes it the preferred choice for prototype development. Further, the
nimbleness, scalability and extensibility and ease of refactoring code
associated with Python allow faster development from initial prototype.

Automatic Game Playing:

This task involves a model learning how to play a computer-based game


using only the pixels that are on the screen. This is a pretty hard task in the
realm of deep reinforcement models, which has also been a breakthrough for
DeepMind, which was part of Google. Google DeepMind’s AlphaGo has
expanded and culminated on this.
Activision-Blizzard, Nintendo, Sony, Zynga, and EA Sports have been the
leaders in the gaming world and brought it to the next level through data
science. Games are now being created by using machine learning algorithms
which are able to upgrade and improve playing as the player moves through
the game. When you are playing a motion game the computer analyzes the
previous moves to change the way the game performs.

Python on the Desktop:


We can type desktop apps using Python in frameworks, e.g. PyGTK or
WxPython. Nevertheless, almost all desktop apps are still created and written
in well-ordered languages such as C++, C or C#. The frameworks used for
these languages tend to have more complex tools used for development and
the programs are mostly easier to hand out because they do not need the user
to install Python. Python has tremendous graphical development tools, such
as the Eclipse PyDev extensions and Wing IDE. Though, Python developers
mostly work "Unix style" with independently operating text editors and
terminals. On platforms like Java or .Net, an atmosphere such as Microsoft’s
Visual Studio offers very complex unification with the programming
language. Depending on the kind of developers you engage with, it could
either be a curse or blessing.
CONCLUSION

Thank for making it through to the end of Python Beginners Guide:


Machine Learning for Newbies, let’s hope it was informative and able to
provide you with all of the tools you need to achieve your goals whatever
they may be.
The next step is to recall some key points that I want you to take away from
this book.
First of all, Python is completely question arranged and incorporates a couple
of practical programming builds.
The principle execution of Python is composed in C and keeps running on for
all intents and purposes for any cutting edge stage. There are similar uses that
keep running inside a Java Virtual Machine (JPype, Jython), on the .Net stage
(IronPython) and even one written in Python itself, called PyPy.
Considerably, you ought to think about Python when you (or your
software engineers):

Need a dialect/language that is valued over a scope of programming


undertakings, from shell mechanization to Web applications.
Like Python's logic and sentence structure.
Discover the dialect fun and beneficial.
Need a universally useful, demonstrated and solid scripting dialect
that accompanies a rich standard library.
Python may not be a useful tool if you:

Assemble fundamental work area applications, particularly for


Windows. Stages like .Net typically offer more modern instruments
and simpler appropriation of the last programming.
Depend on groups of less-experienced developers. These designers
may profit by the more extensive accessibility of preparing for
dialects like Java and are less inclined to commit errors with an
incorporate time, type-checked dialect.
Are building inserted or enormously parallel frameworks for which a
scripting language would be a wrong decision (because of worries
about execution speed).
Have concentrated needs better served by different languages that
you definitely know. For instance, on the off chance that you need to
complete a great deal of content preparation and you have a cellar
loaded with Perl developers, there's no convincing motivation to
switch.

These were the key points that are essential for anyone who wants to start
learning python. All of the points that are mentioned above are the basics of
python that will build a strong base for further learning. Wish You All Good
Luck In Your Journey Of Learning Python.
I want to thank all of you who got through my book. Secondly, if you guys
found this helpful in any way it would mean the world to me. Finally, if you
found this book useful in any way, a review on Amazon is always
appreciated!
PYTHON PROGRAMMING: THE ULTIMATE
INTERMEDIATE GUIDE TO LEARN PYTHON
STEP BY STEP
INTRODUCTION

Thank you for downloading the book Python Programming: The Ultimate
Intermediate Guide to Learn Python Step by Step. I hope you find it
educational about all the topics you need concerning the exciting field of
machine learning.

This guidebook is going to explore how the Python coding language can
work together with machine learning in order to create a program that can
learn all on its own. With traditional coding methods, this just isn’t possible.
You will be stuck with a program that can only do one activity and one that
isn’t able to deal with any learning along the way. But with machine learning,
your program can look through data, look through past performance, and
more and then learn how it should behave.

This guidebook is going to take some time to explore machine learning and
how you can get started with it. We will explore a bit about machine learning
and its basics before moving on to some of the basics that you need to know
about the Python coding language before getting started, and how to set up
the environment. When that information is organized, we will then move onto
some of the different things that you can do with machine learning, using
some of the different scripts from Python such as linear regression, Decision
Trees, random forests, and more.

When you are ready to learn how to use python with machine learning, make
sure to check out this guidebook to help you get started!
1

WHAT IS MACHINE LEARNING

B efore we can learn more about how to use Python to help with
machine learning, it is important to learn more about what the field
of machine learning is able to do. Machine learning is a subset of
artificial intelligence that will deal with technology being able to learn from
the input it is given. With a traditional computer program, the program can
only do what was put into the code. It never takes the input from the user or
makes any decisions on its own to learn and grow. It simply repeats what the
programmer put into the code. But with machine learning, the program is
able to learn based on trial and error, by looking at patterns in the data that it
sees, and even from the input that the user adds.
The idea behind machine learning is to help the program learn how to read
data and then make decisions on its own. There are a lot of times when a
program needs to be able to behave without the programmer being there to
tell it what to do. For example, with a speech recognition program, the
program might have trouble with understanding some speech patterns in the
beginning, but over time, it will learn how to understand the person who uses
it the most and it will make fewer mistakes.
The first definition of machine learning was coined in 1959 by Arthur
Samuel. He defined machine learning in this manner, “Field of study that
enables computers to learn without being explicitly programmed.”
This is one of the neat things about machine learning. The machine is able to
figure out patterns out of a large amount of data, even if the programmer
didn’t specifically tell it how to behave. This can be helpful in uses such as
speech recognition, search engines, and for companies who need to search
through large amounts of data to find patterns and make decisions about how
to act in the future.
How Can a Machine Learn?
So, the next question is how our machines can learn. Before we dwell on all
the details that come with machine learning, we must take a look at the way
that humans learn. This is going to give us a good insight into how machine
learning is going to work.
For instance, we as humans know that we shouldn’t touch a heating plate
using our bare hands. But how do we know not to do this? There are two
possibilities with this one. Either we were burned in the past by touching one
of these plates, or a hot stove, or we were taught by others not to touch these
hot plates. In either case, there is some experience in the past that made us
not touch the heating plates when we see they are on. In other words, we have
some form of past information that we can use to make our future decisions.
Machine learning is going to work in a similar manner. In the beginning, the
computer program basically has no knowledge. These programs are just like
humans when they are first born, having zero knowledge and not knowing
how they are supposed to act. To make a machine learn, information needs to
be passed over to these machines. Then going from this information, the
machines are able to identify patterns with various techniques. Over time, the
machines are going to learn how to identify patterns from the data they have
in order to make decisions, and then they can move on to making some
decisions with data they haven’t even seen.
Training data can be fed into the machine learning algorithms that are nothing
but complex mathematical algorithms. The algorithms are then going to result
in models for machine learning. These models for machine learning are neat
because they have the capability of making predictions on new data, even
data that is unseen.
When Is Machine Learning Used?
While you may not have heard about machine learning in the past, there are
actually quite a few times when it can be used in our modern world. Machine
learning is always adding to its teaching set and learning more than we could
ever imagine. Some of the ways that machine learning can be used in our
modern world include:

Data Security

Malware is a huge problem. Kaspersky Lab said that it found 325,000 new
malware files every day in 2014. But, a company known as Deep Instinct
says that each piece of new malware is going to have a very similar code to
the previous versions. Only up to 10 percent of the code changes which is not
that much of a change. They have begun to use a learning model that is able
to predict what malware is before it can attack. Machine learning algorithms
have also been used to look for patterns in how data inside a cloud is
accessed and report anomalies that show up and could predict security
breaches.

Personal Security

If you have ever gone to a big event, it’s likely that you had to spend some
time in a long security line. Machine learning is proving that it is a big asset
when it comes to eliminating false alarms and being able to catch things that
humans may miss. This could be used at concerts, stadiums, and airports.

Financial Trading

There are many people who are interested in learning how to predict what the
stock market is going to do on any given day so they can make more money.
Machine learning algorithms are able to do this with some accuracy. This can
help you to estimate how the market is going to react so you can make
smarter predictions and keep more money in your pocket.

Healthcare

Machine learning is able to process more information, as well as spot more


patterns, compared to humans. In one study that used computer-assisted
diagnosis when reviewing early mammography scans of women who then
developed breast cancer later on, the computer spotted 52 percent of the
cancers as much as a year ahead of when the women were officially
diagnosed. In addition, machine learning can be used in order to understand
some of the disease risk factors when looking at a large population. For
example, Medecision was able to come up with an algorithm that could
identify eight variables that were able to predict hospitalizations that could
have been avoided in diabetes patients.

Marketing Personalization

Even the world of marketing can get in on machine learning. The more that
you are able to understand about your customer, the more the company is
able to sell. With the help of machine learning, a program can follow where
the customers have been on your site, as well as online with other websites,
in order to come up with an idea of how to market to that person. This opens
up a lot of opportunities in terms of how you can market and reach your
customers.

Fraud Detection

Machine learning has gotten better at being able to spot potential fraud causes
in different fields. For example, PayPal is using machine learning in order to
catch those participating in money laundering. The company has many tools
that they use to compare the millions of transactions that go through and they
are able to precisely distinguish between fraudulent and legitimate
transactions that occur.

Recommendations

Many sites, including those like Netflix and Amazon, will use the
recommendation feature. This helps them to look through your activity and
then compare it with the other users in the hopes of figuring out what you are
most likely to binge or watch or purchase next. The recommendations, as
they gather more information from users, are always getting better so it won’t
take long before they meet up with your needs, even if you are a new user.
Online Search

One of the most popular methods of machine learning is with search engines.
Each time that you put in a search, the program is going to watch how you
are responding to the results it sends. If you click the top result and you stay
on that page for a bit, it will assume that the program got the right results and
the search was successful. If you happen to click on the second page, or you
go and type in a new search and you don’t look at any results, the program is
going to learn from that mistake to do better next time.

Alexa and Other Voice Recognition Software

When you first purchase one of these products, you may run into a lot of
issues. It may have trouble recognizing some of your speech patterns and
there may be some mistakes. Over time, the program is going to learn from
its mistakes and there will be a lot more accuracy in the results you get from
the voice recognition program.
As you can see, there are many different uses of machine learning, and this is
just the beginning. As the technology continues to change, it won’t take long
for machine learning to change and adapt with it as well. There are just so
many great things that you can do with the help of machine learning, and this
is definitely a field that is going to grow into the future.
The Importance of Machine Learning
The ultimate goal of artificial intelligence is to make a machine work in a
similar manner to humans. However, the initial work that has been done with
this shows that we are not able to make a machine as intelligent as humans.
This is because humans are constantly learning in an environment that is
evolving all of the time, while this isn’t really possible for a machine.
Therefore, the best way to make a machine intelligent is to make them learn
on their own. What this means is that machine learning is basically the
discipline of science that works to teach machines how to learn on their own
from data.
The idea that comes with machine learning is that instead of going through
and coding all of that logic and data into the program, the data is going to be
fed into the machine, and then the machine is able to learn from this data,
simply by finding the patterns that are there. What is interesting here is that
these machine learning techniques can be faster than humans at finding these
patterns.
The techniques that are used in machine learning have been around for some
time. But because, until recently, there has been a lack of hardware that is of
high performance enough, these techniques were not used to help solve
problems in the real world. But now, we have a lot of the complex hardware
that is needed for machine learning, and because of the huge amount of data
readily available, these techniques are coming back and have been used
successfully to help develop machines that are intelligent.
The Different Types of Machine Learning
To keep things simple, the techniques that are used with machine learning are
going to be categorized into two different types. These include unsupervised
learning and supervised learning. Let’s divide these up a bit and see what
each one is about.
Supervised Learning
In supervised learning, both the input data, as well as the corresponding
category that the input data belongs to is going to be put into the learning
algorithm. The learning algorithm is going to learn what the relationship is
between the output and the input, and then it can make predictions about the
output of any data samples that are unseen.
For example, this kind of learning algorithm can be fed to images of apples
that have been labeled as fruit, and the potatoes that are labeled as vegetables.
After the machine has gone through training with this data, the algorithm will
be able to identify new images of potatoes as vegetable and those of apple as
a fruit, even without those labels.
There are a few steps that are often seen when it comes to a supervised
learning algorithm. These include:

You will feed the algorithm with the input records of x, and then the
output labels of y.
For each input record, the algorithm is going to predict an output of
y.
The error in prediction is going to be calculated when you subtract y
from y.
The algorithm is able to correct itself by taking out that error.
The other steps will continue for multiple iterations until the
likelihood of an error is almost gone.

Supervised learning can be used for a wide number of things, but it will
usually help you to solve two different types of problems including
regression and classification.

Classification

This is going to refer to the process of being able to predict a discrete output
for a given input. So, if the given input is predicting whether your mail is
either ham or spam, a tumor is malignant or benign, or if a student is going to
fail or pass an exam.

Regression

In these kinds of problems, the machine learning model is going to be given


the task of predicting a continuous value. So, for any given input, it will be
able to predict the price of the house or the marks that a student would get on
their exams.
Unsupervised Learning
With the unsupervised learning, the algorithms are going to be fed with the
input data with no labels. Then the algorithm will learn to identify patterns in
the data and can cluster the records that are similar together. Since most of
the data a program is going to get will not have labels on it, unsupervised
learning is the option that you are more likely to use.
For example, a company may want to use an unsupervised learning algorithm
in order to figure out the shopping trends of their customers. This shopping
trend could be put into an algorithm for unsupervised learning. This
algorithm will take a look at the information and then see if it is able to figure
out the future behavior of the customer. The company may use this to find
out that someone who often purchases baby products will also purchase milk
and then they will move the milk closer to the baby products to get more
sales.
This chapter spent some time introducing the ideas of machine learning. We
saw a bit more what machine learning is all about, as well as learned that
there are some different types of machine learning.
2

SUPERVISED MACHINE LEARNING

N ow that we have spent a little bit of time looking at machine


learning and what it is about, it is time to divide this up a little bit.
Remember we brought up the terms supervised and unsupervised
machine learning. These are both important aspects that come with machine
learning, but they work in different ways and can help you solve different
problems.
The first type of machine learning we are going to look into is known as
supervised machine learning. Supervised machine learning is the type that
will occur any time that you choose an algorithm that helps your program
learn the right response to the inputs it gets from the user. So, as a user starts
to work with that program and provides data or input to that program, the
program is able to take that information and learn from it.
There are a few different ways that a program using supervised machine
learning is able to do that. The first one is to look either at targeted responses
that you provide into that system. Or the computer would look at examples
that you provide to it. This testing could include strings or values of labels to
ensure that the program is learning how you want it to behave when it starts
to function all on its own.
This is a simple process to work with. You basically show the computer
examples of what you want so that it can learn. An example of this in the real
world is if a teacher wants to teach a new topic to their students. One way
that they may do this is to show the class some examples of the situation they
are teaching. The students would spend their time memorizing these
examples because these are going to provide the students with some general
rules about the topic. The teacher will not be able to go through every
example of this but can give some to guide the students.
Later on, when the students see these examples or are tested on them, they
will have the right guidance on how to respond. If they see that there is
something that doesn’t seem similar to the examples the teacher showed
them, then they know how to respond as well.
The machine that has supervised machine learning on it is going to react the
same way. You will show it a variety of examples and tell it how to react.
Then, when the computer sees an example that is the same or similar to what
you showed it, it knows how to react as well. If it sees something that doesn’t
match up to that example, it can also react.
There are several different types of learning algorithms that fit under the
umbrella of supervised machine learning. Some of the most common types of
supervised machine learning that you can use include:

Random forests
KNN
Decision trees
Regression algorithms

We are going to take a look at how to use some of these later on in this
guidebook so you can get a good idea of how to use supervised machine
learning techniques as well.
3

UNSUPERVISED MACHINE LEARNING

W hile there are a lot of times when you will rely on supervised
machine learning, there are also times when you will need to use
with the other type, the one called unsupervised machine
learning. With supervised machine learning, you will simply show some
examples to the computer and then you can teach the computer how to
respond the way that you want. There are a ton of programs that are going to
rely on this kind of machine learning. But there are also sometimes when it
may seem overwhelming to think of showing thousands or hundreds of
thousands of examples to the computer and this is very tedious. Plus, there
are many programs where you are not going to see that work well for your
program at all.
This is where you will want to work with unsupervised machine learning.
This is the next type of machine learning that we are going to look at.
Unsupervised machine learning is the type that you will use in order to let the
computer learn on its own based on the information that the user gives to it. It
may make mistakes on occasion, but the algorithm will be set up to learn
from these mistakes. What this means is that the algorithm for machine
learning that is unsupervised will be in charge of figuring out and analyzing
the data patterns based on the input that you give it.
Now, there will also be a few different types of algorithms that can work well
with unsupervised machine learning. Whichever algorithm you choose to go
with, it is able to take that data and restructure it so that all the data will fall
into classes. This makes it much easier for you to look over that information
later. Unsupervised machine learning is often the one that you will use
because it can set up the computer to do most of the work without requiring a
human being there and writing out all the instructions for the computer.
A good example of this is if your company wants to read through a ton of
data in order to make predictions about that information. It can also be used
in most search engines to give accurate results.
There are a lot of different techniques that you can use when it comes to
machine learning. Some of the most common methods include:

Markov algorithm
Clustering algorithm
Neural networks

A Word About Reinforcement Machine Learning


Another type of machine learning that we haven’t mentioned much yet with
machine learning is reinforcement machine learning. This one will let you
take on a little bit more than what you can do with the unsupervised and the
supervised learning techniques that we talked about before. There are many
times when you may assume that unsupervised machine learning and
reinforcement machine learning are the same things. But there are some
differences. For example, the input that is given over to these algorithms
needs to have a few mechanisms put in for feedback. You can set these
mechanisms up to be either positive or negative based on the type of
algorithm that you want to run.
Whenever you want to work with reinforcement machine learning, you are
doing a method that is more of a trial and error than the other two. This
method can be similar to working with a smaller child. When the child does
something that you do not approve of, you will tell them that they did it
wrong or you will ask them to stop. If they do an action that you approve of,
you can do another action to tell them that you approve, such as praising
them or giving them positive reinforcement.
Through a lot of trial and error to see how you respond to them, the child is
going to learn what you see as acceptable behavior or not. With the right type
of reinforcement each time, the child will strive to do what you want from
them each time. This is similar to how the reinforcement machine learning
works. The program will learn, based on trial and error, how you want it to
behave in each situation.
To keep it simple, this is what reinforcement machine learning is going to be
like. It works on the idea of trial and error and it requires that the application
uses an algorithm that helps it to make decisions. It is a good one to go with
any time that you are working with an algorithm that should make these
decisions without any mistakes and with a good outcome. Of course, it is
going to take some time for your program to learn what it should do. But you
can add this into the specific code that you are writing so that your computer
program leans how you want it to behave.
4

THE BASICS OF WORKING WITH PYTHON

B efore we get started with more of our machine learning, we need to


understand some of the basic parts that come with the Python coding
language. This is a great coding language to work with because it is
simple and easy to use. Yet it still has a lot of power behind it. All the great
features that come with Python, it is the perfect choice when working on
machine learning. Let’s take a look at some of the basics that come with the
Python coding language.
Python Keywords
First, we need to take a look at these important keywords in the Python
language. Like with what you will find in other coding languages there is a
list of keywords in Python that are meant to tell your text editor what to do.
These keywords are reserved and you should only use them for their intended
purposes if you want to be able to avoid issues with your code writing. They
are basically the commands that will tell your compiler how to behave and
they remain reserved so that you can execute the code without a lot of issues
in the process.
Naming Identifiers in Your Code
When you are working with Python, it is important to follow the right rules
when you are naming the different parts of your code. Several of these parts
are known as identifiers. These can go by various names like variables,
classes, entities, and functions. There are a few rules that you must follow
when naming these identifiers, but the rules are going to be the same no
matter which identifier you are working with. Some of the rules that
identifiers need to follow when naming identifiers include:

You can use both uppercase and lowercase letters in the name of
your identifier. You can also work with underscore symbols and
numbers as well. Any combination of the above characters work as
well, just make sure that inside the name, there aren’t any spaces.
So, do not write out something like ‘My first program’. You would
write it as ‘Myfirstprogram’.

Your identifier can’t start out with a number. You can use the
number anywhere else that you want in the name, but it can’t be the
first character of your program. If you do put a number as the first
character, you are going to get an error signal when you have the
compiler try to do this one. You can write something like ‘one
program’, but you can’t name the identifier something like
‘1program’.

The name of the identifier can’t have one of the Python keywords in
it. If you add in the keyword to the name, you are going to cause
confusion in the compiler and you won’t get the program to work.

The rule for naming your identifier doesn’t have to be difficult. As long as
you follow these simple rules, you can give your identifier any name that you
would like. If you do happen to forget one of the rules for naming an
identifier, the compiler is going to notice and you will end up with a syntax
error. You simply need to go back through and fix it and this error will go
away.
Python Comments
You can also work with comments in Python. These comments can be useful
for explaining some of your code, especially if you are trying to explain it to
another programmer or person who is looking through your code. Any time
that a party of your code needs a little bit of clarification, you can add in a
comment. The compiler is not going to recognize the comment and simply
skips right over it without reading at all. Once you indicate to the compiler
that the comment is done, it will start reading the code again. Someone who
executes the code will have no idea how many comments you have or even
where you put them in the program.
It is pretty easy to write out the comments that you want to add into your
code. You simply need to use the (#) sign in front of the comment that you
want to write. So, you could write out something like #this is my Python
code. When the comment is done, you just hit return and start writing the rest
of the code on the next line. As long as you have that sign right in front of the
comment, then your compiler is able to just skip over it and will not read out
what you put in the comment.
You are able to choose how many comments you would like to do inside of
your code. Sometimes the code will just need a few of them while other times
the code may need a lot of comments. Keep the comments down to the ones
that you really need and do not waste time or space writing out more
comments which aren’t necessary.
Python Statements
You can also work with statements inside your Python code. These
statements are useful because they are simply a string of code that often has
some of the other parts that we talk about in this chapter. You can then send
these statements over to your chosen compiler so that the code can be
executed. You can write out any statement that you would like, but it must be
written out in a manner that the compiler will understand. Statements can be
as short or as long as you would like. Some statements are only going to be
one character long, and other times it will be many lines long.
Python Variables
Variables are important because they will save up spots on your computer’s
memory in order to hold onto parts of your code. When you go through the
process of creating a new variable, you are making sure that you are
reserving some space on your computer for this. In some cases, such as when
you are working with data types, your interpreter will do the work of
deciding where this information should be stored because that speeds up the
process.
When it comes to working with variables, your job will be to make sure that
the right variables are lining up with the right values. This will ensure that the
right parts show up in your code at the right time. The good news is that you
are able to give the variable the value that you would like, but do check that it
actually works inside of your code. When you are ready to assign a new value
to a variable, you can use the equal sign to make this happen. Let’s look at a
good example of how this would work:
#!/usr/bin/python

COUNTER = 100 # An integer assignment


miles = 1000.0 # A floating point
name = "John" # A string

PRINT COUNTER

print miles
print name
With the above example, you will get the results to show up with the variable
that you placed with the value. So, when the counter shows up, it will show
you a 100 for the counter, 1000 for the miles, and John as the result of the
name.
Python Functions
Functions can be another part of your language that you need to learn about.
This is basically a part of the code that can be reused and it can help to finish
off one of your actions in the code. Basically, these functions are often really
effective at writing out your code without having a lot of wasted space in the
code. There are a lot of functions that you can use in Python, and this can be
a great benefit to the programmer.
When you are working on your code, you will need to make sure that you are
defining the function. Once you have been able to define the function and it is
considered finalized, it is time to execute it to work in the code. You have the
choice to call it up from the Python prompt or you can call it up from another
function. Below is an example of how you are able to do this:
#!/usr/bin/python

# F UNCTION DEFINITION IS HERE

def print( str ):


"This prints a passed string into this function"
print str
return;

# N OW YOU CAN CALL printme function


printme("I'm first call to user defined function!")
printme("Again second call to the same function")
When this is placed into your compiler, you will be able to see the two
statements that we wrote out inside of the code come up like a message. This
is just one example of how you would call up a function and you can always
change up the statements that are inside the code and figure out how you
want them to execute later.
These are just a few of the basics that come with the Python code. We will
take a closer look at doing these a bit more as we move through this
guidebook, but these will help you get the basics of the Python language and
how you can use it for your advantage in machine learning.
Python as an Object Oriented Language
As you look through some of the different codes that we have for this kind of
learning in the guidebook, you will notice that it is very much an Object
Oriented Programming or OOP language. This is a fairly new form of
programming, but you will find that it works way better for doing your
programs and can make coding, especially with machine learning, easier for a
beginner.
The OOP languages are basically designed to be easier and to cut out some of
the issues that used to occur when programming. Some of the older coding
languages are not designed this way, and that made it harder for a beginner to
get started with.
OOP was designed in order to get rid of these issues because it will use the
programming features that are best in a way that is more structured.
Basically, this structure is going to help you as a programmer get the work
done faster. If you tried out some of the older coding languages and got
frustrated, you will find that Python got rid of some of these issues and
coding is much easier to accomplish now than before.
Unlike Python, some of the earliest programming languages were released
with a development method that is known as the procedural approach. This
got the job done and did help the programmer to get their work done and it
made some great codes for many years, however, there were a lot of flaws.
These flaws were things that some programmers were able to get past but
they were often troublesome enough that beginners would get frustrated and
walk away from doing the work at all.
Since the procedural approach was really hard to use in programming, the
idea of OOP was developed. When you are using OOP, you will find that the
data in your code is going to be treated as an important development of the
code. In the procedural development, the data was allowed to just flow
around in the system. OOP will do a much better job keeping the data in
place so that the function that operates it will stay right next to it. This keeps
all your data safe and makes sure that the programmer, or outside sources,
aren’t accidentally making modifications.
For the beginner who doesn’t understand all of the technical things that come
with programming, OOP is basically just going to provide you all of the
power that you want when creating a new code. It makes things so much
easier to use. There are a lot of features that come with OOP which helps to
make it better to use including:

OOP is able to pull some of the emphasis away from the procedure
of your code and it places more of this emphasis on the data in your
code.
All of the programs that you write will now be divided up into
objects.
The data is structured in order to be characterized by the objects that
are inside of them. The data will be the classes that hold the objects,
and all the objects that are inside of the class need to match up in a
way that makes sense.
The functions that will be important for operating the data of your
objects are going to come together when you are working with the
structure of the data.
The data in your code is hidden so that the external functions will
have access to the data. This sometimes caused some issues in the
past when it came to the older programming languages because the
information could easily get mixed up.
The objects that you use have the ability to communicate with each
other thanks to the fact that they have functions that are in common.
The new data, as well as the functions that you have, can be added
into the code any time that you want.
Whenever you start to design one of your own programs, you have
to remember that you have to follow the bottom-up approach.

As you can see, there is a lot to enjoy when using the Python coding
language. It is designed with a beginner coder in mind, which is why it is one
of the best options for you to choose and why you find it being used with
some other complex topics such as machine learning and even hacking.
5

SETTING UP YOUR PYTHON ENVIRONMENT

T he Python coding language is one of the best options that you can
choose to use when it comes to programming, especially when you
want to work with machine learning. Python is a very simple coding
language to learn, and it is often the first one that beginners will take a look at
because of its simplicity. But don’t be fooled. Just because it is simple to
learn Python doesn’t mean it won’t have the power and strength that you are
looking for in a programming language. This chapter is going to take some
time to explore how you can set up your Python environment to help with
machine learning.
The first thing that you will want to do after installing the Python program
from www.python.com is to download the proper IDE. This is a great
environment to use with Python and will give you all the help that you need.
It also includes the area to do a Python installation, your editors and your
debugging tools. The IDE that we are going to use in this section is
Anaconda. This IDE is easy to install, light, and it has a lot of great
development tools that you will enjoy. It also has its own command line
utility so you can install some third party software as needed. And with this
IDE, you won’t have to go through and install the Python environment
separate.
In order to download the Anaconda IDE, there are a few steps to complete.
We are going to look at the steps that you can use when installing this for a
Windows computer, but the steps for installing on a Linux or Mac computer
are pretty much the same:
1. To start, go to https://www.anaconda.com/download/
2. From here, you will be sent to the homepage. You will need to select
on Python 3.6 because it is the newest version. Click on “Download”
to get the executable file. This takes a few minutes to download but
will depend on how fast your internet is.
3. Once the executable file is downloaded, you can go over to its
download folder and run the executable. When you run this file, you
should see the installation wizard come up. Click on the “Next”
button.
4. Then the License Agreement dialogue box is going to appear. Take a
minute to read this before clicking the “I Agree” button.
5. From your “Select Installation Type” box, check the “Just Me” radio
button and then “Next”.
6. You will want to choose which installation directory you want to use
before moving on. You should make sure that you have about 3 GB
of free space in the installation directory.
7. Now you will be at the “Advanced Installation Options” dialogue
box. You will want to select the “Register Anaconda as my default
Python 3.6” and then click on Install.
8. And then your program will go through a few more steps and the
IDE will be installed on your program.

As you can see, setting up an environment in Python can be simple and


doesn’t have to be very complicated to use. You can choose to work with
other forms of IDE as well and if you already have one set up on your
computer, it is fine to work with that one as well.
6

DATA PREPROCESSING WITH MACHINE


LEARNING

Y our data must be in a specific format before you are able to apply
any type of machine learning algorithm to them. Converting data
over to the right format to do machine learning is going to be
known as the process of data preprocessing. Depending on the type of dataset
you are working with, there are a few preprocessing steps that you will need
to use in order to make the data in the right form. The steps you will need to
use include:

Getting the data set.


Importing your libraries.
Importing the dataset.
Handling any of the missing values.
Handling the categorical data
Dividing the data into different training sets and test sets.
Scaling the data.

Let’s take a closer look at each of these and how they can work together to
prepare your data the proper way.
Getting the Data Set
To start, all of the data sets that we are going to use in this guidebook can be
found at the following link to make things easier:
https://drive.google.com/file/d/1TB0tMuLvuL0Ad1dzHRyxBX9cseOhUs_4/
view
When you get to this link, you can download the “rar” file and then copy the
“Datasets” folder of to your D drive. All of the algorithms that we will use in
this book will access the datasets from “D:/Datasets” folder. The dataset that
you need for this chapter to help with learning more about preprocessing is
known as patients.csv.
If you take a look at this dataset, you will notice that it has information about
the Gender, BMI, and Age of twelve patients. This dataset also has an extra
column that shows you whether the patients have diabetes or not. The BMI
and Age columns are numeric here because they are going to contain
numerical values in them, and the Diabetic and the Gender columns will be
categorical in nature instead.
Another distinction that you can make here is between the independent and
the dependent variable. As a rule of thumb, the variable whose value is
predicted to be the dependent variable and the variables that will be used to
make predictions will be the independent variable. In this example, the
diabetic column is going to be the dependent variable because it is often
going to depend on the other three columns, and those columns are going to
be the independent columns.
Import Libraries
Now that you have some of the data that you need to get started on this, you
need to work on importing the library. Python automatically comes with
some prebuilt libraries that are able to perform a variety of tasks. For the
purpose of this book, we are going to use the Scikit Learn Library. To keep it
simple, we are going to install three libraries that are the most essential for
helping us do machine learning. These include NumPy, matplotlib.pyplot,
and pandas.
NumPy is an important library because it is able to go through and do a lot of
the more advanced mathematical functions that you need. Since machine
learning is going to use a lot of mathematics in it, it is worth your time to
install this Python library.
You can also install the matplotlib.pyplot library. This library is a great one
to use in order to plot out some good charts. Often, when you are looking at
the data that you want to understand through machine learning, you will need
to have this library.
And finally, you should download the pandas library. This library is an easy
one to use and it can allow you to easily import as well as view your datasets
while we do machine learning.
In order to import all of these libraries, you need to either create a new
notebook for Python in Jupyter or you can open up a new file inside Spyder.
The codes that we are going to use with this guidebook are done in Spyder so
you are aware. To import these libraries, use the following code:
import NumPy as np
import matplotlib.pyplot as plt
import pandas as pd
Import the Dataset
Once you have those three important libraries downloaded, it is time to move
on to the next step of importing the dataset that you want to use into the
application that we just created. This will also give you a good idea of why
we are using the pandas library here.
Our dataset is going to be presented in the Comma Separated Values, or CSV
format. The pandas library is going to contain the read_csv function that will
take the path to the CSV formatted dataset as a parameter and loads the
dataset into pandas data frame which is basically an object that stores dataset
in the form of columns and rows.
To make this happen, you can use the following script:

PATIENT _ DATA = PD . READ _ CSV (“D:/D ATASETS / PATIENTS . CSV ”)

The script above is going to help you to load up the data set for patients.csv
in the dataset fold that you have it set. If you are using the Jupyter notebook,
this is even easier to do. You would just use the following script to help you
see what the data looks like:
patient_data.head()
But, if you are working with the Spyder program, you would go over to your
Variable explorer and then double click on a patient_data variable from the
list of variables that show up. Once you click on the right variable, you will
be able to see all the details of this dataset.
At this point, you should be able to see the pandas data frame that looks
similar to a matrix with zero-based index. Once you have the dataset loaded,
the next step is to divide the dataset into a matrix of features and vector of
dependent variables. The feature set will consist of all your independent
variables. For instance, the feature matrix for the patients.csv dataset is going
to contain the information about the Gender, BMI, and Age of the patient. In
addition, the size of your feature matrix is equal to the number of
independent variables by the number of records. In this case, our matrix is
going to be 3 by 12 because we have twelve records and three independent
variables.
Let’s first go through and create our feature features. You can give it any
name that you would like, but traditionally it is going to be denoted by the
capital X. To help us read the code a bit better, we are going to name it
“features” and then use the following script:
features = patient_data.iloc [:,0:3].values
With the script that we used above, the iloc function of your data frame is
used to help select all the rows as well as the first three columns from the
patient_data of the data frame. This iloc function is going to take on two
parameters. The first is the range of rows to select, and then the second part is
going to be the range of columns you want the program to select.
If you would like to create your own label vector from here, you would use
the following script to get it done:
labels = patient_data.iloc[:3].values
How to Handle Any Missing Values
If you take a look at your patient_data object, you are going to notice that the
record at index 4 is missing out on a value in the BMI column. To help
handle these missing values, the easiest approach will be to remove the
record that is missing a value. However, this record could contain some
crucial information so you won’t want to remove it at this time.
Another approach that you can use to help deal with this missing value is to
put something in there to replace that missing value. Often the best choice
here is to replace the missing value with the median or the mean of all the
other values in that same column. To be able to handle the missing values
that come in, we are going to use the Imputer class from the
sklearn.preprocessing library. The script that you need to make this work
includes:
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values=”Nan”, strategy=”mean”, axis =0)
imputer = imputer.fit(features[:,1:2])
features[:,1:2] = imputer.transform(features[:,1:2])
With the script that we wrote above, the first line is in charge of importing the
Imputer class from the right library. We then went on to create the object of
the Imputer class. This is going to take on three parameters including axis,
strategy, and missing_value. In terms of the missing_value parameter, we are
specifying that this is the value that we want to be replaced. In our data set,
the missing value is being shown by “nan”. The strategy parameter is going
to specify the type of strategy that we want to use in order to fill in this
missing value. You can also choose from most-frequent, median, and mean.
And then the axis parameter is going to denote which axis we want to impute
here.
How to Handle the Categorical Data
Right now we know that machine learning algorithms are all going to be
based on mathematical concepts, and to work with mathematics, we need to
work with numbers. Because of this, it is going to be more convenient to take
all of your categorical values and move them over to numeric values. When
we look at our example, we have two categorical values, the Diabetic, and the
Gender option.
The good news is that with the sklearn.preprocessing library, there is the
LabelEncorder class, the one that is going to take your categorical column
and then give you the right numerical output to make sense out of it. The
script that you can use for this one includes:
from sklearn.preprocessing import LabelEncoder
labelencorder_features = LabelEncoder)_
features[:,2] =labelencoder_features.fit_transform(features[:,2])
Just like what we did with the Imputer class, the LabelEncorder class is going
to have a fit_transform method, which is just a combination of the transform
and the fit methods. The class is going to be able to take the categorical
column that you have as the input and then will return the right numeric
values to help you out.
In addition, you can always take the labels vector and then convert it to a set
of numeric values as follows:
labels = labelencoder_features.fit_transform(labels)
Dividing the Data into Training Sets and Tests Sets
Earlier in this guidebook, we discussed that the machine learning models are
going to be trained on a subset of data and then tested on a different subset.
This splitting up your test and training set is done to make sure that the
algorithm you use for machine learning isn’t going to overfit. When we talk
about overfitting, we are referring to the idea that machine learning is going
to perform good results on the training data, but then will give poor results
with the test data.
A good model for machine learning is one that is able to give good results in
both the test data and the training data. That way, we are able to say that the
model we picked has correctly learned all the underlying assumptions from
our set of data and that we are able to accurately use it to make decisions on
any new set of data that you use. The script that is below is going to divide up
your data into 75 percent train size and the rest will be the test size.
from sklearn.model_selection import train_test_split
train_features, test_features, train_labels, test_labels = train_test_split
(features, labels, test_size = 0.25, random_state = 0)
When you execute this script above, you are going to see the train_features
variable is going to contain the matrix of 9 features (because this is the 75
percent of 12) while the train_labels is going to contain the labels that
correspond to this. However, with the test-features, you are going to have the
remaining three features and the test_label will have the corresponding labels.
How to Scale the Data
And the final step that we need to work on with the preprocessing method
here before we are able to put this data into our algorithm for machine
learning is feature scaling. We need to work with the scale features because
there are some data sets that will have a big difference between the values of
its features. For instance, if we add in the number of red blood cells of
patients from patients.csv, the column is going to have values that are in the
hundreds of thousands, but the age column would be much smaller. Many
machine learning models are going to use the Euclidean distance to help find
the distance between the data points in your data.
The good news is that the sklearn.preprocessing library is going to contain
the class known as StandardScaler that you can use in order to implement the
standardization features. Like with other preprocessing classes, it will also
contain the fit-transform that we talked about before and it will take a data set
that you provide it as the input, and then output a scaled data set. The
following script will make this happen for you.
from sklearn.preprocessing import StandardScaler
feature_scaler = StandardScaler()
train_features = feature_scaler.fit_transform(train_features)
test_features = feature_scaler.transform(test_features)
One thing to note is that there isn’t really a need for you to scale labels on
any of your classification problems. For regression problems, we will take a
look at how to scale labels for a regression section.
7

WORKING WITH LINEAR REGRESSION IN


MACHINE LEARNING

I n this chapter, we are going to start working with supervised learning


algorithms, in particular, with linear regression. We will take a look at
how to do a linear regression with the help of one variable, and then
take this further and look at how to do a linear regression with more than one.
Let’s take a look at how to get started with linear regression.
The Theory of Linear Regression
To keep it simple, linear regression is going to be an approach that helps you
identify the relationship between at least two variables. Mathematically, this
linear regression is going to help you find the linear function that maps
independent variables to dependent variables. For those that are plotted on a
2-D graph, you will end up with a straight line.
Let’s look at an example of this. Consider that you want to find a relationship
between the price of a car and the year they were made. If we were to plat the
year on our x-axis and then the price on the y-axis, the linear regression
would find a straight line, the line that is going to fit the data points the best.
The straight line that we will use for this is represented with the formula:
b = ax1 + c
Here, the b is going to be our dependent variable, the a is the slope of our
line, the X is our independent variable, and c is going to be the y-intercept.
Looking at the equation, the x and b are going to remain constant simply
because they are the variables for the data. This means that our algorithm for
linear regression is going to give us a slope and the intercept that will give us
the best line based on the data. The line may not hit every point, but it will hit
and be as close to as many of them as possible.
Of course, we can take this concept and expand it out to even more variables.
An example of the equation that you would use for a linear regression
function would look like the following:
b = a1x1 + a2x2 + a3x3 + …. Anxn +c
You can make this formula go for as long as you would like. N is going to
stand for a total number of independent variables that you work with. This
equation is going to represent a hyperplane with n-dimension. It is important
to note that a two-dimensional regression model is going to look just like a
straight line. With a three dimensional one, it is represented in the form of a
plane and if you have more than three dimensions, it is a hyperplane.
Linear Regression Using One Variable
First, we are going to work with linear regression with just one variable. This
helps us to keep it simple and learn the basics before we move on. We are
going to have one dependent and one independent variable.
For this one, we are going to use the “car_price.csv” to help us figure out the
price of the car, which is going to be out dependent variable, based on the
year the car is manufactured, which is our independent variable. You can
look through that Dataset folder that we used before. To help predict the price
of the cars, we are going to work with a linear regression algorithm from the
Python Scikit Learn Library. Let’s take a look at how to do this.
Importing the Right Libraries
First, we need to make sure that we have the right libraries to get this going.
The codes that you need to get the libraries for this section include:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
You can implement this script into the Jupyter notebook. The final line needs
to be there if you are using the Jupyter notebook, but if you are using Spyder,
you can remove the last line because it will go through and do this part
without your help.
Importing the Dataset
Once the libraries have been imported using the codes that you had before,
the next step is going to be importing the data sets that you want to use for
this training algorithm. We are going to work with the “car_price.csv”
dataset. You can execute the following script to help you get the data set in
the right place:
car_data = pd.read_csv(‘D:\Datasets\car_price.csv’)
Analyzing the Data
Before you use the data to help with training, it is always best to practice and
analyze the data for any scaling or any values that are missing. First, we need
to take a look over the data. The head function is going to return the first five
rows of the dataset you want to bring up. You can use the following script to
help make this one work:
car_data.head()
In addition, the described function can be used in order to return to you all of
the statistical details of the dataset.
car_data.describe ()
Finally, let’s take a look to see if the linear regression algorithm is actually
going to be suitable for this kind of task. We are going to take the data points
and plot them on the graph. This will help us to see if there is a relationship
between the year and the price. To see if this will work out, use the following
script:
plt.scatter(car_data[‘Year’], car_data[‘Price’])
plt.title(“Year vs Price”)
plt.xlabel(“Year”)
plt.ylabel(“Price”)
plt.show()
With the script that we used above, we are working with the scatter plot that
is available on the matplotlib library to help us have the year over on our x-
axis and the price on the y-axis. From the output figure, we are able to see
that an increase in the year number results in an increase in the price of the
car. This is a linear relationship between the price and the year. This means
that we are able to use a linear regression algorithm to help us solve this
problem.
Data Preprocessing
Remember that we spent some time in the past chapter studying data
preprocessing in order to understand that we need to divide up our data into a
feature and a label set so that we can have training and a test set. Now it is
time to go through and do these two tasks. To divide the data into labels and
features, you will execute the following script:
features = car_data.iloc[:,0:1].values
labels = car+data.iloc[:,1].values
Since we only have two columns here, the 0th column is going to contain the
feature set and then the first column is going to contain the label. We will
then be able to divide up the data so that there are 20 percent to the test set
and 80 percent to the training. Use the following scripts to help you get this
done:
from sklearn.model_selection import train_test_split
train_features, test_features, train_labels, test_labels = train_test_split
(features, labels, test_size = 0.2, random_state = 0)
If we then go through and look at this dataset, we are going to see that there
is not a very big difference between the values of the prices and the years.
Both are going to be in the thousands. This means that it really isn’t
necessary to scale the data and you can easily use the data, in its current form,
for training the algorithm.
Training Your Algorithm and Getting It to Make Predictions
The LinearRegression class that comes out of this has been designed to help
with training features and labels as input and train the model. The script that
you can use for this includes:
from sklearn.linear_model import LinearRegresison
lin-reg = LinearREgression()
lin_reg.fit (train_features, train_labels)
Using the same example of the car prices and the years from before, we are
going to look and see what the coefficient is for only the independent
variable. We need to use the following script to help us do that:
print(lin_reg.coef_)
The result for this process is going to be 204.815. This shows that for each
unit change in the year, the car price is going to increase by 204.815 (at least
in this example).
Once this model is trained, the final step is to predict the out of the new
instance. The predict method of this class can be used to make this happen.
The method is going to take test features as the input and then will predict the
output that would correspond here. The following script is the best one to use
to predict the label for the test features:
predictions = lin_reg.predict( test_features)
This is going to give us a prediction of what would happen in the future.
Looking at the example of linear regression with the car prices and the years
they are manufactured, we could use this information to come up with a
guess about how much the car will cost in the future.
Let’s say that you want to figure out how much the car will cost in 2025.
Maybe you want to be able to estimate how much you will need to pay for a
car as you save up for it and you plan to purchase the car during that time.
You would be able to add the coefficient there and make a prediction on how
much the vehicle is going to cost in the future. This is not always completely
accurate, because there are things that can come into play that will change it
up a bit, but it can definitely be a good way to predict prices and how they are
going to behave if nothing else changes.
This is an example of working with a lean regression algorithm that just has
one variable. It is possible to use this while expanding out to a few variables.
You can even do many at the same time, which can be very helpful to
businesses who are trying to get through a large amount of information to use
to make important decisions.
8

USING A DECISION TREE FOR REGRESSION

N ow that we have looked a little bit at regression, it is time to work


with a decision tree. This is a very important part of machine
learning. Each feature in the data set you work with is going to be
treated as a node in this tree. At each node, the decision is made about which
path you will choose on the tree. Each situation is going to be different and
you will have to weight the value of each part before picking which path to
take. The process is going to continue on and on until the leaf node is reached
and you have your final decision.
This may seem complicated at first, but you might be surprised to know that
we have been working with decision trees our whole lives. For example,
maybe there is a bank that is trying to decide whether they want to give out a
loan to a particular customer or not. The bank has a lot of data on the
customer including their salary, their gender, and their age. The bank has to
take this information and then decide whether they want to give the customer
the loan or not from that information.
Each bank is going to be a bit different, but they will often define some
criteria that need to be met. These criteria are a set of rules that will define
whether they will award out that loan or not. Some of the criteria that the
bank may have in place include:

If the age of your applicant is over 25 and then also younger than 60,
then you can go to the next step. Otherwise, the bank will reject the
loan application.
If the first step is satisfied, then check to see if the person is salaried
or not. If the person does have a salary, then you would go to step
three. If the person doesn’t make any money (is jobless), then you
will reject the application.
If the person has a salary and is male, then you will go to step four.
If the applicant is female, go to step five.
If the salary of the male is over $35,000 per year, then you can
award the loan. Otherwise, the loan is going to be rejected.
If the salary of the female is over $45,000 a year, you can award the
loan. Otherwise, reject that application.

The rules that we have above are pretty simplistic, and we chose them at
random. In the real world, the decision tree is going to be way more complex
and there will be a lot more rules and a lot more branches on your decision
tree. There are even times when a statistical technique, such as entropy, will
be used to create these nodes. Entropy is going to refer to the impurity of
classifications in the labeled data.
To keep it simple, in a decision tree the features that result in a minimum
amount of entropy is going to be set as the root node. This helps give the
bank a starting point to go from and can ensure that the right people are given
the loan.
The Benefits of a Decision Tree
Decision trees can be very good to use when sorting through a lot of data.
They are simple and really easy to understand compared to some of the other
methods you may use with machine learning. Some of the benefits of using a
decision tree algorithm include:

Decision trees can work for many different types of problems. They
work well for classification and regression tasks which mean that
you can use them to predict continuous and discrete values.
Decision trees can be used in order to help you classify both non-
linear and linear data.
In comparison to some of the other machine learning algorithms you
may use, these decision trees are pretty fast to train.
Using the Python Scikit Library to Help Implement Decision Trees
Now it is time to use our learning libraries to put one of these decision trees
into action so we can see how it works a bit better. In this section, we are
going to try and predict the consumption of petrol (going in millions) in the
48 states of the US based on several different features. These features are
going to include the ratio of individuals with their license, per capita income,
petrol tax (cents) and paved highways (miles). Let’s go through the different
parts and see what we need to do to make our own decision tree.
Importing the Right Libraries
import pandas as pd
import NumPy as np
import matplotlib.pyplot as plt
%matplotlib inline
Importing the Data Set
When you are ready to import the right data set to use here, you need to use
the following command:
petrol_data = pd.read_csv)’D:\Datasets\petrol_data.csv’)
Data Analysis
To take a look at the data and see what is there, you will simply use the
following code:
petrol_data.head()
This is going to give you a chart with all of the numbers. These numbers are
going to correspond to the different categories that we had before.
Data Preprocessing
If you want to work with data preprocessing here, you can use the following
script to help divide out the data into feature and label set.
features = petrol_data.iloc[:, 0:4].values
labels = petrol_data.iloc[:,4].values
Then you can take this information and divide it up so that eighty percent
goes to training and then other twenty percent goes to a test set. Use the
following script to get this to happen.
from sklearn.model_selection import train_test_split
train_features, test_features, train_labels, test_labels = train_test_split
(features, labels, test_size = 0.2, random_state = 0)
Data Scaling
If we look through the data set, you will notice that the data is not scaled at
all. For instance, the feature of population_driver_license has values that are
between 0 and 1 while the paved_highways and average_income have values
that are in the thousands. Because of this, before we take the information to
the algorithm, we must go through and scale the features. To do this, you can
go through and use the following script:
from sklearn.preprocessing import StandardScaler
feature_scaler = StandardScaler()
train_features_poly = feature_scaler.fit_transform(train)features)
test_features_poly = feature_scaler.transform(test_features)
Training the Algorithm
Now that the features are scaled down, it is time to train the algorithm that we
are going to use. To implement the decision tree to do a classification, you
will need to work with the DecisionTreeClassifier from the sklearn.tree
library. The following script will make sure that the right labels and features
are passed on to the decision tree:
from sklearn.tree import DecisionTreeClassifier
dt_reg = DecisionTreeClassifier()
dt_reg.fit(train_features, train_labels
Make Predictions
Finally, it is time to make predictions using our predict method. This will
help us to get through the decision tree and see what predictions it makes
based on the data that we have available. You would need to use the
following script to see the predictions:
predictions = dt_reg.predict(test_features)
9

RANDOM FOREST FOR REGRESSION

I n the previous chapter of this guidebook, we spent some time studying


decision trees. A single decision tree is sometimes going to be biased
based on the data that you put inside. A better approach for the decision
tree is to work with more than one tree in order to make your own
predictions, and then you can find the average of all the predictions to come
up with your final prediction. This approach is popular and it is known as
ensemble learning.
With ensemble learning, more than one algorithm of same or different types
will be joined together in order to create a powerful machine learning model.
Random forests will be the type of ensemble learning models and can be used
in supervised machine learning.
Random forest algorithms are going to unite several decision tree algorithms
and can create a forest. This is why this one is known as a random forest
algorithm. Similar to decision tree algorithm, random forests can be used to
either predict a continuous value, known as regression, as well as discrete
values classification. Let’s take a look at these random forests and how they
work.
Working on the Random Forest Algorithm
The random forest algorithm is going to work for you in order to perform
several steps including:

Choosing K random data points out of your data set.


Creating a decision tree regression or another classification
algorithm based on the K data points.
Pick out how many trees are there for a random forest and then will
go through the first two steps on each tree.
If the problem is a regression, each of the trees will predict a
continuous value. The final output can be calculated by taking the
mean of the values that are predicted by all of your trees. But if it is
a classification problem, each tree is going to be responsible for
predicting a discrete value.

The Positives of Using Random Forest Algorithms


At this point, you may be wondering why you would want to use one of these
algorithms for random forests. Some of the benefits of this type of algorithm
include:

This algorithm is often one of the most stable of the machine


learning algorithms and it can scale well. Since there are many trees
in the forest, the removal or introduction of data in your dataset can
have an impact on a few trees, but it won’t impact all the trees. This
helps to keep the stability of the algorithm as steady as possible.
Random forest algorithms are going to work out well regardless of
whether you are using categorical or numerical features.
With these random forests, you do not need to do scaling as we did
before. This is because the forest algorithm isn’t going to rely on the
distance between the data points.

The Negatives of Using Random Forest Algorithms


While there are many times in machine learning when you will rely on a
random forest algorithm, there are also times when it may not be the best
choice to go with. Some of the negatives that can come with these random
forests with machine learning include:

It is more complex to work with. There really aren’t limits on how


many trees you can have in your random forest. This means that you
could have thousands of trees potentially. When all of these trees are
involved in your prediction, it is going to be hard to figure out how
your final prediction was made.
The complexity of this random forest algorithm is going to come
with the cost of time. These forests are not something that you want
to work with if you are short on time, simply because of how many
trees are in the forest. These can take a lot of time to finish and the
amount of time that it takes will depend on how many trees are in
the forest.
10
WORKING WITH A SUPPORT VECTOR
REGRESSION

T he next topic we are going to explore is the Support Vector


Regression or SVR. This is a type of Support Vector Machines
Algorithm and it is used to help with non-linear and linear
regressions. This type of machine learning algorithm was introduced in the
1960s and it is one of the most famous that you can use when it comes to
machine learning. Before the neural networks were as common as they are
today, SVM was the most accurate of the algorithms for machine learning.
This chapter is going to take some time to look through the intuition that
comes with an SVM algorithm and then will look at how this algorithm
works. First, we need to look at the theory that comes with this machine
learning algorithm.
The Theory Behind SVM
For a typical linear regression that is found in a two-dimensional feature
space, the job here is to find a straight line that is able to bisect as many of
the data points as possible. This sounds good, except that in the real world;
there can sometimes be multiple decision boundaries that can be used to
classify the data points. And if there are new data points added in, the
decision boundary that you chose for classification will decide if these data
points are successfully classified.
The job of the SVM algorithm is to help you find a decision boundary that is
able to classify data in a way that any misclassifications that show up are
going to be minimized. The way that this algorithm is able to do this is by
maximizing the distances between the closest data points from all the classes
that are in the set of data.
This algorithm is going to be able to find these boundaries with the help of its
support vectors, which is how it gets its name. Support vectors are the ones
that will pass through the closest data points of the two classes you want to
work with. The job here is to maximize the distance that occurs between the
two vectors.
A line that is parallel to both of these support vectors can then be drawn in
the middle of these two vectors. This is the decision boundary and it is often
considered the most optimal of the decision boundaries. It is going to be a
great option to help you to see how your information is split up. Sometimes
this matters, and other times it will direct you to make decisions in a different
way than before.
This method can be used at any time that you have a lot of data that doesn’t
seem to have one vector line that is going to pass through it very well.
Instead, you would use the vectors to help you get close to both sets and it
can help you to separate the data into two, and sometimes more, categories.
You can then look at each of the categories and figure out which one has the
information that you need, how the groups are similar, and even how these
two groups are different. This information is often used for important
business decisions.
11
WHAT IS NAIVE BAYES AND HOW DOES IT
WORK WITH MACHINE LEARNING

A nother topic that often comes up with machine learning is Naive


Bayes. We are now going to move on from some of the regression
algorithms that we talked about before, and we will move on to
some of the classification algorithms, and the Naive Bayes Algorithm is the
best one to start with.
Naive Bayes is another supervised machine learning algorithm that has its
basis from the Bayes Theorem. This algorithm is going to be based on the
idea of feature independence which will state that a feature found in a data set
doesn’t have a relationship with each other. For example, a fruit may be
considered as a banana if it is five inches long or more, yellow in color, and
at least 1 centimeter in diameter. But the Naive Bayes will not have any
concern about these features and how they depend on each other. The fruit is
going to be declared as a banana through the independent contribution of
these features. This is why it is known as a Naive algorithm.
The Naive Bayes algorithm may be one of the simplest of your machine
learning algorithms, but it still has a lot of power behind it, which is why we
are going to study it a little bit here.
The Advantages of the Naive Bayes Algorithm
There are a lot of great benefits from working with the Naive Bayes
algorithm. Some of the benefits that you will enjoy when working with this
machine learning algorithm include:
This is a simple algorithm to work with and it is fast to train with. It
doesn’t have a complex math to go with it and there isn’t
backpropagation or error correction to deal with either.

These algorithms are able to perform better than a lot of the other
algorithms that you can choose when working on categorical data. If
you are using this for numeric features, this algorithm is going to
assume a normal distribution.

The Disadvantages of the Naive Bayes Algorithm


While many companies like to work with the Naive Bayes algorithm because
it is simple to use and easy to understand, there are a few disadvantages to
using this algorithm and there are times when it is better to work with
something different. Some of the disadvantages you may run into when
working with this algorithm includes:

With real-world data, you will find that features are going to be
mostly dependent on the other features. The independence
assumption that comes with the Naive Bayes algorithm can make it a
bad predictor for any set of data that has an interdependent feature.
If you are working with a categorical feature that has such a value in
test set which was not found in the training set, this algorithm is
automatically going to assign a zero probability to that instance. This
means you will need to go through and cross-validate any results that
you get when using this algorithm.

Applications and When to Use the Naive Bayes Algorithm


There are many different ways that you are going to be able to use the Naive
Bayes algorithm. Sure, there are times when it may not give you the results
that you want, but it can still give you some good results as well. Some of the
ways that you can use the Naive Bayes algorithm include:

This algorithm is a good one to use for multi-class problems and it is


often going to be used for any problems of text classification. This
could include spam filtering in emails and sentimental analysis.
This algorithm is often used in combination of collaborative filtering
algorithms for building up machine learning based recommender
systems.
This algorithm is often faster than the other advanced algorithms,
which can make it easier to use in real time applications.
12
K-NEAREST NEIGHBORS ALGORITHM FOR
CLASSIFICATION

T he next algorithm we are going to work with is known as the K-


nearest neighbors or KNN algorithm. You will find that this method
is going to work well in real-world data that isn’t really following a
trend, which is something that can often happen.
The idea that is behind the KNN algorithm is simple. It is simply going to
find the distance between your new data points from the older data points to
the rest of the data. It is then going to rank all of the other data points in
ascending order based on their distance with that test point. It will then take
this information and choose the top K nearest data points. From here, it is
going to assign the new data point to the class of the majority of K data
points.
The KNN algorithm is often considered a lazy algorithm, rather than an eager
algorithm. This means that it is not going to use any training data to help it
generalize. There isn’t a training phase for this kind of algorithm, and if there
is, it is pretty small. This also means that the training phase can be pretty fast.
A lack of generalization also means that KNN is able to keep all of its
training data. This means that all, or at least most of your training data, is
going to be needed during your testing phase.
This algorithm is important because it is going to be based on feature
similarity. This means it is going to be based on how closely any out of
sample features are going to resemble our training will determine how we are
able to classify our given point of data.
Often KNN is going to be used to help with classification. An object is going
to be classified through a majority vote of all the neighbors. The end goal
with this one is to assign the points to the class that is most common between
it and all of its nearest neighbors. There are times when the KNN algorithm is
used to help with problems of regression, but it is more common to use it for
classification.
How Can I Use the KNN Algorithm
There are many different ways that you can use this KNN algorithm. Some of
the most common ones include:

Credit Ratings

Many times the KNN algorithm will be used to help with credit ratings. First,
you will collect financial characteristics and then will compare that
information with people who have similar financial features in the database.
Just by the nature that comes with a credit rating, those who have similar
details financially would end up getting the same kind of credit rating. This
means that you could take a database that is already in existence and use it to
help predict the credit rating of a new customer, without having to go through
and perform all the calculations again.

Many times when a bank is considering giving out a loan to


someone, they may use a KNN algorithm. They may want to ask if
they should give an individual a loan. Is it likely that an individual is
going to default on their loan? Does the person look like they have
characteristics of others who have defaulted on their loans, or are
they closer to those who haven’t defaulted on their loans? The KNN
algorithm can help you compare the information you have against
information that is in the database and then answer these questions.

The KNN algorithm can even be used in political science. You can
take a potential voter and class them as either “will not vote” or “will
vote” based on their characteristics and how they stack up against
others who have or haven’t voted in the past. It is even possible to
look at the information you have about the person and about others
in your database to make a prediction on which party they will vote
for.

The Benefits of Using the KNN Algorithm


There are many benefits that come with using the KNN algorithm. Some of
these include:

There is no assumption about the data. This means that it is going to


work well for any nonlinear data.
The algorithm is pretty simple to use and it is easy to explain to
others.
There is a lot of accuracy with this one. There are some other
supervised learning models that can be better, but for the level that
comes with this one, it is pretty good.
It can be used with many different types of problems. It is useful for
both regression and classification problems.

The Negatives of Using the KNN Algorithm


Now that we have taken a look at some of the benefits that come with the
KNN algorithm, it is also important to notice that there are a few reasons why
you may not want to use this algorithm in your machine learning problem.
Some of the negatives that come with the KNN algorithm include:

It can be computationally expensive. This is because the KNN


algorithm is going to store all of your training data.
Needs to have a high amount of memory in order to work. This can
be hard on your system to handle.
It is going to store pretty much all of your training data, which can
really take up your computer space.
The prediction stage of this process is going to be slow compared to
some of the other options.
This algorithm can be sensitive to any irrelevant features and the
scale of the data, which can be hard when you add in new
information to the mix.
CONCLUSION

Thank you for downloading Python Programming: The Ultimate


Intermediate Guide to Learn Python Step by Step. I hope it was able to
provide you with the information you needed to get started with machine
learning and how to use it for your own needs.
The goal of this guidebook was to provide you with an introduction to
machine learning and how you can implement some of your basic algorithms
with the help of the Python coding language. We explored the basics of both
machine learning and the Python language before moving into some of the
different things that you can do with machine learning.
If you found this book helpful, please consider leaving a review!
PYTHON PROGRAMMING: THE ULTIMATE
EXPERT GUIDE TO LEARN PYTHON STEP-
BY-STEP
INTRODUCTION

Congratulations on downloading Python Programming and thank you for


doing so.
The following chapters will discuss what you need to know to take your
Python programming skills to the next level. If you have been working with
Python for some time and you have all the basics down, then this is the right
book for you. We are going to take some of those skills and push them a bit
further to make your programming skills a little bit stronger.
Inside this guidebook, we are going to talk about a lot of different topics that
will help you advance your coding skills. We will talk about how to create
arguments, what inheritances are, the importance of iterators and generators,
exception handling, and how to create loops. And that is just the start. When
you are done with this guidebook, you will be able to take some of the codes
that you have done in the past and make them stronger and more powerful.
Working with the Python coding language can be a great experience. Python
is one of the best coding languages to learn and even beginners can catch on
quickly. This guidebook will help you to take some of your beginner
knowledge and put it to good use with some of the best coding ever. Check
out this guidebook when you are ready to go from intermediate to advanced
level in your Python programming skills.
There are plenty of books on this subject on the market, so thanks again for
choosing this one. Every effort was made to ensure it is full of as much useful
information as possible. Please enjoy!
1

WORKING WITH INHERITANCES IN PYTHON

T he first topic that we are going to discuss in the Python language is


inheritances. These inheritances can be nice because they help you to
write a lot of complex code, without having to type out line after line.
This makes the coding look nicer, cleans it up, and still gets the results that
you want from it.
To help you keep things simple with inheritances work, remember that an
inheritance is when you take your original code, one that is known as the
parent code, and then copy it down to come up with some “child” codes that
are based from it. These child codes are adjustable so you can make changes
to them, without having to worry about how they are going to affect the
original parent code. You can choose to just make one inheritance, or you can
keep moving it down and make many of these children codes.
While an inheritance may sound complex, it is a pretty simple code to learn.
You can add or take away as much as you want to get this code to work the
way that you want. A good example of how an inheritance looks like inside
of your code includes the following:

#E XAMPLE OF INHERITANCE

#base class
class Student(object):
def__init__(self, name, rollno):
self.name = name
self.rollno = rollno
#Graduate class inherits or derived from Student class
class GraduateStudent(Student):
def__init__(self, name, rollno, graduate):
Student__init__(self, name, rollno)
self.graduate = graduate

DEF D ISPLAY G RADUATE S TUDENT ( SELF ):


print”Student Name:”, self.name)
print(“Student Rollno:”, self.rollno)
print(“Study Group:”, self.graduate)

#P OST G RADUATE CLASS inherits from Student class


class PostGraduate(Student):
def__init__(self, name, rollno, postgrad):
Student__init__(self, name, rollno)
self.postgrad = postgrad

DEF D ISPLAY P OST G RADUATE S TUDENT ( SELF ):


print(“Student Name:”, self.name)
print(“Student Rollno:”, self.rollno)
print(“Study Group:”, self.postgrad)

# INSTANTIATE FROM G RADUATE and PostGraduate classes


objGradStudent = GraduateStudent(“Mainu”, 1, “MS-Mathematics”)
objPostGradStudent = PostGraduate(“Shainu”, 2, “MS-CS”)
objPostGradStudent.DisplayPostGraduateStudent()

W HEN YOU TYPE this into your interpreter, you are going to get this result:

(‘S TUDENT N AME :’, ‘Mainu’)


(‘Student Rollno:’, 1)
(‘Student Group:’, ‘MSC-Mathematics’)
(‘Student Name:’, ‘Shainu’)
(‘Student Rollno:’, 2)
(‘Student Group:’, ‘MSC-CS’)
Inheritances can help give you some freedom when you are writing out the
code that you want to write. If you have a base or parent class that you want
to work with and you want to use to make a derived class, you can easily do
this without having to rewrite the code all over again. Add in that you can
keep the features that you want and get rid of the features that are in the way
or not useful for this leg of the process, and create your own new derived
class, and you have a really cool process that can be done in Python.
You can technically go through and make as many derived classes as you
want. As long as you keep doing them in order with each other and you use
the example above to make your own, you can make as many of these
derived classes as you would like. This makes things easier, limits the
amount of code that you have to write out, and can really make things easier
on you when you create a new program. Each new derived class will be able
to take the features that it likes from its parent code and use them or drop
them to make the code continue on and be stronger than before.
What does it mean to override by base class?
Now that we have taken a look at a good example of an inheritance, the next
thing we need to do is learn the steps that are needed to override our base
class. There are times when you work with your derived class, and then you
need to take some steps to override things that are already found in the base
class. What this means is that you will take a look at the base class and
change up the behavior that is found inside. This makes it easier to bring in
the behavior that you want in your new derived class.
It may sound a bit complex to work with, but it is a nice way to help you
choose out which parental features you want to be present in the new derived
class that you are creating. This process will make it easier for you to get the
right features into your newly created class, while still making sure that the
original stuff that you want in your parent class will stay there.
Overloading
Another thing that you can work on with inheritances is a process known as
overloading. When you are working with this overloading, you are able to
take one of your identifiers and use it to define two or more methods. For the
most part, there are often only two methods inside each of your classes, but
there are some situations where this can be higher. The two methods need to
be found in the same class will then have parameters that are different, so you
still need them to be in different processes. You will find that overloading
can work when you want to have them do a task that needs to be under
different types of parameters.
Overloading is not a process that you will work with that often as a beginner,
but as you advance through your process a bit more, you may use it more
often. But you will still want to spend your time learning about overloading
in case you work with one code that will need it.
One Final Thing to Know About Inheritances
As you write out some of your Python codes, you may find that you can work
on multiple inheritances at once. When you do a multiple inheritance, you are
going to find that each level has some similarities to each other, but you can
still make some small changes in each level. You will quickly notice that
these multiple inheritances are not going to be that different from the single
one we talked about before, you just keep going down the line to get the
results that you want with each level.
When you start to do some codes that need multiple inheritances, you will
take one class, also known as your base class, and you will give it at least two
parent classes to get started. This is an important thing to learn when you are
growing your code, because you can really help to use it to get the code
written out as long as you need.
Multiple inheritances can be as simple or as complicated as you would like to
make them. When you work on them, you will be able to create a brand new
class, which we will call Class C, and you got the information to create this
new class from the previous one, or Class B. Then you can go back and find
that Class B was the one that you created from information out of Class A.
Each of these layers is going to contain some features that you like from the
class ahead of it, and you can go as far into it as you would like. Depending
on the code that you decide to write, you could have ten or more of these
classes, each level having features from the previous one to keep it going.
While creating these inheritances, remember that you are not allowed to
move from a multiple inheritance over to a circular inheritance. You can add
in as many of your parent classes as needed to the code, but you can’t make it
go in a circle and connect things with this method.
As you start to write out some more codes inside the Python language, you
will find that working with different types of inheritances can be pretty
popular. There are many times when you are able to stick with the same
block of code in the program and then make some changes without having to
waste your time rewriting the code over and over again.
2

ARGUMENTS IN PYTHON

N ow it is time to take a look at arguments and how they work in the


Python language. You can define your functions in this language by
taking the variable number of arguments. You can use either your
keywords, or you can work with your default or arbitrary arguments to help
you define these functions. We are going to spend some time looking into
how to do this.
In other beginner books, you may have learned a bit about functions that
were user defined. Particularly, you may have learned how to define
functions and what to call them. If you don’t do this the proper way, it is
going to result in some errors. Let’s look at an example of how this works
with the code below.
def greet(name, msg):
“”This function greets to the person with the provided message””
print(“Hello”, name + ‘,’ + msg)
greet(“Monica”, “Good morning!”)
When you put this into your compiler, you will get the results of “Hello
Monica, Good morning!” The function for greet() has two parameters that are
important. Since you have called this function up with two arguments, it is
going to run well and you are going to get it to come up with no errors.
However, if you use a different number of arguments than what we have
above, the interpreter won’t know what to do and it will start to complain.
Below you will see some similar examples of this. The first one has a single
argument and the second one will have no arguments. You will also see the
error message that comes up with each one.
>>>greet(“Monica”) # only one argument
TypeError: greet() missing 1 required positional argument: ‘mg”
>>>greet() # no arguments
TypeError: greet() missing 2 required positional arguments: ‘name’ and
‘mgs’.
Variable Function Arguments
Now it is time to take this a bit further. Until this point, the functions that you
worked with contained a fixed number of arguments. But there are other
ways that you are able to define your functions that will also help you assume
variable argument numbers. Let’s take a look at the three most common ones
that you can work within your code.
Python default arguments
When you are working on your function arguments inside of Python, they are
able to contain default values. You would then use the assignment operator of
the equal sign in order to offer a default value into your argument. A good
example of this is found below.
def greet(name, msg = “Good morning!”):
“” “”
This function provides a greeting to the person with the message you
provided. If the message is not provided, the default is going to be just “Good
morning!” This makes it a lot easier to handle than some of the other parts.
However, if you would like to also add in a message or a person to greet, you
could add the following with the code above.
print (“Hello”, name + ‘,’ + msg)
greet(“Kate”
greet(“Bruce”, “How do you do?”)
The name parameter that we put here is not going to have a default value and
during a call, it is necessary to have it there. The “msg” parameter is going to
have a default because we put that in to start, and it is the “good morning!”
value from before. When you call it out, it is up to you whether you want to
put it back in or not. If you do add a value in at the later part, it is going to
overwrite the default value.
In a function, any given number of arguments can have a default value, but
when you are working with your default argument, all arguments to its right
will have that same default value as well. This means that there isn’t a way
for non-default arguments to follow with your default arguments.
Python keyword arguments
The next type of argument that you can work with is the keyword argument.
When you use some values to help you call a function, these values are going
to be assigned over to the argument based on their position. In the example
above, the “Bruce” value would provide you with the name argument and the
“How do you do” message when you used it.
Python will allow you to use the keyword argument to call up functions.
When you call up functions with this method, the position, or the order of
your arguments can sometimes be altered. The two examples below are both
valid and will give you the same results, but they are written out a bit
differently.
>>> # 2 keyword arguments
>>>greet(name = “Bruce”, msg = “How do you do?”)
>>> # 2 keyword arguments (out of order)
>>>greet(msg = “How do you do?”, name = “Bruce”)
You can see that these are the exact same things, and the compiler is going to
see them in this manner as well. They are just written out differently. You can
choose the method that works best for your needs.
Arbitrary arguments inside of Python
Now, there are times when you are working on a code and you aren’t going
to know the number of arguments that you need to pass over to your function.
There is still a way to handle this situation when you are writing codes in
Python. You just need to do your calls with arbitrary argument numbers.
In the function definition, you would use the asterisk symbol (*) before the
name of your parameter to help signify that you are doing this type of
argument. The example below will show you how this could work in your
coding.
def greet(*names):
““““This function greets all the names in our names tuple””””
# names is a tuple with arguments for name in names:
print(“Hello:, name)
greet(“Monica”, “Luke”, “Steve”, “John”)

T HE OUTPUT for this one would just list out all the names above with Hello in
front. For example, it would say:
Hello Monica
Hello Luke
Hello Steve
Hello John

W ITH THIS EXAMPLE , we have called the function with many different
arguments and these arguments are going to be wrapped up in a tuple long
before they move into the function. Then, when we are in the function, we
used the for loop to help us recover all the arguments without needing to
write out the code a bunch of times.
As you can guess by this point, the functions found in Python have a lot of
different features that are going to make the life of a programmer to Python a
lot easier. While some of these do have similar capabilities compared to other
programming languages, many of them are unique to using Python.
The extras for these can sometimes make the function easier to use. For
instance, it can help take out some of the noise and bring clarity to the
intention of your callers. With these, some of the subtle bugs that show up in
some codes, and which are sometimes hard to find, can be reduced.
3

NAMESPACE AND PYTHON

I n real life, name conflicts can show up all the time. For example, think
back to your days at school. How many times did you meet more than
one person with the same name? Any time that a teacher asked for a
particular student, most of the other students would ask about which one they
were talking about. The teacher would have to give out the last name or some
other addition to the name to make sure they got the right one.
It would be easiest if everyone had a special name that no one else had. This
would help to get rid of any confusion and you wouldn’t have to go through
and provide additional information to get the right person. But since you
can’t convince parents to pick certain names for their children, it is kind of a
hard task to accomplish.
This can be a similar issue when you are dealing with a programming
language. When a programmer is writing a short program that doesn’t have a
lot of dependencies on outside information, it is easier to provide relevant and
unique names to all the variables. But, when there are thousands of lines in
the program and some outside modules are loaded up, it becomes harder to
do this. Namespaces can help make this process a bit easier and will ensure
that you don’t run into trouble with your code.
The Meaning of Namespaces
First, we need to take a look at what a namespace is and why it is so
important. A namespace is going to be a system of making sure that all the
names in a program are unique and that you are able to use them as the
programmer without any conflict. There are a few different types of
namespaces that you can use including:

Local namespaces: These are namespaces that are of local names


inside a function. These will be created when functions are called,
and will only last until the functions return.
Global namespaces: These namespaces will comprise names from
the different modules that you bring in to complete the project. They
are created when the modules are incorporated into the project and
will stick around until the script ends.
Built-in namespaces: These types of namespaces consist of built-in
exception names and functions.

What Happens When I Don’t Have a Namespace?


The namespace is very important when you are writing out some of your
code. They are there to help you find what you need later on as you work
through the code. They are also there to help the program learn what you
want to pull up as you work through your program and the code. If you don’t
give the function or the variable or another part of the code a namespace, how
is the code going to know what to pull up when you try to call it later on.
You also need to be careful with the type of name that you give on each part
of your code. If your name is confusing and you can’t remember it, you
won’t be able to call it up later on. If you misspell the name, either when you
are first labeling it or later on when you are trying to call it back up, you are
going to get an error that comes up as well. Take caution when you create
these names and make sure that you give them names that you can remember
and ones that can easily identify what you are trying to pull up later.
There are a few rules that come with these namespaces in Python, although
you still have a lot of flexibility with the name that you are using. First, you
can use any character, number, and symbol that you want, such as an
underscore. But you have to start with a letter. You can name something
twopieces if you would like, but you can’t name it 2pieces, or _twopieces. If
you do this, either when naming or calling up the property, you are going to
end up with an error message in your program.
And one final note here is that when you create your namespaces, make sure
that you don’t try to give two different variables or functions the same name.
This is going to cause a lot of problems in your code because the program
won’t know which one you want to save or call up. Give each part their own
namespace so that your code can run as smoothly as possible.
The Scope of Namespaces
These namespaces are important because they will help to identify the whole
list of names that are inside a program. This doesn’t mean to imply that
variable names can be used anywhere. A name will also contain a scope
defining the sections of the program that the name would be used without any
prefixes. There are many different scopes in a program and they are all
important for helping the program run smoothly.
First, there is the local scope. This is the innermost scope that will have a list
of local names that are available to the function you are on now. Then there is
the scope of the whole enclosing functions. A name search will often begin
from the enclosing scope that is nearest, and then it will move out. Then there
is the module level scope that contains the entire list of global names from
the existing module. And finally, there is the outermost scope that contains
the entire list of built-in names. This scope is usually the one that you will
use last in order to get the referenced name.
The most important thing to remember about the namespaces is that you need
to pick out the right ones to work with all parts of your code. If you are
reusing them, it can lead to a lot of confusion in your code and makes it very
hard for the code to figure out what it should call up and what isn’t so
important. This is especially important if you are bringing in some modules
to help make the code easier to understand and use.
4

WORKING WITH ITERATORS IN PYTHON AND


WHAT THESE MEAN

N ow it is time to work with some iterators in the Python language.


Iterators are any objects in your code that will permit iteration over
a collection. These collections do not have to be objects that you
already have in your memory, and because of this, you do not need to make
them finite.
Let’s be a bit more precise with the definition that we are using. You can
easily say that an iterable is an object that has a “__inter__” method needed
in returning an iterator object. It is also possible that an iterator is an object
containing the ‘__inter__’ method and the ‘__next__’ or you can use simply
‘next’. When you look at the former, you are going to get an iterator object
and then the latter is going to return the subsequent elements of the iteration.
When you work on your code, you want to avoid calling ‘next’ and __inter__
directly. Python is going to help you call these up automatically if you use list
or ‘for’ comprehensions. In case you do need to go through and call them up
manually, you can use the built-in functions of ‘inter’ and ‘next’ in Python
and then pass the container or iterator as parameter.
If you are into mathematics, you will find that you can do a lot with Python.
This programming language has done a lot to support the idea of mathematics
in the form of lists, sets, and tuples and the notations that are used for both
can be pretty similar.
The biggest thing that you may like to work with when you bring out Python
if you have a mathematical mind are the iterators, as well as the generators
(which we will talk about later). Both of these are going to make it easier on a
programmer to write simple code that they can read, even when the code is a
bit more complex. This chapter will take us a little bit further than we have
already talked about to explore iterators and how they can be so powerful in
your own code.
What are Iterators
While we discussed this a bit above, it is time to go deeper into the world of
iterators. An iterator is basically any object that can iterate throughout a
collection. The collection doesn’t have to be only objects that are already
found in your memory, and it is not required that they are finite.
Iterables are going to be defined as objects that will use the method called
__iter__, and this is the method that you want to return back to the object that
is your iterator. This iterator is going to work with two methods. It can either
work with the __next__ or the __iter__ method. With the first one, you are
going to get the method to return the object of the iterator, and then the
second one is going to return to you the element that is in your iterator.
Because an iterator is its own iterator, it will always be able to return
__self__ in its iterator method.
To help you keep your code working well and to make sure that things
continue to make sense when you are coding, most programmers are not
going to call either of these two methods directly. Instead, they often choose
to work with something called list comprehension. And Python can call this
up for you automatically. There may be times though when you need to do
manual work to call these up. Python will provide you with some special
functions to make this easier.
At this point, it is time to take a look at how this is done. Let’s work with “b”
as our iterable. If this is true, you would be able to use iter(b) rather than the
d.__iter__() to get the same thing. You can technically work with either one
of these in your code since they mean the exact same thing. But the first one
is cleaner and easier to read in the code compared to the second one.
On a bit of a side note, when we are working with the len() function, note that
the iterator doesn’t really have a well-defined length at this point. This is
going to be true of most of your iterators in Python. You won’t use this
function very often. But, if you do want to take a look at your chosen iterator
and then find out how many items are inside of it, then you would go through
and do this manually, or you can use the sum function.
Working with an Example of an Iterator
Some iterables will contain other objects, which will serve as their iterators,
and this means that they are not going to be iterators themselves. For
instance, the object ‘list’ is an iterable but not at all an iterator (instead of
implementing next, it is going to implement __inter__). As you can see in the
example that we are going to have below, iterators for the ‘list’ objects are
going to use ‘listiterator’ type. You may even notice how the ‘list’ objects
contained a properly defined length, and the listiterator objects don’t have
that.
>>> a = [1, 2]
>>> type(a)
<type ‘list’>
>>>type(iter(a))
<type‘listiterator’>
>>>it = itera)
>>>next(it)
1
>>>next(it)
2
>>>next(it)
Traceback (most recent call last):
File “<stdin>”, line 1, in <module>
StopIteration
>>>len(a)
2
>>>len(it)
Traceback(most recent call last):
File “<stdin>”, line 1, in <module>
TypeError: object of type ‘listiterator’ has no len()
When an iterator is done, the interpreter is going to expect it to come out and
raise an exception that is known as “StopIteration”. However, remember that
iterators can technically keep going over an endless set. These iterators are
going to dictate that the user needs to make sure that they aren’t using the
program in a manner that will create a loop that keeps going on and on. It is
often better to add these into the code yourself, to make sure that the user
doesn’t get lost and cause an endless loop to the program that they can’t get
out of.
5

EXCEPTION HANDLING AND HOW TO CREATE A


UNIQUE CODE WITH THEM

T he next topic that we are going to talk about is how to handle and
raise your own exceptions. As you work on your codes, you will find
that there are certain exceptions that the program already brings out,
and then there are some that you will purposely add into the program to make
it work the way that you want. The automatic ones will be found in the
Python library. For example, if the user tries to divide by zero in your code,
the Python language won’t let this happen. But if there is a particular
exception that you would like to raise, or something that you don’t want your
user to be able to do, you can add that in as well.
Now, the first type that you can work with is an exception that your compiler
is able to recognize all on its own. If the user does one of these things, then
the program won’t let it finish. This could be if you add in an improper
statement to the code, or you misspell a class so the compiler has trouble
finding it. Or, it could be when you try to divide by zero. These are a few of
the examples of exceptions that the compiler can raise automatically for you.
As a programmer, it is a good idea to know some of the different exceptions
that are already found inside the Python library. This helps you to know what
to add into the code, and when an exception may turn up for you. Some of the
exceptions, and their keywords, that will come up while you do coding
include:

Finally—this is the action that you will want to use to perform


cleanup actions, whether the exceptions occur or not.
Assert—this condition is going to trigger the exception inside of the
code
Raise—the raise command is going to trigger an exception manually
inside of the code.
Try/except—this is when you want to try out a block of code and
then it is recovered thanks to the exceptions that either you or the
Python code raised.

Raising Exceptions in your Code


First, we are going to take a look at how you would use these exceptions in
your code. When those automatic ones show up, you want to be prepared and
know what you can do to make them easier to understand and more. If you
are working on a code and you notice that there is an issue that shows up, or
you want to figure out why the program is doing something that seems
wrong, you will be able to look and notice that the compiler is raising a new
exception. This is because your program has had a chance to look through
your code and is trying to figure out what you would like to do.
Many times, the issues that you see will be simple and you can easily fix
them. For example, if you are trying to bring up a file, and you gave it the
wrong name, either when you first named it or when you are calling it up,
your compiler is going to raise a new exception. The program looked through
your code and noticed that you were doing something it couldn’t help you
with.
A good way to start to see how these exceptions work is to take some time to
do our own example and then see what happens when the compiler raises one
of these exceptions. Here is a code that you can add into your compiler to see
what happens.
x = 10
y = 10
result = x/y #trying to divide by zero
print(result)
The output that you are going to get when you try to get the interpreter to go
through this code would be:
>>>
Traceback (most recent call last):
File “D: \Python34\tt.py”, line 3, in <module>
result = x/y
ZeroDivisionError: division by zero
>>>
When you take a look at this example, your compiler is going to bring up an
error, simply because you or the user is trying to divide by zero. This is not
allowed with the Python code so it will raise up that error. Now, if you leave
it this way and you run the program exactly how it is, you are going to get a
messy error message showing up, something that your user probably won’t
be able to understand. It makes the code hard to understand, and no one will
know what to do next.
A better idea is to look at some of the different options that you can add into
your code to help prevent some of the mess from before. You want to make
sure that the user understands why this exception is being raised, rather than
leaving them confused in the process. A different way that you can write out
this code to make sure that everyone is on the same page includes:
x = 10
y=0
result = 0
try:
result = x/y
print(result)
except ZeroDivisionError:
print(“You are trying to divide by zero.”)
As you can see, the code that we just put into the compiler is going to be
pretty similar to the one that we wrote above. But we did go through and
change up the message to show something there when the user raises this
exception. When they do get this exception, they will see the message “You
are trying to divide by zero” come up on the screen. This isn’t a necessary
step, but it definitely makes your code easier to use!
Defining Some of your Own Exceptions in the Code
In the examples above, we simply took the time to handle exceptions that the
program already recognizes. This can provide you with a way to add
personalized messages, rather than just strings of words and code that no one
will understand. But, there is another level here. There are times when you
are creating a code and you want to be able to raise an exception all on your
own.
For example, you may be working on a code and decide that the users should
only be able to input certain numbers, and then the others are not allowed.
This may work when you are creating a game for others to play. Or you could
have an exception come up if you only want to let the user try to answer three
times. Once the user has gone through all the guesses that they are allowed,
the compiler will then raise one of these exceptions to tell the user that they
won’t be allowed to guess again.
These exceptions are unique just to your code, and if you don’t write them
into the code, the compiler will just keep going, without recognizing that it is
supposed to stop. You can add in any kind of exception that you want to this
message, using a similar idea that we went with before. The code that you
can do to make this happen includes:
class CustomException(Exception):
def_init_(self, value):
self.parameter = value
def_str_(self):
return repr(self.parameter)
TRY :

raise CustomException(“This is a CustomError!”)


except CustomException as ex:
print(“Caught:”, ex.parameter)
When you finish this particular code, you are done successfully adding in
your own exception. When someone does raise this exception, the message
“Caught: This is a CustomError!” will come up on the screen. You can
always change the message to show whatever you would like, but this was
there as a placeholder to show what we are doing. Take a moment here to add
this to the compiler and see what happens.
Exception handling is something that you will work with a lot more as you
start to write out some more advanced codes on Python. There are a lot of
times that you will work either with the exceptions that are recognized by the
program or ones that you want to bring up for the code that you are writing in
particular. Working with some of the codes that we bring up in this chapter
will help you to deal with these exceptions and will ensure that you are able
to make them look good to the user. Make sure to try a few of these codes in
your compiler to ensure that you get some practice with these exceptions and
that you are able to get a good idea of how these exceptions are supposed to
work.
6

THE PYTHON GENERATORS

P ython generators are functions that will help you create sequences of
results. These generators are able to maintain their local state so that
the function will resume where they left off whenever it has been
called more than one time. You can think of your generator as a powerful
iterator. The function state is maintained with the keyword when you use
“yield”. In Python, this is similar to using ‘return’, but there are some big
differences that we will explore as we go through this guidebook.
How These Generators Work
The best way to see how a generator will work is to take a look at a simple
example.
# generator_example_1.py

D EF NUMBER G ENERATOR ( N ):

Number = 0
While number < n:
Yield number
Number + = 1
MY G ENERATOR = NUMBER G ENERATOR (3)

PRINT ( NEXT ( MY G ENERATOR ))

print(next(myGeneartor))
print(next(myGenerator))
The code that we have above is going to define a generator for you with the
name ‘numberGenerator’ that gets the value ‘n’ as the argument, before you
go through and define it using a while loop to help with a limit value. In
addition, it is going through and helps you define a variable that has the name
of ‘number’ and then it assigns a zero value with this.
When you call in your instantiated generator using the ‘myGenerator’ with
the method ‘next()’ in it, it is going to go through and run your generator
through the code until the initial ‘yield’ statement. For this example, it is
going to return 1. Even when you get a value returned to you, your function is
going to tend to keep the variable ‘number’ value for when you call up your
function next, and then it grows in value by one. What this means is that it is
able to start up again right where it left off at the next call of the function.
If you decided to call up this generator once more following what we have in
the code already, you are going to raise an exception here. It will say
“StopIteration” since it has finished up and reverted from its internal while
loop.
This may seem a bit silly at first, but it is a useful functionality since you are
able to use these generators to help you create some iterables as you go along.
For example, if you went through and used the ‘list()’ to wrap
‘myGenerator’, you would then get an array of numbers such as [0, 1, 2] back
rather than your generator object. In some cases, this is going to be a bit
easier for you to work with.
What is the Difference Between Yield and Between
Now, there are times when you will use the ‘return’ keyword. You will use
this to get a return of a value from a given function and when this happens,
the function is going to get lost from its local state. This means that when you
try to call on a function again, or a second time, it is going to have to start
fresh from the first statement.
On the other hand, you can use the ‘yield’ to help keep the state between the
different functions. This method is going to ensure that you are able to get the
function to go back to where it was when you first called it up. So, the one
that you choose will depend on where you want the function to end up when
you are done. Do you want it to go back to its original place where you called
it up, or do you want it to go all the way back to the beginning?
Using Return in a Generator
The generator can use the statement for ‘return’ but only when there isn’t a
return value. The generator will then go on as in any other function return
when it reaches this statement. The return basically tells the program that you
are done and you want it to go back to the rest of the code. Let’s take a look
at how you can change up the code to use these generators simply by adding
in an if else clause so that you can discriminate against any numbers that are
above 20. The code you would use for this includes:

# GENEATOR _ EXAMPLE _2. PY

DEF NUMBER G ENEATOR ( N ):

if n < 20:
number = 0
while number < n:
yield number
number +=1
else:
return
print(list(numberGeneator(30)))
This particular example is going to show that the genitor will be an empty
array. This is because we have set it so that it won’t yield any values that are
above 20. Since 30 is above 20, you will not get any results with this one. In
this particular case, the return statement is going to work in the same way as
a break statement. But, if you go through this code and you get a value that is
below 20, you would then see that show up in the code.
Some more about Generators
Remember that a generator is going to be a type of iterator, one that your
code has defined with the notation of a function that is easier for you to use.
When you use these generators, you are working with a type of function that
will give you a yield expression. These won’t be able to give you a return
value though. Instead, when they are ready for you to use them, they are
going to just give you the results. The process to remember the context that
you need to get a generator is an automated process in this language. The
context is going to be the value of your local variables, the location of the
control flow you want to use, and even more.
Now, there are some options when it comes to calling up the generator that
you want to use. If you call it with the help of __next__, the yield you are
going to get will show up at the next iteration value in the line. You can also
choose to work with __iter__, which is one that will automatically implement
in your program, and it tells the program that it should take that generator and
use it in the best place where an iterator is needed.
As a programmer, there are a few options that you can choose from when it
comes to working on these generators. Some of the options that you can use
include:

Generator expressions: These types of expressions give you as the


programmer the ability to define a generator with a simple notation.
This is something that is done when you are creating your own list in
Python. You would use the methods of __iter__ and __next__
because they provide you the results for any objects in the generator
type.

Recursive generators: It is possible for your chosen generator to be


recursive, just like what you would find with some functions. The
idea here is that you would swap all of the elements that are on your
list with the one on top, allowing all of them to move to the first
position, and the rest of the list is then gone.

When Should I Work with Generators?


A question that you may have as we are looking through this chapter is why
you should work with generators when you learn coding. As we have seen
from some of the examples that we worked on, these generators can be very
advanced tools when you are writing out your own codes. In programming,
there are times when these generators can actually improve efficiency. Some
of these scenarios include the following.

Any time that you have a lot of data that you want to process
through. Generators will be useful here because they can offer
calculation on demand. This is a very common method to be used in
stream processing.

You may also use stacked generators in the process of piping. This is
the same way that you would use Unix pipes. Put differently, you are
able to use these generators in order to pipeline a series of operations
to make things easier.
7

WHAT ARE ITERTOOLS IN THE PYTHON


LANGUAGE

T hese itertools are just a collection of tools that you can use to help
you handle the iterators that occur in your code. Iterators, to put it
simply, are just data types that can be used in one of your for loops.
The list is one of the most common iterators that are used in Python, but there
are many others that you may come to rely on as well.
Before you are able to get started with these, you must take the time to bring
in the itertools module. At the same time, it is often helpful to go through and
import your operator module. This is not always necessary, but you will find
that with many of your own personal codes, the operator module can be
handy to have around.
The itertools module is going to hold onto a bunch of functions. We are going
to spend some time talking about some of the different things that you can do
in your code with these itertools and why they are so good for creating strong
and powerful codes.
Itertools are really cool things that you can work within the Python language.
Even though this has a very technical sounding name, and it is often
something that is not emphasized much in beginner materials on Python,
these itertools will be a type of built-in package that you can use in your
programs to get more power out of them. As you work with this topic, the
biggest barrier that there is for this, is that Python has many methods that are
going to perform tasks that are very similar. This chapter is going to show
you some of the different methods that you can use with itertools in Python.
chain()
The first method we are going to take a look at is the chain(). This method is
going to help you provide a list of lists, iterables, and tuples and it will join
them together for you. Think about when you were younger and you would
use some tape to join pieces of paper together. This is the same process that
you will use to help you do the chain() method in Python.
count()
Let’s suppose that you want to conduct a sensitivity analysis of a very
important business simulation. This simulation is going to be about how the
cost of your tool is ten dollars, but over the next few months, you hope that
the demand for that tool is going to explode and you want to make sure that,
even if it costs you a bit more to keep up with demand, you will still make a
profit. You will then want to create a list of theoretical costs passing it to the
‘magic_business_simulation’. You may use something that looks like the
following.
[(i * 0.25) + 10 for I in range (100)]
>>>[ 10.0, 10.25, 10.5, 10.75, …]
This is not bad at all. It is sometimes hard to read through the chain, but it
shows that the syntax is not that hard. At this point, you may wonder how the
function that you just did is going to know when to stop. The answer is that it
won’t stop. Just like with some of the other methods that you may do in
Python, itertools will generate infinitely until you add in a break to help make
them stop.
Another thing to note is that the itertools are similar to iterables. You know
that having an infinite amount of iterables could be scary because you don’t
know when to stop. We are going to take a look in a moment at one of the
itertools that you can use to help you stop this process so you don’t get stuck
in an infinite loop.
ifilter()
This method is going to just need an easy invocation of the following syntax
to get it to work.
print list(intertools.ifilter(lambda x: x % 2, numbers))
>>>[23, 7]
This is the itertool you will want to work with when it is time to filter out
some of the information that you have in your code. If you are trying to
eliminate some things, or you want to search for something inside the code,
you would use a code that is similar to what we have above.
compress()
This is a method that a lot of programmers like to use. It is a perfect addition
to your code. Two lists given (a and b) return the ‘a’ elements and then the
corresponding elements of ‘b’ for it if (‘a’) are true.
imap()
This is the last method that we are going to look at for this chapter and we are
going to use it as a simple addition for those that already know how to use
some of the other programming options such as ‘map()’ ‘map and ‘filter’.
When you pass it as a function, it is going to grab the arguments
systematically and then throw them at the function to return the results.
These are the most important of the itertools that you can use when you are
creating some of your own codes. They are simple to use and you are going
to love how much they can add to your own code. Practice using some of
these itertools and find some codes that already have them in place to see just
what they are able to do for you.
8

WHAT ARE CLOSURES IN PYTHON AND WHY


ARE THEY SO IMPORTANT

B efore we take a close look at what these closures are, we need to


take a look at a few other parts of the Python language. Non-local
variables and nested functions are also important and they will help
us see what the closure is all about. First, we will start with the nested
function.
When you have one function that is used and defined inside of a different
function, that first one is going to be your nested function. These nested
functions are interesting because of how you create them and their ability to
access variables of your enclosing scope. In this language, these non-local
variables can be accessed only within their current scope, and no scopes
outside of this can find them. A good example of how this works is the
following:

# P YTHON PROGRAM TO illustrate


# nested functions
def outerFunction(text):
text = text

def innerFunction():
print(text)

INNER F UNCTION ()

if __name__ == '__main__':
outerFunction('Hey!')
As you can see here, the innerFunction() part of the code is something that
your outerFunction can access, and you can use it as much as you want as
long as you are in that function. But if you leave this, or go to another part of
the code, you will not be able to access that innerFunction() part. In this case,
the innerFunction() is going to be the nested function, which will use text as
its non-local variable.
What are Closures?
Now that we have taken a look at a nested function and those non-local
variables, it is time to talk a bit more about closures. A closure is going to be
an object of a function that remembers values that are in your enclosing
scope. This happens even if those objects are not showing up in the memory.
Your closure is going to be a record that can help store a function along with
its environment, a mapping associating all the variables in the function that
are free. Remember that the variables are going to be used locally, but they
are then defined in your enclosing scope. It is able to do this with the value,
or the reference that you bound to the namespace of the closure when you
first created it.
A closure is going to be a bit different than your plain functions. These
closures will make it so that the function is able to access these variables that
you captured by going through the closure’s copies of the values or
references. This can happen, even when that particular function is called up
outside of its own scope. A good example of a code that you can write out
that works with this is the following:
# Python program to illustrate
# closures
def outerFunction(text):
text = text

def innerFunction():
print(text)

return innerFunction # Note we are returning function WITHOUT


parenthesis

if __name__ == '__main__':
myFunction = outerFunction('Hey!')
myFunction()
Take a moment to type this code into your compiler and see what happens.
What you should be able to observe from this code is that the closure is there
to help you call the function up, even when you are not in the right scope.
The function known as innerFunction has its scope present only in your
outerFunction. But when you use one of these closures, like we did before,
you are able to extend the scope so you can call it up anywhere that you
would like.
The code above did the same thing that we did originally, make it so that the
nested function is called up only inside its original function. But if you want
to be able to call up the function at any point, even when you are outside its
scope, you would need to use a code like this one.
# Python program to illustrate
# closures
import logging
logging.basicConfig(filename='example.log', level=logging.INFO)

def logger(func):
def log_func(*args):
logging.info(
'Running "{}" with arguments {}'.format(func.__name__, args))
print(func(*args))
# Necessary for closure to work (returning WITHOUT parenthesis)
return log_func

def add(x, y):


return x+y

def sub(x, y):


return x-y

add_logger = logger(add)
sub_logger = logger(sub)

add_logger(3, 3)
add_logger(4, 5)

sub_logger(10, 5)
sub_logger(20, 10)
If you want to see the difference between these two codes, all you need to do
is bring up your compiler and try it out. You will see that the output is a bit
different, which helps you to see how the closure can help you out with your
code, so you can reach what you need regardless of whether you are in the
right scope or not.
Basically, you need to decide whether you want to be able to reach the nested
function only in its original function, or if you would like to be able to reach
it no matter where you are in the code. If you are going for the first option,
then you will need to just use a code that is similar to what we first talked
about. If you want to be able to reach the function outside of its scope, then
you need to make sure that you add in a closure to help.
When would I use Closures?
There are several different reasons why you would like to use a closure in
your coding. First, these closures are often used as a type of callback
function. This means that they are able to provide you with some sort of
hiding for the data. In your code, this is going to help you reduce the use of
global variables, which can clean up the code and can keep some of the bugs
out of it.
In addition, these closures can help with functions. If you are trying to work
with a few functions at the same time, closures can be a good and efficient
way to deal with these. But, if you do plan to have a lot of different functions
in the code, then you will want to go with a class.
It will often depend on your end result when it comes to whether you want to
use closures or not. Some programmers find that they don’t need to be able to
access their nested functions outside of its current scope, so there is no reason
for them to use these closures. Others may need it to show up in other places
as well, or to work outside its original scope, and a closure can make that
easier to accomplish for them.
9

WORKING WITH REGULAR EXPRESSIONS

W hen you are working with Python, one thing that you will notice
is that its library is really amazing. This library is going to be
full of things that are called regular expressions, and it is the part
that is responsible for handling your searches while also extracting different
tasks for you behind the scenes.
These regular expressions are going to be used in your coding to help you
filter out different texts or text strings. It is possible to check and then see if a
string or a text that is already present inside your code is going to match up
with a regular expression as well. When it comes to working with regular
expressions, you will be able to stick with a similar syntax no matter which
language you choose. Learn how to do this with Python, and you will be well
on your way to using it with other coding languages as well, if you choose.
At this point, you are probably wondering what regular expressions are and
how you would be able to use them properly in your coding. A good place to
start with this process is to bring out the text editor and have your program
locate a word that was spelled in two different manners in your code. We are
going to help you do a few things with the use of regular expressions to help
clear out confusions that would come with this problem.
Of course, regular expressions open up a world of things that you are able to
do with your code. This is why it is so important to learn how to use them the
proper way. If you want to start using these in your code, the first step is to
import your expression library. You can do this when you first start up the
program because you will probably use it quite a bit.
There are many regular expressions that you can choose to work with when
writing out statements, and if you know what they can all do and how they
work, it makes a big difference in what you can do with your code. Let’s take
some time to look at the most common regular expressions that you can work
with, how they work, and how are you going to get them to perform correctly
inside your own codes.
Basic Patterns
One thing that you are going to like about regular expressions is that you
won’t be stuck just using them for specific fixed characters. They can also
help you to watch out for some patterns if you need them to. Some of the
patterns that are common with regular expressions include:

1. a, X, 9, < — ordinary characters just match themselves exactly. The


meta-characters that aren’t going to match themselves simply
because they have a special meaning include: . ^ $ * ? { [] and more.
2. . (the period) — this is going to match any single except the new line
symbol of ‘\n’
3. 3. \w — this is the lowercase w that is going to match the “word”
character. This can be a letter, a digit, or an underbar. Keep in mind
that this is the mnemonic and that it is going to match a single word
character rather than the whole word.
4. \b — this is the boundary between a non-word and a word.
5. \s — this is going to match a single white space character including
the form, tab, return, newline, and even space. If you do \S, you are
talking about any character that is not a white space.
6. ^ = start, $ = end — these are going to match to the end or the start
of your string.
7. \t, \n, \r — these are going to stand for tab, newline, and return
8. \d — this is the decimal digit for all numbers between 0 and 9. Some
of the older regex utilities will not support this so be careful when
using it
9. \ — this is going to inhibit how special the character is. If you use
this if you are uncertain about whether the character has some
special meaning or not to ensure that it is treated just like another
character.
These are just a few of the regular expressions that you can use when you
work on your code. These are important to learn, so bring them out and place
into your compiler to get some experience with them. There are a lot of codes
when you are going to need them, and in some cases, you may need to use
more than one to help you get the results that your code needs.
How to Use your Regular Expressions to Help Do a Query in Python
In addition to looking for some of the basic patterns in a code like we did
above, you can also work with these regular expressions to help you do a
search on any input string in your code. There are three different methods
that you can use, depending on what you want to look up in the code. Each
time you use a program, you may find that you need to do a different type of
query in order to get the program to work the way that you want. Working
with regular expressions in Python can definitely help you to get this done.
Let’s take a look at how each one works.
Using your search method
First on the list is our search() method. This is a good one to use because it
allows you to match up your query to anywhere in the code. This function
doesn’t have a lot of restrictions like the other ones. If you want to search
through the whole string, rather than just at the beginning or the end of it,
then the search method is the right one for you.
The search() method is going to help you look for something that is inside of
your string, even if it happens to be at the way end of the string you are
looking inside. An example of how this can work includes the following
code.
import re
string = ‘apple, orange, mango, orange’
match = re.search(r’orange’, string)
print(match.group(0))
Take a moment to add this to your compiler and see what output you get.
This code is going to give you an output of “orange”. With this method, you
are only going to see the match one time. There could be ten oranges in the
code, the search function will just tell you if one is there, not how many of
that item are in the string. Even though there are technically two oranges in
the code above, the search() method will just return one of them to you. Once
it finds that first orange, it has done its job and will stop. Later we will
discuss another method that you can use that can help you know exactly how
many of an item is in the string.
Using your match method
While the search() method can do a lot for your code, there may be times
when you want to add a little bit more to it. The match() method is a bit
different because it is going to find matches to your query, but only when
they happen at the beginning of your string. It is responsible for looking for a
specific pattern inside the syntax that you are searching through.
Let’s take a look at the example that we did above. You can see that there is a
pattern, where the object ‘orange’ appears between all your other words. But
when you decide to use the re_match method instead of using the re_search
like we did above, you are going to get no results.
Even though orange is in the code, it is still not the first part of the string. In
this case, ‘apple’ is. With this match method, it just looks at the first object in
your string. And since that first object is not orange in this case, it was not
able to find a match for you.
If your pattern doesn’t have the right objects in the right order when you start,
then you are not going to get the right answers here. You can change up the
pattern, but once you get the code running, that pattern is stuck and you
won’t be able to make changes. With this example, if you asked for apple,
you would get a result because apple is the first part of your string.
You can experiment with the match method simply by taking the code that
we have above and replacing the search part with the match part. This helps
you to get a new output and can make it easier to tell what kind of search you
are doing here.
Using the findall method
There are also times when you want to be able to find out how many of a
particular object is in your string. If you go with the other methods, you will
just find out whether the object is the first in the pattern or if that object is in
the string at all. But with the findall method, you will be able to find all of the
oranges that are in the string. Using the example above, the findall method
would provide you with the output of “orange, orange” since there are two of
them present there.
You can have as many of the same object in your string as you would like, or
you can pick out any other object as well. If you added ten more oranges in
there, the findall method would list out orange ten times. If you wanted to
find out how many apples are there, you could use the findall method and in
this example, apple will show up once.
To see how the findall method will work differently than the search method
we discussed before, take a moment and experiment. Bring up your compiler
and use the code that we had in the search method. Replace the search part
with findall and see what happens.
Then go through and mess around with the list a little bit. Take things away,
add more objects, and play around to see what will happen to your output
each time that you do this. This is a good way to practice your regular
expressions and you can even throw in the match method to learn better how
each of these work.
The Square Brackets in Your Code
As you work on writing some of your Python code, you will find that there
are instances when you will need to use the square brackets. These can help
you to indicate a specific set of characters and can set things apart from
others. One example of this would be when you are writing out a statement,
such as [abc], you will have a match for a, b, or c. This can make it easier for
you to get a match done rather than having to do searches individually for
each one.
Other types of codes, such as the \s and the \w will work with your square
brackets and you will want to make sure that you put them inside. The only
exception to this rule is that the dot will just be a literal dot and you will not
be able to do a match with them, even if they are inside your square brackets
like before.
The examples that we did above are just a few of the examples of what you
are able to do when you work on regular expressions, and you will often use
them to make sure that the code works the way that you want. There is so
much that you can do with these regular expressions and they often make
things easier when there are some harder parts that come up with your code.
For practice here, take some time to add a few of these regular expressions
into your interpreter. Practice and see how all of the different ones work. You
can take the code that we used above and change out the regular expressions
and see what different answers you will get. Add in some more fruits, add in
more of the same type of fruit, and mess around until you get the hang of how
these regular expressions work.
10
WHAT ARE THE CONDITIONAL STATEMENTS
AND WHEN WILL I NEED TO USE THEM?

T he next topic we are going to explore is the conditional statements, or


the decision control statements. There are some times when you will
need to get the code to do some things, or make some decisions when
you are not there. Any time that the user will be able to put in an answer on
their own, rather than picking from two options, you will want to use these
decision control statements to help you keep the programs moving.
There are three different options that you can work on when it comes to the
conditional statements including the if statement, the if else statement, and
the elif statement. First, we are going to keep things simple and talk about the
if statements. This one is going to work on the idea that the answer the user
gives is either true or false. If the user puts in information that the program
sees as true, then the interpreter is going to continue on with the program and
show the information that you want. But if the user puts in information that is
seen as false, the program is going to end because it doesn’t know where to
go from that point.
To help show this a bit better, let’s look at an example of how the if statement
will work.
age = int(input(“Enter your age:”))
if (age <=18):
print(“You are not eligible for voting, try next election!”)
print(“Program ends”)
Let us explore what is going to happen with this code when you put it into
your program. If the user comes to the program and puts that they are
younger than 18, then there will be a message that shows up on the screen. In
this case, the message is going to say “You are not eligible for voting, try
next election!” Then the program, as it is, is going to end. But what will
happen to this code if the user puts in some age that is 18 or above?
With the if statement, nothing will happen if the user says that their age is
above 18. The if statement has just one option and will focus on whether the
answer that the user provides is going to match up with the conditions that
you set with your code. The user has to put in that they are under the age of
18 with the if statement in this situation, or you won’t be able to get the
program to happen again.
Now, this could cause some problems. You want the user to put in the answer
that works the best for their age. Some of the users who come to your
program will have an age that is above 18 and you do not want the program
to just end without anything there just because they are older. This can look
unprofessional and most people will have no idea why the code ended
suddenly. This is why the if statements are not used all that often. They are
too simple and they don’t help the program make many decisions based on
the information that they get from the user.
This is why the if else statements are often used instead. These take the idea
that we just talked about above, and expands it a little bit further. Let’s say
that you have the program above and you want to have a result come up
regardless of the age that the user inputs into the program. So, we can
separate people out based on their age, with a group for those under 18 and a
group for those 18 and above. The code that you can use to make this work
includes the following:
age = int(input(“Enter your age:”))
if (age <=18):
print(“You are not eligible for voting, try next election!”)
else
print(“Congratulations! You are eligible to vote. Check out your local
polling station to find out more information!)
print(“Program ends”)
As you can see, this really helps to add some more options to your code and
will ensure that you get an answer no matter what results the user gives to
you. You can also change up the message to say anything that you want, but
the same idea will be used no matter the answer that the user gives.
You have the option to add in some more possibilities to this. You are not
limited to just two options like we have above. If this works for your
program, that is just fine to use. But if you need to use more than these two
options, you can expand out this as well. For example, take the option above
and expand it to have several different age groups. Maybe you want to have
different options to come for those who are under 18, those that are between
the ages of 18 and 30, and those who are over the age of 30. You can separate
it out in that way and when the program gets the answer from the user, it will
execute the part that you want.
Another example is if the program wants to have the user pick out their
favorite color. You can make a list of six colors that you have in the code and
the corresponding message that will come with it. You may pick out red,
blue, purple, green, yellow, and orange. Then, the user can also pick another
color, but you add a catch all to the end. This way, if the user chooses black
as their favorite color, a seventh and final message will come up.
Adding a catch all to the end of your code, or the “else” part of this, can be
important. You can’t always think about all the different examples that the
person may put in. You could put a hundred options into your code (which
would take a lot of time and be messy and not really necessary), and then the
user will name a color differently or pick the one color that you have
forgotten. If you don’t have that as an option, then the program won’t know
how to behave from there.
The else statement in here is important because it helps you to catch all of the
remaining answers that the user could give you. If you don’t have a statement
in the code to handle the answer that the user gives, then the else statement
will make sure to get you covered. Just make sure that you have that else
statement in place to get it done.
Working with the elif Statements
The if statements are good ones to work with when you first start to work in
the Python language and you want to know how these conditional statements
work. They are based on the idea of a true and false answer. If the answer the
user gives is deemed true based on your preset conditions, then the program
will continue on its path. But if the condition is deemed false based on your
conditions, then the program will end. This is pretty simple to work with, but
it is often too basic for the ideas that we want to work on here.
With the if else statements, we took this a bit further. We looked at the idea
that the if statement was too simple and we wanted to be able to handle any
answer that the user brought to us. The if else statement was able to handle
this. We looked at a simple example of that and how you could make it work,
and then discussed how you can expand it out to handle a lot of different
options from the user based on the input they give.
Now it is time to work on the elif statements. The elif statement is a great
thing that you can do with the Python program. These allow your user to pick
from a few choices that you present them, and then, depending on the answer
that they give, the program will execute and give the right results.
You will see these elif statements in many different places. One option is in
games. If you have ever played a game or been on a program that gave you a
menu style of choices to make, then you have seen the elif statements in
action. These statements are often used if you want to provide more options,
rather than one or two, to your user.
With the elif statement, you do have some freedom. You can choose to have
as many of these statements present in the code as you want, as long as you
write out the code in the proper manner and you make sure that you add in
the right function to go along with them. In addition, having too many of
these could mean that you have a complicated code that is a bit harder to
write out than others, but if it works well for your program, then it is just fine
to add in as many as you would like.
To better understand how these elif statements are going to work, here is a
good example of the syntax that comes with these statements.
if expression1:
statement(s)
elif expression2:
statement(s)
elif expression3:
statement(s)
else:
statement(s)
This is a pretty basic syntax of the elif statement and you can add in as many
of these statements as you would like. Just take that syntax and then place the
right information into each part and the answer that is listed next to it. Notice
that there is also an else statement at the end of this. Don’t forget to add this
to your code so that it can catch any answer that the user puts in that is not
listed in your elif statements.
To help you better understand how these elif statements work and how the
syntax above is going to work, let’s take a look at a little game that you can
create using these statements.
Print(“Let’s enjoy a Pizza! Ok, let’s go inside Pizzahut!”)
print(“Waiter, Please select Pizza of your choice from the menu”)
pizzachoice = int(input(“Please enter your choice of Pizza:”))
if pizzachoice == 1:
print(‘I want to enjoy a pizza napoletana’)
elif pizzachoice == 2:
print(‘I want to enjoy a pizza rustica’)
elif pizzachoice == 3:
print(‘I want to enjoy a pizza capricciosa’)
else:
print(“Sorry, I do not want any of the listed pizza’s, please bring a Coca
Cola for me.”)
Now the user is going to be able to go through and make the choices that they
want and they will get the right option to meet with them. For example, if
they want to go with the pizza rustica, they will pick the number 2. If they
want to have just a drink rather than one of the other choices above, they can
do that too. While we did use pizza as an example in here, there are a lot of
other things that you can do with it, so pretty much if you want your user to
have some options, you would use the syntax that is above and then fill in the
options that work the best for you.
The conditional statements are great options to work with because they
provide you with a lot more power than you can get with some of the other
options in coding. It is a great way to make sure that you are able to make a
program that can make decisions, without you having to come up with every
possible scenario and without you having to be there to make the decisions as
well. Try out a few of these conditional statements in your compiler and
experiment a bit to see all the amazing things that you can do with these
conditional statements.
11
DO I NEED TO LEARN ASSERT HANDLING IN
THIS LANGUAGE

N ow it is time to work with some assertions in Python. These are


going to be similar to some of the other topics that we have
explored in this guidebook, but there are some differences as well.
You will often see the topic of assertions coming up when you talk about
exception handling, so working with both of these at the same time can help.
In the Python language, an assertion is going to be a type of sanity check that
you can choose to turn either on or off when you are done testing your
program. The easiest way that you can remember an assertion is that, it is like
a raise-if statement, or a raise-if-not statement. An expression is going to be
tested, and if you get a result that is false, you will see an exception be raised.
Your assertion is going to be carried out with the help of an assert statement.
This is going to be a newer keyword that is found in some of the newer
versions of Python. Many programs will end up placing assertions at the start
of their function in order to check if they have a valid input. They may also
put it after calling up a function to check if the output is valid as well.
What is an Assert Statement?
When you add in an assert statement to your code, the program in Python is
going to evaluate the expression that comes up next, which we hope is true.
However, if Python evaluates this and finds that the expression is false, it is
going to raise up the exception AssertionError. The syntax that you can use
for the assert statement includes:
assert Expression [ , Arguments]
If you use this and the assertion fails, Python is going to use
ArgumentExression as your argument for this error. These types of
exceptions can be caught, and you can handle them similarly to what we did
earlier with exception handling. You can use the try-except statement to help
you handle it properly.
There will be times when the Python code will not be able to handle the
exception that is raised with that statement. If the program can’t handle it,
then it is going to terminate your program and will produce a traceback. But
for the most part, this assertion is going to help you handle any issues that
come up and it is mostly there to help you catch anything that can go wrong
in the program.
The assertion handling is very important to the code that you write. It helps to
make sure that all issues inside the code are caught and handled before you
send out the program. You will be able to tell whether a statement is true or
not, and whether or not the program will be able to handle it on its own, or if
you need to go back through and make some changes. Either way, it is a good
idea to add this in, especially if you are going to be dealing with any
exceptions in the code as well. It is a safe and effective way to test out your
code and catch any issues or mistakes ahead of time.
The Importance of Testing Your Code
The assert statement is a great way to test out parts of your code. It gets the
program to go through and test out some of the things that are inside the code
so you can catch any potential problems or bugs ahead of time. Getting used
to the process of writing testing code and then running this code alongside
your original code is considered part of good coding habit. When you use it
wisely, this is a good method to help you go through and precisely define the
intent of your code. However, if you are about to get started with testing,
there are some general rules that you need to follow including:

The unit that you use for testing should keep its focus on a tiny bit of
the functionality. Then it should work to prove that part is correct.
Each of the units that you are doing need to be fully independent.
Each test must have the ability to run alone, as well as with the test
suite, regardless of what order it is called up with. The implication
with this rule is that your test needs to be loaded up with a fresh
dataset and it needs to be able to clean up a bit afterward. You can
handle this with two methods known as the teardown() and setUp()
methods.
When creating a test, you need to work on creating one that is able to
run fast. If you find that even one of your tests takes more than a few
milliseconds to run, then the whole process is slowed down, or the
tests won’t be able to run as often as you need. There are some times
when the tests just can’t be as fast as you want because they need to
work with some complex data, and then you need to load up the
structure each time you run the test. You should keep some of these
heavier tests in their own area so that the other tests can run when
needed.
You need to fully understand all the tools that come with your
program and learn how to run either a test case or a single test. Then,
when you are developing your own function inside a module, you
can do these tests as often as possible. Set it up so that this happens
automatically any time that you save your code.
You should always go through and run a full test suite right before
you get started with a session of coding, and then consider running it
again when you are done. This helps you feel more confident that
nothing was broken when you worked on the code.
If you are working on a development session and then have to leave
right in the middle, you can write in a broken unit test about what
you are planning on working on next. Then, when you come back to
the work, you will still have a pointer there where you can get right
back on track.
When you are doing a code and trying to debug it, you should work
on a new test that is responsible for finding the bug. This isn’t
always something that is possible, but the bug catching tests are
going to be very valuable when you do your project.
When you are testing out a function, make sure that you use
descriptive and long names. The style guide for this point is often
going to be a bit different than what you would do when running a
code, for those you want names that are kind of short. The reason
that your testing functions need to be longer is because you want
them to display on the screen when the test fails. When you have
them as descriptive as possible, it is easier to tell what is going on in
the code.
When you are working on the code and you find that something goes
wrong or you need to go through and change something, and if the
code has gone through a good set of tests, then you are able to rely
on the testing suite to help fix your problems. The testing code needs
to be read as much, and sometimes even more than running a code.
This isn’t always a bad thing to rely on the testing code. But you do
need to be careful and make sure that the testing code you work with
is secure and that it will actually be able to catch any of the problems
that come up inside your actual code.
Another way that you are able to use this testing code is as a type of
introduction to some new developers. When you have someone who
will work on the code base, having them read through your testing
code is a good place for them to start. If they have to add in some
functionality, then they need to go through and add in a test to make
sure that this functionality is not something that is already there and
just needs to be adjusted to work properly.

Testing your program is a great way to make sure that it is going to work the
proper way and that there isn’t going to be any major bugs or problems that
will show up in it when someone else tries to use the code. Use the assert
statement and the steps that are above, and you will be able to get your code
up and running the way that you want.
12
HOW TO WORK WITH LOOPS IN YOUR PYTHON
CODE

T he next topic that we are going to discuss inside Python is the idea of
a loop. These are going to be very important to your code and can
work well with some of the conditional statements that you will do in
your coding. Loops can clean up your program, can help you get a lot done in
a few lines of code, and is a wonderful way to make the code really intense
and powerful without having to learn a lot of new things.
These loops are going to be really helpful when you are writing a code where
you need that particular program to repeat something, at least a few times,
within the code, but you don’t want to make the code messy and write out
those lines a bunch of times. It isn’t a big deal to go in and write the line two
or three times, but if you want something to repeat one hundred times, or
infinite amount of time until a specific result is reached, then the code can be
tedious and messy if you write them all out. A loop will be able to handle this
for you, and you can get it all done in just a few lines.
For example, let’s say that if you are working on a code where you want all
the numbers from one through ten listed out, you don’t want to write out so
many lines of code to tell the compiler to do this. These loops can do the
work for you, making the program repeat itself until a certain condition (one
that you can set out) is met.
While this may sound a bit complex, these loops are actually really easy to
work with. These loops are there to work with your compiler, telling it that it
should just repeat that same block of code over and over again. It will do this
until a condition that you inserted is met. If you want your code to be able to
count up from one to ten, you would just tell the compiler that you want it to
stop once the output is higher than ten. We will take a look at a few examples
of codes that work like this so you can get a better idea of how this works.
Of course, when writing a loop condition, you need to be careful about
getting the condition set up. If you don’t set up your condition from the
beginning, then the program will just keep reading the code over and over
again, getting stuck in a continuous loop. You need to have a condition or a
break in your code to help it stop and move on to the next thing the program
should do.
With the traditional methods of coding that you may have used in the past,
you would have to write out every line of code. Even if there were some
similar parts of the code that were the same, or you were basically retyping
the same piece of code over and over again, that is how you had to do it as a
beginner because that is the only way that you knew how to do things.
With the help of these loops, you can get rid of that way of thinking. You can
combine a lot of lines of code into just a few and instead convince the
compiler to read through that same line as many times as you need. If you
need it to do it 100 times, then that is what the compiler will do. With one
line of code, thanks to these loops, you can get a ton of things done without
having to write out 100 lines, or more, of code.
With that said, there are a few types of loops that are available for you to try
out. The one that you pick will depend on what you want to happen inside the
program and how many times you want the code to go through a loop. Let’s
take a look at some of these loops, including the for loop, the while loop, and
the nested loop.
The While Loop
The first type of loop that you can work within your Python code is known as
the while loop. The while loop is the type that you will use if you want to
make sure that the code goes through a cycle a predetermined number of
times. You can set this number of times when you write the code to make
sure the loop goes for as long as you would like.
With the while loop, your goal is not to make the code go through its cycle an
indefinite amount of times, but you do want to make sure that it goes through
for a specific number of times. If you are counting from one to ten, you want
to make sure it goes through the loop ten times to be right. With this option,
the loop is going to go through at least one time and then check to see if the
conditions are met or not. So, it will put up the number one, then check its
conditions and put up the number two, and so on until it sees where it is.
To give us a little bit better understanding on how these loops work, let’s take
a look at some sample codes of the while loop and see what happens.
counter = 1
while(counter <= 3):
principal = int(input(“Enter the principal amount:”))
numberofyears = int(input(“Enter the number of years:”))
rateofinterest = float(input(“Enter the rate of interest:”))
simpleinterest = principal * numberofyears * rateofinterest/100
print(“Simple interest = %.2f” %simpleinterest)
#increase the counter by 1
counter = counter + 1
print(“You have calculated simple interest for 3rd time!”)
Before we move on, take this code and add it to your compiler and let it
execute this code. You will see that when this is done, the output is going to
come out in a way that the user can place any information that they want into
the program. Then the program will do its computations and figure out the
interest rates, as well as the final amounts, based on whatever numbers the
user placed into the system.
With this particular example, we set the loop up to go through three times.
This allows the user to put in results three times to the system before it moves
on. You can always change this around though and add in more of the loops
if it works the best for your program.
What is the for loop, and why is it different from the while loop?
The while loop that we discussed above is a great one to use. It is going to
work in many of the situations where you want to work with a loop and it
will often be enough. But there are times when you will need something a bit
different, and that is when the for loop can be useful. The for loop is actually
the one that is considered the more traditional method to use loops in coding,
but it is also one that you can use with many of your codes.
With the for loop, you will have it set up so that the user isn’t the one who
goes in and gives the program information that determines when the loop will
stop. Instead, the for loop is set up to go over the iteration in the order that
things show up inside your statement, and then this information is going to
show up on your screen. There isn’t any need for input from an outside force
or user, at least until it reaches the end. An example of the code that you can
use to work on a for loop includes:
# Measure some strings:
words = [‘apple’, ‘mango’, ‘banana’, ‘orange’]
for w in words:
print(w, len(w))
When you work with the for loop example that is above, you are able to add
it to your compiler and see what happens when it gets executed. When you do
this, the four fruits that come out on your screen will show up in the exact
order that you have them written out. If you would like them to show up in a
different order, you can do that, but then, you need to go back to your code
and rewrite them in the right order, or your chosen order. Once you have
them written out in the syntax and they are ready to be executed in the code,
you can’t make any changes to them.
The Nested Loop to Finish It Out
And the final type of loop that you can choose to work with is known as the
nested loop. This one is going to work slightly different than the while loop
and the for loop, but there are times when it is helpful to use this kind of a
loop. When you do work with one of the nested loops, you are taking one
loop and then placing it so it is inside of another one. Then both of these
loops are going to keep on running until they are done.
This may seem like a silly thing to add into your code, but there are many
times when you write out a code that will need this. For example, you may be
working on a code where you need to write out your own multiplication
table. Maybe you want to be able to write it so it goes from 1 all the way to
10 and has the answers for each one in there.
This would be a huge amount of code if you wrote out each line to tell the
program how to behave. And you can certainly do that if you want to waste
some time practicing your code writing. But a better method to use to make
this work, a way that would get it done in relatively few lines of code and
save you time includes the following:
#write a multiplication table from 1 to 10
For x in xrange(1, 11):
For y in xrange(1, 11):
Print ‘%d = %d’ % (x, y, x*x)
When you got the output of this program, it is going to look similar to this:
1*1 = 1
1*2 = 2
1*3 = 3
1*4 = 4
All the way up to 1*10 = 10
Then it would move on to do the table by twos such as this:
2*1 =2
2*2 = 4
And so on until you end up with 10*10 = 100 as your final spot in the
sequence.
Go ahead and put this into the compiler and see what happens. You will
simply have four lines of code, and end up with a whole multiplication table
that shows up on your program. Think of how many lines of code you would
have to write out to get this table the traditional way that you did before. This
table only took a few lines to accomplish, which shows how powerful and
great the nested loop can be.
The loops are great options to add into your code. There are a lot of reasons
when you would need to take a loop and add it inside your code. You will be
able to use it as a way to get a lot of coding done in just a few lines, and a
way to clean up the code so that you can still get the same thing done without
writing out too much. The compiler is set up to keep reading through the loop
until the condition that you set is no longer valid. This can open up a lot of
things that you are able to do with your code, while also keeping things clean
and manageable all at the same time.
13
WHEN TO USE USER-DEFINED FUNCTIONS IN
YOUR CODE

F unctions are a common part of all programming languages that you


try to use. It is known as a block of reusable code that you will use in
order to perform a specific task. But when you define function in
Python, you need to be able to know both types before you get started. These
two types include user-defined and built-in. The built-in functions are going
to come with the libraries and packages in Python, but the user-defined
functions are the ones that a developer will need to help them with certain
projects. In Python, all the functions are going to be treated like objects,
which can make it a lot easier to work on compared to some of the other
high-level coding languages.
In this chapter, we are going to spend some time focusing on just one type of
function, the user-defined functions. To help us understand this concept, we
will learn how to make these user-defined functions work for us by
implementing them in some of our code examples. Before we do that though,
we are going to look at a few important concepts that help us understand
what is going on better with user-defined functions.
Why these Types of Functions so Important in Python?
In general, a developer can either write out their own user-defined function,
or they are able to borrow it from a third-party library that isn’t associated
with Python directly. These functions are sometimes going to provide you
with an advantage depending on how and when they are being used in the
code. Some things to remember about these user-defined functions and how
they work include:
These functions are going to be made out of code blocks that are
reusable. It is necessary to only write them out once and then you
can use them as many times as you need in the code. You can even
take that user-defined function and use it in some of your other
applications as well.

These functions can also be very useful. You can use them to help
with anything that you want from writing out specific logic in
business to working on common utilities. You can also modify them
based on your own requirements to make the program work
properly.

The code is often going to be friendly for developers, easy to


maintain, and well-organized all at once. This means that you are
able to support the approach for modular design.
You are able to write out these types of functions independently and
the tasks of your project can be distributed for rapid application
development if needed.
A user-defined function that is thoughtfully and well-defined can
help ease the process for the development of an application.

Now that we know a little bit more about the basics of a user-defined
function, it is time to look at some of the different arguments that can come
with these functions before moving on to some of the codes that you can use
with this kind of function.
The Function Arguments You Can Use
When you are working with Python, these user-defined functions have the
ability to take on four different argument types. These types, and their
meanings are going to be pre-defined and the developer is not able to change
these. Instead, the developer has the option to follow these rules and then add
in some things to make custom functions. There are four types of arguments
that you can use here and the rules that go with them.

Default arguments: This coding language has a different way to


represent the default values and the syntax for your function
arguments. These default values will indicate that the function
argument is going to take that value if you don’t have a value for the
argument pass during your function call. You will be able to tell that
the default value is there by using the equal sign (=).
Required arguments: These are the arguments that are mandatory in
your function. You need to have these values passed in the correct
order and number when the function is called out, or the code isn’t
going to work right.
Keyword arguments: Another option that you can work on is the
keyword arguments. These are also relevant to the function calls in
Python. These keywords are going to be mentioned through your
function call, along with some of the values that go along with this.
These keywords will be mapped with the function arguments to
make it easier for you to identify all the right values, even if your
order doesn’t stay where it should during the call. This one helps to
keep everything organized in the code.
Variable number of arguments: This is another option that you can
use and it can be useful when you don’t know the exact amount of
arguments that you need to pass on to a function. Or you can design
it in a way where any number of arguments can be passed as long as
they meet the requirements that you set.

How to Write One of these User-Defined Functions


Now that we have some ideas about the argument types that you can work on
in Python, it’s time to learn the steps that you need in order to get this done.
There are four basic steps that are needed to make this happen. You can make
it as simple or as difficult as you would like, but we are going to start with
some of the basics that you need to make this happen. The basic steps to
write out your own user-defined function include;

1. Declare your function. You will need to use the “def” keyword and
then have the name of the function come right after it.
2. Write out the arguments. These need to be inside the two
parentheses of the function. End this declaration with a colon to keep
up with the proper writing protocol in this language.
3. Add in the statements that the program is supposed to execute at this
time.
4. End the function. You can choose whether you would like to do it
with a return statement or not.

An example of the syntax that you would use when you want to make one of
your own user-defined functions includes:
def userDefFunction (arg1, arg2, arg3, …):
program statement1
program statement2
program statement3
….
Return;
14
WORKING WITH MEMOIZATION IN PYTHON

T he final topic that we are going to talk about in this guidebook is


known as memoization. This is the method of caching a functional
call’s results. When you go through and memorize a function, you
are only able to evaluate it by looking up the results that you obtained the
first time that you had put those parameters to your function. The log for this
is often known as the Memoization cache. In some situations, you are going
to find that the lookup failed. This simply means that the function wasn’t able
to call using those parameters. Only at that time would running your function
really be necessary.
Memoization doesn’t make much sense unless the function is deterministic,
or you can simply accept the result as out of date. But, if your function is
expensive, a big speedup would happen when you use this process. Let’s
back up a bit and see what this all means.
As a programmer, you know that when you do a recursion, it is going to
make it easy for you to break up a big problem into pieces that are smaller
and more manageable. Try considering iterative sets against the recursive
solutions for a Fibonacci sum. Recursive solutions are often simpler when
you are reading and then writing a branching problem. You will notice that
graph traversals, mathematical series, and tree traversals are often done with
recursion. Even though it does offer you a ton of convenience, the
computational time that comes with recursion can be very big.
Doing Manual Memoization
The first approach that we are going to use is going to require you to take
advantage of a feature out of Python, one that most people are not that
excited about, to add state to a function. We can do that with the following
code:
def fib_default_memoizedn, cache = {}):
if n in cache:
ans = cache[n]
elif n <= 2:
ans = 1
cache[n] = ans
else:
ans = fib_default_memoized(n-2) + fib_default_memoized(n-1)
cache[n] = ans

RETURN ANS

The basic logic that comes with this is pretty obvious. The cache is going to
be the results dictionary of your previous calls to the
fib_default_memoized(). The ‘n’ parameter is the key. It is going to be the
nth Fibonacci number. If this is true, then you are done. But if it is not true,
then you have to take the time to evaluate this as the version of the native
recursive and keep it in the cache before the return of the results.
The thing here is ‘cache’ is the function’s keyword parameter. Python is
usually going to evaluate the keyword parameters only one time, which is
when you import the function. This means that if there are any issues with
mutability in your parameter, it is only going to be initialized one time. This
is usually the basis of small bugs that happen in the program, but in this case,
you are going to mutate your parameter in order to take advantage of it.
Manual Memoization: Objects
Some programmers who use Python argue that going through and mutating
your formal parameters is a bad idea. For others, especially those who like to
work with Java, the argument for this is that all functions that have state need
to be turned into objects. An example of how this would look like in your
compiler includes the following:
class Fib():

CACHE = {}

DEF __ CALL __( SELF , n):


if n in self.cache:
ans = self.cache[n]
if n <= 2:
ans = 1
self.cache[n] = ans
else:
ans = self(n-2) + self(n -1)
self.cache[n] = ans

RETURN ANS

If you are doing this one, the __call__ dunder method is going to be used to
make the Fib instances behave like a function. The Cache is shared by all the
Fib instances because that is its class attribute. When you are looking at
Fibonacci numbers, this is a desirable thing to do. However, if your object
made calls to a server well defined in the constructor, and the result was
going to depend on the server, this may not be a good thing. Instead, you
would need to move it over to an object attribute by taking it right to the
‘__init__’ part. Either way, you will get the speed up process from this.
Manual Memoization: Using ‘Global”
Another thing that we can work on with this process is manual memoization
with the help of the ‘Global’ function. You can go through and evade your
default parameters and some of the hacky mutations simply by adding in
‘global’. This is one thing that sometimes gets a bad reputation with
programmers, but it is a good one to learn how to use. Many times you would
use the global here declaration because it works better, but you would use the
same kind of coding that we had above.
Decorators
The last thing we are going to talk about here is a decorator. This is simply a
higher order function. What this means is that the function is going to be the
argument and then it will return to you another function. When it comes to
these decorators, the returned function is going to usually be the original
function, which has been augmented to be more functional. An example of
this would be to make decorate that is going to allow you to print text each
time a function is called. The way that you can write this out is:
def output_decorator(f):
def f_(f)
f()
print(‘Ran f…’)
return f_
You can take the decorated version to replace the f. You just need to do
‘F=output_decorator(f)’. Just by calling the f(), you are going to get your
decorated version. Python is going to make this even easier if you just use the
following syntax to help.
@output_decorator
def f()
#...define f…
Now, if you go through and try to do this, you will find that the result from
the output_decortor is not that motivating. But you can go beyond this and
augment the operation of the function itself. For example, you could include
a type of cache with the decorator and then intercept the calls to the function
if needed.
But if you try to write out your own decorator, there are times when you get
confused in the particulars of the argument passing, and then getting really
stuck with the introspection of Python when you figure this out. Introspection
is the capacity to determine when you run the program, the type of an object.
This is one of the strengths of the Python language, but if you are using a
decorator, things can become messy.
If you are going to use one of the decorators, be careful with what you are
doing here. You want to make sure that you understand how to make them
work and that you actually need to use it in your code. Otherwise, you may
run into some issues with the code, and it may not interpret in the compiler
the right way.
CONCLUSION

Thank you for making it through to the end of Python Programming. We


hope it was informative and able to provide you with all of the tools you need
to achieve your goals whatever they may be.
The next step is to take a look at some of the topics and the things that we
discuss in this guidebook, and put them to use. There are many different
things that you will be able to do with the Python language, and this
guidebook aimed to help you get started with a few of the more complex
parts that you may want to add into your code. When you are done, you will
be able to combine this information with your basics and make some really
powerful codes.
When you have spent some time working on the Python language and you are
ready to take your skills to the next level and develop some strong codes that
can do so much in just a few lines, make sure to read through this guidebook
to help you get started!
Finally, if you found this book useful in any way, a review on Amazon is
always appreciated!

Вам также может понравиться