Research Essay

Fischer 1
Richard Fischer
Mr. Rudebusch
English Comp IV
19 December 2016
The Effect of Machine Learning on Predictive Analytics
Businesses need to make decisions that will make them better prepared for the future. To
make smart decisions, they need to have an idea of what will be coming in the future. Many
businesses take data from the past and use it to predict what will happen in the future. However,
these businesses are growing larger and larger, and they are collecting more and more data.
Today, these businesses are growing large enough that the amount of data they have is larger that
they can analyze using the same traditional method. Some experienced employees believe that
their method works best. They believe that the computer predictions are inaccurate and that
machine learning doesnt properly represent the problem. However, others believe that the use of
machine learning algorithms improves the predictions because it can produce more complex
calculations in much less time. Although experienced employees believe that human intuition
and past knowledge alone works best for predictive analytics, the use of machine learning
improves predictive analytics by increasing the accuracy of predictions and lowering
significantly the time taken for this process.
Businesses need as much information as they can get to plan for the future. One way that
they are able to get this information is through the field of predictive analytics. In predictive
analytics, data from the past is taken and analyzed for patterns and trends. These patterns and
trends are then used to make predictions about the future. Predictive analytics has many different
Fischer 2
uses in many different fields. It can be used by businesses to predict what products a customer
might buy so they can better market the product to the customers. They can also use it to predict
how many customers they might have on a given day, so they know how much inventory they
need and how many employees they need to schedule that day. Insurance companies can use it to
predict the risk of something happening to a person, so they know whether or not to cover that
person or not. The businesses can make decisions based on these predictions and be better
prepared for the future. Predictive analytics is very important to businesses, but it is becoming a
more expensive process.
As time goes on, businesses are continually getting bigger. As they grow, they have an
increased amount of customers. More customers give the business more data to use for predictive
analytics. Hirak Kashyap, a professor at the University of California, writes, With digitization
of all processes and availability of high throughput devices at lower costs, data volume is rising
everywhere, including in bioinformatics research (Kashyap 2). In bioinformatics, complex
biological data like genetic codes is collected and analyzed. The amount of this biological data is
rising. Today, the use of a computer in our everyday lives has become much more common.
Also, advances in technology have given us much more digital storage. These things combined
have led to a significant increase in the amount of data available for businesses. This increase in
data volume is making predictive analytics more expensive.
The traditional method requires a person to periodically work with the data. Usama
Fayyad, a professor at the University of Minnesota Duluth, discusses the methods and techniques
for data analysis. Fayyad describes a process in the traditional method of data analysis.
Fischer 3
For example, in the healthcare industry, it is common for specialists to
periodically analyze current trends and changes in health-care data, say, on a
quarterly basis. The specialists then provide a report detailing the analysis to the
sponsoring health-care organization; this report becomes the basis for future
decision making and planning for healthcare management. (1)
With predictive analytics, new data constantly needs to be collected. When new data is
received, the trends and patterns in the data may change. To ensure that the predictions continue
to be accurate, the changes in the trends and patterns need to be accounted for. Traditionally, this
needs to be done manually. There is a healthcare specialist that will look at the changes in the
data every quarter. They will then write a report about the changes that will be used in the future
for healthcare planning and decision making. With this method, a person needs to take time
every quarter to look at the data and write a report. If there is more data, it will take the person
longer to analyze it. When more time is needed, more employees are needed and the business
will spend more money paying them. By this method, an increased amount of data will result in
an increased cost of predictive analytics.
However, they are starting to collect enough data that it is becoming very slow and
expensive to analyze it. He says that traditionally, the method of data analysis relies on manual
analysis and interpretation (Fayyad 1). Fayyad calls this method slow, expensive, and highly
subjective (Fayyad 2). Because of the high expense he notes that as data volumes grow
dramatically, this type of manual data analysis is becoming completely impractical in many
domains (Fayyad 2). As I explained in the paragraph above, by increasing the amount of data,
the cost of predictive analytics will increase. The amount of data is starting to grow large enough
Fischer 4
that predictive analytics is becoming too expensive for some businesses. These businesses are
spending more money for predictive analytics than they would get from it, making predictive
analytics impractical.
However, while these businesses are growing, technology is also improving. Machine
learning technology is currently being applied to predictive analytics. In machine learning,
algorithms are made that allow a computer to take data and find patterns and trends in it. When
the computer is given new data, it is able to automatically adjust to continue to make accurate
results. Algorithms can be written to perform detailed calculations on a large amount of data
quickly and automatically. Because the process of machine learning is mostly automated,
machine learning significantly lowers the time needed for predictive analytics. The only work the
employee has to do is write the algorithm, and the computer does the rest. Because of that, the
employee can do more work in less time. In the article Machine Learning: What it is and Why it
Matters, the author quotes Thomas H. Davenport, a reporter for The Wall Street Journal. He
says, Humans can typically create one or two good models a week; machine learning can create
thousands of models a week (qtd in 11). Predictive analytics works by creating mathematical
models from past data. These models are then used to make the predictions. Machine learning
can create models much faster than humans can traditionally. That employee can do the same
amount of work in less time using machine learning. Then, the businesses will be able to spend
less money paying the employees for less work.
An experiment was done to look at the accuracy of machine learning algorithms in the
QSAR analysis. Robert Burbidge, a professor at the University College London, wrote a paper
about the experiment. The experiment was about machine learning algorithms and the
Fischer 5
Quantitative structure-activity relationship (QSAR) analysis. The QSAR analysis is used by
pharmaceutical companies in discovering new drugs. They are using data from the molecules of
compounds to predict the molecule's biological activity, which will help the pharmaceutical
companies discover new drugs. The experiment tested different types of machine learning
algorithms and compared them to a more manual way of analysis. Each method was tested on its
accuracy and how much time each method takes. A type of machine learning algorithms that I
will be focusing on is the Support Vector Machines (SVM). SVMs are a recent addition to the
QSAR analysis. An SVM is a type of learning algorithm that can take the data it is given, use
this data to form a mathematical model, and use this model to make future predictions about the
data. The table to the right shows each of the machine
learning algorithms and their accuracy and time taken using
them. In the table, the SVM machine learning algorithm is
classified as SVM-RBF and the manual method is
classified as NN (manual). The SVM algorithm resulted in an 87.33 percent accuracy, while
the manual method only resulted in an 86.97 percent accuracy. Machine learning can create more
accurate predictions, which will help the business be better prepared for the future. Also, the
experiment showed how machine learning increased the time taken for predictive analytics.
While the manual method took 2110 seconds, the SVM took much less time with only 77.4
seconds. The SVM made a prediction 27 times faster than the manual method. This time
improvement can save a business a lot of money in employee salaries. For example, instead of a
business paying 27 different employees for predictive analytics, they can pay 1 employee to use
machine learning.
Fischer 6
Support vector machines were also used in research about the function of proteins in
bacteria. Chin-Sheng Yu, a professor for the Department of Biological Science and Technology
at the National Chiao Tung University, discussed the use of SVMs in the research. They need a
way to predict the location of the proteins in the bacteria. Because they have a lot of data
available for the research, they need a method that can make predictions automatically. One way
that they can accomplish this is through support vector machines. Yu states that for the method
using SVMs, the overall prediction accuracy reaches 89%, which, to the best of our knowledge,
is the highest prediction rate ever reported (Yu 1). The use of SVMs provided them with
accurate predictions that were correct 89 percent of the time. This method gave them the most
accurate predictions they have ever had. With more accurate predictions they can have a better
understanding of what will be coming in the future, allowing them to make smarter business
choices.
However, machine learning has disadvantages. Sometimes, some of the data that is given
is very irrelevant to the prediction. When some machine learning algorithms are given irrelevant
data, the irrelevant data can make the predictions inaccurate. However, machine learning
algorithms can be adjusted to overcome this inaccuracy. The algorithm can be made to sort the
data based on relevance. This will keep the focus on the important parts of the data rather than
letting irrelevant data throw off their accuracy.
Also, machine learning requires a lot of computer and math skills. A typical employee
doesnt have these computer and math skills they need for machine learning. Because of this, all
of a businesses analytics are done by a certain group of people in an analytics department. This
department handles many different analytics projects in many different parts of the business.
Fischer 7
However, the analytics department does these analytics projects without a good understanding of
the problem. Without an understanding, they may create an algorithm that doesnt properly
represent the data.
Businesses use predictive analytics to make smarter business choices. However, as
businesses grow the process of predictive analytics is becoming too slow and expensive because
of the increase in the amount of data. Businesses can use machine learning to overcome this
issue. Machine learning makes machine learning a significantly faster and more accurate process.
Although machine learning may produce inaccurate predictions because of irrelevant data, the
algorithms are continually being modified to fix issues like this.

Fischer 8
Works Cited
Burbidge, Robert, et al. "Drug Design by Machine Learning: Support Vector Machines for
Pharmaceutical Data Analysis." Drug Design by Machine Learning: Support Vector
Machines for Pharmaceutical Data Analysis. Elsevier Science Ltd., 2001. Web. 04 Nov.
2016.
Fayyad, Usama, et al. "From Data Mining to Knowledge Discovery in Databases." Association
for
the Advancement of Artificial Intelligence, n.d. Web. 4 Nov. 2016.
Kashyap, Hirak, et al. Big Data Analytics in Bioinformatics: A Machine Learning Perspective.
N.p., 15 June 2015. Web. 04 Nov. 2016.
"Machine Learning: What It Is and Why It Matters." Analytics, Business Intelligence and Data
Management. Web. 03 Nov. 2016.
Yu, Chin-Sheng, et al. "Predicting Subcellular Localization of Proteins for Gram-negative
Bacteria
rotein Science : A
by Support Vector Machines Based on N-peptide Compositions." P
Publication of the Protein Society. U.S. National Library of Medicine. Web. 18 Dec. 2016.

Research Essay

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Research Essay

Загружено:

Авторское право:

Доступные форматы

Fischer 1

The Effect of Machine Learning on Predictive Analytics

improves predictive analytics by increasing the accuracy of predictions and lowering

significantly the time taken for this process.

more expensive process.

everywhere, including in bioinformatics research (Kashyap 2). In bioinformatics, complex

data volume is making predictive analytics more expensive.

For example, in the healthcare industry, it is common for specialists to

periodically analyze current trends and changes in health-care data, say, on a

decision making and planning for healthcare management. (1)

an increased cost of predictive analytics.

learning technology is currently being applied to predictive analytics. In machine learning,

less money paying the employees for less work.

Quantitative structure-activity relationship (QSAR) analysis. The QSAR analysis is used by

data. The table to the right shows each of the machine

learning algorithms and their accuracy and time taken using

them. In the table, the SVM machine learning algorithm is

classified as SVM-RBF and the manual method is

letting irrelevant data throw off their accuracy.

represent the data.

Businesses use predictive analytics to make smarter business choices. However, as

algorithms are continually being modified to fix issues like this.

Pharmaceutical Data Analysis." Drug Design by Machine Learning: Support Vector

the Advancement of Artificial Intelligence, n.d. Web. 4 Nov. 2016.

N.p., 15 June 2015. Web. 04 Nov. 2016.

Management. Web. 03 Nov. 2016.

Yu, Chin-Sheng, et al. "Predicting Subcellular Localization of Proteins for Gram-negative

Вам также может понравиться