Вы находитесь на странице: 1из 8

Fischer 1

Richard Fischer

Mr. Rudebusch

English Comp IV

19 December 2016

The Effect of Machine Learning on Predictive Analytics

Businesses need to make decisions that will make them better prepared for the future. To

make smart decisions, they need to have an idea of what will be coming in the future. Many

businesses take data from the past and use it to predict what will happen in the future. However,

these businesses are growing larger and larger, and they are collecting more and more data.

Today, these businesses are growing large enough that the amount of data they have is larger that

they can analyze using the same traditional method. Some experienced employees believe that

their method works best. They believe that the computer predictions are inaccurate and that

machine learning doesnt properly represent the problem. However, others believe that the use of

machine learning algorithms improves the predictions because it can produce more complex

calculations in much less time. Although experienced employees believe that human intuition

and past knowledge alone works best for predictive analytics, the use of machine learning

improves predictive analytics by increasing the accuracy of predictions and lowering

significantly the time taken for this process.

Businesses need as much information as they can get to plan for the future. One way that

they are able to get this information is through the field of predictive analytics. In predictive

analytics, data from the past is taken and analyzed for patterns and trends. These patterns and

trends are then used to make predictions about the future. Predictive analytics has many different
Fischer 2

uses in many different fields. It can be used by businesses to predict what products a customer

might buy so they can better market the product to the customers. They can also use it to predict

how many customers they might have on a given day, so they know how much inventory they

need and how many employees they need to schedule that day. Insurance companies can use it to

predict the risk of something happening to a person, so they know whether or not to cover that

person or not. The businesses can make decisions based on these predictions and be better

prepared for the future. Predictive analytics is very important to businesses, but it is becoming a

more expensive process.

As time goes on, businesses are continually getting bigger. As they grow, they have an

increased amount of customers. More customers give the business more data to use for predictive

analytics. Hirak Kashyap, a professor at the University of California, writes, With digitization

of all processes and availability of high throughput devices at lower costs, data volume is rising

everywhere, including in bioinformatics research (Kashyap 2). In bioinformatics, complex

biological data like genetic codes is collected and analyzed. The amount of this biological data is

rising. Today, the use of a computer in our everyday lives has become much more common.

Also, advances in technology have given us much more digital storage. These things combined

have led to a significant increase in the amount of data available for businesses. This increase in

data volume is making predictive analytics more expensive.

The traditional method requires a person to periodically work with the data. Usama

Fayyad, a professor at the University of Minnesota Duluth, discusses the methods and techniques

for data analysis. Fayyad describes a process in the traditional method of data analysis.
Fischer 3

For example, in the healthcare industry, it is common for specialists to

periodically analyze current trends and changes in health-care data, say, on a

quarterly basis. The specialists then provide a report detailing the analysis to the

sponsoring health-care organization; this report becomes the basis for future

decision making and planning for healthcare management. (1)

With predictive analytics, new data constantly needs to be collected. When new data is

received, the trends and patterns in the data may change. To ensure that the predictions continue

to be accurate, the changes in the trends and patterns need to be accounted for. Traditionally, this

needs to be done manually. There is a healthcare specialist that will look at the changes in the

data every quarter. They will then write a report about the changes that will be used in the future

for healthcare planning and decision making. With this method, a person needs to take time

every quarter to look at the data and write a report. If there is more data, it will take the person

longer to analyze it. When more time is needed, more employees are needed and the business

will spend more money paying them. By this method, an increased amount of data will result in

an increased cost of predictive analytics.

However, they are starting to collect enough data that it is becoming very slow and

expensive to analyze it. He says that traditionally, the method of data analysis relies on manual

analysis and interpretation (Fayyad 1). Fayyad calls this method slow, expensive, and highly

subjective (Fayyad 2). Because of the high expense he notes that as data volumes grow

dramatically, this type of manual data analysis is becoming completely impractical in many

domains (Fayyad 2). As I explained in the paragraph above, by increasing the amount of data,

the cost of predictive analytics will increase. The amount of data is starting to grow large enough
Fischer 4

that predictive analytics is becoming too expensive for some businesses. These businesses are

spending more money for predictive analytics than they would get from it, making predictive

analytics impractical.

However, while these businesses are growing, technology is also improving. Machine

learning technology is currently being applied to predictive analytics. In machine learning,

algorithms are made that allow a computer to take data and find patterns and trends in it. When

the computer is given new data, it is able to automatically adjust to continue to make accurate

results. Algorithms can be written to perform detailed calculations on a large amount of data

quickly and automatically. Because the process of machine learning is mostly automated,

machine learning significantly lowers the time needed for predictive analytics. The only work the

employee has to do is write the algorithm, and the computer does the rest. Because of that, the

employee can do more work in less time. In the article Machine Learning: What it is and Why it

Matters, the author quotes Thomas H. Davenport, a reporter for The Wall Street Journal. He

says, Humans can typically create one or two good models a week; machine learning can create

thousands of models a week (qtd in 11). Predictive analytics works by creating mathematical

models from past data. These models are then used to make the predictions. Machine learning

can create models much faster than humans can traditionally. That employee can do the same

amount of work in less time using machine learning. Then, the businesses will be able to spend

less money paying the employees for less work.

An experiment was done to look at the accuracy of machine learning algorithms in the

QSAR analysis. Robert Burbidge, a professor at the University College London, wrote a paper

about the experiment. The experiment was about machine learning algorithms and the
Fischer 5

Quantitative structure-activity relationship (QSAR) analysis. The QSAR analysis is used by

pharmaceutical companies in discovering new drugs. They are using data from the molecules of

compounds to predict the molecule's biological activity, which will help the pharmaceutical

companies discover new drugs. The experiment tested different types of machine learning

algorithms and compared them to a more manual way of analysis. Each method was tested on its

accuracy and how much time each method takes. A type of machine learning algorithms that I

will be focusing on is the Support Vector Machines (SVM). SVMs are a recent addition to the

QSAR analysis. An SVM is a type of learning algorithm that can take the data it is given, use

this data to form a mathematical model, and use this model to make future predictions about the

data. The table to the right shows each of the machine

learning algorithms and their accuracy and time taken using

them. In the table, the SVM machine learning algorithm is

classified as SVM-RBF and the manual method is

classified as NN (manual). The SVM algorithm resulted in an 87.33 percent accuracy, while

the manual method only resulted in an 86.97 percent accuracy. Machine learning can create more

accurate predictions, which will help the business be better prepared for the future. Also, the

experiment showed how machine learning increased the time taken for predictive analytics.

While the manual method took 2110 seconds, the SVM took much less time with only 77.4

seconds. The SVM made a prediction 27 times faster than the manual method. This time

improvement can save a business a lot of money in employee salaries. For example, instead of a

business paying 27 different employees for predictive analytics, they can pay 1 employee to use

machine learning.
Fischer 6

Support vector machines were also used in research about the function of proteins in

bacteria. Chin-Sheng Yu, a professor for the Department of Biological Science and Technology

at the National Chiao Tung University, discussed the use of SVMs in the research. They need a

way to predict the location of the proteins in the bacteria. Because they have a lot of data

available for the research, they need a method that can make predictions automatically. One way

that they can accomplish this is through support vector machines. Yu states that for the method

using SVMs, the overall prediction accuracy reaches 89%, which, to the best of our knowledge,

is the highest prediction rate ever reported (Yu 1). The use of SVMs provided them with

accurate predictions that were correct 89 percent of the time. This method gave them the most

accurate predictions they have ever had. With more accurate predictions they can have a better

understanding of what will be coming in the future, allowing them to make smarter business

choices.

However, machine learning has disadvantages. Sometimes, some of the data that is given

is very irrelevant to the prediction. When some machine learning algorithms are given irrelevant

data, the irrelevant data can make the predictions inaccurate. However, machine learning

algorithms can be adjusted to overcome this inaccuracy. The algorithm can be made to sort the

data based on relevance. This will keep the focus on the important parts of the data rather than

letting irrelevant data throw off their accuracy.

Also, machine learning requires a lot of computer and math skills. A typical employee

doesnt have these computer and math skills they need for machine learning. Because of this, all

of a businesses analytics are done by a certain group of people in an analytics department. This

department handles many different analytics projects in many different parts of the business.
Fischer 7

However, the analytics department does these analytics projects without a good understanding of

the problem. Without an understanding, they may create an algorithm that doesnt properly

represent the data.

Businesses use predictive analytics to make smarter business choices. However, as

businesses grow the process of predictive analytics is becoming too slow and expensive because

of the increase in the amount of data. Businesses can use machine learning to overcome this

issue. Machine learning makes machine learning a significantly faster and more accurate process.

Although machine learning may produce inaccurate predictions because of irrelevant data, the

algorithms are continually being modified to fix issues like this.


Fischer 8

Works Cited

Burbidge, Robert, et al. "Drug Design by Machine Learning: Support Vector Machines for

Pharmaceutical Data Analysis." Drug Design by Machine Learning: Support Vector

Machines for Pharmaceutical Data Analysis. Elsevier Science Ltd., 2001. Web. 04 Nov.

2016.

Fayyad, Usama, et al. "From Data Mining to Knowledge Discovery in Databases." Association

for

the Advancement of Artificial Intelligence, n.d. Web. 4 Nov. 2016.

Kashyap, Hirak, et al. Big Data Analytics in Bioinformatics: A Machine Learning Perspective.

N.p., 15 June 2015. Web. 04 Nov. 2016.

"Machine Learning: What It Is and Why It Matters." Analytics, Business Intelligence and Data

Management. Web. 03 Nov. 2016.

Yu, Chin-Sheng, et al. "Predicting Subcellular Localization of Proteins for Gram-negative

Bacteria

rotein Science : A
by Support Vector Machines Based on N-peptide Compositions." P

Publication of the Protein Society. U.S. National Library of Medicine. Web. 18 Dec. 2016.

Вам также может понравиться