You are on page 1of 4

The Future of Data Mining-Part 2

The last article comprised the first of a series of three articles dealing with the future of
data mining. As with any perspective which looks at the future, the first article focused
on change and specifically how data mining has changed the world of business. We
looked at examples of data minings impact both in the commercial as well as non
commercial sectors of our society. The last section of the article then specifically dealt
with data minings impact within the marketing area. Continuing on this theme of
change, wed like to now examine some factors that will most impact data mining in the
Organizational Change

Certainly other factors such as privacy already have a huge impact on data mining
which will continue well into the future. Huge interest in privacy has already resulted
in voluminous material being written on this subject. In fact, a three part series on data
mining and privacy has already been written in this publication. The remainder of this
article, though, will deal with the three factors outlined above.

Organizational Change
Historical speaking, no departments essentially existed in the area of data mining. As
direct marketing evolved into a more common marketing discip line, marketing service
departments were created to help in the execution of direct mail programs. Specifically,
this involved a number of activities. The first activity was to generate names for a
given campaign by working with list broker specialists who would recommend specific
lists based on the desired campaign objective. The second activity involved data
hygiene and cleansing in order to ensure that the name and address were correct and
that there were no duplicate names on the promoteable file. The last activity involved
the actual generation of the campaign list file and its accompanying test cells. Along
with targeting the best names for a given campaign, the test cells were used to derive
specific learning which had been identified upfront as one of the campaign objectives.

From an analytical perspective, minimal activities were conducted within the direct
marketing area. Yet, as direct marketing began to grow in prominence, the analytical
component continued to evolve and grow due to increased demand for better
information and insights. Ultimately, this necessitated the requirement for stronger
mathematical skills and a certain level of statistical knowledge. With the analytical
component being able to deliver and demonstrate significant tangible benefits directly
to the bottom line, both the volume as well as the complexity of work increased. This
growing demand ultimately resulted in the creation of either data mining or CRM
analytical departments. One emerging trend that has occurred in some organizations is
the merging of all the analytical components under one domain. Typically in many
organizations today, the analytical areas report separately to the functional areas .
Examples of this are the marketing analysis and credit risk analysis areas which
operate as separate silos. In certain organizations, both these areas have now been
merged into one area in order for the organization to obtain a more consistent and
holistic view of customer profitability. We should expect to observe more of this
consolidation of data analysis activities as more and more organizations move towards
customer-level profitability.

The internet with its tremendous volume of data will also reinforce this above data
analytics consolidation trend. Of course, the key will be to somehow merge the on- line
world of data with the off- line world of data and more importantly determine how to
best use this merged information for better decision- making.

This above shift towards consolidating data analysis activities under one area
obviously increases the importance of the data mining role within the organization.
Given the significance of these changes, it is likely that they would occur at a very
senior level. In fact, some organizations in the U.S. have already created an executive
position in data mining which is referred to as the chief data officer(CDO). This
position reports directly to the CEO and is at the same level as the chief marketing
officer and chief technology officer. Not that these executive level changes might
become the norm in Canada, but it is fair to say that the role of data mining will have
more senior level prominence within most organizations.

This has always represented the largest challenge in building a data mining practice.
Since the data mining discipline is relatively new, it has been difficult to find
knowledgeable practitioners within this area. In the past, organizations typically relied
on finding individuals from other organizations with an extensive direct marketing
discipline. Formal educational training in this area was non-existent.

Yet this has changed as educational institutions such as Dalhousie, now offer business
degrees in the area of marketing informatics. One key component within this
marketing informatics discipline is how data mining is deployed within the marketing
world. Community colleges also offer specific courses in data mining while business
associations such as the Canadian Marketing Association have seminars devoted solely
to data mining.

Despite the educational developments mentioned above, the educational aspect of data
mining from a more formal standpoint is still in its infancy. From a people standpoint,
though, it is at the educational level where significant changes will occur and
specifically at the university level. One good current example of change at the
academic level is the University of Montreal which has a department devoted solely to
data mining research. They have been the thought leaders in promoting data mining
from a purely academic viewpoint. A good example of their work has been their
research on different factor analysis approaches which may have practical business
relevance down the road.

The emergence of change within universities themselves might occur as the computer
science departments evolve into specialized disciplines such as data mining or perhaps
into a more all-encompassing area called Knowledge Management. With or without
these changes, though, data mining will still become a core component in any
information management discipline. This increased level of training should better equip
young individuals in more junior-related positions within data mining. Organizations
can then provide the much needed practical- level training to complement this academic
training thereby creating a more knowledgeable employee.

The academic community through its research can also help identify how the business
community can better improve its existing data mining practices. A good example of
this in practice is through an association called the MITACS(Mathematics of
Information Technology and Complex systems) which represents a Network of Centres
of Excellence (NCE) for the Mathematical Sciences. This association has begun initial
attempts to marry the data mining learning both from the academic and business
communities through seminars which bring together the leading experts from both the
business and academic areas.

The software component of data mining represents the one area of data mining whic h
has seen the most change. This will continue to occur in the foreseeable future. Why
has the software world changed so dramatically? With cheap and easy access to data
through technology, there has been huge demand in trying to empower more people to
do data analysis. Historically, this type of capability was isolated to highly technical
people with strong programming skills. In particular, those with SAS programming
skills were treated as the data mining gurus of the company. These skillsets are still
highly valued within companies, but there has been a shift to empower other
individuals. The first scenario is to empower regular business analysts with the
capability of doing non-statistical data analysis. In the second scenario, organizations
would like to empower more mathematically-oriented individuals in conducting
statistical type analyses, yet their capabilities are limited because they do not have
strong programming skills. In both above scenarios, new software can help to facilitate
the data mining exercise by eliminating the need for someone to actually write
programming code. Instead, graphical user interfaces(GUI) provide the interface that
allows the analyst to actually conduct a data mining exercise. The analyst still needs to
have a very deep understanding of the data mining process, but he or she is able to
conduct this exercise without writing any programming code.

Another significant development in the data mining software world are tools that
purport to conduct real-time analysis. In other words, analyses can be conducted
anytime with the most up to date accurate information. This can have tremendous
relevance within the online world where information is constantly being exchanged.
Having tools that can both access and conduct analysis at any time is certainly very
significant when compared to the old world of having to wait for the next mainframe
database update or wait six to eight weeks before having enough data to analyze a
direct mail campaign.

Many of these new tools will also have high-powered mathematical tools to help target
specific prospects and/or customers. Although it is exciting as a data miner to have
access to these new high powered tools, we must never forget that they are indeed just
TOOLS. Remember the old adage A fool with a tool is still a fool.

Although, software vendors and high-powered mathematicians might be inclined to say

that their new tool is the next breakthrough in mathematics, it is important for data
miners to be grounded in how they evaluate these tools. Proper validation exercises
are critical towards effective validation of these tools.

Perhaps the most exciting developments in data mining are coming from what is
referred to as text mining. The learning and work being conducted in this area are
techniques that focus on processes to convert unstructured data such as text into
actionable structured data. It does this through a process involving data
parsing/cleansing, conversion of the data cleansed/parsed data into actionable numeric
data and then classifying this numeric data into categories using certain clustering
techniques. Think about the opportunity here. Data miners can now use free form type
data, written comments, and other text type data on the customer to uncover certain
customer patterns that might be relevant for a given customer behaviour. A good
example of this is being able to examine email data between customers and a retail
company. With the email data of customers, we might be able to classify them into
three categories suc h as complaints, request for catalog information, or request for
specific product information. This could then be used to determine if any of these three
categories have an impact on retention. The ability to identify and discover patterns in
this type of text data represents the new frontier of data mining.

The above three factors represent some key areas which will impact data mining in the
future besides having an impact on data mining today. The next article will be
interview-based. A number of key data mining practitioners within the data mining
industry will be interviewed in order to get their insights and observations on future
trends within the industry. Through the interviews, we expect to see both some similar
opinions as well as divergent opinions from the interviewees. This mix of both similar
and divergent opinions will certainly enrich the reader on what leading-edge
practitioners are thinking regarding the future of data mining.