Вы находитесь на странице: 1из 16

ERP Implementation Life Cycle:

ERP lifecycle is in which highlights the different stages in implementation of An ERP. The
process of ERP implementation is referred to as “ERP Implementation Life Cycle”. There are
different stages of the ERP implementation that are as give below:

1. Adoption decision,
2. Acquisition,
3. Implementation,
4. Use and maintenance,
5. Evolution and Retirement phases

Let us brief all the phases of ERP Life Cycle from the following paragraphs.

Block Diagram of Phases


of ERP Life Cycle

1. Adoption Decision: Once the company has decided to go for the ERP system, the search for
the package must start as there are hundreds of packages it is always better to do a through and
detailed evaluation of a small number of packages, than doing analysis of dozens of packages.
This stage will be useful in eliminating those packages that are not suitable for the business
process.This stage is considered an important phases of the ERP implementation, as the package
that one selects will decide the success or failure of the package that one selects will decide the
success or failure of the project. Implementation of an ERP involves huge investments and it is
not easy to switch between different packages, so the right thing is ‘do it right the first time’.
Once the packages to be evaluated are identified, the company needs to develop selection criteria
that permit the evaluation of all the available packages on the same scale.

2. Acquisition: This is the phase that designs the implementation process. It is in this phase
that the details of how to go about the implementation are decided. Time schedules deadlines, etc
for the project are arrived at. The plan is developed, roles are identified and responsibilities are
assigned. it will also decide when to begin the project, how to do it and it completion. A
committee by the team leaders of each implementation group usually does such a planning.
This is considered the most crucial phase for the success of ERP implementation. This
is the process through which the companies create a complete model of where they are now, and
in which direction will they opt in the future. It has been estimated that even the best packages
will only meet 80% of the company’s requirements. The remaining 20% presents problematic
issues for the company’s reengineering. It is in this phase that human factors are taken into
consideration. While every implementation is going to involve a significant change in number of
employees and their job responsibilities, as the process becomes more automated and efficient, it
is best to treat ERP as an investment as well as cost cutting measure.

Training is also an important phase in the implementation, which takes place along
with the process of implementation. This is the phase where the company trains its employees to
implement and later, run the system. Thus, it is vital for the company to choose the right
employee who has the right attitude-people who are willing to change, learn new things and are
not afraid of technology and a good functional knowledge.

3. Implementation: This is the main functional area of ERP implementation. There is a bit of
mystique around the customization process and for good reason: the Holy Grail of ERP
implementation is synchronizing existing company practices with the package. In order to do so,
business processes have to be understood and mapped in such a way that the arrived-at solutions
match up with the overall goals of the company. But, companies cannot just shut down their
operations while the mapping processes take place. Hence the prototype-a simulation of the
actual business processes of the company will be used. The prototype allows for through testing
of the “to be” model in a controlled environment. As the ERP consultants configure and test the
prototype, they attempt to solve any logistical problems inherent in the BPR before the actual go-
live implementation.
This is the phase where one tries to break the system. One has reached a point where
the company is testing the real case scenarios. The system is configured and now you must come
up with extreme cases like system overloads, multiple users logging on at the same time, users
entering invalid data, hackers trying to access restricted areas and so on. This phase is performed
to find the weak link so that it can be rectified before its implementation.

4. Use and Maintenance: This is the phase where ERP is made available to the entire
organization. On the technical side the work is almost complete: data conversion is done,
databases are up and running and on the functional side, the prototype is fully configured and
tested and ready to go operational. The system is officially proclaimed operational even thorough
the implementation team must have been testing it and running it successfully for some time. But
once the system is ‘live’ the old system is removed and the new system is used for doing
business.

This is the phase where the actual users of the system will be given training on how to
use the system. This phase starts much before the system goes live. The employees who are
going to use the new system are identified. Their current skills are noted and they are divided
into groups bases on the current skill levels. Then each group is given traning ont eh new system.
This training is very important as the success of the ERP system is in the hands of the end-user.
So, these training sessions should give the participants an overall view of the systems and how
each person’s actions affect the entire system.
5. Evaluation and Retirement Phase: Once the implementation is over, the vendor and the
hired consultants will go. To reap the fruit of the implementation it is very important that the
system has wide acceptance. There should be enough employees who are trained to handle
problems those crops up time to time. The systems must be updated with the change in
technology. The post implementation will need a different set of roles and skills than those with
less integrated kind of systems. At a minimum, everyone who uses these systems needs to be
trained on how they work, how they relate to business process and how a transaction ripples
through the entire company whenever they press a key.

7 Types of Data to Sync Between your ERP and Ecommerce:

ERP and E-commerce are completely two different platforms with different utilities and
architectures. Now, integrating ERP and E-commerce improves business functions throughout
the organization and adds benefits to the organization (Sales, Marketing, IT, Operational services
and customer services).

 First of all, if there are any dependencies on products, customers and stock or quantity.
 Flow of information is another challenge. Either the data flow is one directional or it is
somehow bi-directional i.e. from E-commerce to ERP & from ERP to E-commerce.
 Current process of managing data and information and also the two systems that will act
as the master of two data.

These are the challenges that we face and we need to overcome. Now identifying these
challenges will boost up the data identification process and broadly the following data points
must be considered for E-commerce and ERP implementation. The seven types of data that
should be synchronized between your ERP and E-commerce are:

1. Customers
2. Products
3. Inventory/Stock
4. Sales orders
5. Shipping/Delivery
6. Invoice and payments
7. Tier price and volume price discount

1. Customer Synchronize process:


Main agenda of business is that the data should be properly synchronized. Synchronizing
customer information between ERP and E-commerce systems is the first step that will
add uniformity in customer experience.
For example, a customer who often shops on your e-commerce store calls your customer
service team to change its shipping address and the required change is reflected in the
ERP system. Now, any change in customer information in either of the applications (ERP
/Ecommerce) will reflect in the other automatically post integration. Customer
information sync will help you drive benefits like personalize customer experience,
provide self-service customer portals and more.
2. Products/Items Synchronization

A customer visits your e-commerce store and places an order. But, where does your e-
commerce store get the product from? Businesses often maintain all the basic product
information in the ERP system. Without this information (product and inventory) the e-
commerce store cannot work effectively.For example,
o You are a retailer where you manage product and inventory information
independently in your e-commerce and ERP systems.
o A customer visits your e-commerce store, explores the store and really likes an
item.
o He/ She visits the product details page and finds that the product is available in
the required size and is in stock. He/ She places an order!
o After few days, the customer receives an email that his/ her order is canceled
because that product was out of stock and inventory information was not synced
with the ERP system.
If this situation happens then there is a very high chance that the customer might never
come back and even recommend other customers to not buy from you.
With integration and synchronization your E-commerce and ERP systems will always be
in sync with each other and you can very easily avoid scenarios like these.

3. Inventory Synchronization

Product and Inventory sync will minimize back orders, improve customer’s shopping
experience and make it easy to sell on multiple channels, build trust and confidence with
your customers, managing product information, inventory management etc.
For example, the inventory information sync can be taken to a different level when the
inventory is managed and maintained in multiple warehouse locations. This would then
require businesses to maintain the warehouse and inventory information in ERP system
and then sync the information back to the e-commerce system.
4. Sales Orders Synchronization
Orders are probably the real reason why most businesses consider e-commerce and ERP
integration. There is a lot of synergy between e-commerce and ERP systems as far as
orders are concerned – while the orders are accepted in the e-commerce system and the
actual fulfillment happens in the ERP system.

o Since your e-commerce and ERP systems are not integrated, you manually
transfer orders from your e-commerce system to your ERP system for fulfillment.
Remember, because of manual data transfer, you can only sync a limited number
of orders between the two systems.
o To ensure quick turnaround time the only alternative you have is to add more
resources to sync data between the two systems. This way, your business is
always limited by the number of orders you can sync between the systems which
is not an ideal situation.
o Integration will not only automate this process but will also reduce the turnaround
time (because the orders are synchronized immediately) without errors that are
common in manual data transfers.

Order sync will help make your business scalable, will enable you to expand to new
markets, minimize order aging, improve customer satisfaction, reduce errors and costs.
5. Tier Price and Volume Price Discount

E-commerce and ERP systems handle this by defining different tier and volume price
discounts and associating a price corresponding to that. Say a customer places an order.
Now, if the tier price and volume price information is not available in the ERP system,
the order total will be less than what the ERP system expects as because the order was
placed at a price defined by the price rule applicable. This will then lead to inconsistency
in the accounting books which can then further lead to compliance issues. E-commerce
and ERP integration bridges this gap by synchronizing tier prices and corresponding
updates between the two systems. Tier price information sync helps you personalize the
customer experience by offering them special prices, maintaining accounting consistency
and reducing compliance issues.

6. Delivery Synchronization

Delivery is an important part of the order fulfillment process and keeping customers
informed about the order and shipment status play an important role in improving
customer experience. For example, the customer places an order in your e-commerce
store and get an expected delivery time/date. Now, in case if the customer does not
receive any updates on the order or shipment status then the customer will get restless.
The overall customer experience in this case, even if you deliver the best product in the
best condition at the best possible price will not be very great.Through data sync,
Shipping information sync will help build trust and confidence, improve customer
experience and satisfaction, increase productivity etc.

7. Invoice Synchronization

All transactions affect payments/accounts receivable which performs the main function of
the financial status of business operations.

For example, after placing an order and delivery, the invoice information is manually
synced back to the e-commerce system. Not allowing e-commerce and ERP system talk
to each other is not an option. Manual sync is an error prone process. Imagine what
would happen if the during the manual data transfer from e-commerce to ERP the
payment information is wrongly entered into the ERP system.

Moreover, because the payment was captured in the e-commerce system, the invoice will have to
be synced back to the e-commerce system to capture the payment. Now doing all this manually
is not an efficient process and must be automated. Invoices/Payments sync ensure consistency in
the accounting books, reduces the operating cycle, improves customer experience and many
more.

E-commerce is a modernized way of selling products at relatively lower costs. It has changed the
way in which enterprises do business and communicate with their consumers. For almost every
type of retailer, putting an online store has become a need. The online stores offer various
benefits for retailers, as well as customers. However, maintaining an online storefront can
become a huge, and arduous stress for retailers, if the storefront is not integrated with their
Enterprise Resource Planning (ERP) system.
The Reason Behind Integrating E-commerce With ERP?

Usually, businesses overlook the need for integrating e-commerce with ERP due to cost or
business interruption caused by having to modify current systems. However, when you see the
effect of this integration on your operations, as well as customer experience, the costs seem to be
worth it.When you integrate e-commerce sales channels with your ERP system, it enables you to
function even more competently as a business. The key types of data, such as shipping/tracking,
order, customer, item, and inventory are all linked to your independent systems.There are some
types of integration, which pass this data between your systems automatically, thus eliminating
the need to enter data from one system to another, manually.

Hand-keying the sales orders into ERP is a painful and time consuming process. It involves a lot
of hassles in updating product data in Excel, and then uploading to the web store. If you are
dependent on reducing inventory in your systems manually, or not able to do it all, then it can
cause you to oversell. There are a lot of chances to make mistakes as these processes are not
automated. Some of the common mistakes include:

 Wrong shipping addresses


 Incorrect inventory levels
 Missing, incomplete, or incorrect product information

In short, when you don’t integrate e-commerce with ERP, you put your consumers experience,
and your business at stake.

Benefits of ERP Integration with an E-Commerce Storefront

Here are the seven major benefits of integrating your e-commerce storefront with ERP system:

1. Reduces Data Redundancy, Human Involvement, and Error:


With the integration of e-commerce storefront, the payment & shipping information, web
orders, and web customer details are integrated to ERP system. In the similar way, it
helps to upload the details ofitem and inventory from ERP to the e-commerce portal.
Thus, in any way, the integration helps eliminate the requirement of re-entering the data.
In turn, this helps reduce data redundancy, human involvement, and error.
2. Reduces Operation Costs:
The real-time data, which is available from the ERP system on to the storefront, enables
consumers to view latest order status, available inventory, as well astrack shipments with
tracking numbers. Automated data inputs help save on errors, re-works, and order
backfires. All these benefits help reduce operational costs.
3. Increases Internal Productivity: Integrated systems play a vital role in streamlining
several business processes. This has helped lessen the involvement of human resource in
the processes. The web sales orders are integrated to the ERP system in real-time, and
this helps a back office ERP user to track the order instantly and start with the further
processing. Thus, integration reduces the order fulfillment cycle.
4. Reduces-Inventory-Cost:
All the information about web sales appears in your ERP system promptly. Based on
these web transactions, ERP item inventory will also get updated. Thus, with up to date
information about inventory and web sales, an ERP user can plan the purchase in a proper
way, thus reducing inventory costs.
5. Increases Customer Satisfaction:
With an ease of getting up to date product information, order tracking details, and
inventory availability detail from ERP system, increases the satisfaction level of a
consumer.
6. Generates Financial Reports in ERP:
Financial reports of sales are generated by e-commerce applications. By integrating of e-
commerce with ERP, it helps the business to produce trial balance, balance sheet, cash
flow, and P/L Statement, which gives the required clarity in financial information.
7. Better Control of your Business:
By integrating e-commerce storefront with ERP business processes, it helps business
owners to get better control of their business, thus giving competitive gains.
The main benefits of integrating e-commerce store front with your ERP are as above. Now, it is
time that you comprehend these benefits and adopt them for the success of your business. As a
whole, availability of precise, and up to date order and consumer information makes it easier to
deliver a good customer service.
ERP AND RELATED TECHNOLOGIES:

BUSINESS PROCESS REENGINEERING:

Dr. Michael Hammer ‘The fundamental rethinking and radical redesign of business
processes to achieve dramatic improvements in critical, contemporary measures of
performance such as cost, quality, service and speed”.
MIS (Management Information System):
• A computer based system that optimizes the collection, collation, transfer and
presentation of information throughout an organization, through an integrated structure of
database and information flow.
• MIS is flexible and can be adapted to the changing needs of the organization.
DATA WAREHOUSING,DATA MINING&OALP
History of Data warehousing
Data Warehouses are a distinct type of computer database that were first developed during the
late 1980s and early 1990s. They were developed to meet a growing demand for management
information and analysis that could not be met by operational systems. Operational systems were
unable to meet this need for a range of reasons:
a) The processing load of reporting reduced the response time of the operational systems
b) The database designs of operational systems were not optimized for information
analysis and reporting
c) Most organizations had more than one operational system, so company-wide reporting
could not be supported from a single system
d) Development of reports in operational systems often required writing specific computer
programs which was slow and expensive
As a result, separate computer databases began to be built that were specifically designed to
support management information and analysis purposes. These data warehouses were able to
bring in data from a range of different data sources, such as mainframe computers,
minicomputers, as well as personal computers and office automation software such as
spreadsheet, and integrate this information in a single place. This capability, coupled with user-
friendly reporting tools and freedom from operational impacts, has led to a growth of this type of
computer system.
As technology improved (lower cost for more performance) and user requirements increased
(faster data load cycle times and more features), data warehouses have evolved through several
fundamental stages:
Off line Operational Databases
Data warehouses in this initial stage are developed by simply copying the database of an
operational system to an off-line server where the processing load of reporting does not impact
on the operational system's performance.
Off line Data Warehouse
Data warehouses in this stage of evolution are updated on a regular time cycle (usually daily,
weekly or monthly) from the operational systems and the data is stored in an integrated
reporting-oriented data structure.
Real Time Data Warehouse
Data warehouses at this stage are updated on a transaction or event basis, every time an
operational system performs a transaction (e.g. an order or a delivery or a booking etc.)
Integrated Data Warehouse
Data warehouses at this stage are used to generate activity or transactions that are passed back
into the operational systems for use in the daily activity of the organization.
The Data Warehouse Architecture
The data warehouse architecture consists of various interconnected elements which are:
1) Operational and external database layer: the source data for the data warehouse.
2) Informational access layer: the tools, the end user access to extract and analyze the data.
3) Data Access Layer: the interface between the operational and informational access layer.
4) Metadata Layer: The data directory or repository of metadata information.
The concept of "data warehousing" dates back at least to the mid-1980s, and possibly earlier. In
essence, it was intended to provide an architectural model for the flow of data from operational
systems to decision support environments. It attempted to address the various problems
associated with this flow, and the high costs associated with it. In the absence of such an
architecture, there usually existed an enormous amount of redundancy in the delivery of
management information. In larger corporations it was typical for multiple decision support
projects to operate independently, each serving different users but often requiring much of the
same data. The process of gathering, cleaning and integrating data from various sources, often
legacy systems, was typically replicated for each project. Moreover, legacy systems were
frequently being revisited as new requirements emerged, each requiring a subtly different view
of the legacy data.
Based on analogies with real-life warehouses, data warehouses were intended as large-scale
collection/storage/staging areas for corporate data. From here data could be distributed to "retail
stores" or "data marts" which were tailored for access by decision support users (or
"consumers"). While the data warehouse was designed to manage the bulk supply of data from
its suppliers (e.g. operational systems), and to handle the organization and storage of this data,
the "retail stores" or "data marts" could be focused on packaging and presenting selections of the
data to end-users, to meet specific management information needs.
Somewhere along the way this analogy and architectural vision was lost, as some vendors and
industry speakers redefined the data warehouse as simply a management reporting database. This
is a subtle but important deviation from the original vision of the data warehouse as the hub of a
management information architecture, where the decision support systems were actually the data
marts or "retail stores".
Advantages
There are many advantages to using a data warehouse, some of them are:
i. Data warehouses enhance end-user access to a wide variety of data.
ii. Decision support system users can obtain specified trend reports, e.g. the item with
the most sales in a particular area within the last two years.
iii. Data warehouses can be a significant enabler of commercial business applications,
particularly customer relationship management (CRM) systems.
Limitations
a) Extracting, transforming and loading data consumes a lot of time and computational
resources.
b) Data warehousing project scope must be actively managed to deliver a release of
defined content and value.
c) Compatibility problems with systems already in place.
d) Security could develop into a serious issue, especially if the data warehouse is web
accessible.
e) Data Storage design controversy warrants careful consideration and perhaps
prototyping of the data warehouse solution for each project's environments.
DATA MINING
Overview
Generally, data mining (sometimes called data or knowledge discovery) is the process of
analyzing data from different perspectives and summarizing it into useful information -
information that can be used to increase revenue, cuts costs, or both. Data mining software is one
of a number of analytical tools for analyzing data. It allows users to analyze data from many
different dimensions or angles, categorize it, and summarize the relationships identified.
Technically, data mining is the process of finding correlations or patterns among dozens of fields
in large relational databases.
a) Data mining parameters include:
b) Association - looking for patterns where one event is connected to another event
c) Sequence or path analysis - looking for patterns where one event leads to another later
event
d) Classification - looking for new patterns (May result in a change in the way the data is
organized but that's ok)
e) Clustering - finding and visually documenting groups of facts not previously known
f) Forecasting - discovering patterns in data that can lead to reasonable predictions about
the future (This area of data mining is known as predictive analytics.)
Data mining techniques are used in a many research areas, including mathematics, cybernetics,
and genetics. Web mining, a type of data mining used in customer relationship management
(CRM), takes advantage of the huge amount of information gathered by a Web site to look for
patterns in user behavior. A data miner is a program that collects such information, often without
the user's knowledge, as spyware.
Data mining is a class of database applications that look for hidden patterns in a group of data
that can be used to predict future behavior. For example, data mining software can help retail
companies find customers with common interests. The term is commonly misused to describe
software that presents data in new ways. True data mining software doesn't just change the
presentation, but actually discovers previously unknown relationships among the data.
Continuous Innovation
Although data mining is a relatively new term, the technology is not. Companies have used
powerful computers to sift through volumes of supermarket scanner data and analyze market
research reports for years. However, continuous innovations in computer processing power, disk
storage, and statistical software are dramatically increasing the accuracy of analysis while
driving down the cost.
Example
For example, one Midwest grocery chain used the data mining capacity of Oracle software to
analyze local buying patterns. They discovered that when men bought diapers on Thursdays and
Saturdays, they also tended to buy beer. Further analysis showed that these shoppers typically
did their weekly grocery shopping on Saturdays. On Thursdays, however, they only bought a few
items. The retailer concluded that they purchased the beer to have it available for the upcoming
weekend. The grocery chain could use this newly discovered information in various ways to
increase revenue. For example, they could move the beer display closer to the diaper display.
And, they could make sure beer and diapers were sold at full price on Thursdays.
Data, Information, and Knowledge
Data
Data are any facts, numbers, or text that can be processed by a computer. Today, organizations
are accumulating vast and growing amounts of data in different formats and different databases.
This includes:
• Operational or transactional data such as, sales, cost, inventory, payroll, and accounting
• Non-operational data, such as industry sales, forecast data, and macro economic data
• Meta data - data about the data itself, such as logical database design or data dictionary
definitions
Information
The patterns, associations, or relationships among all this data can provide information. For
example, analysis of retail point of sale transaction data can yield information on which products
are selling and when.
Knowledge
Information can be converted into knowledge about historical patterns and future trends. For
example, summary information on retail supermarket sales can be analyzed in light of
promotional efforts to provide knowledge of consumer buying behavior. Thus, a manufacturer or
retailer could determine which items are most susceptible to promotional efforts.
Data Warehouses
Dramatic advances in data capture, processing power, data transmission, and storage capabilities
are enabling organizations to integrate their various databases into data warehouses. Data
warehousing is defined as a process of centralized data management and retrieval. Data
warehousing, like data mining, is a relatively new term although the concept itself has been
around for years. Data warehousing represents an ideal vision of maintaining a central repository
of all organizational data. Centralization of data is needed to maximize user access and analysis.
Dramatic technological advances are making this vision a reality for many companies. And,
equally dramatic advances in data analysis software are allowing users to access this data freely.
The data analysis software is what supports data mining.
Application of Data mining
Data mining is primarily used today by companies with a strong consumer focus - retail,
financial, communication, and marketing organizations. It enables these companies to determine
relationships among "internal" factors such as price, product positioning, or staff skills, and
"external" factors such as economic indicators, competition, and customer demographics. And, it
enables them to determine the impact on sales, customer satisfaction, and corporate profits.
Finally, it enables them to "drill down" into summary information to view detail transactional
data.
With data mining, a retailer could use point-of-sale records of customer purchases to send
targeted promotions based on an individual's purchase history. By mining demographic data
from comment or warranty cards, the retailer could develop products and promotions to appeal to
specific customer segments.
For example, Blockbuster Entertainment mines its video rental history database to recommend
rentals to individual customers. American Express can suggest products to its cardholders based
on analysis of their monthly expenditures.
WalMart is pioneering massive data mining to transform its supplier relationships. WalMart
captures point-of-sale transactions from over 2,900 stores in 6 countries and continuously
transmits this data to its massive 7.5 terabyte Teradata data warehouse. WalMart allows more
than 3,500 suppliers, to access data on their products and perform data analyses. These suppliers
use this data to identify customer buying patterns at the store display level. They use this
information to manage local store inventory and identify new merchandising opportunities. In
1995, WalMart computers processed over 1 million complex data queries.
The National Basketball Association (NBA) is exploring a data mining application that can be
used in conjunction with image recordings of basketball games. The Advanced Scout software
analyzes the movements of players to help coaches orchestrate plays and strategies. For example,
an analysis of the play-by-play sheet of the game played between the New York Knicks and the
Cleveland Cavaliers on January 6, 1995 reveals that when Mark Price played the Guard position,
John Williams attempted four jump shots and made each one! Advanced Scout not only finds
this pattern, but explains that it is interesting because it differs considerably from the average
shooting percentage of 49.30% for the Cavaliers during that game.
By using the NBA universal clock, a coach can automatically bring up the video clips showing
each of the jump shots attempted by Williams with Price on the floor, without needing to comb
through hours of video footage. Those clips show a very successful pick-and-roll play in which
Price draws the Knick's defense and then finds Williams for an open jump shot.
Process of data mining
While large-scale information technology has been evolving separate transaction and analytical
systems, data mining provides the link between the two. Data mining software analyzes
relationships and patterns in stored transaction data based on open-ended user queries. Several
types of analytical software are available: statistical, machine learning, and neural networks.
Generally, any of four types of relationships are sought:
Classes: Stored data is used to locate data in predetermined groups. For example, a restaurant
chain could mine customer purchase data to determine when customers visit and what they
typically order. This information could be used to increase traffic by having daily specials.
Clusters: Data items are grouped according to logical relationships or consumer preferences. For
example, data can be mined to identify market segments or consumer affinities.
Associations: Data can be mined to identify associations. The beer-diaper example is an example
of associative mining.
Sequential patterns: Data is mined to anticipate behavior patterns and trends. For example, an
outdoor equipment retailer could predict the likelihood of a backpack being purchased based on a
consumer's purchase of sleeping bags and hiking shoes.
Data mining consists of five major elements:
1. Extract, transform, and load transaction data onto the data warehouse system.
2. Store and manage the data in a multidimensional database system.
3. Provide data access to business analysts and information technology professionals.
4. Analyze the data by application software.
5. Present the data in a useful format, such as a graph or table.
Different levels of analysis are available:
a) Artificial neural networks: Non-l inear predictive models that learn through training
and resemble biological neural networks in structure.
b) Genetic algorithms: Optimization techniques that use processes such as genetic
combination, mutation, and natural selection in a design based on the concepts of natural
evolution.
c) Decision trees: Tree-shaped structures that represent sets of decisions. These decisions
generate rules for the classification of a dataset. Specific decision tree methods include
Classification and Regression Trees (CART) and Chi Square Automatic Interaction
Detection (CHAID) . CART and CHAID are decision tree techniques used for
classification of a dataset. They provide a set of rules that you can apply to a new
(unclassified) dataset to predict which records will have a given outcome. CART
segments a dataset by creating 2-way splits while CHAID segments using chi square tests
to create multi-way splits. CART typically requires less data preparation than CHAID.
d) Nearest neighbor method: A technique that classifies each record in a dataset based on
a combination of the classes of the k record(s) most similar to it in a historical dataset
(where k 1). Sometimes called the k-nearest neighbor technique.
e) Rule induction: The extraction of useful if-then rules from data based on statistical
significance.
f) Data visualization: The visual interpretation of complex relationships in
multidimensional data. Graphics tools are used to illustrate data relationships.
Technological infrastructure required for Data Mining
Today, data mining applications are available on all size systems for mainframe, client/server,
and PC platforms. System prices range from several thousand dollars for the smallest
applications up to $1 million a terabyte for the largest. Enterprise-wide applications generally
range in size from 10 gigabytes to over 11 terabytes. NCR has the capacity to deliver
applications exceeding 100 terabytes. There are two critical technological drivers:
Size of the database: the more data being processed and maintained, the more powerful the
system required.
Query complexity: the more complex the queries and the greater the number of queries being
processed, the more powerful the system required.
Relational database storage and management technology is adequate for many data mining
applications less than 50 gigabytes. However, this infrastructure needs to be significantly
enhanced to support larger applications. Some vendors have added extensive indexing
capabilities to improve query performance. Others use new hardware architectures such as
Massively Parallel Processors (MPP) to achieve order-of-magnitude improvements in query
time. For example, MPP systems from NCR link hundreds of high-speed Pentium processors to
achieve performance levels exceeding those of the largest supercomputers.
Applications of Data Mining
Data mining has been cited as the method by which the U.S. Army unit Able Danger supposedly
had identified the September 11, 2001 attacks leader, Mohamed Atta, and three other 9/11
hijackers as possible members of an al Qaeda cell operating in the U.S. more than a year before
the attack
Data Mining is most frequently used for Customer Relationship Management applications.
Common goals are to predict which people are most likely to: a) Be Acquired b) Be Cross-Sold
or Up-Sold c) Leave \ Churn d) Be Retained, Saved, or Won back
These applications can contribute significantly to the bottom line. Rather than contacting a
prospect or customer through a call center or sending mail, only prospects that are predicted to
have a high likelihood of responding to an offer are contacted.
More sophisticated methods may be used to optimize across campaigns so that we can predict
which channel and which offer an individual is most likely to respond to - across all potential
offers.
Finally, in cases where many people will take an action without an offer, uplift modeling can be
used to determine which people will have the greatest increase in responding if given an offer.
Business employing data mining quickly see a return on investment, but also they recognize that
the number of predictive models can quickly become very large. Rather than 1 model to predict
which customers will churn, we could build a separate model for each region and customer type.
Then instead of sending an offer to all people that are likely to churn, we may only want to send
offers to customers that will likely take to offer. And finally, we may also want to determine
which customers are going to be profitable over a window of time and only send the offers to
those that are likely to be profitable. In order to maintain this quantity of models, they need to 1)
Manage model versions 2) Move to "Automated Data Mining."
Another example of data mining, often called the Market Basket Analysis, relates to its use in
retail sales. If a clothing store records the purchases of customers, a data mining system could
identify those customers who favour silk shirts over cotton ones. Although some explanations of
relationships may be difficult, taking advantage of it is easier. The example deals with
association rules within transaction-based data. Not all data are transaction based and logical or
inexact rules may also be present within a database. In a manufacturing application, an inexact
rule may state that 73% of products which have a specific defect or problem, will develop a
secondary problem within the next 6 months.
OLAP
OLAP stands for On-Line Analytical Processing. The first attempt to provide a definition to
OLAP was by Dr. Codd, who proposed 12 rules for OLAP. Later, it was discovered that this
particular white paper was sponsored by one of the OLAP tool vendors, thus causing it to lose
objectivity. The OLAP Report has proposed the FASMI test, Fast Analysis of Shared
Multidimensional Information. For a more detailed description of both Dr. Codd's rules and the
FASMI test, please visit The OLAP Report.
For people on the business side, the key feature out of the above list is "Multidimensional." In
other words, the ability to analyze metrics in different dimensions such as time, geography,
gender, product, etc. For example, sales for the company is up. What region is most responsible
for this increase? Which store in this region is most responsible for the increase? What particular
product category or categories contributed the most to the increase? Answering these types of
questions in order means that you are performing an OLAP analysis.
Depending on the underlying technology used, OLAP can be braodly divided into two different
camps: MOLAP and ROLAP.
In the OLAP world, there are mainly two different types: Multidimensional OLAP (MOLAP)
and Relational OLAP (ROLAP). Hybrid OLAP (HOLAP) refers to technologies that combine
MOLAP and ROLAP.
MOLAP
This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a
multidimensional cube. The storage is not in the relational database, but in proprietary formats.
Advantages:
a) Excellent performance: MOLAP cubes are built for fast data retrieval, and is optimal
for slicing and dicing operations.
b) Can perform complex calculations: All calculations have been pre-generated when the
cube is created. Hence, complex calculations are not only doable, but they return quickly.
Disadvantages:
a) Limited in the amount of data it can handle: Because all calculations are performed
when the cube is built, it is not possible to include a large amount of data in the cube
itself. This is not to say that the data in the cube cannot be derived from a large amount of
data. Indeed, this is possible. But in this case, only summary-level information will be
included in the cube itself.
b) Requires additional investment: Cube technology are often proprietary and do not
already exist in the organization. Therefore, to adopt MOLAP technology, chances are
additional investments in human and capital resources are needed.
ROLAP
This methodology relies on manipulating the data stored in the relational database to give the
appearance of traditional OLAP's slicing and dicing functionality. In essence, each action of
slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement.
Advantages:
a) Can handle large amounts of data: The data size limitation of ROLAP technology is the
limitation on data size of the underlying relational database. In other words, ROLAP
itself places no limitation on data amount.
b) Can leverage functionalities inherent in the relational database: Often, relational
database already comes with a host of functionalities. ROLAP technologies, since they sit
on top of the relational database, can therefore leverage these functionalities.
Disadvantages:
a) Performance can be slow: Because each ROLAP report is essentially a SQL query (or
multiple SQL queries) in the relational database, the query time can be long if the
underlying data size is large.
b) Limited by SQL functionalities: Because ROLAP technology mainly relies on
generating SQL statements to query the relational database, and SQL statements do not
fit all needs (for example, it is difficult to perform complex calculations using SQL),
ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP
vendors have mitigated this risk by building into the tool out-of-the-box complex
functions as well as the ability to allow users to define their own functions.
HOLAP
HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For
summary-type information, HOLAP leverages cube technology for faster performance. When
detail information is needed, HOLAP can "drill through" from the cube into the underlying
relational data.

Вам также может понравиться