A Methodology For Web Usage Mining and Its Application To Target Group Identification

Fuzzy Sets and Systems 148 (2004) 139 152
www.elsevier.com/locate/fss
A methodology for web usage mining and its application to target group identication
Sandro Arayaa , Mariano Silvab , Richard Weberc;
BCI Bank, Santiago, Chile WebMining Ltda., Santiago, Chile c Department of Industrial Engineering, University of Chile, Republica 701, Santiago, Chile
b a
Abstract Web usage mining is an important and fast developing area of web mining where a lot of research has been done already. Recently, companies got aware of its potentials, especially for applications in marketing. A structured methodology is, however, a crucial requirement for a successful practical application of web usage mining. This publication provides such a methodology that is based on suggestions from literature and own experience from various web mining projects. Its application in a Chilean bank shows how a combined use of data from a data warehouse and web data can contribute to improve marketing activities. The benets from this project point at the huge potential web usage mining has not only in nancial services. c 2004 Elsevier B.V. All rights reserved.
Keywords: Fuzzy clustering; Pattern recognition; Soft computing; Web usage mining; Business intelligence
1. Introduction Customer Relationship Management is a central topic in todays nancial services [14]. Due to increasing competition between suppliers of goods and services, a strong relation between a provider and each one of its customers is nowadays more important than ever before. On the other hand, we are faced with tendencies that hinder direct customer contact, such as e.g. worldwide globalization, fusions of enterprises, and an increasing number of services o ering remote transactions, e.g. internet and phone banking. The challenge, companies are faced with, consists in nding ways to understand customer behavior, their preferences and desires in order to provide each customer excellent and personalized services
Corresponding author. Fax: +56-2-689 7895. E-mail address: rweber@dii.uchile.cl (R. Weber).
0165-0114/$ - see front matter c 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.fss.2004.03.011
140
S. Araya et al. / Fuzzy Sets and Systems 148 (2004) 139 152
at low costs. Advanced technologies, such as data mining, web mining, soft computing, and data warehouses o er tools to reach this goal. This publication presents a methodology for web usage mining and its application for marketing in a Chilean bank. The experience made in this project points at the potentials web usage mining o ers for nancial services. Special emphasis is paid to the conceptual work performed in this application rather than developing new soft computing methods. Section 2 of this paper provides a brief overview on the state-of-the-art of web mining and in particular web usage mining. In Section 3 we propose a methodology for web usage mining with special emphasis on applications in marketing and related areas. Section 4 shows the application of our methodology in a particular case, the identication of target groups for marketing activities in a Chilean bank. Section 5 concludes this paper and points at extensions and future research. 2. Overview on web usage mining Web mining can be dened roughly as data mining using data generated by the web and includes the following sub areas: web content mining, web usage mining, and web structure mining [16]. In web content mining (WCM) we try to nd useful information in the content of web pages [13], as e.g. free text inside a web page, semi-structured data such as HTML code, pictures, and downloadable les. Web structure mining (WSM) aims at generating a structural summary about the web site and web pages. While web content mining mainly focuses on the structure of innerdocument, web structure mining tries to discover the link structure of the hyperlinks at the interdocument level. Web usage mining (WUM) is applied to the data generated by visits to a web site, especially those contained in web log les. Other sources can be browser logs, user proles, user sessions, bookmarks, folders and scrolls [13]. Fig. 1 shows the main application areas of WUM. Next, we present an idea of each one of these areas in order to correctly classify the application presented below. A detailed overview, which is out of the scope of this paper, can be found e.g. in [16]. Personalization of web sites is a very challenging eld of both, current research as well as applications that have as goals e.g. individualized marketing for e-commerce or dynamic recommendations
Web Mining
Web Content Mining
Web Usage Mining
Web Structure Mining
System Personalization Improvements
Modification of Web Site
Business Intelligence
Characterization of use
Fig. 1. Application areas of web usage mining.
141
to a web visitor based on his/her prole and usage behavior. Analyzing web data can also be used for system improvements providing the key to understanding web tra c behavior. Advanced load balancing, data distribution or policies for web caching as well as higher security standards are potential benets of such improvements. Similar analyses could be used for modication of web sites. Understanding visitors behavior in a web site provides hints for adequate design and update decisions. Business intelligence covers the application of intelligent techniques in order to improve certain businesses, mainly in marketing. If such a business is performed over the web, web usage mining o ers powerful tools, as will be shown below. Characterization of use is based e.g. on models that determine the pages a visitor might visit on a given site. Recently, web usage mining and web content mining have been combined in order to provide e.g. a better understanding of the requirements a visitor has when entering a web site. Analyzing the log les in relation to the content of each page o ers much more information than pure log le analysis. Based on techniques from information retrieval [3] a similarity measure between visitors of a web site has been suggested combining content with usage of the respective site [18]. 3. A methodology for web usage mining Based on suggestions from literature (e.g. [16]) and our experience from related projects, we developed a methodology, which can be used for general applications of web usage mining. It is an adaptation of the well-known process of knowledge discovery in databases (KDD) [8,9] to the analysis of web data. Our methodology consists of the following steps that build an iterative process. 1. 2. 3. 4. 5. 6. 7. identication of objectives and available data, selection, preprocessing, transformation, pattern discovery, interpretation and evaluation of results, business integration.
Compared to the KDD process we add explicitly the steps identication of objectives and available data and business integration. As will be seen, both are crucial for successful web usage mining. Next, we describe those steps of the methodology that show major di erences compared to the standard KDD process. 3.1. Identication of objectives and available data Both, objectives and available data play a fundamental role in the subsequent steps of the proposed methodology. Some possible objectives of web usage mining for marketing purposes are: acquire new customers, retain customers, predict churn of customers,
142
cross selling, predict future visit patterns. In practice, we rarely nd one of these objectives clearly described. Often we are faced with some vague ideas a decision-maker has in mind and have to translate them into one or several of the objectives mentioned above. This is also the case of the application we are presenting below. In web usage mining we have the following three types of data, which we will study below in more detail: web data, i.e. data generated by visits to a web site; business data, i.e. data in traditional systems generated by the respective business; meta data, i.e. data describing the web site itself (content and structure).
3.2. Selection Based on the previously determined data sources, the selection process identies the subset of data we want to work with. Whereas the selection step is part of the well-known KDD process, we are faced with various particularities due to the fact that we work with web and meta data. This makes a detailed investigation of the respective data sources necessary. 3.2.1. Web data We have the following three types of web data, i.e. data generated by visits to a web site: log les, Cookies, and query data. Within the rst one of these types there areamongst othersthree di erent kinds of log les of particular importance [17]: Access Log, Error Log, Referrer Log. Access Logs could be stored in Common Log le format (CLF) or in Extended Log le format (ELF). Fig. 2 shows an example of the CLF. The syntax of a Common Log le format (CLF) is as follows [see www.w3.org]: remotehost rfc931 authuser [date] request status bytes
Fig. 2. Example of Common Log format.
143
Fig. 3. Example of Extended Log format.
The following table explains the elements of a CLF: remotehost rfc931 authuser [date] request status bytes Remote hostname (or IP if DNS hostname is not available, or if DNSLookup is O ). The remote log name of the user. The username as which the user has authenticated himself. Date and time of the request. The request line exactly as it came from the client. The HTTP status code returned to the client. The content-length of the document transferred.
The Extended Log le format stores additionally the browser used and the URL from where a page has been accessed (see Fig. 3). The Error Log contains information on all kinds of possible errors, e.g. broken links and attempts for unauthorized access. The Referrer Log is an optional log le that contains information on web pages from where a particular page has been accessed. The second type of web data we are interested in are Cookies. These are text les stored on a users system, used for keeping track of settings or data for a particular web site, which created the Cookie. When the users browser requests a web page, it sends the settings that apply to that page along with the request. Cookies can be temporary or permanent. A users browser keeps track of temporary cookies as long as it is running, but deletes them when it is shut down. Temporary cookies are used to pass information between web pages during a single visit, e.g. online shopping carts. A users browser saves permanent cookies as les on its system to maintain settings or data between visits (see Fig. 4). Query data are the third type of web data and are generated by queries to the web server. Each search on a web site could be stored in a le and provide useful information on visitors preferences. Depending on the previously determined objectives we can select the web data for further analysis. It should be mentioned, however, that selection interacts strongly with preprocessing and transformation, since the nal selection cannot be performed before user sessions have been identied as will be shown later. 3.2.2. Business data Since we focus on web usage mining applications in marketing, the following kinds of business data are of particular importance: Demographic information (e.g.: age, sex, education, civil status). Product information: Which products does a customer currently use? Which products did a customer use before?
144
Fig. 4. Example of a Cookie le.
Information on transactions: How does a customer use his or her product, e.g. a credit card? Information on previous campaigns: Which information did a customer already receive via marketing campaigns? How did he or she react? Most companies have built data warehouses or data marts where such information is stored. The selection of the appropriate business data depends also on the previously determined objectives and can be performed guided by the typical considerations of a data mining project, see e.g. [2]. Therefore we do not explain it here in more detail. 3.2.3. Meta data Meta data describe the structure and the content of a web site. The structure is given e.g. by the home page and links between pages. The content of a web site can be represented e.g. by a vector space model [3]. If a web site contains W di erent words and has P di erent pages, we can represent its content by a matrix M with W rows and P columns. The element mi; j of matrix M represents the weight of word i in page j (i = 1; : : : ; W ; j = 1; : : : ; P). In order to estimate these weights, we use the tfxidf-weighting [1], dened as mi; j = fi; j log(P=ni ), where fi; j is the frequency of word i in document j and ni is the number of pages containing word i (i = 1; : : : ; W ; j = 1; : : : ; P). The selection of meta data depends also partially on the step preprocessing and transformation since e.g. some preprocessing of the content of each page has to be performed as will be seen later. In the rst selection attempt we determine if we work with meta data at all and if so which pages of an entire web site are relevant for our analysis. 3.3. Preprocessing and transformation We show these two steps just for web data and meta data. Preprocessing and transformation of traditional business data has been studied in pure data mining projects, e.g. [7] and will not be considered here in detail.
145
Fig. 5. Preprocessing and transformation of web data.
3.3.1. Web data Whereas data in databases and data warehouses can show various kinds of errors due to manual data input, web data is generated automatically and therefore does not have these kinds of errors. There are, however, other reasons, why web data could be erroneous making preprocessing necessary, e.g. implementation errors or server down times. These sources for poor data quality have to be investigated andif necessaryappropriate measures for data quality improvement have to be undertaken. The most frequent transformation of web data is sessionization of log les, a step, which takes a log le (see above) as input data and determines user sessions as output. Several sessionization heuristics have been proposed; see e.g. [5]. We describe the time-based heuristic, which determines a user session from the sequential data in a log le if the pages visited from one IP address are within a predened time frame. Usually, 30 min are taken for such a time-based sessionization. If a web site uses Cookies (see above) the step of sessionization is trivial, since the site recognizes a visitor and his or her session can be determined without any di culties. Fig. 5 illustrates how preprocessing and transformation of web data can be performed. 3.3.2. Meta data In general, preprocessing is not necessary for meta data since it describes the static structure of a web site. This is quite di erent for dynamically changing web sites, a case we do not consider here explicitly. We assume that meta data does not contain errors. Describing the content of a web site, however, makes certain transformation steps necessary. Since a web page contains tags and words that do not have direct relation with its content, text ltering becomes important. For example we want to eliminate HTML tags and stop words (e.g. pronouns, prepositions, conjunctions). Furthermore, we want to apply the word Stemming process in order to generate the adequate word stem, for example the words buy, buyer, buying have the same stem. One such algorithm is presented in [15]. After having performed preprocessing and transformation for all three data sources another data selection step is necessary. For example we might wish to analyze just a subset of all user sessions,
146
Table 1 Objectives and data mining approaches Objective Acquire new customers Retain customers Predict churn of customers Cross selling Predict future visit patterns Data mining approach Customer segmentation Classication Classication Association rules Sequential patterns
e.g. those sessions with a number of page views in a reasonable range (see e.g. [18]) or sessions that contain transactions (as will be shown below). 3.4. Pattern discovery and interpretation and evaluation of results Depending on the previously dened objectives statistical analysis, web tra c analysis, and di erent data mining approaches can be applied. Table 1 gives an idea on which approach is most suited given a certain objective. Since real-world applications often are guided by vaguely described objectives we generally have to apply a combination of data mining approaches. This is also the case of the application we present below. Interpretation and evaluation of results can be done as in the basic KDD process and will not be analyzed in more detail. 3.5. Business integration The results from the data mining step have to be integrated into the respective business. Applications of web usage mining for marketing show some particularities, such as the possibility to perform online marketing, place advertisements in a web site dynamically, among others. 4. Application of the proposed methodology In this section we present the application of the proposed methodology for target group identication of bank customers. After a brief introduction we describe each of its steps with its respective result. For further convenience we rst introduce the following terminology. A customer is a person that uses at least one product of the bank. Here we considered just natural persons; a similar application for legal persons would also be interesting and is left for future work. Among all customers we want to distinguish registered customers who are registered for use of the banks web site and have user name and password. In this application we do not consider visitors that just visit the banks web site without being a customer. Analyzing their web usage behavior would be another project. The Chilean bank Banco Cr dito e Inversiones (BCI) started its virtual bank (www.bci.cl) in e 1996 and its registered customers perform currently more than 10,000 bank transactions in Internet per day. Such online transactions cause typically less than 10% of traditional transaction costs. Most of the customers, however, are still not registered for online banking.
147
Fig. 6. Questions in virtual and traditional bank.
Objective of the presented work is to increase the use of the virtual bank encouraging nonregistered customers to register for online banking. A rst non-trivial task in this project was to identify the adequate data mining tasks in order to reach the mentioned goal. Fig. 6 shows some questions that helped us to translate this business objective into data mining tasks. In order to reach the above-mentioned objective we dened the following data mining tasks: 1. Determine homogeneous segments of registered customers in order to (hopefully) identify a segment of heavy users, i.e. registered customers that distinguish themselves from the other registered customers by realizing many bank transactions online. 2. Assign the non-registered customers to the segments found before in order to identify twins of heavy users, i.e. non-registered customers that look like heavy users. Available data are the log les and the corporate data warehouse with data for all customers and transaction data of registered customers. 4.1. Selection From the log les we selected the visits to the web site from a certain period (in our case last 6 months before the analysis). From the data warehouse we selected demographic data for all customers and transaction data from the same period for the registered customers.
148
4.2. Preprocessing and transformation We applied the sessionization procedure [5] to the log les in order to identify user sessions. We applied a time-based heuristic with 30 min as longest user session and the sessionization toolbox WLS XP (WebMining Log Sessionizator Xpert, [21]). Then we selected those sessions that contained at least one bank transaction. 4.3. Pattern discovery The above-mentioned business objective had been translated into two data mining tasks, which we reach in the following two steps: We rst perform clustering in order to segment the registered customers. It will be shown that one of the segments can clearly be separated and contains the group of heavy users. Based on these segments the banks data warehouse is explored in the second step in order to assign the non-registered users to the segments found. Especially those non-registered customers that are assigned to the class of heavy users (twins of heavy users) are interesting for further marketing activities, because it is supposed that they are likely to realize many bank transactions online once they are registered. For both tasks, feature selection is necessary, i.e. from all attributes in the data warehouse we have to select the relevant features for segmentation and classication. The attributes available in the data warehouse can be classied into demographic attributes, socioeconomic attributes, product tendency attributes, and transactional attributes. Applying various statistical analyses, e.g. correlation analysis and principal component analysis and a close interaction with the business experts allowed us to determine 6 features that possess the highest discriminative power for the further data mining steps. Competition between Chilean banks prevents us to mention these features. With these features as input data we determined segments of similar registered customers applying fuzzy clustering, a technique that has proved its e ectiveness for market and customer segmentation; see e.g. [2,10,20]. In particular, we run the algorithm fuzzy c-means in order to determine homogeneous customer segments; see [4] for more details on the algorithm. For each possible number of classes between c = 2 and 20 this algorithm has been applied and a measure of class validityin this case the partition entropy [22]was calculated. This measure of validity indicates a good number of classes that could be found; see e.g. [20]. In this application, the most adequate number of classes is c = 5. Table 2 shows the result of the fuzzy cluster analysis for customer segmentation. We assigned each customer to the class with highest degree of membership. The third column of Table 2 shows the normalized number of bank transactions performed on average by customers from each class. Due to the competition between local banks we are not allowed to present the real number of transactions. It can be seen that class H distinguishes clearly from the other classes by its very high number of transactions. These are the heavy users we wanted to identify within the registered users. It should be mentioned that we did not use number of transactions as one of the features for clustering.
S. Araya et al. / Fuzzy Sets and Systems 148 (2004) 139 152 Table 2 Clustering results Name L1 L2 M1 M2 H Total Interpretation Young registered customers with low usage. Very young registered customers with low usage. Old registered customers with moderate usage. Mid-aged registered customers with moderate usage. Young registered customers with high usage. Normalized no. of transactions 17.7 18.4 22.0 22.7 100.0
149
Cases (%) 22 10.3 11.1 28.5 28.2 100
Table 3 Test results Class H (predicted) Class H (real) Other class (real) Total Error (%) 1177 151 1328 11.8 Other class (predicted) 19 2923 2942 0.6 Total 1196 3074 4270 Error (%) 1.6 5.2
Next, we trained a neural network (multilayer perceptron, MLP) to assign each registered customer to his or her class, information we determined already before by clustering. It should be stressed at this point that only features available also for non-registered users have been used in order to classify the registered customers to their class. This gives us the opportunity to assign also non-registered customers to the segments found before. To train the neural classier a balanced sample of the 5 segments previously mentioned was taken. The total sample contained 21,350 registers of which 80% (17,080) were used for training and the remaining 20% (4270) for testing. The neural network (multilayer perceptron) has the following architecture: Number of input neurons: 6, corresponding to the six attributes used. Number of neurons in the hidden layer: 12 (transfer function: sigmoid) Number of output neurons: 5, corresponding to the 5 classes: H, L1, L2, M1 and M2. The sample contains just those customers that have a high degree of membership to its respective class (here: at least 0.4). This is where the special advantage of fuzzy clustering plays an important role in this application. Both data mining techniques, fuzzy clustering as well as the neural network have been applied using the software tool DataEngine [12]. After training we tested the neural network with the 4270 test data. Table 3 shows the respective results. Since we are interested in class H only we show the test errors 1 and 2 between class H and the other classes. For the important class of our analysisclass Honly 1.6% of the test data belonging to class H was not classied correctly. The fact that 11.8% of the test data from other classes are assigned
150
Table 4 Classication results for non-registered customers Class % Cases L1 22.9 L2 17.7 M1 25.2 M2 13.1 H 21.0 Total 100.0
Table 5 Registration rates of di erent groups Group 1 Registration rate (%) 10.4 Group 2 5.1 Group 3 2.2
erroneously to class H is not so relevant since it is more important to identify as many objects as possible from class H for future marketing campaigns. After being trained with registered customers, the neural network was applied to the entire database of non-registered customers. Table 4 shows the results of this classication. According to Table 4, 21% of all non-registered customers were assigned to class H, i.e. they look like heavy users. These non-registered customers (in the following TGH: target group H) are the most attractive ones in terms of motivating web usage becauseaccording to their feature valuesthey probably behave like real heavy users if only they register. 4.4. Business integration The bank carried out a mailing to a subset of the target group H (TGH) explaining the advantages of online banking and motivating a registration. Four weeks after this mailing registration behavior was analyzed and compared between three groups (see Table 5): customers from TGH that received the mailing (Group 1), customers from TGH that did not receive the mailing (control group; Group 2), non-registered customers not in TGH that did not receive the mailing (Group 3). The registration rate is the percentage of customers from a certain group that registered themselves in the dened period (4 weeks after the mailing). The di erence between Groups 2 and 3 underlines that customers from TGH really show a di erent attitude towards Internet banking and therefore haveeven without an external stimulus like e.g. a mailinga higher registration rate. The di erence between Groups 1 and 2 shows the e ect of the mailing: it doubled the registration rate within a group of homogeneous customers. 5. Conclusions and future work We presented a methodology for web usage mining and applied it for marketing in a bank. In this case, special emphasis was laid on promoting the use of Internet banking. The goal of increasing the number of registered customers by focalized marketing campaigns to an interesting target group
151
has been reached. This particular application hints at the potentials, web usage mining o ers for improved customer relationship management e.g. in nancial services. What needs to be done in the future can be classied in applied and theoretical work. As applied work we see the tracking of newly registered customers. For the time being, a registration has been counted as a success. It has to be supervised, however, if a customer who registered after a mailing really behaves like a heavy user. It becomes also important to gather customer reaction to the performed marketing activities in order to improve continuously the developed system, especially in dynamic environments as e.g. the web. Regarding theoretical work we need adequate techniques from dynamic data mining, such as e.g. dynamic fuzzy clustering in order to update e ciently the segments of registered customers. Di erent approaches of dynamic data mining have already been used successfully for customer segmentation; see e.g. [6,11]. Acknowledgements The Nucleus Millennium Science on Complex Engineering Systems supported this project (www.sistemasdeingenieria.cl). References
[1] K. Aas, L. Eikvil, Text categorisation: a survey, Technical Report, Norwegian Computing Center, 1999. [2] J. Angstenberger, R. Weber, M. Poloni, Data warehouse support to data mining: a database marketing perspective, J. Data Warehousing 3 (1) (1998) 211. [3] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley, England, 1999. [4] J.C. Bezdek, J.M. Keller, R. Krishnapuram, N.R. Pal, Fuzzy Models and Algorithms for Pattern Recognition and Image Processing, Kluwer, Boston, London, Dordrecht, 1999. [5] R. Cooley, B. Mobasher, J. Srivastava, Data preparation for mining world wide web browsing patterns, J. Knowledge Inform. Systems 1 (1999) 532. [6] F. Crespo, R. Weber, A methodology for dynamic data mining based on fuzzy clustering, Fuzzy Sets and Systems, in press. [7] A. Famili, W.-M. Shen, R. Weber, E. Simoudis, Data preprocessing and intelligent data analysis, Intell. Data Anal. 1 (1) (1997) 323. [8] U.M. Fayyad, Data mining and knowledge discovery: making sense out of data, IEEE Expert, Intelligent Systems and their Applications, October 1996, pp. 2025. [9] J. Han, M. Kamber, Data MiningConcepts and Techniques, Morgan Kaufmann Publishers, San Francisco, 2001. [10] H. Hruschka, Market denition and segmentation using fuzzy clustering methods, Internat. J. Res. Marketing 3 (1986) 117134. [11] A. Joentgen, L. Mikenina, R. Weber, H.-J. Zimmermann, Dynamic fuzzy data analysis based on similarity between functions, Fuzzy Sets and Systems 105 (1) (1999) 8190. [12] MIT GmbH, DataEngine 4.0 Manual, Aachen, Germany, 2001, www.dataengine.de. [13] S.K. Pal, V. Talwar, P. Mitra, Web mining in soft computing framework: relevance, state of the art and future directions, IEEE Trans. Neural Networks 13 (I.5) (2002) 11631177. [14] J. Peppard, Customer relationship management (CRM) in nancial services, European Management J. 18 (3) (2000) 312327. [15] M.F. Porter, An algorithm for su x stripping, Program 14 (3) (1980) 130137. [16] J. Srivastava, R. Cooley, M. Deshpande, P.-N. Tan, Web usage mining: discovery and applications of usage patterns from web data, SIGKDD Explorations 1 (2) (2000) 1223.
152
[17] R. Stout, Web Site Stats: Tracking Hits and Analyzing Tra c, McGraw-Hill, California, 1997. [18] J.D. Vel zquez, H. Yasuda, T. Aoki, R. Weber, A new similarity measure to understand visitor behavior in a web a site, IEICE Trans. Inform. Systems E87-D (2) (2004) 389396. [19] X. Wang, K.A. Smith, Clustering web user interests using self organising maps, in: A. Abraham, J. Ruiz-del-Solar, M. Koppen (Eds.), Soft Computing SystemsDesign, Management and Applications, IOS Press, Amsterdam, Berlin, 2002, pp. 843852. [20] R. Weber, Customer segmentation for banks and insurance groups with fuzzy clustering techniques, in: J.F. Baldwin (Ed.), Fuzzy Logic, Wiley, Chichester, 1996, pp. 187196. [21] Webmining Ltda., WebMining Log Sessionizator Xpert WLS, Santiago, Chile, 2003, www.webmining.cl. [22] H.-J. Zimmermann, Fuzzy Set Theoryand its Applications, 4th Edition, Kluwer Academic Publishers, Boston, Dordrecht, London, 2001.

A Methodology For Web Usage Mining and Its Application To Target Group Identification

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

A Methodology For Web Usage Mining and Its Application To Target Group Identification

Загружено:

Авторское право:

Доступные форматы

Fuzzy Sets and Systems 148 (2004) 139 152

Web Content Mining

Web Usage Mining

Web Structure Mining

System Personalization Improvements

Modification of Web Site

Fig. 1. Application areas of web usage mining.

Fig. 2. Example of Common Log format.

Fig. 3. Example of Extended Log format.

Fig. 4. Example of a Cookie le.

Fig. 5. Preprocessing and transformation of web data.

Fig. 6. Questions in virtual and traditional bank.

Cases (%) 22 10.3 11.1 28.5 28.2 100

Вам также может понравиться