Beruflich Dokumente
Kultur Dokumente
emenasalvas@fi.upm.es
The gap between tecnology depelopment in the web and the business factors is increasing and genetares as a side effect a separation on what tecnologist develop and what the companies need. Knowing that the problem exists is just the begining Technological projects have to be integrated in the global strategy of the company
The problem
Innovative ideas in e-commerce are vaguely defined so they loose focus and precision New technologies are being applied consuming resources but without appropriate finantial or economic benefits Growth of the web activity, participation in every daily activity (commercial, educational news, ..) is not being replied by an accordindly number of servicies Services are being considered insuficient. Thus, site sponsors have to improve offered services to satisfy the increasing growth in demand. On the other hand, the growth in offers will bring a growth in demand what will make that the consumer will ask for a better service offer. Web Mining projects have to be planned as one more project in the global strategy of the company
The question:
Differences between criteria used to evaluate the success of any project in the entreprise compared to those in the case of a web project are in the root of the problem of webmining not complete success Site sponsors do no evaluate commercial and finantial aspects and are only based on vague commertial notions The success in terms of use, structure and content has to be linked to company business goals achievement
To evaluate if the web mining project results contribute to the company goals fulfilment:
The web site is not usually the end but the means. It is of the channels that the company uses to achieve goals. So in order to establish a site as a sucessful site, then it is a must the activities being developed through the site to generate value for the company
Traditional approaches only analyze the site from the user perspective, but the actions of the users have to generate value for the company It is a CRM project Web Project plan generation
Analytical CRM
Legacy Systems
Data Warehouse
Front Office
Service Automation
Marketing Automation
Sales Automation
Customer Activity
Customers
Products
Mobile Office
Mobile Sales
Field Service
Vertical Apps.
Category Mgmt.
Marketing Automation
Campaign Mgmt.
Customer Interaction
Fax Letter
Direct Interaction
Collaborative CRM
Data Mining
Increasing potential to support business decisions
Making Decisions Data Presentation Visualization Techniques Data Mining Information Discovery
Data Exploration Statistical Analysis, Querying and Reporting Data Warehouses / Data Marts OLAP, MDA Data Sources Paper, Files, Information Providers, Database Systems, OLTP
DBA
Fact Gap
Deployment
Web Logs is just the begining Not only the data has to be taken into account but all the circumstances under which the data were collected: Environment
General Organization-related Customer-related
Enviroment
Affects both direct and indirectly to the way activites occur. Between the factors to take into account:
Legal conditions Technological conditions Demography Ecological conditions (weather, transports, communications) Cultural and social conditions Geographical situation
Information to be added
Departments:
The same concept can have different meaning depending on the department Product for marketing is not the same than for production Data per se of the object: size, color, Data relevant for the company: margin of benefits, top ten, How it is presented in the web Static data: gender, demographic information (varies over the time but in a particular moment it is static) Roles: Behavior with the company being analyzed: number and kind of transaction he/she performs Behavioural data related to the environment (economy, legal constraints, climate,) Web Log: Location (IP), time, browser, Behaviour : comparative with the normal if any to discover : mood, different location, Itself has no meaning Legal and fiscal periods, holidays, weekend, Opening, closure, .
Products, services:
Navigators:
Dates
Data enrichment
There is no method, no model to follow. It is more an art Only with experience Projects for the same domain share the enrichment:
A model could be established Evaluate if data are appropriate to mine Evaluate kind of patterns that can be obtained Evaluate if a certain pattern cannot be obtained
Ontology of concepts ??? Integrate metadata so the mining activity deals with them.
Solution has to be deployed and integrated in the site structure. Patterns evolve in time as new data are coming Models have to be refined Establish the basis for the model to be refined without performance decrease
Interface Agent
Original WEBSITE
WebLogs
Subsessions capture/approximate user state information. Key concept: frequent behavior paths. Markov model to predict next set of pages and behaviour Webhouse to store information about users Modify APACHE: pop ups and precaching
Case-study
PDEP
Behaviour rules
Pgina principal, Tabln Pgina principal, Tabln Pgina principal, Tabln Exmenes Prcticas, Material apoyo Prctica 1 Prcticas, Material apoyo Prctica 2
Pgina de Decisin
Pgina Objetivo
2. Find Subsessions
Sessions may be described in terms of subsessions. E.g., browse catalog, browse shipping information, browse privacy notices, perform purchase. Subsessions may be defined in a number of ways, according to the PDEP desired semantics. E.g., use breakpoints.
PIND PDEP
BK N
BK M
BK P
.. .
Time 40 (secs)
30 20 10 0 1 2 3 4 5 6 7 8 9 1011121314151617181920212223
URLs
Value
25 20 15 10 5 0 -5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 sesin 3
sesin 1
sesin 2
Conclusion
Without a proper project management:
Difficult to obtain significant patterns Difficult interpretation of the resutls The potential of the process is minimized
Site goals have to be integrated Algorithms alone are of not use: The best algorithm not always means the best result The patterns have to be deployed in a proper architecture
THANKS! QUESTIONS???