EAKFDT

Abstract Introduction
Objective
Modules Algorithms Hardware & Software Requirements UML Diagrams Introduction to KDDML Screen Shots Conclusion
Most data mining algorithms and tools stop at discovered customer models, producing distribution information on customer profiles. Such techniques, when applied to industrial problems such as customer relationship management (CRM), are useful in pointing out customers who are likely attritors and customers who are loyal, but they require human experts to postprocess the discovered knowledge manually. Most of the post processing techniques have been limited to producing visualization results and interestingness ranking, but they do not directly suggest actions that would lead to an increase in the objective function such as profit.
we present novel algorithms that suggest actions to change customers from an undesired status (such as attritors) to a desired one (such as loyal) while maximizing an objective function: the expected net profit. These algorithms can discover cost effective actions to transform customers from undesirable classes to desirable ones. The approach we take integrates data mining and decision making tightly by formulating the decision making problems directly on top of the data mining results in a post processing step. To improve the effectiveness of the approach, we also present an ensemble of decision trees which is shown to be more robust when the training data changes. Empirical tests are conducted on both a realistic insurance application domain and UCI benchmark data.
Research in Data Mining.

Models for Knowledge Discovery.
Final objective of Data Mining.

To extract actionable knowledge we consider CRM. Telecommunication Industry is the best example.
The main objective of this proposed project is to maximize the expected net profit while reducing the costs.
we present a novel post processing technique to extract actionable knowledge from decision trees. The actions that are associated with attribute-value changes, in order to maximize the profit-based objective functions.
we will research other forms of limited resources problem as a result of post processing data mining models. Evaluate the effectiveness of our algorithms in the real-world deployment of the action-oriented data mining.
Cost effective
High quality and services
Customers jump from one model phone to the other.
Businesses should find ways to keep their customers.

This project thus holds a important role in CRM.
For decision trees, we have considered two broad cases.
Unlimited-Resource Case
Limited-Resource Case
First Step in Post processing Decision Trees: 1. Import customer data with data collection, cleaning, processing. 2. Decision Tree is built to predict the customer status. 3. Search for optimal actions for each customer. 4. Produce reports for domain experts to review.
Algorithm leaf-node search 1. For each customer x, do 2. Let S be the source leaf node in which x falls into; 3. Let D be a destination leaf node for x the maximum net profit PNet; 4. Output (S, D, PNet);
a customer Jack whos record states that the Service = Low, Sex = M , and Rate = L
Actionchange Service from L(low) to H (high) Assume that the cost of such a change is $200 If Jack 100% loyal, total profit from him is $1,000 net profit would be $400 $1,000x(0.8-0.2)-$200
Searches for optimal actions.
Decision Tree provides probability of customers in desired status.

A customer used to build a Decision Tree falls into a leafnode with a probability of being in the desired status. The algorithm tries to move the customer to other leaves with higher probabilities of being in the desired status. This movement incurs cost represented in cost matrix. Algorithm searches all nodes so that a best destination node is found. Net profit should be maximum.
Attribute-value changes incur costs, determined by domain experts. For each attribute used in decision tree, a cost matrix is used for such costs. Attributes include hard attributes and soft attributes. Hard attributes like sex, address, no. of children whose values cannot be changed with reasonable amount of money.
Soft attributes like service level, interest rate can be changed quite easily. For hard attributes, users must assign a very large number to every entry in cost matrix. However, hard attributes should be included because most of them are important in accurate probability estimates of leaves.
For companies with limited resources, it is difficult to optimally merge all leaf-nodes. Each account manager of a mutual fund company can take care of only one customer group. A decision tree here contains source leaf-nodes that correspond to customer segments to be converted and a no. of destination nodes.
Formal
Definition of BSP Algorithms for BSP Improving the Robustness Experimental Evaluation
BSP: bounded segmentation problem C(Attri,u,v),i=1,2 : cost matrix Bc(Li) : unit benefit vector ASet = {(Attri,uivi),i=1,2,Nd} : action set Net profit
O(nk)
k=1 2
Hardware Requirements:
Processor Ram Cache Hard disk : : : : Intel Pentium III or more 256 MB or more 512 KB 16 GB hard disk.
Operating system
: Windows 2000 /XP or later Languages/Software : Java Runtime Environment, Java Software Development Kit 1.6 Java NetBeans IDE
Extracting Unlimited Resource
<<include>>
Leaf Node Search URC
System
Limited Resource Case

<<include>>
Greedy BSP
Input Customer Data
Build Customer Profile
Compute Proactive Solution
Search Optimal Actions
Generate Reports
Retrieve Each Customer
Input Leaf Node
Input Destination Node
Compute Cost Matrix
Compute Profit
Retrieve Decision Tree
Retrieve path from Root to Destination
Apply Cost Matrix
Maximize Profit
Input Training Set
Rank All Attributes
Construct Decision Tree
Input Data
Calculate Greedy Profit
Retrieve Action Set
UnlimitedResources : UnlimitedResources : System
1: InputCustomer( )
2: BuildProfile( )
3: ComputeSolution( )
4: SearchActions( )
5: GenerateReports( )
LeafNodeURC : LeafNodeURC : System
1: RetrieveCustomer( )
2: InputSourceNode( )
3: InputDestinationNode( )
4: ComputeMatrix( )
5: ComputeProfit( )
LimitedResource : LimitedResource : System
1: RetrieveTree( )
2: RetrievePath( )
3: RetrieceMatrix( )
4: MaximizeProfit( )
GreedyBSP : GreedyBSP : System
1: InputTraining( )
2: RankAttributes( )
3: ConstructDecisionTree( )
4: InputData( )
5: CalculateNetProfit( )
6: RetrieveActionSet( )
Test Case ID: Test Name: Purpose: Input: Expected Result:
Case Required Software Testing To check weather the required Software is installed on the systems Enter Javac version, and Java version Should Display the version number for both the java compiler and Java virtual Machine
Test Case ID:
Test Case Name: Required Software Testing Input: Actual Result: Failure Enter Javac version, and Java version Displays java version "1.6.0_01" and Javac "1.6.0_01" If the java environment is not installed then the Deployment fails
Result
Successes
Most data mining algorithms and tools produce only the segments and ranked lists of customers or products in their outputs. In this project, we present a novel technique to take these results as input and produce a set of actions that can be applied to transform customers from undesirable classes to desirable ones. For decision trees, we have considered two broad cases. The first case corresponds to unlimited resources, and the second case corresponds to the limited resource-constraint situations. In both cases, our aim is to maximize the expected net profit of all the customers. We have found a greedy heuristic algorithm to solve both problems efficiently and presented an ensemble-based decision-tree algorithm that use a collection of decision trees, rather than a single tree, to generate the actions. We show that the resultant action set is indeed more robust with respect to training data changes.
[1] The kdd-cup-98 result: http://www.kdnuggets.com/meetings/kdd98/kdd-cup-98-results.html, 2005.

[2] N. Abe, E. Pednault, H. Wang, B. Zadrozny, W. Fan, and C. Apte, Empirical Comparison of Various Reinforcement Learning Strategies for Sequential Targeted Marketing, Proc. Second IEEE Intl Conf. Data Mining (ICDM 02), 2002. [3] M. Belkin, P. Niyogi, and V. Sindhwani, On Manifold Regularization, Proc. 10th Intl Workshop Artificial Intelligence and Statistics, pp. 17-24, Jan. 2005.
[4] C. Elkan, The Foundations of Cost-Sensitive Learning, Proc. 17th Intl Joint Conf. Artificial Intelligence (IJCAI 01), 2001.
[5] J. Huang and C.X. Ling, Using Auc and Accuracy in Evaluating Learning Algorithms, IEEE Trans. Knowledge and Data Eng., vol. 17, no. 3, pp. 299-310, 2005.
[6] J. Li and H. Liu, Ensembles of Cascading Trees, Proc. IEEE Intl Conf. Data Mining (ICDM 03), pp. 585-588, 2003.
[7] C.X. Ling, T. Chen, Q. Yang, and J. Cheng, Mining Optimal Actions for Intelligent CRM, Proc. IEEE Intl Conf. Data Mining (ICDM), 2002.
[8] E. Pednault, N. Abe, and B. Zadrozny, Sequential Cost-Sensitive Decision Making with Reinforcement Learning, KDD 02: Proc. Eighth ACM SIGKDD Intl Conf. Knowledge Discovery and Data Mining, pp. 259268, 2002.
[9] K. Wang, Y. Jiang, and A. Tuzhilin, Mining Actionable Patterns by Role Models, Proc. IEEE Intl Conf. Data Eng., 2006. [10] K. Wang, S. Zhou, Q. Yang, and J.M.S. Yeung, Mining Customer Value: From Association Rules to Direct Marketing, Data Mining and Knowledge Discovery, vol. 11, no. 1, pp. 57-79, 2005.
[11] A. Tuzhilin, Y. Jiang, K. Wang, and A. Fu, Mining Patterns that Respond to Actions, Proc. IEEE Intl Conf. Data Mining, pp. 669- 672, 2005.
[12] Q. Yang, J. Yin, C.X. Ling, and T. Chen, Postprocessing Decision Trees to Extract Actionable Knowledge, Proc. IEEE Conf. Data Mining (ICDM 03), pp. 685-688, 2003. [13] B. Zadrozny and C. Elkan, Learning and Making Decisions When Costs and Probabilities Are Both Unknown, Proc. Seventh ACM SIGKDD Intl Conf. Knowledge Discovery and Data Mining (ACM SIGKDD 01), pp. 204-213, 2001.
[14] X. Zhang and C.E. Brodley, Boosting Lazy Decision Trees, Proc. Intl Conf. Machine Learning (ICML), pp. 178-185, 2003.
[15] Z.-H. Zhou, J. Wu, and W. Tang, Ensembling Neural Networks: Many Could Be Better Than All, Artifical Intelligence, vol. 137, nos. 1-2, pp. 239-263, 2002.

EAKFDT

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

EAKFDT

Hochgeladen von

Copyright:

Verfügbare Formate

Abstract Introduction

Research in Data Mining.

Final objective of Data Mining.

High quality and services

Customers jump from one model phone to the other.

Businesses should find ways to keep their customers.

For decision trees, we have considered two broad cases.

Searches for optimal actions.

Decision Tree provides probability of customers in desired status.

Extracting Unlimited Resource

Leaf Node Search URC

Limited Resource Case

Input Customer Data

Build Customer Profile

Compute Proactive Solution

Search Optimal Actions

Retrieve Each Customer

Input Leaf Node

Input Destination Node

Compute Cost Matrix

Retrieve Decision Tree

Retrieve path from Root to Destination

Apply Cost Matrix

Input Training Set

Rank All Attributes

Construct Decision Tree

Calculate Greedy Profit

Retrieve Action Set

UnlimitedResources : UnlimitedResources : System

LeafNodeURC : LeafNodeURC : System

LimitedResource : LimitedResource : System

GreedyBSP : GreedyBSP : System

Test Case ID: Test Name: Purpose: Input: Expected Result:

Test Case ID:

[1] The kdd-cup-98 result: http://www.kdnuggets.com/meetings/kdd98/kdd-cup-98-results.html, 2005.

Das könnte Ihnen auch gefallen