Beruflich Dokumente
Kultur Dokumente
Abstract classifying rule for each category, and then use the
classifying rule to classify records in other databases.
This paper uses data mining classification algorithms-- The Food Mart is an international chain store. It is a
C5.0 and CART algorithms to get useful information to mixture of “70% food +20% daily necessities +10%
decision-making out of customers’ transaction behaviors. styles”, which has some especial characteristics especially
Firstly, by business understanding, data understanding in the style and taste. The store has made a high-taste
and data preparing, modeling and evaluating we get the shopping environment for its customers and has made a
results of the two algorithms and by comparing the lot of efforts on its operating and personnel disposition in
results ,we know that the two algorithms can both be order to attract more and more high-level consumers.
applied in the customer membership card classification The international chain store implements membership
model and can obtain a quite accurate result. Then we card system management which is helpful not only to
introduce the application of this model. Through analysis, accumulate the customer’s information but also to offer
we get to know customers’ income level and children corresponding service for different card-rank users. From
number are the two main factors to affect them to choose this way we can enhance customers' loyalty to the store.
cards. Knowing that, enterprises can take corresponding Therefore, so as to recommend corresponding card to the
measures, such as dividing customers into different appropriate customer, senior managers want to obtain
groups and then recommending the corresponding card to different card-rank customers’ characteristics and which is
the customer who has the similar characteristics. By this the most important factor that affects the customers to
means, enterprises can provide special service to different choose this kind of card not that kind.
card rank users in order to attract more and more SPSS Clementine is an open data mining tool and has
customers. won the British government SMART innovation prize
twice. It not only supports the entire data mining flow
which composes of getting data 、 transferring data 、
1. Introduction
modeling、evaluating and deploying, but also supports
the accepted data mining standard-- CRISP-DM(Cross-
Data mining[1] is the process of discovering interesting Industry Standard Process Data Mining). The
knowledge from large amounts of data stored either in visualization of Clementine makes “thought” possible,
databases, data warehouses, or other information that is, programmers can concentrate on the to-be-solved
repositories. In other words, the data you wish to analyze problem itself but not be limited to some technical work
by data mining techniques are incomplete (lacking (e.g. coding). It has also provides kinds of graphic
attribute values or certain attributes of interest, or techniques which are helpful to understand the key
containing only aggregate data), noisy (containing errors, relation between data and can instruct users to find the
or outlier values which deviate from the expected), and final solution by the most convenient way.
inconsistent (e.g. containing discrepancies in the
department.
The classification analysis is by analyzing the data in 2. Classification Algorithm
the demonstration database, to make the accurate
description or establish the accurate model or mine the Decision Tree is an important model to realize the
classification. It was a learning system—CLS builded by
212
Figure2. Effect of member card’s classification by
country factor
For the normal card users, their education background Rule summary: Customer whose yearly income is from
is worse compared to other card users. 10,000 to 30,000 is the normal card customer; Customer
213
whose yearly income is from 30,000 to 150,000, and Figure11. Comparison result of C5.0 and C&RT
having less than 3 children is the bronze card customer, model by gains parameter
having more than 3 children is the golden card customer;
Customer whose yearly income is more than 150,000, and The abscissa is usually for the quantile (according to
unmarried is the silver card customer, married is the confidence descending sequence), the y-coordinate is
golden card customer. Thus, the main factors that have a accumulation Gains.
big influence on customer rank is the income, child The ideal Gains chart should achieve high
number and the marital status. accumulation Gains at high in the earlier period, tend very
Classification result of C&RT algorithm: quickly to 100% and then stay steady.
Response parameter comparison: Response = (number
of hits of quantile accumulation number of hits/ number of
quantile sample) ×100%
6. Evaluating
214
Therefore, these two algorithms may have a very good
application in the customer membership card
classification model and obtain a quite accurate result.
7. Application
8. Conclusions
9. References
215