Beruflich Dokumente
Kultur Dokumente
Professor Tom Fomby Director Richard B. Johnson Center for Economic Studies Department of Economics SMU May 23, 2013
Types of Problems
Customer and Student Retention Employee Churn Credit Scoring (Auto or Home Loans) Bond Ratings What Characteristics Make for a Successful Mary Kay Representative? Detection of Fraudulent Insurance Claims Is a Newly Introduced Product Meeting with Consumer Acceptance or Rejection? Who is a likely Donor to your Charity? Early Detection of a Stolen or Compromised Credit Card
Types of Problems
What kind of genetic markers imply certain susceptibilities to specific diseases? Netflix and recommendations of Related and Suggested Movies Recommendations for Book Purchases: Amazon Side-Bars Click Stream Analysis of Optimal Web Base Design
Crankshaft Cartoon
Text Mining
Text
Numbers for
Prediction
Who Wrote the Federalist Papers? Frederick Mosteller and David Wallace
Inference in an Authorship Problem JASA, June 1963
Doc 2
18
Target Marketing
Target Marketing is the process of choosing specific customers to advertise to and/or to offer discounts to in order to increase the sales of the company Target Marketing usually proceeds in two stages: (1) Determining the probability that the solicited customer will purchase products from the company once solicited and (2) Once the solicited customer decides to purchase items from the company, estimating the profit that will likely be generated by the customers purchases. Thus the goal is to advertise only to those potential customers that represent expected profits that exceed the cost of advertising to the customer We then need to use data mining techniques to determine (1) the probability of purchase and (2) conditional on purchase, the expected profit of purchase. Expected Profit of Purchase = (Probability of Purchase) x (Expected profits from purchase, conditional on purchase)
Credit Scoring
Credit scoring involves using data mining tools determine the credit worthiness of loan applicants The task is determining the probability that a potential borrower will default on his or her obligations, given the personal characteristics of the borrower and the macroeconomic conditions of the economy at the time Some Examples: Citibank and Credit Card Issuers reviewing applicants for credit cards; Banks considering loaning money for mortgages
Fraud Detection
Of interest to IRS, Credit Card Companies, and Auditors Given a history of transactions, a record of typical income tax reports or income or balance sheets, which transactions\reports appear to be outliers? Basic Tool: Statistical Outlier Analysis. Roughly speaking: What is three or more standard deviations from the norm?
Customer Retention
What factors determine the loyalty displayed by a customer? When is a customer likely to jump ship? Would loyalty programs be useful? Basic Tool: Duration Modeling. This method determines what factors extend or limit the durations of customers with companies. Purpose: To identify potential fragile customers and then incentivize them so that they will remain loyal Result: Higher profits
Customer Segmentation
Suppose you are a giant publisher of magazines of various types. How do your subscribers differ across your portfolio of magazines? When soliciting advertising for your magazines, how do you match your potential advertisers with your magazines so that the advertisers receive the maximum benefit for their advertising expenditures? Is there a niche market (customer segment) that none of your magazines (or those of your competitors) is currently serving? Is this niche market substantial enough to warrant introducing a new magazine? Also, retailers often like to be able to distinguish between customers with low versus high elasticities of demand for their products so that they will know who to offer discounts to increase their revenues and profits. Basic Tool: Cluster Analysis
Affinity Analysis
Given that a customer purchases a given set of items, what is the probability that they will purchase another set of items? That is, what does the customers final market basket look like, given a partially-filled one? Purpose: Arrange the store shelves of a retail store so as make it most convenient for customers to purchase related goods and minimize the time of search and shopping. We want the customer to be able to shop quickly but at the same time buy a lot! On book seller web pages, once you have indicated an interest in purchasing a given book, several related books are often brought to your attention by advertisements in the margins of the page you are currently on. Affinity analysis is helpful in generating associated sales on retail web pages. This increases the profits of the web retailer. Major Tool: Association Rules The A priori Algorithm.
Link Analysis
Explores Associations between groups (individuals, organizations, web sites, nationstates and the like) Uses: To improve webpage design, to facilitate criminal investigations, and to benefit medical research in epidemiology and pharmacology, among other uses
Text Mining
To Understand Textual Content For Finding Interesting Regularities in Text Help Classify Documents by Type and Content Useful for Medical Science Search Engines seeking most current research on particular maladies seen in patients Beneficial in Building Spam Filters Help Examine Evolution of Opinion vis--vis Blogs
Data Preparation & Exploration Sampling Cleaning Summaries Visualization Partitioning Dimension reduction
Classification K-Nearest Neighbor Nave Bayes Logistic Regression Classification Trees Neural Nets Discriminant Analysis Segmentation/Clu stering Affinity Analysis/ Association Rules Deriving Insight Model Evaluation & Selection Deriving Insight