Sie sind auf Seite 1von 107

Subject

1. Story
2. Slides
3. Spotfire Dashboard
4. Research Notes
4.1. Home
4.2. Table of Contents
4.3. Who's Using It
4.4. What They're Saying
4.5. Discussion/resources
4.5.1. Fall 2012 Course Notes
4.5.2. Fall 2013 Course Notes
4.5.3. Google Group Data Science for Business
4.6. Errata
4.6.1. Preface
4.6.2. Chapter 3
4.6.3. Chapter 4
4.6.4. Chapter 6
4.6.5. Chapter 7
4.6.6. Chapter 9
4.6.7. Chapter 11
5. Data Science for Business
5.1. Cover Page
5.2. Back Cover Page
5.3. Praise
5.4. Inside Cover Page
5.5. Dedication
5.6. Preface
5.6.1. Introduction
5.6.2. Our Conceptual Approach to Data Science
5.6.3. To the Instructor
5.6.4. Other Skills and Concepts
5.6.5. Sections and Notation
5.6.5.1. Technical Details Ahead A note on the starred sections
5.6.6. Using Examples
5.6.7. Safari Books Online
5.6.8. How to Contact Us
5.6.9. Acknowledgments
5.7. 1. Introduction: Data-Analytic Thinking
5.7.1. The Ubiquity of Data Opportunities
5.7.2. Example: Hurricane Frances
5.7.3. Example: Predicting Customer Churn
5.7.4. Data Science, Engineering, and Data-Driven Decision Making
5.7.4.1. Figure 1-1. Data science in the context of various data-related processes in the organization
5.7.5. Data Processing and Big Data

5.7.6. From Big Data 1.0 to Big Data 2.0


5.7.7. Data and Data Science Capability as a Strategic Asset
5.7.8. Data-Analytic Thinking
5.7.8.1. The need for managers with data-analytic skills
5.7.9. This Book
5.7.10. Data Mining and Data Science, Revisited
5.7.11. Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist
5.7.12. Summary
5.8. 2. Business Problems and Data Science Solutions
5.8.1. From Business Problems to Data Mining Tasks
5.8.2. Supervised Versus Unsupervised Methods
5.8.2.1. A note on the terms: Supervised and unsupervised learning
5.8.3. Data Mining and Its Results
5.8.3.1. Figure 2-1. Data mining versus the use of data mining results
5.8.4. The Data Mining Process
5.8.4.1. Figure 2-2. The CRISP data mining process
5.8.4.2. Business Understanding
5.8.4.3. Data Understanding
5.8.4.4. Data Preparation
5.8.4.5. Modeling
5.8.4.6. Evaluation
5.8.4.7. Deployment
5.8.5. Implications for Managing the Data Science Team
5.8.5.1. Software skills versus analytics skills
5.8.6. Other Analytics Techniques and Technologies
5.8.6.1. Statistics
5.8.6.2. Database Querying
5.8.6.3. Data Warehousing
5.8.6.4. Regression Analysis
5.8.6.5. Machine Learning and Data Mining
5.8.6.6. Answering Business Questions with These Techniques
5.8.7. Summary
5.9. 3. Introduction to Predictive Modeling: From Correlation to Supervised Segmentation
5.9.1. Models, Induction, and Prediction
5.9.1.1. Terminology: Prediction
5.9.1.2. Figure 3-1. Data mining terminology for a supervised classification problem
5.9.1.3. Many Names for the Same Things
5.9.1.4. Terminology: Induction and deduction
5.9.2. Supervised Segmentation
5.9.2.1. Selecting Informative Attributes
5.9.2.1.1. Figure 3-2. A set of people to be classified
5.9.2.2. Equation 3-1. Entropy
5.9.2.3. Figure 3-3. Entropy of a two-class set as a function of p(+)
5.9.2.4. Equation 3-2. Information gain
5.9.2.5. Figure 3-4. Splitting the write-off sample into two segments

5.9.2.6. Figure 3-5. A classification tree split on the three-valued Residence attribute
5.9.2.7. Numeric variables
5.9.2.8. Example: Attribute Selection with Information Gain
5.9.2.8.1. Table 3-1. The attributes of the Mushroom dataset
5.9.2.8.2. Figure 3-6. Entropy chart for the entire Mushroom dataset
5.9.2.8.3. Figure 3-7. Entropy chart for the Mushroom dataset as split by GILL-COLOR
5.9.2.8.4. Figure 3-8. Entropy chart for the Mushroom dataset as split by SPORE-PRINTCOLOR
5.9.2.8.5. Figure 3-9. Entropy chart for the Mushroom dataset as split by ODOR
5.9.2.9. Supervised Segmentation with Tree-Structured Models
5.9.2.9.1. Figure 3-10. A simple classification tree
5.9.2.9.2. Figure 3-11. First partitioning: splitting on body shape (rectangular versus oval)
5.9.2.9.3. Figure 3-12. Second partitioning: the oval body people sub-grouped by head type
5.9.2.9.4. Figure 3-13. Third partitioning: the rectangular body people sub-grouped by body color
5.9.2.9.5. Figure 3-14. The classification tree resulting from the splits done in Figure 3-11 to Figure 3-1
5.9.3. Visualizing Segmentations
5.9.3.1. Figure 3-15. A classification tree and the partitions it imposes in instance space
5.9.3.2. Decision lines and hyperplanes
5.9.4. Trees as Sets of Rules
5.9.5. Probability Estimation
5.9.5.1. Figure 3-16. The effect of Laplace smoothing on probability estimation for several instance rat
5.9.6. Example: Addressing the Churn Problem with Tree Induction
5.9.6.1. Table 3-2. Attributes for the cellular phone churn-prediction problem
5.9.6.2. Figure 3-17. Churn attributes from Table 3-2 ranked by information gain
5.9.6.3. Figure 3-18. Classification tree learned from the cellular phone churn data
5.9.7. Summary
5.10. 4. Fitting a Model to Data
5.10.1. Sidebar: Simplifying Assumptions in This Chapter
5.10.2. Classification via Mathematical Functions
5.10.2.1. Figure 4-1. A dataset split by a classification tree with four leaf nodes
5.10.2.2. Figure 4-2. The raw data points of Figure 4-1, without decision lines
5.10.2.3. Figure 4-3. The dataset of Figure 4-2 with a single linear split
5.10.2.4. Linear Discriminant Functions
5.10.2.4.1. Equation 4-1. Classification function
5.10.2.4.2. Equation 4-2. A general linear model
5.10.2.4.3. Figure 4-4. A basic instance space in two dimensions containing points of two classes
5.10.2.4.4. Figure 4-5. Many different possible linear boundaries can separate the two groups of points
5.10.2.5. Optimizing an Objective Function
5.10.2.6. An Example of Mining a Linear Discriminant from Data
5.10.2.6.1. Figure 4-6. Two parts of a flower. Width measurements of these are used in the Iris dataset
5.10.2.6.2. Figure 4-7. A dataset and two learned linear classifiers
5.10.2.7. Linear Discriminant Functions for Scoring and Ranking Instances
5.10.2.8. Support Vector Machines, Briefly
5.10.2.8.1. Figure 4-8. The points of Figure 4-2 and the maximal margin classifier
5.10.2.8.2. Figure 4-9. Two loss functions illustrated
5.10.3. Regression via Mathematical Functions

5.10.3.1. Sidebar: Loss functions


5.10.4. Class Probability Estimation and Logistic Regression
5.10.4.1. Table 4-1. Probabilities and the corresponding odds
5.10.4.2. Table 4-2. Probabilities, odds, and the corresponding log-odds
5.10.4.3. Note: Logistic regression is a misnomer
5.10.4.4. Logistic Regression: Some Technical Details*
5.10.4.4.1. Technical Details Ahead
5.10.4.4.2. Equation 4-3. Log-odds linear function
5.10.4.4.3. Equation 4-4. The logistic function
5.10.4.4.4. Figure 4-10. Logistic regressions estimate of class probability as a function of f(x)
5.10.4.4.5. Class Labels and Probabilities
5.10.5. Example: Logistic Regression versus Tree Induction
5.10.5.1. Figure 4-11. One of the cell images from which the Wisconsin Breast Cancer dataset was der
5.10.5.2. Table 4-3. The attributes of the Wisconsin Breast Cancer dataset
5.10.5.3. Table 4-4. Linear equation learned by logistic regression on the Wisconsin Breast Cancer data
5.10.5.4. Figure 4-12. Decision tree learned from the Wisconsin Breast Cancer dataset
5.10.6. Nonlinear Functions, Support Vector Machines, and Neural Networks
5.10.6.1. Figure 4-13. The Iris dataset with a nonlinear feature
5.10.6.2. Note: Neural networks are useful for many tasks
5.10.7. Summary
5.11. 5. Overfitting and Its Avoidance
5.11.1. Generalization
5.11.2. Overfitting
5.11.3. Overfitting Examined
5.11.3.1. Holdout Data and Fitting Graphs
5.11.3.1.1. Figure 5-1. A typical fitting graph
5.11.3.1.2. Figure 5-2. A fitting graph for the customer churn (table) model
5.11.3.1.3. Note: Base rate
5.11.3.2. Overfitting in Tree Induction
5.11.3.2.1. Figure 5-3. A typical fitting graph for tree induction
5.11.3.3. Overfitting in Mathematical Functions
5.11.4. Example: Overfitting Linear Functions
5.11.4.1. Figure 5-4. The original Iris dataset and the models (boundary lines) that two linear methods
5.11.4.2. Figure 5-5. The Iris dataset from Figure 5-4 with a single new Iris Setosa example added (sho
5.11.4.3. Figure 5-6. The Iris dataset from Figure 5-4 with a single new Iris Versicolor example added (s
5.11.4.4. Figure 5-7. The Iris dataset from Figure 5-6 with its Iris Versicolor example added (shown by s
5.11.5. Example: Why Is Overfitting Bad?*
5.11.5.1. Technical Details Ahead
5.11.5.2. Table 5-1. A small set of training examples
5.11.5.3. Figure 5-8. Classification trees for the overfitting example
5.11.6. From Holdout Evaluation to Cross-Validation
5.11.6.1. Sidebar: Building a modeling laboratory
5.11.6.2. Figure 5-9. An illustration of cross-validation
5.11.7. The Churn Dataset Revisited
5.11.7.1. Figure 5-10. Fold accuracies for cross-validation on the churn problem

5.11.8. Learning Curves


5.11.8.1. Figure 5-11. Learning curves for tree induction and logistic regression for the churn problem
5.11.9. Overfitting Avoidance and Complexity Control
5.11.9.1. Avoiding Overfitting with Tree Induction
5.11.9.2. A General Method for Avoiding Overfitting
5.11.9.3. Avoiding Overfitting for Parameter Optimization*
5.11.9.3.1. Technical Details Ahead
5.11.9.3.2. Sidebar: Beware of multiple comparisons
5.11.10. Summary
5.12. 6. Similarity, Neighbors, and Clusters
5.12.1. Similarity and Distance
5.12.1.1. Figure 6-1. Euclidean distance
5.12.1.2. Equation 6-1. General Euclidean distance
5.12.2. Nearest-Neighbor Reasoning
5.12.2.1. Example: Whiskey Analytics
5.12.2.2. Nearest Neighbors for Predictive Modeling
5.12.2.2.1. Classification
5.12.2.2.2. Figure 6-2. Nearest neighbor classification
5.12.2.2.3. Table 6-1. Nearest neighbor example: Will David respond or not?
5.12.2.2.4. Probability Estimation
5.12.2.2.5. Regression
5.12.2.3. How Many Neighbors and How Much Influence?
5.12.2.3.1. Sidebar: Many names for nearest-neighbor reasoning
5.12.2.4. Geometric Interpretation, Overfitting, and Complexity Control
5.12.2.4.1. Figure 6-3. Boundaries created by a 1-NN classifier
5.12.2.4.2. Figure 6-4. Classification boundaries created on a three-class problem created by 1-NN (sin
5.12.2.4.3. Figure 6-5. Classification boundaries created on a three-class problem created by 30-NN (a
5.12.2.5. Issues with Nearest-Neighbor Methods
5.12.2.5.1. Intelligibility
5.12.2.5.2. Dimensionality and domain knowledge
5.12.2.5.3. Computational efficiency
5.12.3. Some Important Technical Details Relating to Similarities and Neighbors
5.12.3.1. Heterogeneous Attributes
5.12.3.2. Other Distance Functions*
5.12.3.2.1. Technical Details Ahead
5.12.3.2.2. Equation 6-2. Euclidean distance (L2 norm)
5.12.3.2.3. Equation 6-3. Manhattan distance (L1 norm)
5.12.3.2.4. Equation 6-4. Jaccard distance
5.12.3.2.5. Equation 6-5. Cosine distance
5.12.3.3. Combining Functions: Calculating Scores from Neighbors*
5.12.3.3.1. Technical Details Ahead
5.12.3.3.2. Equation 6-6. Majority vote classification
5.12.3.3.3. Equation 6-7. Majority scoring function
5.12.3.3.4. Equation 6-8. Similarity-moderated classification
5.12.3.3.5. Equation 6-9. Similarity-moderated scoring

5.12.3.3.6. Equation 6-10. Similarity-moderated regression


5.12.4. Clustering
5.12.4.1. Example: Whiskey Analytics Revisited
5.12.4.2. Hierarchical Clustering
5.12.4.2.1. Figure 6-6. Six points and their possible clusterings
5.12.4.2.2. Note: Dendrograms
5.12.4.2.3. Figure 6-7. The phylogenetic Tree of Life, a huge hierarchical clustering of species, displaye
5.12.4.2.4. Figure 6-8. A portion of the Tree of Life
5.12.4.2.5. Figure 6-9. Hierarchical clustering of Scotch whiskeys
5.12.4.3. Nearest Neighbors Revisited: Clustering Around Centroids
5.12.4.3.1. Figure 6-10. The second step of the k-means algorithm: find the actual center of the cluste
5.12.4.3.2. Figure 6-11. The first step of the k-means algorithm: find the points closest to the chosen c
5.12.4.3.3. Figure 6-12. A k-means clustering example using 90 points on a plane and k=3 centroids
5.12.4.3.4. Figure 6-13. A k-means clustering example using 90 points on a plane and k=3 centroids
5.12.4.4. Example: Clustering Business News Stories
5.12.4.4.1. Data preparation
5.12.4.4.2. The news story clusters
5.12.4.5. Understanding the Results of Clustering
5.12.4.6. Using Supervised Learning to Generate Cluster Descriptions*
5.12.4.6.1. Technical Details Ahead
5.12.4.6.2. Figure 6-14. The decision tree learned from cluster J on the Scotches data
5.12.5. Stepping Back: Solving a Business Problem Versus Data Exploration
5.12.5.1. Figure 6-15. The CRISP data mining process
5.12.6. Summary
5.13. 7. Decision Analytic Thinking I: What Is a Good Model?
5.13.1. Evaluating Classifiers
5.13.1.1. Sidebar: Bad Positives and Harmless Negatives
5.13.1.2. Plain Accuracy and Its Problems
5.13.1.3. The Confusion Matrix
5.13.1.3.1. Table 7-1. The layout of a 2 2 confusion matrix
5.13.1.4. Problems with Unbalanced Classes
5.13.1.4.1. Table 7-2. Confusion matrix of A
5.13.1.4.2. Table 7-3. Confusion matrix of B
5.13.1.4.3. Figure 7-1. An example of why accuracy is misleading
5.13.1.5. Problems with Unequal Costs and Benefits
5.13.2. Generalizing Beyond Classification
5.13.3. A Key Analytical Framework: Expected Value
5.13.3.1. Equation 7-1. The general form of an expected value calculation
5.13.3.2. Using Expected Value to Frame Classifier Use
5.13.3.3. Using Expected Value to Frame Classifier Evaluation
5.13.3.3.1. Figure 7-2. A diagram of the expected value calculation
5.13.3.3.2. Table 7-4. A sample confusion matrix with counts.
5.13.3.3.3. Error rates
5.13.3.3.4. Costs and benefits
5.13.3.3.5. Figure 7-3. A cost-benefit matrix

5.13.3.3.6. Figure 7-4. A cost-benefit matrix for the targeted marketing example
5.13.3.3.7. Equation 7-2. Expected profit equation with priors p(p) and p(n) factored
5.13.3.3.8. Table 7-5. Our sample confusion matrix (raw counts)
5.13.3.3.9. Table 7-6. The class priors and the rates of true positives, false positives, and so on
5.13.3.3.10. Sidebar: Other Evaluation Metrics
5.13.4. Evaluation, Baseline Performance, and Implications for Investments in Data
5.13.5. Summary
5.14. 8. Visualizing Model Performance
5.14.1. Ranking Instead of Classifying
5.14.1.1. Figure 8-1. Thresholding a list of instances sorted by scores
5.14.2. Profit Curves
5.14.2.1. Figure 8-2. Profit curves of three classifiers
5.14.3. ROC Graphs and Curves
5.14.3.1. Figure 8-3. ROC space and five different classifiers (A-E) with their performance shown
5.14.3.2. Figure 8-4. Each different point in ROC space corresponds to a specific confusion matrix
5.14.3.3. Figure 8-5. An illustration of how a ROC curve (really, a stepwise graph) is constructed from
5.14.4. The Area Under the ROC Curve (AUC)
5.14.5. Cumulative Response and Lift Curves
5.14.5.1. Figure 8-6. Four example classifiers (AD) and their cumulative response curves
5.14.5.2. Figure 8-7. The four classifiers (AD) of Figure 8-6 and their lift curves
5.14.6. Example: Performance Analytics for Churn Modeling
5.14.6.1. Table 8-1. Accuracy values of four classifiers trained and tested on the complete KDD Cup 20
5.14.6.2. Table 8-2. Accuracy and AUC values of four classifiers on the KDD Cup 2009 churn problem
5.14.6.3. Figure 8-8. Fitting curves for a classification tree on the churn data
5.14.6.4. Figure 8-9. ROC curves of the classifiers on one fold of cross-validation for the churn problem
5.14.6.5. Figure 8-10. Lift curves for the churn domain
5.14.6.6. A note on combining classifiers
5.14.6.7. Figure 8-11. Profit curves of four classifiers on the churn domain, assuming a 9-to-1 ratio of b
5.14.6.8. Figure 8-12. Profit curves of four classifiers on the churn domain
5.14.7. Summary
5.15. 9. Evidence and Probabilities
5.15.1. Example: Targeting Online Consumers With Advertisements
5.15.2. Combining Evidence Probabilistically
5.15.2.1. More math than usual ahead
5.15.2.2. Joint Probability and Independence
5.15.2.2.1. Equation 9-1. Joint probability using conditional probability
5.15.2.3. Bayes Rule
5.15.2.3.1. Note: Bayesian methods
5.15.3. Applying Bayes Rule to Data Science
5.15.3.1. Equation 9-2. Bayes Rule for classification
5.15.3.2. Conditional Independence and Naive Bayes
5.15.3.2.1. Equation 9-3. Naive Bayes equation
5.15.3.3. Advantages and Disadvantages of Naive Bayes
5.15.3.3.1. Sidebar: Variants of Naive Bayes
5.15.4. A Model of Evidence Lift

5.15.4.1. Equation 9-4. Probability as a product of evidence lifts


5.15.5. Example: Evidence Lifts from Facebook Likes
5.15.5.1. Table 9-1. Some Facebook page Likes and corresponding lifts
5.15.5.2. Evidence in Action: Targeting Consumers with Ads
5.15.6. Summary
5.16. 10. Representing and Mining Text
5.16.1. Why Text Is Important
5.16.2. Why Text Is Difficult
5.16.3. Representation
5.16.3.1. Bag of Words
5.16.3.1.1. Note: Sets and bags
5.16.3.2. Term Frequency
5.16.3.2.1. Table 10-1. Three simple documents
5.16.3.2.2. Table 10-2. Term count representation
5.16.3.2.3. Table 10-3. Terms after normalization and stemming, ordered by frequency
5.16.3.2.4. Note: Careless Stopword Elimination
5.16.3.3. Measuring Sparseness: Inverse Document Frequency
5.16.3.3.1. Equation 10-1. Inverse Document Frequency (IDF) of a term
5.16.3.3.2. Figure 10-1. IDF of a term t within a corpus of 100 documents
5.16.3.4. Combining Them: TFIDF
5.16.4. Example: Jazz Musicians
5.16.4.1. Figure 10-2. Representation of the query Famous jazz saxophonist born in Kansas who playe
5.16.4.2. Figure 10-3. Representation of the query Famous jazz saxophonist born in Kansas who playe
5.16.4.3. Figure 10-4. Final TFIDF representation of the query Famous jazz saxophonist born in Kansas
5.16.4.4. Table 10-4. Similarity of each musicians text to the query Famous jazz saxophonist born in K
5.16.5. The Relationship of IDF to Entropy*
5.16.5.1. Technical Details Ahead
5.16.5.2. Figure 10-5. Plots of various values related to IDF(t) and IDF(not_t)
5.16.6. Beyond Bag of Words
5.16.6.1. N-gram Sequences
5.16.6.2. Named Entity Extraction
5.16.6.3. Topic Models
5.16.6.3.1. Figure 10-6. Modeling documents with a topic layer.
5.16.6.3.2. Note: Topics as Latent Information
5.16.7. Example: Mining News Stories to Predict Stock Price Movement
5.16.7.1. The Task
5.16.7.1.1. Figure 10-7. Percentage change in price, and corresponding label
5.16.7.2. The Data
5.16.7.2.1. Sidebar: The News Is Messy
5.16.7.2.2. Figure 10-8. Graph of stock price of Summit Technologies, Inc
5.16.7.3. Data Preprocessing
5.16.7.4. Results
5.16.7.4.1. Figure 10-9. ROC curves for the stock news classification task
5.16.7.4.2. Figure 10-10. Lift curves for the stock news prediction task
5.16.7.4.3. Sidebar: Prior Work on Predicting Stock Prices from Financial News

5.16.8. Summary
5.17. 11. Decision Analytic Thinking II: Toward Analytical Engineering
5.17.1. Targeting the Best Prospects for a Charity Mailing
5.17.1.1. The Expected Value Framework: Decomposing the Business Problem and Recomposing the S
5.17.1.2. A Brief Digression on Selection Bias
5.17.2. Our Churn Example Revisited with Even More Sophistication
5.17.2.1. The Expected Value Framework: Structuring a More Complicated Business Problem
5.17.2.2. Assessing the Influence of the Incentive
5.17.2.2.1. Equation 11-1. VT decomposition
5.17.2.3. From an Expected Value Decomposition to a Data Science Solution
5.17.3. Summary
5.18. 12. Other Data Science Tasks and Techniques
5.18.1. Co-occurrences and Associations: Finding Items That Go Together
5.18.1.1. Measuring Surprise: Lift and Leverage
5.18.1.1.1. Equation 12-1. Lift
5.18.1.1.2. Equation 12-2. Leverage
5.18.1.2. Example: Beer and Lottery Tickets
5.18.1.3. Associations Among Facebook Likes
5.18.1.3.1. Note: Supervised Versus Unsupervised?
5.18.2. Profiling: Finding Typical Behavior
5.18.2.1.1. Figure 12-1. A distribution of wait times for callers into a banks call center
5.18.2.1.2. Figure 12-2. The distribution of wait times for callers into a banks call center after a quick
5.18.2.1.3. Figure 12-3. A profile of our customers with respect to their spending and the time they sp
5.18.2.1.4. Figure 12-4. A profile of our customers with respect to their spending and the time they sp
5.18.2.1.5. Note: Soft Clustering
5.18.3. Link Prediction and Social Recommendation
5.18.4. Data Reduction, Latent Information, and Movie Recommendation
5.18.4.1. Figure 12-5. A collection of movies placed in a taste space defined by the two strongest lat
5.18.5. Bias, Variance, and Ensemble Methods
5.18.6. Data-Driven Causal Explanation and a Viral Marketing Example
5.18.7. Summary
5.19. 13. Data Science and Business Strategy
5.19.1. Thinking Data-Analytically, Redux
5.19.2. Achieving Competitive Advantage with Data Science
5.19.3. Sustaining Competitive Advantage with Data Science
5.19.3.1. Formidable Historical Advantage
5.19.3.2. Unique Intellectual Property
5.19.3.3. Unique Intangible Collateral Assets
5.19.3.4. Superior Data Scientists
5.19.3.5. Superior Data Science Management
5.19.4. Attracting and Nurturing Data Scientists and Their Teams
5.19.4.1. A note on publishing
5.19.5. Examine Data Science Case Studies
5.19.6. Be Ready to Accept Creative Ideas from Any Source
5.19.7. Be Ready to Evaluate Proposals for Data Science Projects

5.19.7.1. Example Data Mining Proposal


5.19.7.1.1. Targeted Whiz-bang Customer Migrationprepared by Big Red Consulting, Inc.
5.19.7.2. Flaws in the Big Red Proposal
5.19.8. A Firms Data Science Maturity
5.19.8.1. A note on immature firms
5.19.8.2. Note: Data science is neither operations nor engineering
5.20. 14. Conclusion
5.20.1. The Fundamental Concepts of Data Science
5.20.1.1. Applying Our Fundamental Concepts to a New Problem: Mining Mobile Device Data
5.20.1.2. Figure 14-1. A scatterplot of a sample of GPS locations captured from mobile devices
5.20.1.3. Changing the Way We Think about Solutions to Business Problems
5.20.2. What Data Cant Do: Humans in the Loop, Revisited
5.20.3. Privacy, Ethics, and Mining Data About Individuals
5.20.4. Is There More to Data Science?
5.20.5. Final Example: From Crowd-Sourcing to Cloud-Sourcing
5.20.6. Final Words
5.21. A. Proposal Review Guide
5.21.1. Business and Data Understanding
5.21.2. Data Preparation
5.21.3. Modeling
5.21.4. Evaluation and Deployment
5.22. B. Another Sample Proposal
5.22.1. Scenario and Proposal
5.22.1.1. Churn Reduction via Targeted Incentives A GGC Proposal
5.22.2. Flaws in the GGC Proposal
5.23. Glossary
5.23.1. a priori
5.23.2. Accuracy (error rate)
5.23.3. Association mining
5.23.4. Attribute (field, variable, feature)
5.23.5. Class (label)
5.23.6. Classifier
5.23.7. Confusion matrix
5.23.8. Coverage
5.23.9. Cost (utility/loss/payoff)
5.23.10. Cross-validation
5.23.11. Data cleaning/cleansing
5.23.12. Data mining
5.23.13. Dataset
5.23.14. Dimension
5.23.15. Error rate
5.23.16. Example
5.23.17. Feature
5.23.18. Feature vector (record, tuple)
5.23.19. Field

5.23.20. i.i.d. sample


5.23.21. Induction
5.23.22. Instance (example, case, record)
5.23.23. KDD
5.23.24. Knowledge discovery
5.23.25. Loss
5.23.26. Machine learning
5.23.27. Missing value
5.23.28. Model
5.23.29. Model deployment
5.23.30. OLAP (MOLAP, ROLAP)
5.23.31. Record
5.23.32. Schema
5.23.33. Sensitivity
5.23.34. Specificity
5.23.35. Supervised learning
5.23.36. Tuple
5.23.37. Unsupervised learning
5.23.38. Utility
5.24. Bibliography
5.25. Index
5.25.1. Symbols
5.25.2. A
5.25.3. B
5.25.4. C
5.25.5. D
5.25.6. E
5.25.7. F
5.25.8. G
5.25.9. H
5.25.10. I
5.25.11. J
5.25.12. K
5.25.13. L
5.25.14. M
5.25.15. N
5.25.16. O
5.25.17. P
5.25.18. Q
5.25.19. R
5.25.20. S
5.25.21. T
5.25.22. U
5.25.23. V
5.25.24. W

5.25.25. Y
5.25.26. Z
5.26. About the Authors
5.27. Colophon

URL
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Story
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Slides
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Spotfire_Dashboard
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Research_Notes
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Home
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_of_Contents
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Who's_Using_It
http://semanticommunity.info/Data_Science/Data_Science_for_Business#What_They're_Saying
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Discussion.2Fresources
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Fall_2012_Course_Notes
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Fall_2013_Course_Notes
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Google_Group_Data_Science_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Errata
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Preface
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chapter_3
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chapter_4
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chapter_6
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chapter_7
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chapter_9
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chapter_11
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Science_for_Business
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Cover_Page
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Back_Cover_Page
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Praise
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Inside_Cover_Page
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Dedication
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Preface
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Introduction
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Our_Conceptual_Approach_to
http://semanticommunity.info/Data_Science/Data_Science_for_Business#To_the_Instructor
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Other_Skills_and_Concepts
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sections_and_Notation
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Technical_Details_Ahead_.E2.
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Using_Examples
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Safari.C2.AE_Books_Online
http://semanticommunity.info/Data_Science/Data_Science_for_Business#How_to_Contact_Us
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Acknowledgments
http://semanticommunity.info/Data_Science/Data_Science_for_Business#1._Introduction:_Data-Analyti
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Ubiquity_of_Data_Oppor
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Hurricane_Frances
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Predicting_Custom
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Science.2C_Engineering
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_1-1._Data_science_in_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Processing_and_.E2.80.

http://semanticommunity.info/Data_Science/Data_Science_for_Business#From_Big_Data_1.0_to_Big_D
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_and_Data_Science_Cap
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data-Analytic_Thinking
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_need_for_managers_with
http://semanticommunity.info/Data_Science/Data_Science_for_Business#This_Book
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Mining_and_Data_Scien
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chemistry_Is_Not_About_Test
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Problems_and_D
http://semanticommunity.info/Data_Science/Data_Science_for_Business#From_Business_Problems_to_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Supervised_Versus_Unsuperv
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_note_on_the_terms:_Super
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Mining_and_Its_Results
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_2-1._Data_mining_ver
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Data_Mining_Process
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_2-2._The_CRISP_data_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Business_Understanding
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Understanding
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Preparation
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Modeling
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Evaluation
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Deployment
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Implications_for_Managing_th
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Software_skills_versus_analyt
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Other_Analytics_Techniques_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Statistics
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Database_Querying
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Warehousing
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Regression_Analysis
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Machine_Learning_and_Data_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Answering_Business_Questio
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_2
http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to_Predictive
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Models.2C_Induction.2C_and_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Terminology:_Prediction
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-1._Data_mining_term
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Many_Names_for_the_Same_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Terminology:_Induction_and_d
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Supervised_Segmentation
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Selecting_Informative_Attribu
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-2._A_set_of_people_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_3-1._Entropy
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-3._Entropy_of_a_two
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_3-2._Information_ga
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-4._Splitting_the_.E2.

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-5._A_classification_t
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Numeric_variables
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Attribute_Selection
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_3-1._The_attributes_of_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-6._Entropy_chart_fo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-7._Entropy_chart_fo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-8._Entropy_chart_fo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-9._Entropy_chart_fo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Supervised_Segmentation_w
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-10._A_simple_classifi
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-11._First_partitioning
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-12._Second_partitio
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-13._Third_partitionin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-14._The_classificatio
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Visualizing_Segmentations
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-15._A_classification_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Decision_lines_and_hyperplan
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Trees_as_Sets_of_Rules
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Probability_Estimation
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-16._The_effect_of_La
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Addressing_the_Ch
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_3-2._Attributes_for_the
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-17._Churn_attribute
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_3-18._Classification_tr
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_3
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fitting_a_Model_to_Data
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Simplifying_Assump
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Classification_via_Mathemati
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-1._A_dataset_split_b
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-2._The_raw_data_po
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-3._The_dataset_of_F
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Linear_Discriminant_Function
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_4-1._Classification_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_4-2._A_general_line
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-4._A_basic_instance
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-5._Many_different_p
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Optimizing_an_Objective_Fun
http://semanticommunity.info/Data_Science/Data_Science_for_Business#An_Example_of_Mining_a_Lin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-6._Two_parts_of_a_fl
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-7._A_dataset_and_tw
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Linear_Discriminant_Function
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Support_Vector_Machines.2C
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-8._The_points_of_Fig
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-9._Two_loss_function
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Regression_via_Mathematica

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Loss_functions
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Class_Probability_Estimation_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_4-1._Probabilities_and_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_4-2._Probabilities.2C_od
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Logistic_regression_is_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Logistic_Regression:_Some_Te
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Technical_Details_Ahead
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_4-3._Log-odds_linea
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_4-4._The_logistic_fu
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-10._Logistic_regress
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Class_Labels_and_Probabilitie
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Logistic_Regression
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-11._One_of_the_cell
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_4-3._The_attributes_of_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_4-4._Linear_equation_le
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-12._Decision_tree_le
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Nonlinear_Functions.2C_Supp
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_4-13._The_Iris_dataset
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Neural_networks_are_u
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_4
http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and_Its_Avoida
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Generalization
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Overfitting
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Overfitting_Examined
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Holdout_Data_and_Fitting_Gr
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-1._A_typical_fitting_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-2._A_fitting_graph_f
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Base_rate
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Overfitting_in_Tree_Induction
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-3._A_typical_fitting_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Overfitting_in_Mathematical_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Overfitting_Linear_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-4._The_original_Iris_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-5._The_Iris_dataset_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-6._The_Iris_dataset_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-7._The_Iris_dataset_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Why_Is_Overfitting
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Technical_Details_Ahead_2
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_5-1._A_small_set_of_tra
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-8._Classification_tre
http://semanticommunity.info/Data_Science/Data_Science_for_Business#From_Holdout_Evaluation_to_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Building_a_modeling
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-9._An_illustration_of
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Churn_Dataset_Revisited
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-10._Fold_accuracies

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Learning_Curves
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_5-11._Learning_curves
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Overfitting_Avoidance_and_C
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Avoiding_Overfitting_with_Tre
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_General_Method_for_Avoid
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Avoiding_Overfitting_for_Para
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Technical_Details_Ahead_3
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Beware_of_.E2.80.9
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_5
http://semanticommunity.info/Data_Science/Data_Science_for_Business#6._Similarity.2C_Neighbors.2C
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Similarity_and_Distance
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-1._Euclidean_distan
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-1._General_Euclid
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Nearest-Neighbor_Reasoning
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Whiskey_Analytics
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Nearest_Neighbors_for_Predic
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Classification
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-2._Nearest_neighbo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_6-1._Nearest_neighbor_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Probability_Estimation_2
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Regression
http://semanticommunity.info/Data_Science/Data_Science_for_Business#How_Many_Neighbors_and_H
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Many_names_for_ne
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Geometric_Interpretation.2C_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-3._Boundaries_creat
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-4._Classification_bou
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-5._Classification_bou
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Issues_with_Nearest-Neighbo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Intelligibility
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Dimensionality_and_domain_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Computational_efficiency
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Some_Important_Technical_D
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Heterogeneous_Attributes
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Other_Distance_Functions*
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Technical_Details_Ahead_4
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-2._Euclidean_dist
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-3._Manhattan_dis
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-4._Jaccard_distan
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-5._Cosine_distanc
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Combining_Functions:_Calcul
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Technical_Details_Ahead_5
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-6._Majority_vote_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-7._Majority_scorin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-8._Similarity-mod
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-9._Similarity-mod

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_6-10._Similarity-mo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Clustering
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Whiskey_Analytics_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Hierarchical_Clustering
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-6._Six_points_and_th
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Dendrograms
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-7._The_phylogenetic
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-8._A_portion_of_the_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-9._Hierarchical_clus
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Nearest_Neighbors_Revisited
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-10._The_second_ste
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-11._The_first_step_o
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-12._A_k-means_clus
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-13._A_k-means_clus
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Clustering_Busines
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_preparation
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_news_story_clusters
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Understanding_the_Results_o
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Using_Supervised_Learning_t
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Technical_Details_Ahead_6
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-14._The_decision_tre
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Stepping_Back:_Solving_a_Bu
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_6-15._The_CRISP_data
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_6
http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Decision_Analytic_Thinking
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Evaluating_Classifiers
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Bad_Positives_and_H
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Plain_Accuracy_and_Its_Probl
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Confusion_Matrix
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_7-1._The_layout_of_a_2
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Problems_with_Unbalanced_C
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_7-2._Confusion_matrix_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_7-3._Confusion_matrix_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_7-1._An_example_of_w
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Problems_with_Unequal_Cost
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Generalizing_Beyond_Classifi
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_Key_Analytical_Framework
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_7-1._The_general_fo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Using_Expected_Value_to_Fra
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Using_Expected_Value_to_Fra
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_7-2._A_diagram_of_the
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_7-4._A_sample_confusi
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Error_rates
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Costs_and_benefits
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_7-3._A_cost-benefit_m

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_7-4._A_cost-benefit_m
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_7-2._Expected_profi
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_7-5._Our_sample_confu
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_7-6._The_class_priors_a
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Other_Evaluation_M
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Evaluation.2C_Baseline_Perfo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_7
http://semanticommunity.info/Data_Science/Data_Science_for_Business#8._Visualizing_Model_Perform
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Ranking_Instead_of_Classifyin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-1._Thresholding_a_li
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Profit_Curves
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-2._Profit_curves_of_t
http://semanticommunity.info/Data_Science/Data_Science_for_Business#ROC_Graphs_and_Curves
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-3._ROC_space_and_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-4._Each_different_po
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-5._An_illustration_of
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Area_Under_the_ROC_Cu
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Cumulative_Response_and_L
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-6._Four_example_cla
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-7._The_four_classifie
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Performance_Analy
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_8-1._Accuracy_values_o
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_8-2._Accuracy_and_AUC
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-8._Fitting_curves_for
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-9._ROC_curves_of_th
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-10._Lift_curves_for_t
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_note_on_combining_classifi
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-11._Profit_curves_of
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_8-12._Profit_curves_of
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_8
http://semanticommunity.info/Data_Science/Data_Science_for_Business#9._Evidence_and_Probabilitie
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Targeting_Online_C
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Combining_Evidence_Probabi
http://semanticommunity.info/Data_Science/Data_Science_for_Business#More_math_than_usual_ahea
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Joint_Probability_and_Indepen
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_9-1._Joint_probabili
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Bayes.E2.80.99_Rule
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Bayesian_methods
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Applying_Bayes.E2.80.99_Ru
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_9-2._Bayes_Rule_fo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Conditional_Independence_an
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_9-3._Naive_Bayes_e
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Advantages_and_Disadvanta
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Variants_of_Naive_B
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_Model_of_Evidence_.E2.80.

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_9-4._Probability_as_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Evidence_Lifts_from
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_9-1._Some_Facebook_p
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Evidence_in_Action:_Targetin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_9
http://semanticommunity.info/Data_Science/Data_Science_for_Business#10._Representing_and_Mining
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Why_Text_Is_Important
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Why_Text_Is_Difficult
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Representation
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Bag_of_Words
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Sets_and_bags
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Term_Frequency
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_10-1._Three_simple_do
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_10-2._Term_count_repr
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_10-3._Terms_after_norm
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Careless_Stopword_Elim
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Measuring_Sparseness:_Inver
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_10-1._Inverse_Docu
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-1._IDF_of_a_term_t
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Combining_Them:_TFIDF
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Jazz_Musicians
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-2._Representation_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-3._Representation_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-4._Final_TFIDF_repr
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Table_10-4._Similarity_of_eac
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Relationship_of_IDF_to_E
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Technical_Details_Ahead_7
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-5._Plots_of_various
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Beyond_Bag_of_Words
http://semanticommunity.info/Data_Science/Data_Science_for_Business#N-gram_Sequences
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Named_Entity_Extraction
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Topic_Models
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-6._Modeling_docum
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Topics_as_Latent_Inform
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Mining_News_Stori
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Task
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-7._Percentage_cha
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Data
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_The_News_Is_Messy
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-8._Graph_of_stock_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Preprocessing
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Results
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-9._ROC_curves_for
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_10-10._Lift_curves_for
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Prior_Work_on_Pred

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_10
http://semanticommunity.info/Data_Science/Data_Science_for_Business#11._Decision_Analytic_Thinkin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Targeting_the_Best_Prospects
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Expected_Value_Framew
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_Brief_Digression_on_Select
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Our_Churn_Example_Revisite
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Expected_Value_Framew
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Assessing_the_Influence_of_t
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_11-1._VT_decompo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#From_an_Expected_Value_De
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_11
http://semanticommunity.info/Data_Science/Data_Science_for_Business#12._Other_Data_Science_Task
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Co-occurrences_and_Associat
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Measuring_Surprise:_Lift_and
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_12-1._Lift
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Equation_12-2._Leverage
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Beer_and_Lottery_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Associations_Among_Faceboo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Supervised_Versus_Un
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Profiling:_Finding_Typical_Beh
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_12-1._A_distribution_o
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_12-2._The_distribution
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_12-3._A_profile_of_our
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_12-4._A_profile_of_our
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_.E2.80.9CSoft.E2.80.9D
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Link_Prediction_and_Social_R
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Reduction.2C_Latent_In
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_12-5._A_collection_of_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Bias.2C_Variance.2C_and_Ens
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data-Driven_Causal_Explanat
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Summary_12
http://semanticommunity.info/Data_Science/Data_Science_for_Business#13._Data_Science_and_Busin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Thinking_Data-Analytically.2C
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Achieving_Competitive_Adva
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sustaining_Competitive_Adva
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Formidable_Historical_Advant
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Unique_Intellectual_Property
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Unique_Intangible_Collateral_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Superior_Data_Scientists
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Superior_Data_Science_Mana
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Attracting_and_Nurturing_Da
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_note_on_publishing
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Examine_Data_Science_Case
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Be_Ready_to_Accept_Creative
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Be_Ready_to_Evaluate_Propo

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example_Data_Mining_Propos
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Targeted_Whiz-bang_Custom
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Flaws_in_the_Big_Red_Propos
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_Firm.E2.80.99s_Data_Scien
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_note_on_.E2.80.9Cimmatur
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Data_science_is_neithe
http://semanticommunity.info/Data_Science/Data_Science_for_Business#14._Conclusion
http://semanticommunity.info/Data_Science/Data_Science_for_Business#The_Fundamental_Concepts_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Applying_Our_Fundamental_C
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Figure_14-1._A_scatterplot_of
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Changing_the_Way_We_Think
http://semanticommunity.info/Data_Science/Data_Science_for_Business#What_Data_Can.E2.80.99t_Do
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Privacy.2C_Ethics.2C_and_Min
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Is_There_More_to_Data_Scien
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Final_Example:_From_Crowdhttp://semanticommunity.info/Data_Science/Data_Science_for_Business#Final_Words
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A._Proposal_Review_Guide
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Business_and_Data_Understa
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_Preparation_2
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Modeling_2
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Evaluation_and_Deployment
http://semanticommunity.info/Data_Science/Data_Science_for_Business#B._Another_Sample_Proposal
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Scenario_and_Proposal
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Churn_Reduction_via_Targete
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Flaws_in_the_GGC_Proposal
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Glossary
http://semanticommunity.info/Data_Science/Data_Science_for_Business#a_priori
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Accuracy_(error_rate)
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Association_mining
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Attribute_(field.2C_variable.2
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Class_(label)
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Classifier
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Confusion_matrix
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Coverage
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Cost_(utility.2Floss.2Fpayoff)
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Cross-validation
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_cleaning.2Fcleansing
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_mining
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Dataset
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Dimension
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Error_rate
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Feature
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Feature_vector_(record.2C_tu
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Field

http://semanticommunity.info/Data_Science/Data_Science_for_Business#i.i.d._sample
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Induction
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Instance_(example.2C_case.2
http://semanticommunity.info/Data_Science/Data_Science_for_Business#KDD
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Knowledge_discovery
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Loss
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Machine_learning
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Missing_value
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Model
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Model_deployment
http://semanticommunity.info/Data_Science/Data_Science_for_Business#OLAP_(MOLAP.2C_ROLAP)
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Record
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Schema
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sensitivity
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Specificity
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Supervised_learning
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Tuple
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Unsupervised_learning
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Utility
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Bibliography
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Index
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Symbols
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A
http://semanticommunity.info/Data_Science/Data_Science_for_Business#B
http://semanticommunity.info/Data_Science/Data_Science_for_Business#C
http://semanticommunity.info/Data_Science/Data_Science_for_Business#D
http://semanticommunity.info/Data_Science/Data_Science_for_Business#E
http://semanticommunity.info/Data_Science/Data_Science_for_Business#F
http://semanticommunity.info/Data_Science/Data_Science_for_Business#G
http://semanticommunity.info/Data_Science/Data_Science_for_Business#H
http://semanticommunity.info/Data_Science/Data_Science_for_Business#I
http://semanticommunity.info/Data_Science/Data_Science_for_Business#J
http://semanticommunity.info/Data_Science/Data_Science_for_Business#K
http://semanticommunity.info/Data_Science/Data_Science_for_Business#L
http://semanticommunity.info/Data_Science/Data_Science_for_Business#M
http://semanticommunity.info/Data_Science/Data_Science_for_Business#N
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Q
http://semanticommunity.info/Data_Science/Data_Science_for_Business#P
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Q
http://semanticommunity.info/Data_Science/Data_Science_for_Business#R
http://semanticommunity.info/Data_Science/Data_Science_for_Business#S
http://semanticommunity.info/Data_Science/Data_Science_for_Business#T
http://semanticommunity.info/Data_Science/Data_Science_for_Business#U
http://semanticommunity.info/Data_Science/Data_Science_for_Business#V
http://semanticommunity.info/Data_Science/Data_Science_for_Business#W

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Y
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Z
http://semanticommunity.info/Data_Science/Data_Science_for_Business#About_the_Authors
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Colophon

Function URL
URL
URL
URL
URL
URL
URL
Narrative http://www.data-science-for-biz.com/
http://semanticommunity.info/@api/deki/files/28362/Data_Science_for_Business.
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticverses.com/
http://www.mindtouch.com/
http://semanticommunity.info/Data_S
http://semanticommunity.
Slides
http://semanticommunity.info/@api/deki/files/28159/BrandNiemann02182014.pptx
Spotfire
Notes
Book Web Shttp://www.data-science-for-biz.com/
Book Web Shttp://www.data-science-for-biz.com/
Book Web Shttp://www.data-science-for-biz.com/
Book Web Shttp://www.data-science-for-biz.com/
Book Web Shttp://www.data-science-for-biz.com/
Course Nothttp://people.stern.nyu.edu/ja1517/pdsfall2012/index.html
Course Nothttp://jattenberg.github.io/PDS-Fall-2013/
Google Grohttps://groups.google.com/forum/?hl=en#!forum/data-science-for-biz
Errata
http://www.data-science-for-biz.com/
Errata
Errata
Errata
Errata
Errata
Errata
Errata
PDF
http://semanticommunity.info/@api/deki/files/28362/Data_Science_for_Business.pdf
http://semanticommunity.info/@api/deki/files/28160/BrandNiemann02072014.pp
http://semanticommunity.info/@api/deki/files/28361/BrandNiemann02
Figure
Figure
Content
Content http://my.safaribooksonline.com/
corporate@oreilly.com
http://oreilly.com/catalog/errata.csp?isbn=9781449361327
Content
Content http://semanticommunity.info/Data_Science/Data_Science_for_Business#Preface
Content http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and
Content
Content http://www.data-science-for-biz.com/
Content http://semanticommunity.info/Data_Science/Data_Science_for_Business#1._Introduction:_D
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Busi
Content
Details
Content permissions@oreilly.com
Content http://my.safaribooksonline.com/?portal=oreilly
http://www.safaribooksonline.com/content
http://www.safaribooksonline.com/subscriptions
http://www.safaribooksonline.com/teams-organizations
http://www.safaribooksonline.com/government
http://www.safaribooksonline.com/ind
http://www.safaribooksonl
Content http://oreil.ly/data-science
http://www.data-science-for-biz.com/
bookquestions@oreilly.com
http://www.oreilly.com/
http://facebook.com/oreilly
http://twitter.com/oreillymedia
http://www.youtube.com/o
Content http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fittin
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_
http://www.data-science-for-biz.com/
Content
Content
Example
Example
Content
Figure
http://semanticommunity.info/@api/deki/files/28085/Figure_1-1.png
Content

Content
Content
Content
Content
Content
Content
Content
Summary
Content
Content
Content
Notes
Content
Figure
Content
Figure
Content
Content
Content
Content
Content
Content
Content
Content
Content
Content
Content
Content
Content
Content
Content
Summary
Content
Content
Content
Figure
Content
Content
Content
Content
Figure
Equation
Figure
Equation
Figure

http://semanticommunity.info/Data_Science/Data_Science_for_Business#13._Data_Science_

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Other_Analytics_Te

http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to
http://semanticommunity.info/Data_Science/Data_Science_for_Business#6._Simi

http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to
http://semanticommunity.info/@api/deki/files/28158/Figure_2-1.png
http://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining

http://semanticommunity.info/@api/deki/files/28157/Figure_2-2.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#From_Business_Pro
http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Deci
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to
http://semanticommunity.info/Data_Science/Data_Science_for_Business#14._Con

http://semanticommunity.info/Data_Science/Data_Science_for_Business#1._Introduction:_D

http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chapter_3
http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Over

http://semanticommunity.info/@api/deki/files/28188/Figure_3-1.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://semanticommunity.info/Data_Science/Data_Science_for_Business#1._Intro

http://semanticommunity.info/@api/deki/files/28189/Figure_3-2.png
http://semanticommunity.info/@api/deki/files/28225/Equation_3-1.png
http://semanticommunity.info/@api/deki/files/28190/Figure_3-3.png
http://semanticommunity.info/@api/deki/files/28232/Equation_3-2e.png
http://semanticommunity.info/@api/deki/files/28191/Figure_3-4.png

Figure
Content
Content
Table
Figure
Figure
Figure
Figure
Content
Figure
Figure
Figure
Figure
Figure
Content
Figure
Content
Content
Content
Figure
Example
Table
Figure
Figure
Summary
Content
Sidebar
Content
Figure
Figure
Figure
Content
Equation
Equation
Figure
Figure
Content
Example
Figure
Figure
Content
Content
Figure
Figure
Content

http://semanticommunity.info/@api/deki/files/28192/Figure_3-5.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Over

http://archive.ics.uci.edu/ml/datasets/Mushroom
http://semanticommunity.info/Data_Science/Data_Science_for_Business_Data_Sc
http://semanticommunity.info/@api/deki/files/28185/Table_3-1.png
http://semanticommunity.info/@api/deki/files/28186/Table_3-1b.png
http://semanticommunity.info/@api/deki/files/28193/Figure_3-6.png
http://semanticommunity.info/@api/deki/files/28194/Figure_3-7.png
http://semanticommunity.info/@api/deki/files/28195/Figure_3-8.png
http://semanticommunity.info/@api/deki/files/28196/Figure_3-9.png

http://semanticommunity.info/@api/deki/files/28197/Figure_3-10.png
http://semanticommunity.info/@api/deki/files/28198/Figure_3-11.png
http://semanticommunity.info/@api/deki/files/28199/Figure_3-12.png
http://semanticommunity.info/@api/deki/files/28200/Figure_3-13.png
http://semanticommunity.info/@api/deki/files/28201/Figure_3-14.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Over
http://semanticommunity.info/@api/deki/files/28202/Figure_3-15.png

http://semanticommunity.info/@api/deki/files/28203/Figure_3-16.png

http://semanticommunity.info/@api/deki/files/28187/Table_3-2.png
http://semanticommunity.info/@api/deki/files/28204/Figure_3-17.png
http://semanticommunity.info/@api/deki/files/28184/Figure_3-18.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Over
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Errata
http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Intro

http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to
http://semanticommunity.info/@api/deki/files/28243/Figure_4-1.png
http://semanticommunity.info/@api/deki/files/28244/Figure_4-2.png
http://semanticommunity.info/@api/deki/files/28245/Figure_4-3.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Visualizing_Segme
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Trees_a
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/@api/deki/files/28256/Equation_4-1.png
http://semanticommunity.info/@api/deki/files/28257/Equation_4-2.png
http://semanticommunity.info/@api/deki/files/28246/Figure_4-4.png
http://semanticommunity.info/@api/deki/files/28247/Figure_4-5.png

http://archive.ics.uci.edu/ml/datasets/Iris
http://archive.ics.uci.edu/ml/
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/@api/deki/files/28248/Figure_4-6.png
http://semanticommunity.info/@api/deki/files/28249/Figure_4-7.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to
http://semanticommunity.info/@api/deki/files/28250/Figure_4-8.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar
http://semanticommunity.info/@api/deki/files/28250/Figure_4-8.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Loss_func
http://semanticommunity.info/@api/deki/files/28251/Figure_4-9.png

Sidebar
Content
Table
Table
Notes
Details
Details
Equation
Equation
Figure
Content
Example
Figure
Table
Table
Figure
Figure
Figure
Note
Summary
Content
Content
Content
Content
Content
Figure
Figure
Note
Content
Figure
Content
Example
Figure
Figure
Figure
Figure
Example
Details
Table
Figure
Content
Sidebar
Figure
Content
Figure

http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Decision_Analy
http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Intro
http://semanticommunity.info/@api/deki/files/28261/Table_4-1.png
http://semanticommunity.info/@api/deki/files/28262/Table_4-2.png

http://semanticommunity.info/@api/deki/files/28258/Equation_4-3.png
http://semanticommunity.info/@api/deki/files/28259/Equation_4-4.png
http://semanticommunity.info/@api/deki/files/28252/Figure_4-10.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and
http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_
http://semanticommunity.info/Data_Science/Dat
http://semanticommunity.info/@api/deki/files/28253/Figure_4-11.png
http://semanticommunity.info/@api/deki/files/28263/Table_4-3.png
http://semanticommunity.info/@api/deki/files/28260/Table_4-4.png
http://semanticommunity.info/@api/deki/files/28254/Figure_4-12.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#An_Example_of_M
http://semanticommunity.info/Data_Science/Data_Science_for_Business#12._Oth
http://semanticommunity.info/@api/deki/files/28242/Figure_4-13.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Logistic

http://semanticommunity.info/@api/deki/files/28266/Figure_5-1.png
http://semanticommunity.info/@api/deki/files/28267/Figure_5-2.png

http://semanticommunity.info/@api/deki/files/28268/Figure_5-3.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Avoiding_Overfittin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#An_Example_of_M
http://semanticommunity.info/@api/deki/files/28269/Figure_5-4.png
http://semanticommunity.info/@api/deki/files/28270/Figure_5-5.png
http://semanticommunity.info/@api/deki/files/28271/Figure_5-6.png
http://semanticommunity.info/@api/deki/files/28272/Figure_5-7.png

http://semanticommunity.info/@api/deki/files/28276/Table_5-1.png
http://semanticommunity.info/@api/deki/files/28273/Figure_5-8.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#1._Introduction:_D
http://semanticommunity.info/@api/deki/files/28274/Figure_5-9.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Address
http://semanticommunity.info/@api/deki/files/28275/Figure_5-10.png

Content
Figure
Content
Content
Content
Details
Details
Sidebar
Summary
Content
Content
Figure
Equation
Content
Example
Content
Content
Figure
Table
Content
Content
Content
Sidebar
Content
Figure
Figure
Figure
Content
Content
Content
Content
Content
Content
Details
Details
Equation
Equation
Equation
Equation
Details
Details
Figure
Figure
Figure
Figure

http://semanticommunity.info/@api/deki/files/28265/Figure_5-11.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Beware_o
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fitting_a_Model
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fitting_a_Model
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Logistic
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_

http://sem http://semanticommunity.info/Data_Science/Data_Science_for_Business#12._Oth
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Visualizing_Segme
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Classifi
http://semanticommunity.info/@api/deki/files/28325/Figure_6-1.png
http://semanticommunity.info/@api/deki/files/28315/Equation_6-1.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://www.whiskyclassified.com/
http://adn.biol.umontreal.ca/~numericalecology/data/scotch.html

http://semanticommunity.info/@api/deki/files/28326/Figure_6-2.png
http://semanticommunity.info/@api/deki/files/28340/Table_6-1.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Probability_Estima

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_Base_rate
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Holdou

http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_Gene
http://semanticommunity.info/@api/deki/files/28327/Figure_6-3.png
http://semanticommunity.info/@api/deki/files/28328/Figure_6-4.png
http://semanticommunity.info/@api/deki/files/28329/Figure_6-5.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Heterogeneous_At
http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Intro
http://semanticommunity.info/Data_Science/Data_Science_for_Busine

http://www.dcs.ed.ac.uk/home/jhb/whisky/lapointe/text.html
http://semanticommunity.info/@api/deki/files/28316/Equation_6-2.png
http://semanticommunity.info/@api/deki/files/28317/Equation_6-3.png
http://semanticommunity.info/@api/deki/files/28318/Equation_6-4.png
http://semanticommunity.info/@api/deki/files/28319/Equation_6-5.png

http://semanticommunity.info/@api/deki/files/28320/Equation_6-6.png
http://semanticommunity.info/@api/deki/files/28321/Equation_6-7.png
http://semanticommunity.info/@api/deki/files/28322/Equation_6-8.png
http://semanticommunity.info/@api/deki/files/28323/Equation_6-9.png

Figure
Content
Example
Content
Figure
Note
Figure
Figure
Figure
Content
Figure
Figure
Figure
Figure
Example
Content
Content
Content
Details
Details
Figure
Content
Figure
Content
Content
Content
Sidebar
Content
Content
Table
Content
Table
Table
Figure
Content
Content
Content
Equation
Content
Content
Figure
Table
Content
Content
Figure

http://semanticommunity.info/@api/deki/files/28324/Equation_6-10.png

http://itol.embl.de/index.shtml
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Exampl
http://semanticommunity.info/@api/deki/files/28330/Figure_6-6.png

http://semanticommunity.info/@api/deki/files/28331/Figure_6-7.png
http://semanticommunity.info/@api/deki/files/28332/Figure_6-8.png
http://semanticommunity.info/@api/deki/files/28333/Figure_6-9.png
https://en.wikipedia.org/w/index.php?title=Determining_the_number_of_clusters_in_a_data
http://semanticommunity.info/@api/deki/files/28334/Figure_6-10.png
http://semanticommunity.info/@api/deki/files/28335/Figure_6-11.png
http://semanticommunity.info/@api/deki/files/28336/Figure_6-12.png
http://semanticommunity.info/@api/deki/files/28337/Figure_6-13.png
http://trec.nist.gov/data/reuters/reuters.html
http://semanticommunity.info/Data_Science/Data_Science_for_Business#10._Representing_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Other_D

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Trees_as_Sets_of_R
http://www.dcs.ed.ac.uk/home/jhb/whisky/lapointe/text.html
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://www.cs.waikato.ac.nz/ml/weka/

http://semanticommunity.info/@api/deki/files/28338/Figure_6-14.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Decision_Analy
http://semanticommunity.info/@api/deki/files/28339/Figure_6-15.png

http://sem http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Over
http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and

http://semanticommunity.info/@api/deki/files/28356/Table_7-1.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and
http://semanticommunity.info/@api/deki/files/28357/Table_7-2.png
http://semanticommunity.info/@api/deki/files/28358/Table_7-3.png
http://semanticommunity.info/@api/deki/files/28352/Figure_7-1.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#11._Decision_Anal
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Busi
http://semanticommunity.info/@api/deki/files/28350/Equation_7-1.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Busi
http://semanticommunity.info/@api/deki/files/28353/Figure_7-2.png
http://semanticommunity.info/@api/deki/files/28359/Table_7-4.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#8._Visualizing_Mod
http://semanticommunity.info/@api/deki/files/28354/Figure_7-3.png

Figure
Equation
Table
Table
Sidebar
Content
Summary
Content
Content
Figure
Content
Figure
Content
Figure
Figure
Figure
Content
Content
Figure
Figure
Example
Table
Table
Figure
Figure
Figure
Note
Figure
Figure
Summary
Content
Example
Content
Content
Content
Equation
Content
Note
Content
Equation
Content
Equation
Content
Sidebar
Content

http://semanticommunity.info/@api/deki/files/28355/Figure_7-4.png
http://semanticommunity.info/@api/deki/files/28351/Equation_7-2.png
http://semanticommunity.info/@api/deki/files/28360/Table_7-5.png
http://semanticommunity.info/@api/deki/files/28341/Table_7-6.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#8._Visualizing_Mod
http://semanticommunity.info/Data_Science/Data_Science_for_Business#8._Visualizing_Mod
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Note:_B
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_

http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_Key_Analytical_F
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fittin
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/@api/deki/files/28281/Figure_8-1.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#A_Key_Analytical_F
http://semanticommunity.info/@api/deki/files/28282/Figure_8-2.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_Bad_Posi
http://semanticommunity.info/@api/deki/files/28283/Figure_8-3.png
http://semanticommunity.info/@api/deki/files/28284/Figure_8-4.png
http://semanticommunity.info/@api/deki/files/28285/Figure_8-5.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Perform

http://semanticommunity.info/@api/deki/files/28286/Figure_8-6.png
http://semanticommunity.info/@api/deki/files/28287/Figure_8-7.png
http://www.kddcup-orange.com/
http://semanticommunity.info/Data_Science/Data_Science_for_Business#9._Evid
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_
http://semanticommunity.info/Data_Science/Dat
http://semanticommunity.info/Data_S
http://semanticommunity.info/@api/deki/files/28293/Table_8-1.png
http://semanticommunity.info/@api/deki/files/28277/Table_8-2.png
http://semanticommunity.info/@api/deki/files/28288/Figure_8-8.png
http://semanticommunity.info/@api/deki/files/28289/Figure_8-9.png
http://semanticommunity.info/@api/deki/files/28290/Figure_8-10.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Bias.2C_Variance.2
http://semanticommunity.info/@api/deki/files/28291/Figure_8-11.png
http://semanticommunity.info/@api/deki/files/28292/Figure_8-12.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Chapter_9
http://semanticommunity.info/Data_Science/Data_Science_for_Business#10._Representing_

http://semanticommunity.info/@api/deki/files/28302/Equation_9-1.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Decision_Analy
http://semanticommunity.info/@api/deki/files/28303/Equation_9-2.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Probability_Estima
http://semanticommunity.info/@api/deki/files/28304/Equation_9-3.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Decision_Analy

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Cumulative_Respo
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar
http://semanticommunity.info/Data_Science/Data_Science_for_Busine

Equation
Example
Table
Content
Summary
Content
Content
Content
Content
Content
Note
Content
Table
Table
Table
Note
Content
Equation
Figure
Content
Example
Figure
Figure
Figure
Table
Details
Details
Figure
Content
Content
Content
Content
Figure
Note
Example
Content
Figure
Content
Sidebar
Figure
Content
Content
Figure
Figure
Sidebar

http://semanticommunity.info/@api/deki/files/28294/Equation_9-4.png
http://semanticommunity.info/@api/deki/files/28305/Table_9-1.png
http://www.data-science-for-biz.com/NB-advertising.html

http://semanticommunity.info/Data_Science/Data_Science_for_Business#14._Conclusion
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Exampl

http://semanticommunity.info/@api/deki/files/28221/Table_10-2.png
http://semanticommunity.info/@api/deki/files/28222/Table_10-3.png

http://semanticommunity.info/@api/deki/files/28223/Equation_10-1.png
http://semanticommunity.info/@api/deki/files/28212/Figure_10-1.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Example:_Attribut
http://semanticommunity.info/@api/deki/files/28213/Figure_10-2.png
http://semanticommunity.info/@api/deki/files/28214/Figure_10-3.png
http://semanticommunity.info/@api/deki/files/28215/Figure_10-4.png
http://semanticommunity.info/@api/deki/files/28208/Table_10-4.png

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Selecting_Informa
http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Intro
http://semanticommunity.info/@api/deki/files/28216/Figure_10-5.png

http://semanticommunity.info/@api/deki/files/28217/Figure_10-6.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#12._Other_Data_S

http://semanticommunity.info/@api/deki/files/28218/Figure_10-7.png
http://finance.yahoo.com/q?s=AAPL
https://www.google.com/finance

http://semanticommunity.info/@api/deki/files/28207/Figure_10-8.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Sidebar:_The_New
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Bag_of_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to
http://semanticommunity.info/@api/deki/files/28219/Figure_10-9.png
http://semanticommunity.info/@api/deki/files/28220/Figure_10-10.png

Summary
Content
Content
Content
Content
Content
Content
Content
Equation
Content
Summary
Content
Content
Content
Equation
Equation
Example
Content
Note
Profiling
Figure
Figure
Figure
Figure
Note
Content
Content
Figure
Content
Content
Summary
Content
Content
Content
Content
Content
Content
Content
Content
Content
Content
Note
Content
Content
Content

http://semanticommunity.info/Data_Science/Data_Science_for_Business#14._Conclusion
http://sem http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Deci

http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Deci

http://semanticommunity.info/Data_Science/Data_Science_for_Business#12._Other_Data_S
http://semanticommunity.info/@api/deki/files/28205/Equation_11-1.png
http://semanticommunity.info/Data_Science/Data_Science_for_Business#1._Introduction:_D
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Data_a
http://semanticommunity.info/Data_Science/Data_Science_for_Busine

http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble

http://semanticommunity.info/Data_Science/Data_Science_for_Business#9._Evidence_and_P
http://semanticommunity.info/Data_Science/Data_Science_for_Business#9._Evidence_and_P
http://semanticommunity.info/Data_Science/Data_Science_for_Business#6._Simi
http://www.giwebb.com/
http://semanticommunity.info/Data_Science/Data_Science_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fitting_a_Model
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fittin
http://semanticommunity.info/Data_Science/Data_Science_for_Busine

http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fitting_a_Model

http://semanticommunity.info/Data_Science/Data_Science_for_Business#6._Similarity.2C_Ne

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Bias.2C_Variance.2
https://en.wikipedia.org/wiki/Netflix_prize
http://en.wikipedia.org/wiki/Netflix_Prize
http://semanticommunity.info/Data_Science/Data_Science_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fitting_a_Model
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fittin
http://semanticommunity.info/Data_Science/Data_Science_for_Business#6._Similarity.2C_Ne
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Learnin
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://semanticommunity.info/Data_Science/Data_Science_for_Business#11._Dec
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_

http://semanticommunity.info/Data_Science/Data_Science_for_Business#1._Introduction:_D

http://www.sigkdd.org/
http://semanticommunity.info/Data_Science/Data_Science_for_Business#12._Oth
http://www.kaggle.com/
http://www.sigkdd.org/kddcup/index.php
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://www.kdnuggets.com/

Example
Content
Content
Content
Content
Note
Content
Content
Content
Figure
Content
Content
Content
Content
Content
Summary
Proposal
Proposal
Proposal
Proposal
Proposal
Proposal
Proposal
Proposal
Proposal
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary

http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://semanticommunity.info/Data_Science/Data_Science_for_Business#11._Dec
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_
http://semanticommunity.info/Data_Science/Dat
http://semanticommunity.info/Data_S
http://semanticommunity.

http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fitting_a_Model

http://semanticommunity.info/Data_Science/Data_Science_for_Business#6._Similarity.2C_Ne
http://semanticommunity.info/Data_Science/Data_Science_for_Business#10._Rep
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_
http://semanticommunity.info/Data_Science/Data_Science_for_Business#7._Decision_Analy
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Busi
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_
http://donottrack.us/bib/#sec_economics

http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble
http://semanticommunity.info/Data_Science/Data_Science_for_Business#10._Rep

http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Business_Proble

http://semanticommunity.info/Data_Science/Data_Science_for_Business#A._Proposal_Revie
http://semanticommunity.info/Data_Science/Data_Science_for_Business#13._Dat
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_

http://semanticommunity.info/Data_Science/Data_Science_for_Business#A._Proposal_Revie
http://semanticommunity.info/Data_Science/Data_Science_for_Business#2._Busi
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_
http://semanticommunity.info/Data_Science/Dat
http://semanticommunity.info/Data_S
http://semanticommunity.

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Accuracy_(error_ra
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Instance_(example
http://semanticommunity.info/Data_Science/Data_Science_for_Business#Attribute_(field.2C

http://semanticommunity.info/Data_Science/Data_Science_for_Business#Attribute_(field.2C

Glossary
Glossary
Glossary
Glossary
Glossary
Glossary http://semanticommunity.info/Data_Science/Data_Science_for_Business#Cost_(utility.2Floss
Glossary
Glossary
Glossary
Glossary
Glossary
Glossary http://semanticommunity.info/Data_Science/Data_Science_for_Business#Feature_vector_(re
Glossary
Glossary http://semanticommunity.info/Data_Science/Data_Science_for_Business#Confusion_matrix
Glossary http://semanticommunity.info/Data_Science/Data_Science_for_Business#Confusion_matrix
Glossary
Glossary http://semanticommunity.info/Data_Science/Data_Science_for_Business#Feature_vector_(re
Glossary
Glossary http://semanticommunity.info/Data_Science/Data_Science_for_Business#Cost_(utility.2Floss
Bibliographhttp://www.iiia.csic.es/People/enric/AICom.html
http://archive.ics.uci.edu/ml
http://ssrn.com/abstract=1819486
http://dx.doi.org/10.2139/ssrn.1819486
http://www.businessinsider.com/2012-digital-10
http://online.liebertpub.com/doi/abs/
http://www.sciencemag.or
Index
index@oreilly.com
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index
Index

Index
Index
Content
Content

URL
URL
URL
URL
URL
URL
ttp://semanticommunity.info/Data_Science/Data_Science_for_Business_Data_Science
n02182014.pptx

/28361/BrandNiemann02122014.pptx

81449361327

ness#Preface
ness#5._Overfitting_and_Its_Avoidance

ce_for_Business#2._Business_Problems_and_Data_Science_Solutions

http://www.safaribooksonline.com/
ttp://www.youtube.com/oreillymedia
e-for-biz.com/

ness#13._Data_Science_and_Business_Strategy

ness#Other_Analytics_Techniques_and_Technologies

ce_for_Business#6._Similarity.2C_Neighbors.2C_and_Clusters

ness#3._Introduction_to_Predictive_Modeling:_From_Correlation_to_Supervised_Segmentation

Data_Mining

a_Science/Data_Science_for_Business#11._Decision_Analytic_Thinking_II:_Toward_Analytical_Engineering
ce_for_Business#14._Conclusion

ness#1._Introduction:_Data-Analytic_Thinking

ness#5._Overfitting_and_Its_Avoidance

ce_for_Business#5._Overfitting_and_Its_Avoidance

ce_for_Business#1._Introduction:_Data-Analytic_Thinking

ce_for_Business#5._Overfitting_and_Its_Avoidance

ce_for_Business_Data_Science
le_3-1b.png

ce_for_Business#5._Overfitting_and_Its_Avoidance

a_Science/Data_Science_for_Business#8._Visualizing_Model_Performance

ce_for_Business#3._Introduction_to_Predictive_Modeling:_From_Correlation_to_Supervised_Segmentation

ness#3._Introduction_to_Predictive_Modeling:_From_Correlation_to_Supervised_Segmentation

Data_Science_for_Business#Sidebar:_Simplifying_Assumptions_in_This_Chapter

Data_Science_for_Business#5._Overfitting_and_Its_Avoidance

ness#3._Introduction_to_Predictive_Modeling:_From_Correlation_to_Supervised_Segmentation
ce_for_Business#Sidebar:_Loss_functions

ure_4-9.png

ce_for_Business#3._Introduction_to_Predictive_Modeling:_From_Correlation_to_Supervised_Segmentation

ity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and_Its_Avoidance

ce_for_Business#12._Other_Data_Science_Tasks_and_Techniques

ness#2._Business_Problems_and_Data_Science_Solutions
ness#Example:_Logistic_Regression_versus_Tree_Induction

ness#Avoiding_Overfitting_for_Parameter_Optimization*
ness#An_Example_of_Mining_a_Linear_Discriminant_from_Data

ness#1._Introduction:_Data-Analytic_Thinking

ness#Example:_Addressing_the_Churn_Problem_with_Tree_Induction

ness#Sidebar:_Beware_of_.E2.80.9Cmultiple_comparisons.E2.80.9D
ness#4._Fitting_a_Model_to_Data
a_Science/Data_Science_for_Business#Sidebar:_Loss_functions

ce_for_Business#12._Other_Data_Science_Tasks_and_Techniques
ce_for_Business#Classification_via_Mathematical_Functions

ogy/data/scotch.html

ness#Probability_Estimation

ce_for_Business#Holdout_Data_and_Fitting_Graphs

ce_for_Business#A_General_Method_for_Avoiding_Overfitting

Data_Science_for_Business#5._Overfitting_and_Its_Avoidance

ce_for_Business#Example:_Whiskey_Analytics

er_of_clusters_in_a_data_set&oldid=526596002

ce_for_Business#Other_Distance_Functions*

ness#7._Decision_Analytic_Thinking_I:_What_Is_a_Good_Model.3F

ce_for_Business#5._Overfitting_and_Its_Avoidance
ness#5._Overfitting_and_Its_Avoidance

ness#5._Overfitting_and_Its_Avoidance

ce_for_Business#2._Business_Problems_and_Data_Science_Solutions

ce_for_Business#2._Business_Problems_and_Data_Science_Solutions

ness#8._Visualizing_Model_Performance

ness#8._Visualizing_Model_Performance
a_Science/Data_Science_for_Business#3._Introduction_to_Predictive_Modeling:_From_Correlation_to_Supervised

Data_Science_for_Business#The_Confusion_Matrix

ness#A_Key_Analytical_Framework:_Expected_Value

ness#Sidebar:_Bad_Positives_and_Harmless_Negatives

ness#Example:_Performance_Analytics_for_Churn_Modeling

nticommunity.info/Data_Science/Data_Science_for_Business#11._Decision_Analytic_Thinking_II:_Toward_Analytic

ness#Bias.2C_Variance.2C_and_Ensemble_Methods

ness#Chapter_9
ness#10._Representing_and_Mining_Text

ness#7._Decision_Analytic_Thinking_I:_What_Is_a_Good_Model.3F

ness#Probability_Estimation

ness#7._Decision_Analytic_Thinking_I:_What_Is_a_Good_Model.3F

Data_Science_for_Business#Sidebar:_Variants_of_Naive_Bayes

ce_for_Business#Example:_Clustering_Business_News_Stories

ness#Example:_Attribute_Selection_with_Information_Gain

ce_for_Business#3._Introduction_to_Predictive_Modeling:_From_Correlation_to_Supervised_Segmentation

ness#12._Other_Data_Science_Tasks_and_Techniques

ce_for_Business#Bag_of_Words
ness#3._Introduction_to_Predictive_Modeling:_From_Correlation_to_Supervised_Segmentation

ness#14._Conclusion
ce_for_Business#7._Decision_Analytic_Thinking_I:_What_Is_a_Good_Model.3F

ce_for_Business#7._Decision_Analytic_Thinking_I:_What_Is_a_Good_Model.3F

ness#12._Other_Data_Science_Tasks_and_Techniques

Data_Science_for_Business#8._Visualizing_Model_Performance

ness#2._Business_Problems_and_Data_Science_Solutions

ness#9._Evidence_and_Probabilities
a_Science/Data_Science_for_Business#6._Similarity.2C_Neighbors.2C_and_Clusters
Data_Science_for_Business#7._Decision_Analytic_Thinking_I:_What_Is_a_Good_Model.3F

ness#4._Fitting_a_Model_to_Data

ness#6._Similarity.2C_Neighbors.2C_and_Clusters

a_Science/Data_Science_for_Business#10._Representing_and_Mining_Text
ce_for_Business#4._Fitting_a_Model_to_Data
Data_Science_for_Business#5._Overfitting_and_Its_Avoidance
a_Science/Data_Science_for_Business#6._Similarity.2C_Neighbors.2C_and_Clusters

ness#1._Introduction:_Data-Analytic_Thinking

ness#2._Business_Problems_and_Data_Science_Solutions

http://semanticommunity.info/Data_Science/Data_Science_for_Business#3._Introduction_to_Predictive
http://semanticommunity.info/Data_Science/Data_Science_for_Business#4._Fitting_a_Model
http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Over
http://semanticommunity.info/Data_Science/Data_Science_for_Busine
http://semanticommunity.info/Data_Science/Data_Science_

ness#4._Fitting_a_Model_to_Data

a_Science/Data_Science_for_Business#10._Representing_and_Mining_Text
ness#7._Decision_Analytic_Thinking_I:_What_Is_a_Good_Model.3F
a_Science/Data_Science_for_Business#1._Introduction:_Data-Analytic_Thinking

ce_for_Business#10._Representing_and_Mining_Text

ness#2._Business_Problems_and_Data_Science_Solutions

a_Science/Data_Science_for_Business#Flaws_in_the_Big_Red_Proposal

http://semanticommunity.info/Data_Science/Data_Science_for_Business#5._Overfitting_and_Its_Avoida
http://semanticommunity.info/Data_Science/Data_Science_for_Business#8._Visualizing_Mod
http://semanticommunity.info/Data_Science/Data_Science_for_Business#12._Oth

ness#Accuracy_(error_rate)
ness#Instance_(example.2C_case.2C_record)
ness#Attribute_(field.2C_variable.2C_feature)

ness#Attribute_(field.2C_variable.2C_feature)

ness#Cost_(utility.2Floss.2Fpayoff)

ness#Feature_vector_(record.2C_tuple)

ness#Confusion_matrix
ness#Confusion_matrix

ness#Feature_vector_(record.2C_tuple)

ness#Cost_(utility.2Floss.2Fpayoff)
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2294077
http://www.cs.waikato.ac.nz/~ml/index.html
http://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set
http://sci2s.ugr.es/keel/pdf/algorithm/articulo/wilcoxon1945.pdf
http://www.winterberrygroup.com/ourinsights/wp
http://www.cs.waikato.ac.nz/~ml/weka/

ytical_Engineering

ed_Segmentation

ed_Segmentation

orrelation_to_Supervised_Segmentation

nking_II:_Toward_Analytical_Engineering

ed_Segmentation

a_Science/Data_Science_for_Business#8._Visualizing_Model_Performance

ce_for_Business#12._Other_Data_Science_Tasks_and_Techniques

ac.nz/~ml/weka/

Subject
1. Story
2. Slides
3. Spotfire Dashboard
4. Research Notes
5. Thinking with Data
5.1. Cover Page
5.2. Praise for Thinking with Data
5.3. Inside Cover Page
5.4. Preface
5.4.1. Introduction
5.4.2. Conventions Used in This Book
5.4.3. Safari Books Online
5.4.4. How to Contact Us
5.4.5. Acknowledgments
5.5. 1 Scoping: Why Before How
5.5.1. Context (Co)
5.5.2. Vision (V)
5.5.2.1. Figure 1-1. A visual mockup
5.5.2.2. Sentence Mockups
5.5.2.3. Example 1
5.5.2.4. Example 2
5.5.2.4.1. Idea 1
5.5.2.4.2. Idea 2
5.5.2.4.3. Idea 3
5.5.2.5. Example 3
5.5.2.6. Example 4
5.5.3. Outcome (O)
5.5.4. Seeing the Big Picture
5.6. 2 What Next?
5.6.1. Refining the Vision
5.6.1.1. Techniques for refining the vision
5.6.1.1.1. Interviews
5.6.1.1.2. Rapid investigation
5.6.1.1.3. Kitchen sink interrogation
5.6.1.1.4. Working backward
5.6.1.1.5. More mockups
5.6.1.1.6. Roleplaying
5.6.2. Deep Dive: Real Estate and Public Transit
5.6.2.1. Figure 2-1. Mockup graph
5.6.3. Deep Dive Continued: Working Forward
5.6.4. Deep Dive Continued: Scaffolding
5.6.5. Verifying Understanding
5.6.6. Getting Our Hands Dirty
5.7. 3 Arguments

URL
Function URL
URL
http://semanticommunity.info/Data_Science/Thinking_
Narrative http://shop.oreilly.com/product/
http://semanticommunity.info/Data_Science/Thinking_
Slides
http://semanticommunity.info/@
http://semanticommunity.info/Data_Science/Thinking_
Spotfire
http://semanticommunity.info/Data_Science/Thinking_
Notes
http://semanticommunity.info/Data_Science/Thinking_
PDF
http://semanticommunity.info/@
http://semanticommunity.info/Data_Science/Thinking_
PNG
http://semanticommunity.info/@
http://semanticommunity.info/Data_Science/Thinking_
Reviews
http://semanticommunity.info/Data_Science/Thinking_
PNG
http://my.safaribooksonline.com
http://semanticomm
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content http://www.dataists.com/2010/0
http://semanticomm
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Logo
http://my.safaribooksonline.com
http://www.safariboo
http://semanticommunity.info/Data_Science/Thinking_
Content http://oreil.ly/thinking-with-data
questions@oreilly.co
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Figure
http://semanticommunity.info/@
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Figure
http://semanticommunity.info/@
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content

5.7.1. Audience and Prior Beliefs


http://semanticommunity.info/Data_Science/Thinking_
Content
5.7.1.1. Building an Argument
http://semanticommunity.info/Data_Science/Thinking_
Content
5.7.2. Claims
http://semanticommunity.info/Data_Science/Thinking_
Content
5.7.2.1. Claims
http://semanticommunity.info/Data_Science/Thinking_
Content
5.7.3. Evidence, Justification, and Rebuttals
http://semanticommunity.info/Data_Science/Thinking_
Content
5.7.3.1. Evidence and Transformations
http://semanticommunity.info/Data_Science/Thinking_
Content
5.7.3.2. Adding Justifications, Qualifications, and http://semanticommunity.info/Data_Science/Thinking_
Content
5.7.4. Deep Dive: Improving College Graduation http://semanticommunity.info/Data_Science/Thinking_
Content
5.8. 4 Patterns of Reasoning
http://semanticommunity.info/Data_Science/Thinking_
Content http://semanticommunity.info/D
5.8.1. Categories of Disputes
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.1.1. FACT
http://semanticommunity.info/Data_Science/Thinking_
Content http://www.nber.org/papers/w15
http://bit.ly/1gIDQfN
5.8.1.2. DEFINITION
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.1.3. VALUE
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.1.4. POLICY
http://semanticommunity.info/Data_Science/Thinking_
Content http://www.thegreatcourses.com
5.8.2. General Topics
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.2.1. SPECIFIC-TO-GENERAL
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.2.2. GENERAL-TO-SPECIFIC
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.2.3. ARGUMENT BY ANALOGY
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.3. Special Arguments
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.3.1. OPTIMIZATION
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.3.2. BOUNDING CASE
http://semanticommunity.info/Data_Science/Thinking_
Content
5.8.3.3. COST/BENEFIT ANALYSIS
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9. 5 Causality
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.1. Defining Causality
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.1.1. Table 5-1. Consider one tree
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.1.2. Table 5-2. Looking at many trees
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.1.3. Table 5-3. Confounders found
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.2. Designs
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.2.1. Table 5-4. Multiple causation
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.3. Intervention Designs
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.3.1. Figure 5-1. A shocking example of withinhttp://semanticommunity.info/Data_Science/Thinking_
Figure
http://semanticommunity.info/@
5.9.4. Observational Designs
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.5. Natural Experiments
http://semanticommunity.info/Data_Science/Thinking_
Content
5.9.6. Statistical Methods
http://semanticommunity.info/Data_Science/Thinking_
Content
5.10. 6 Putting It All Together
http://semanticommunity.info/Data_Science/Thinking_
Content
5.10.1. Deep Dive: Predictive Model for Conversiohttp://semanticommunity.info/Data_Science/Thinking_
Content
5.10.1.1. Context
http://semanticommunity.info/Data_Science/Thinking_
Content
5.10.1.2. Need
http://semanticommunity.info/Data_Science/Thinking_
Content
5.10.1.3. Vision
http://semanticommunity.info/Data_Science/Thinking_
Content
5.10.1.4. Outcome
http://semanticommunity.info/Data_Science/Thinking_
Content
5.10.1.5. Figure 6-1 Predicted probability plot
http://semanticommunity.info/Data_Science/Thinking_
Figure
http://semanticommunity.info/@
5.10.2. Deep Dive: Calculating Access to Microfi http://semanticommunity.info/Data_Science/Thinking_
Content
5.10.2.1. Context
http://semanticommunity.info/Data_Science/Thinking_
Content
5.10.2.2. Needs
http://semanticommunity.info/Data_Science/Thinking_
Content
5.10.2.3. Vision
http://semanticommunity.info/Data_Science/Thinking_
Content

5.10.2.4. Outcome
5.10.2.5. Wrapping Up
5.11. Appendix Further Reading
5.12. About the Author
5.13. Colophon

http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content
http://semanticommunity.info/Data_Science/Thinking_
Content

URL
URL
URL
URL
URL
URL
oreilly.com/product/0636920029182.do
nticommunity.info/@api/deki/files/28361/BrandNiemann02122014.pptx

URL

nticommunity.info/@api/deki/files/28067/Thinking_with_Data.pdf
nticommunity.info/@api/deki/files/28068/CoverPage1.png

ttp://semanticommunity.info/@api/deki/files/28069/CoverPage2.png

http://semanticommunity.info/Data_Science/Thinking_with_Data#2_What_Next.3F
http://semanticommunity.info/Data_Science/Thinking_with_Data#3_Arguments
http://semanticommunity.info/Data_Science/Thinking_with_Data#4_Patterns_of_R
http://semanticommunity.info/Data_Science/Thinking_with_Data#5_C
http://semanticommunity.info/Data_Science/Thinking_with_
http://semanticommunity.info/Data_Science/Thi

http://www.safaribooksonline.com/subscriptions
http://www.safaribooksonline.com/teams-organizations
http://www.safaribooksonline.com/government
http://www.safaribooksonline.com/individuals
http://www.safaribooksonline.com/publishers
http://www.safaribooksonline.com/
http://semanticommunity.info/@api/d
http://www.oreilly.com/
http://facebook.com/oreilly
http://twitter.com/oreillymedia
http://www.youtube.com/oreillymedia

nticommunity.info/@api/deki/files/28070/Figure_1-1_A_visual_mockup.png

nticommunity.info/@api/deki/files/28071/Figure_2-1_Mockup_graph.png

nticommunity.info/Data_Science/Thinking_with_Data#5_Causality

ttp://bit.ly/1gIDQfN

hegreatcourses.com/tgc/courses/course_detail.aspx?cid=4294

nticommunity.info/@api/deki/files/28072/Figure_5-1_A_shocking_example_of_within-subject_design.png

nticommunity.info/@api/deki/files/28073/Figure_6-1_Predicted_probability_plot.png

ity.info/Data_Science/Thinking_with_Data#Appendix_Further_Reading

nticommunity.info/@api/deki/files/28074/SafariLogo.png

ect_design.png

Attribute
classes
cap-shape
cap-surface
cap-color
bruises?
odor
gill-attachment
gill-spacing
gill-size
gill-color
stalk-shape
stalk-root
stalk-surface-above-ring
stalk-surface-below-ring
stalk-color-above-ring
stalk-color-below-ring
veil-type
veil-color
ring-number
ring-type
spore-print-color
population
habitat

Description
edible=e, poisonous=p
bell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s
fibrous=f,grooves=g,scaly=y,smooth=s
brown=n,buff=b,cinnamon=c,gray=g,green=r, pink=p,purple=u,red=e,
bruises=t,no=f
almond=a,anise=l,creosote=c,fishy=y,foul=f, musty=m,none=n,pungen
attached=a,descending=d,free=f,notched=n
close=c,crowded=w,distant=d
broad=b,narrow=n
black=k,brown=n,buff=b,chocolate=h,gray=g, green=r,orange=o,pink=
enlarging=e,tapering=t
bulbous=b,club=c,cup=u,equal=e, rhizomorphs=z,rooted=r,missing=?
fibrous=f,scaly=y,silky=k,smooth=s
fibrous=f,scaly=y,silky=k,smooth=s
brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w
brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w
partial=p,universal=u
brown=n,orange=o,white=w,yellow=y
none=n,one=o,two=t
cobwebby=c,evanescent=e,flaring=f,large=l, none=n,pendant=p,sheat
black=k,brown=n,buff=b,chocolate=h,green=r, orange=o,purple=u,whi
abundant=a,clustered=c,numerous=n, scattered=s,several=v,solitary=
grasses=g,leaves=l,meadows=m,paths=p, urban=u,waste=w,woods=d

=k,sunken=s

=r, pink=p,purple=u,red=e,white=w,yellow=y

f, musty=m,none=n,pungent=p,spicy=s

=g, green=r,orange=o,pink=p,purple=u,red=e, white=w,yellow=y

phs=z,rooted=r,missing=?

e=o, pink=p,red=e,white=w,yellow=y
e=o, pink=p,red=e,white=w,yellow=y

l, none=n,pendant=p,sheathing=s,zone=z
=r, orange=o,purple=u,white=w,yellow=y
ered=s,several=v,solitary=y
urban=u,waste=w,woods=d

Attribute
sepal length
sepal width
pedal length
pedal width
class

Description
sepal length in cm
sepal width in cm
petal length in cm
petal width in cm
Iris Setosa, Iris Versicolour, Iris Virginica

NAME

NAME

wyne

yellow

v.pale

pale

p.gold

gold

Aberfeldy

Aberfeldy

Aberlour

Aberlour

Ardberg

Ardberg

Ardmore

Ardmore

Auchentosha Auchentosh

Aultmore

Aultmore

Balblair

Balblair

Balmenach

Balmenach

Balvenie

Balvenie

Banff

Banff

Ben Nevis

Ben Nevis

Benriach

Benriach

Benrinnes

Benrinnes

Benromach Benromach

Bladnoch

Bladnoch

Blair Athol

Blair Atho

Bowmore

Bowmore

Brackla

Brackla

BruichladdichBruichladd

BunnahabhaiBunnahabha

Caol Ila

Caperdonich Caperdonic

Caol Ila

Cardhu

Cardhu

Clynelish

Clynelish

Coleburn

Coleburn

Convalmore Convalmore

CragganmoreCragganmor

Craigellachie Craigellac

Dailuaine

Dailuaine

Dallas Dhu

Dallas Dhu

Dalmore

Dalmore

Dalwhinnie

Dalwhinnie

Deanston

Deanston

Dufftown

Dufftown

Edradour

Edradour

Fettercairn

Fettercair

Glen Albyn Glen Alby

Glenallachie Glenallach

Glenburgie

Glenburgie

Glencadam Glencadam

Glen DeveronGlen Dever

Glendronach Glendronac

Glendullan

Glendullan

Glen Elgin

Glen Elgin

Glenesk

Glenesk

Glenfarclas

Glenfarcla

Glenfiddich

Glenfiddic

Glen Garioch Glen Gario

GlenglassaugGlenglassa

Glengoyne

Glengoyne

Glen Grant

Glen Grant

Glen Keith

Glen Keith

Glenkinchie Glenkinchi

Glenlivet

Glenlivet

Glenlochy

Glenlochy

Glenlossie

Glenlossie

Glen Mhor

Glen Mhor

Glenmorangi Glenmorang

Glen Moray Glen Moray

Glen Ordie

Glen Ordie

Glenrothes

Glenrothes

Glen Scotia Glen Scoti

Glen Spey

GlentauchersGlentauche

Glenturret

Glenturret

Glenugie

Glenugie

Glen Spey

Glenury RoyaGlenury Ro

Highland ParkHighland P

Imperial

Imperial

Inchgower

Inchgower

Inchmurrin

Inchmurrin

Inverleven

Inverleven

Jura

Jura

Kinclaith

Kinclaith

Knockando

Knockando

Knockdhu

Knockdhu

Ladyburn

Ladyburn

Lagavulin

Lagavulin

Laphroaig

Laphroaig

Linkwood

Linkwood

Littlemill

Littlemill

Lochnagar

Lochnagar

Lochside

Lochside

Longmorn

Longmorn

Macallan

Macallan

Millburn

Millburn

Miltonduff

Miltonduff

Mortlach

Mortlach

North Port

North Port

Oban

Oban

Port Ellen

Port Ellen

Pulteney

Pulteney

Rosebank

Rosebank

Saint Magdal Saint Magd

Scapa

Scapa

Singleton

Singleton

Speyburn

Speyburn

Springbank Springbank

Springbank-LLongrow

Strathisla

Strathisla

Talisker

Talisker

Tamdhu

Tamdhu

Tamnavulin Tamnavulin

Teaninich

Teaninich

Tobermory

Tobermory

Tomatin

Tomatin

Tomintoul

Tomintoul

Tormore

Tormore

Tullibardine Tullibardi

o.gold

f.gold

bronze

p.amber

amber

f.amber

red

sherry

AROMA

PEAT

SWEET

LIGHT

FRESH

DRY

FRUIT

GRASS

SEA

SHERRY

SPICY

RICH

soft

med

full

round

smooth

light

firm

oily

full

dry

sherry

big

light

smooth

clean

fruit

grass

smoke

sweet

spice

oil

salt

arome

full

dry

warm

big

light

smooth

clean

fruit

grass

smoke

sweet

spice

oil

salt

arome

ling

long

very

quick

AGE

DIST

-9

12

10

18

10

12

10

-9

-9

-9

-9

-9

-9

-9

10

-9

10

12

12

-9

12

12

-9

-9

12

-9

-9

-9

12

15

-9

10

10

-9

12

-9

-9

12

12

12

12

12

10

-9

10

-9

10

10

-9

10

12

-9

-9

10

12

12

17

10

-9

12

12

-9

12

12

17

10

-9

-9

-9

20

12

10

12

12

-9

12

10

-9

12

12

-9

12

-9

18

-9

-9

15

14

10

10

10

-9

-9

10

12

10

10

SCORE

REGION

DISTRICT islay

midland

spey

east

69

40

HIGH

MIDLAND

83

43

HIGH

SPEY

85

40

ISLAY

SOUTH

66

46

HIGH

SPEY

85

40

LOW

WEST

75

40

HIGH

SPEY

76

40

HIGH

NORTH

69

40

HIGH

SPEY

85

40

HIGH

SPEY

66

40

HIGH

SPEY

55

40

HIGH

WEST

69

40

HIGH

SPEY

78

40

HIGH

SPEY

75

40

HIGH

SPEY

85

40

LOW

BORDERS

75

40

HIGH

MIDLAND

81

40

ISLAY

LOCH

77

40

HIGH

SPEY

76

40

ISLAY

LOCH

77

40

ISLAY

NORTH

77

40

ISLAY

NORTH

73

40

HIGH

SPEY

72

40

HIGH

SPEY

81

40

HIGH

NORTH

67

40

HIGH

SPEY

68

40

HIGH

SPEY

90

40

HIGH

SPEY

72

40

HIGH

SPEY

74

40

HIGH

SPEY

85

40

HIGH

SPEY

79

40

HIGH

NORTH

76

43

HIGH

SPEY

69

40

HIGH

MIDLAND

71

40

HIGH

SPEY

81

40

HIGH

MIDLAND

71

40

HIGH

EAST

67

40

HIGH

SPEY

76

40

HIGH

SPEY

68

40

HIGH

SPEY

68

40

HIGH

EAST

75

40

HIGH

SPEY

75

43

HIGH

SPEY

75

43

HIGH

SPEY

75

43

HIGH

SPEY

66

40

HIGH

EAST

86

40

HIGH

SPEY

75

40

HIGH

SPEY

77

43

HIGH

EAST

76

40

HIGH

SPEY

74

40

HIGH

WEST

76

43

HIGH

SPEY

64

40

HIGH

SPEY

76

43

LOW

EAST

85

40

HIGH

SPEY

70

40

HIGH

WEST

76

40

HIGH

SPEY

64

40

HIGH

SPEY

80

40

HIGH

NORTH

75

40

HIGH

SPEY

75

40

HIGH

NORTH

80

40

HIGH

SPEY

85

40

LOW

CAMPBEL

73

40

HIGH

SPEY

71

46

HIGH

SPEY

77

57.1

HIGH

MIDLAND

70

54.8

HIGH

EAST

76

40

HIGH

EAST

90

40

HIGH

ORKNEY

76

40

HIGH

SPEY

75

40

HIGH

SPEY

65

40

HIGH

WEST

67

46

LOW

WEST

71

40

HIGH

JURA

69

40

LOW

WEST

76

43

HIGH

SPEY

73

40

HIGH

SPEY

57

46

LOW

WEST

89

43

ISLAY

SOUTH

86

43

ISLAY

SOUTH

83

40

HIGH

SPEY

83

43

LOW

NORTHWEST

80

43

HIGH

EAST

71

40

HIGH

EAST

85

40

HIGH

SPEY

87

40

HIGH

SPEY

76

40

HIGH

SPEY

76

43

HIGH

SPEY

81

40

HIGH

SPEY

64

40

HIGH

EAST

76

43

HIGH

WEST

79

40

ISLAY

SOUTH

77

40

HIGH

NORTH

76

40

LOW

CENTRAL

67

40

LOW

CENTRAL

76

40

HIGH

ORKNEY

79

40

HIGH

SPEY

71

40

HIGH

SPEY

88

46

LOW

CAMPBEL

90

46

LOW

CAMPBEL

79

40

HIGH

SPEY

90

45.8

HIGH

SKYE

75

40

HIGH

SPEY

76

43

HIGH

SPEY

71

40

HIGH

NORTH

67

40

HIGH

MULL

75

40

HIGH

SPEY

76

40

HIGH

SPEY

76

43

HIGH

SPEY

76

40

HIGH

MIDLAND

west

north

lowland

campbell

islands

Name

Example

NAME

Aberfeldy

wyne

yellow

v.pale

pale

p.gold

gold

o.gold

f.gold

bronze

p.amber

amber

f.amber

red

sherry

AROMA

PEAT

SWEET

LIGHT

FRESH

DRY

FRUIT

GRASS

SEA

SHERRY

SPICY

RICH

soft

med

full

round

smooth

light

firm

oily

full

dry

sherry

big

light

smooth

clean

fruit

grass

Category
Name
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Color variable
Nose variable
Nose variable
Nose variable
Nose variable
Nose variable
Nose variable
Nose variable
Nose variable
Nose variable
Nose variable
Nose variable
Nose variable
Body variable
Body variable
Body variable
Body variable
Body variable
Body variable
Body variable
Body variable
Palate variable
Palate variable
Palate variable
Palate variable
Palate variable
Palate variable
Palate variable
Palate variable
Palate variable

smoke

sweet

spice

oil

salt

arome

full

dry

warm

big

light

smooth

clean

fruit

grass

smoke

sweet

spice

oil

salt

arome

ling

long

very

quick

AGE

-9

DIST

SCORE

69

40

REGION

HIGH

DISTRICT

MIDLAND

islay

midland

spey

east

west

north

lowland

campbell

islands

Palate variable
Palate variable
Palate variable
Palate variable
Palate variable
Palate variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Finish variable
Age
1-5
Score
Percent
Regions
District
Geography
Geography
Geography
Geography
Geography
Geography
Geography
Geography
Geography

Das könnte Ihnen auch gefallen