Sie sind auf Seite 1von 49

Data Processing, Fundamental Data Analysis, and the Statistical Testing of Differences

Prof. Rushen Chahal

Data Analysis Overview

Validation & Editing

Coding

Data Entry

Machine Tabulation & Statistical Cleaning Analysis of Data

Prof. Rushen Chahal

Data Analysis Overview


Step One: Validation: Confirming the interviews/surveys occurred Editing: Determining the questionnaires were completed correctly

Step Two: Coding: Grouping and assigning numeric codes to the question responses.

Step Three: Data Entry: Process of converting data to an electronic form Can use scanning devices to enter data Scanning the questionnaire into a data base (such as with bubble sheets) Step Four: Clean the Data: Check for data entry errors or data entry inconsistencies Machine cleaning - computerized check of the data Step Five: Data tabulations and statistical analysis.

Prof. Rushen Chahal

Editing & Skip Patterns


Editing:
The Process of ascertaining that questionnaires were filled out properly and completely.

Skip Patterns:
Sequence in which later questions are asked, based on a respondents answer to an earlier or questions.

Prof. Rushen Chahal

Coding
Coding:
The Process of grouping and assigning numeric codes to the various responses to a question. The Process: List Responses Consolidate Responses Set Codes Enter Codes Keep Coding Sheet

Prof. Rushen Chahal

Data Entry
Data Entry:
The Process of converting information to an electronic format.

Intelligent Data Entry:


A form of data entry in which the information being entered into the data entry device is checked for internal logic.

Prof. Rushen Chahal

Machine Cleaning of Data


Machine Cleaning of Data: Final computer error check of data.

Error Checking Routines: Computer programs that accept instructions from the user to check for logical errors in the data.

Marginal Report: Computer-generated table of the frequencies of the responses to each question, used to monitor entry of valid codes and correct use of skip patterns.

Prof. Rushen Chahal

Cross Tabulation Data


Examination of the responses to one question relative to the responses to one or more questions in a survey set. Bi-variate cross-tabulation: Cross tabulation two items - Business Category and Gender

Are You a Veteran? (All) You Liked the Chamber's Services (All) Race/Ethnicity (All) Count of Respondent Business Category Computers/Technology Construction General Services Manufacturing No Response Other Professional Retail Wholesale #N/A Grand Total Gender Female Male Grand Total 5 7 12 2 4 6 1 1 13 6 19 1 4 5 15 11 26 1 3 4 4 4 8 1 1 2 1 1 42 42 84

Multi-variate cross-tabulation: Additional filtering criteria Status - Now filtering three items.
Race/Ethnicity (All) Are You a Veteran? Yes You Liked the Chamber's Services (All) Count of Respondent Business Category Computers/Technology Construction Manufacturing Other Professional Grand Total

Veteran

Gender Female Male Grand Total 1 3 4 1 1 5 5 3 2 5 1 1 9 7 16

Prof. Rushen Chahal

Graphic Representations of Data


One Way Frequency Tables
A table showing the number of respondents choosing each answer to a survey question.
Did You Like the Movie? 8 6 4 2 0 Female 4 3 7 No Yes Grand Total

Prof. Rushen Chahal

Graphic Representations of Data


Line, Pie, and Bar Charts
Line Charts: Good for demonstrating linear relationships. Pie Charts: Good for special relationships among data points. Bar Charts: Good for side by side relationships / comparisons
Did You Like the Movie?
15 12 10 7 5 0 Female Male Grand Total 4 3 5 3 2 6 No Yes Grand Total

Did You Like the Movie?


14 12 10 8 6 4 2 0 Female Male Grand Total 4 3 7 5 2 3 6 6 No Yes Grand Total

Did You Like the Movie?


12

Female Male Grand Total

6 2

Prof. Rushen Chahal

Descriptive Statistics
Effective means of summarizing large data sets. Key measures include: mean, median, mode, kurtosis, standard deviation, skewness, and variance.
Significant discrepancies in Mean and Median should cause you to look further into this data.

Years in Business Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 22.4 2.6 15.0 5.0 23.1 534.5 3.8 2.1 98.0 2.0 100.0 1770.5 79.0

Prof. Rushen Chahal

Descriptive Statistics

Mean:
The sum of the values for all observations of a variable divided by the number of observations.

Median:
In an ordered set, the value below which 50 percent of the observations fall.

Mode:
The value that occurs most frequently.

Prof. Rushen Chahal

Descriptive Statistics

Variance:
The sums of the squared deviations from the mean divided by the number of observations minus one. The same formula as standard deviation with the squaring.

Range:
The maximum value for a variable minus the minimum value for that variable.
sum

Standard Deviation:
Calculated by:

Standard Deviation

(X1- X) (N-1)

subtracting the mean of a series from each value in a series squaring each result then summing them then dividing the result by the number of items minus 1 and finally taking the square root of this value

Prof. Rushen Chahal

Statistical Significance

Mathematical Differences:
By definition, if numbers are not exactly the same, they are different. This fact does not, however, mean that the difference is either important or statistically significant.

Statistical Significance:
If a particular difference is large enough to be unlikely to have occurred because of chance or sampling error, then the difference is statistically significant.

Prof. Rushen Chahal

Statistical Significance
Managerial Important Differences:
One must be able to distinguish between mathematically differences and statistically significant differences in using the data analysis in managerial decision making.

Hypothesis:
An assumption, argument, or theory that a researcher or manager makes about some characteristics of the population under study.

Prof. Rushen Chahal

Hypothesis Testing

Step One: Stating the hypothesis


Null Hypothesis - status quo proven to be true. Alternative Hypotheses - another alternative proven to the true.

Step Two: Choosing the appropriate test statistic


Test of means, test or proportions, ANOVA, etc.

Step Three: Developing a decision rule


Determine the significance level. Need to determine whether to reject or fail to reject the null hypothesis.

Prof. Rushen Chahal

Hypothesis Testing

Step Four: Calculating the value of the test statistic


Use the appropriate formula to calculate the value of the statistic.

Step Five: Stating the conclusion


Stated from the perspective of the original research question.

Prof. Rushen Chahal

Types of Errors in Hypothesis Testing


Type I:
Rejection of the null hypothesis when, in fact, it is true.

Type II:
Acceptance of the null hypothesis when, in fact, it is false.

Tests are either one or two-tailed. This decision depend on the nature of the situation and what the researcher is demonstrating. One-Tailed: If you take the medicine, you will get better

Two-Tailed: If you take the medicine, you will get either better or worse.

Prof. Rushen Chahal

Issues With Type I and II Errors

Actual State of the Null Hypothesis

Fail to Reject Ho

Reject Ho

Ho is true

Correct (1- E) no error

Type I error ( E)

Ho is false

Type II error ( F)

Correct (1- F) no error

Prof. Rushen Chahal

Commonly Used Statistical Hypothesis Tests

Independent Samples:
Samples in which measurement of a variable in one population has no effect on measurement of the variable in the other.

Related Samples:
Samples in which measurement of a variable in one population might influence measurement of the variable in the other.

Degrees of Freedom:
Is equal to the number of observations minus the number of assumptions or constraints necessary to calculate a statistic.

Prof. Rushen Chahal

Hypothesis Tests
About One and Two Means Respectively

Z-Test:
Hypothesis test used for a single mean if the sample is large enough and drawn from a normal population. Usually for samples of about 30 and above.

t-Test:
Hypothesis test used for a single mean if the sample is too small to use the Z-test. Usually for samples below 30.

Hypothesis test that tests the difference between groups of data.

Prof. Rushen Chahal

Hypothesis Tests
About Proportions and P-Value

Proportion in One Sample:


Test to determine whether the difference between proportions is greater than would be expected because of sampling error.

Two Proportions in Independent Samples:


Test to determine the proportional differences between two or more groups.

p-value:
The exact probability of getting a computed test statistic that was largely due to chance. The smaller the p-value, the smaller the probability that the observed result occurred by chance.

Prof. Rushen Chahal

Statistics and the Internet


ActivStats - www.datadesk.com Autobox - www.autobox.com

In Slide Show mode, click on the arrow to be taken to the respective web page.

Math Software - http://gams.nist.gov Minitab - www.minitab.com SAS - www.sas.com SPSS - www.spss.com Stata - www.stata.com SYSTAT - www.systat.com Vizion - www.datadesk.com/viz!on xISTAT - www.xlstat.com

Prof. Rushen Chahal

Bi-variate Correlation and Regression

Prof. Rushen Chahal

Bivariate Analysis of Association


Bivariate Techniques: Statistical methods of analyzing the relationship between variables.

Independent Variable: Variable believed to affect the value of the dependent variable.

Chapter Thirteen Prof. Rushen Chahal

Prof. Rushen Chahal

Bivariate Analysis of Association


Dependent Variable:
Variable expected to be explained or caused by the independent variable.

Bivariate Regression Analysis:


The analysis of the strength of the linear relationship between variables when one is considered the independent variable and the other is the dependent variable.
Chapter Thirteen Prof. Rushen Chahal

Prof. Rushen Chahal

Types of Relationships
As Found in Scatterplot Diagrams
Y Y

No Apparent Relationship Between X and Y

Perfect Positive Relationship Between X and Y

Perfect Negative Relationship Between X and Y

Parabolic Relationship Between X and Y

Prof. Rushen Chahal

Types of Relationships
As Found in Scatterplot Diagrams

General Negative Relationship Between X and Y

General Positive Relationship Between X and Y

Negative Curvilinear Relationship Between X and Y

No Apparent Relationship Between X and Y

Prof. Rushen Chahal

Least-Square Estimation Procedure

Used to fit data for X and Y not plotted; Enables estimation of non-plotted data points; Results in a straight line that fits the actual observations (plotted dots) better than any other line that could be fit to the observations.
Y

Prof. Rushen Chahal

Least-Square Estimation Procedure


Estimating the best line of fit:

Y = a + bX + e Where: Y = dependent variable a = estimated Y intercept b = estimated slope of the regression line X = independent variable e = error
Where:

Values for a and b can be calculated as follows:

X = mean of value X Y = mean of value y

b=

7 X iY i - n X Y 7 X 2 i - n(X ) 2

a = Y - bX
n = sample size

Prof. Rushen Chahal

Measures of Association
Coefficient of Determination:
Percentage of the total variation in the dependent variable explained by the manipulation of the independent variable(s).
n Total Variation - Unexplained Variation Total Variation

R =

R2 =

1-

I=1 n I=1

7 (Y i - Y i)
2

7 (Y i - Y )

Pearson Correlation:
Analysis of the degree to which changes in one variable are associated with changes in another for use with metric data.

The Strength of Association - R 2 :

R = + or -

R2

The coefficient of determination: the percentage of the total variation in the dependent variable explained by the independent variable.

Prof. Rushen Chahal

Sum of Squares
Total Variation: Sum of Squares (SST)
SST =
n

7 (Yi - Y)2
i=1

n n

i=1

7 Yi

i=1

7 Yi 2 n

Prof. Rushen Chahal

Sum of Squares
Sum of Squares due to Regression (SSR)
SSR = 7 (Yi - Y)2
i=1
2

n n n

= a 7 Yi + i=1

b7 1Xi Yi i=

7 Y i=1 i n

Prof. Rushen Chahal

Sum of Squares
Error Sums of Squares (SSE)
n

SSE = 7 (Yi - Y)2


i=1

i=1

7 Y 2i

a7 1 i Y i=

b7 1XiYi i=

Prof. Rushen Chahal

Correlation
Assessing Measures of Association

Measure of Association using interval or ratio data.

Measure of Association using ordinal or rank order data.

Prof. Rushen Chahal

Correlation
Assessing Measures of Association

Measures of Association:
Do not mean there is a causal relationship between the relevant variables; Could simply represent coincidence between the relevant variables; Should be taken in context and with the timeliness of both data sets in mind; Can be used in conjunction with cross tabulations of the relevant data to add another perspective to the results.

Prof. Rushen Chahal

Communicating the Research Results and Managing Marketing Research

Prof. Rushen Chahal

The Research Report


Explain why the research was done What were the motivations for doing the research? Was there a problem that had to be addressed? State the specific research objectives What do you hope to learn? What are your research goals? Explain how the research was done What type of sampling did you use and why? Did you do surveys, focus groups, interviews, etc? Present the research findings In what form written, slide presentations, oral? How can you make the findings practical and actionable? Provide conclusions and recommendations Conclusions for descriptive research Recommendations for analytic research

Prof. Rushen Chahal

The Research Report

1. 2. 3. 4. 5.

6.

Title Page: Include the submitters information Letter of Transmittal: Letter giving ownership of the research. Table of Contents Executive Summary: A page or two highlighting the key findings. Background: Relevant historical information that set the stage for the research Methodology: Detail how you conducted the research

Prof. Rushen Chahal

The Research Report

Findings: Dovetail the findings with the research objectives and tie in the secondary data into the primary findings. Combination of a descriptive and analytic approach is generally best 8. Limitations: Discuss problems faced and how they were handled. 9. Conclusions: Summarize the key headlines of the research findings. 10. Recommendations: Give management action items based on the research. 11. Appendices: Relevant supporting documents, tables, data, etc. 7.

Prof. Rushen Chahal

Interpreting and Presenting the Results


Monthly Score by City 400

Some Formatting Tips:


Use bulleted charts when appropriate; Use text to discuss / elaborate on bullets; Follow proper writing standards; Use a minimum of text to convey message; Don't use too many different graphic types; Multiple graphics on a page can tell a story; Dont use over-hyped text; Appearance - be professional and consistent.

350

76 300 45 84 250 65 200 44 34 150 32 44 100 34 54 33 50 87 65 45 0 44 43 34 23 34 34 34 54 56 65 54 65 76 63 56 67 53 56 54 54 76 57 87 54 34 76 87 65 34 35 87 54 46 65 55 56 43 55 54 34 57 76 25 65 63 75

Sample Bar Chart


Boise, ID Washington, DC Austin, TX Chapel Hill, NC Santa Fe, NM

Prof. Rushen Chahal

Interpreting and Presenting the Results

Prof. Rushen Chahal

Interpreting and Presenting the Results

Prof. Rushen Chahal

Interpreting and Presenting the Results

Prof. Rushen Chahal

The Presentation
When Presenting, One Might Use:
A presentation outline; Visuals - charts on easels, PowerPoint, etc.; Copies of the final report; Web options; An executive summary; Researcher contact information.

One Might Want to Convey:


What the data are telling you; The impact of the data on managerial decision making; What course(s) of action is recommended; What future studies might be needed; What was missing from this study; Potential future research benefits.
Click to See Keys to Good Public Speaking

Prof. Rushen Chahal

The Presentation
Tailored to the Audience - Understand Their:
Frame of reference; Attitudes, beliefs, perceptions, and prejudices; Educational background / level of research knowledge; Time constraints for the presentation and for action; Position within the organization; Interest in hearing the results.

Understanding the Barriers to Effective Communication:


Assess the listeners way of listening; Be responsive to questions in a positive way; Dont be defensive to criticism; Take some time to size up the listener's personality type.

Prof. Rushen Chahal

The Presentation
Persuasion - Using the Research Findings to Reinforce Conclusions:
Questions the researcher should keep in mind:
What do the data really mean? What impact do they have? How can the data be conveyed simply? How can one make the data valuable and applicable? What have we learned from the data? What do we need to do given the information we now have? How can future studies of this nature be enhanced? What can make information such as this more useful?

Prof. Rushen Chahal

The Presentation
Showing the Value of the Research Key Factors in the Use of Marketing Research:
The perceived creditability and usefulness of the report to the users; The degree of client and researcher interaction; The organizational climate for research; The personality and organizational level of key users.

The Role of Trust:


Key components of trust between the researcher and the decision makers: A function of interpersonal relationship and skill; Perceived and actual integrity of the researcher; Delivering what is promised; Being accessible to management / receivers of the research; Perceived willingness of the researcher to reduce user uncertainty; Confidentiality, expertise, professionalism, and follow-up.

Prof. Rushen Chahal

Motivating Managers to Use the Research

Increase sales and improve customer satisfaction; Better position the company competitively; Make investors happy; Improve company effectiveness and efficiency; Help the company control costs; Help the company identify opportunities; Lead to tangible quality and performance measures; Enable the company to stay ahead of customers needs and wants.

Prof. Rushen Chahal

Das könnte Ihnen auch gefallen