Sie sind auf Seite 1von 24

Chap.6.

Data analysis
6.1. Information systems used for data analysis 6.2. Descriptive statistics 6.3. Inferential statistics

6.1. Information systems used for data analysis


SPSS System (Statistical Package for Social Sciences) is used on a large scale in marketing research for data analysis.
It is used mainly for data gathered with the help of questionnaires but also for various quantitative data from statistics, companys recording etc.) The obtained information is presented as tables and charts. It offers multiple ways of data analysis like: summarize data, transforming variables, statistical tests etc.

6.1. Information systems used for data analysis


The flow of using SPSS system for information processing

Data gathering

Creating SPSS data base

Selecting the procedure of data analysis

Selecting the variables for analysis

Data processing in order to obtain the information

6.1. Information systems used for data analysis


Data gathering Depends on the research method:
Surveys - questionnaire Secondary data official statistics, statistical data bases, company recordings etc. Avoiding data gathering errors is very important for the research success. The researcher should pay special attention to: Proper training of the operators that collect data. Verification in the fieldwork to ensure that the interviewers are following the sampling procedures. Controlling the data recordings to determine whether interviewers are cheating.

6.1. Information systems used for data analysis


Creating SPSS data base In order to create a data base in SPSS the following steps are followed:
Opening a new file Defining the variables of research Recording data in the data base Verification of recorded data

6.1. Information systems used for data analysis


Start/ Programs/ SPSS for Windows

6.1. Information systems used for data analysis

A new empty data base

6.1. Information systems used for data analysis

The window for defining variables

6.1. Information systems used for data analysis

Setting the type of data

6.1. Information systems used for data analysis

Defining the codes for response categories

6.1. Information systems used for data analysis

Defining the codes for missing responses

6.1. Information systems used for data analysis


Coding data
The process of identifying and assigning numerical scores or other character symbols to data expressed in words.

Codes facilitate the introduction of data in data bases. Codes allow data to be processed by computers. Coding depends on the type of scale used in questionnaire.

6.1. Information systems used for data analysis


Ex: Nominal scale What brand of cigarettes do you smoke most often? Winston (1) L&M (2) Kent (3) Marlboro (4) Winchester (5) Viceroy (6) Other. Please specify ________ (7)
Attention: The assigned codes do not represent an order or a specific quantity. They are allotted only for identification of a response category (like the numbers of football players)

Binary (dichotomus) scale: - particular case Are you smoking? Yes (1) No (0)

6.1. Information systems used for data analysis


Ex: Ordinal scale 1. The rank order scale according to a characteristic: Please rank the following 5 brands of laundry detergent according to your preference (give the rank 1 to the most preferred brand, rank 2 for the second preferred brand and so on until the rank 5 for the least preferred brand). OMO ARIEL DERO PERSIL TIDE Coding: in this case it is defined a variable for every response category. The rank assigned by every respondent (from 1 to 5) will be introduced in data base. Attention: for the ordinal scales, the codes assigned generate an order.

6.1. Information systems used for data analysis


2. Semantic differential How much important is the ratio quality price when you choose a brand of laundry detergent? __(5)__ __(4)___ _____(3)____ ____(2)____ __(1)__ very important neither important not important not at all important nor unimportant important 3. Numerical scale How satisfied you are with the whitening power of Ariel laundry detergent?

Very satisfied

Very dissatisfied

Usually, in this case only the extreme values are coded (1= very dissatisfied, 5=very satisfied)
4. Likert scale Please indicate your opinion related to the following statement: When somebody chooses a laundry detergent, the price is the most important, all brands having about the same whitening power. __(5)__ __(4)___ _____(3)____ ____(2)____ __(1)__ strongly agree neither agree disagree strongly agree nor disagree disagree

6.1. Information systems used for data analysis


Interval scale
The middle point of every interval is recorded in data base. This one is used both as value of the variable and code of the response category.

How many cigarettes do you generally smoke during a day ? 5-9 (7) 10-14 (12) 15-19 (17) 20-24 (22) 25-29 (27) Ratio scale For this type of scale, coding is not used. In the data base it is recorded the exact value indicated by the respondent.
Ex: How many hours do you study for an exam during the examination session?____5 h____

6.1. Information systems used for data analysis


Ex: Divide 100 points among each of the following brands according to your preference for the brand:
ARIEL __40___ DERO __20___ PERSIL __30__ TIDE __10___

Coding: in this case it is defined a variable for every response category (like in the case of rank order scale). The value assigned by every respondent will be introduced in the data base.

6.2. Descriptive statistics


Descriptive analysis

Refers to the transformation of raw data into a form that will make them easy to understand and interpret (summarize data). The most common ways to summarize data are: frequency distribution, percentage distribution, calculation of central tendency and variation indicators. Charts could be associated to frequency tables in order to facilitate the understanding of information. Attention: Descriptive statistics is computed exclusively at the level of sample, using the data collected from the sample members.

6.2. Descriptive statistics


Selecting the procedures of descriptive analysis in SPSS

6.2. Descriptive statistics


Frequency table An arrangement of statistical data in a row-and-column format that exhibits the count of responses and percentages for each category assigned to a variable.
General Happiness Frequency Very Happy 467 Pretty Happy 872 Not Too Happy 165 Total 1504 NA 13 1517 Percent 30,8 57,5 10,9 99,1 ,9 100,0 Valid Percent 31,1 58,0 11,0 100,0 Cumulative Percent 31,1 89,0 100,0

Valid

Missing Total

6.2. Descriptive statistics


Measures of central tendency: mode, mean, median Mode is the response category with the highest frequency Median is the middle value when the data are arranged in ascending or descending order. It divide the sample into two equal groups (50% of the sample members are on the left and the other 50% on the right of the median). Mean is the most commonly used for central tendency when data are measured with ratio or interval scale.
n

For binary scale


i i

x f
x=
i =1

x=

1 fYes + 0 f No fYes = =p n n

Mean score represents a summarized rank used in the case of ordinal scale for creating final order of analyzed categories. It is calculated like mean but it has not the same properties with this one.

6.2. Descriptive statistics


Variation indicators: range, variance, standard deviation, standard error of mean. Range measures the spread of data

Range=xlargest-xsmallest

Variance is the mean of squared deviation from mean. It is an indicator of sample homogeneity. n For binary scale 2

( x x )
i i =1

fi

s2 =

s 2 = p( 1 p ) or

s 2 = p( 100 p )

Standard deviation is the square root of the variance. It is expressed in the same units as the data. For binary scale n

( x x )
i

fi

n Standard error of mean - a measure of how much the value of the mean may vary from sample to sample taken from the same distribution.

s=

i =1

s=

p( 1 p ) or s =

p( 100 p )

s sx = n

6.2. Descriptive statistics


Selecting the procedures of descriptive analysis in SPSS

6.2. Descriptive statistics


The indicators of descriptive statistics that could be calculated for every type of scale: Nominal scale: Mode Ordinal scale : Mode, Median, Mean score. Interval scale : Mode, Median, Mean, Variance, Standard deviation, Standard error of mean. Ratio scale: the same with interval scale and in addition we can divide a scale value by another (due to existence of absolute zero). Exceptions: Binary scale: even if it is a nominal scale, the absence of a named characteristic represents absolute zero. We can calculate: Mean, Variance, Standard deviation, Standard error of mean. Mean score: It is calculated like mean but it has not the same properties with this one, because the distances between scale levels are not equals. It does not allow to calculate variance and standard deviation.

Das könnte Ihnen auch gefallen