Beruflich Dokumente
Kultur Dokumente
Introduction
• Cross tabulation is a statistical tool that is used to analyze categorical
data. Categorical data is data or variables that are separated into different
categories that are mutually exclusive from one another. An example of
categorical data is eye color. Your eye color can be divided into 'categories'
(i.e., blue, brown, green), and it is impossible for eye color to belong to more
than one category (i.e., color).
• Cross tabulation helps you understand how two different variables are related
to each other. For example, suppose you wanted to see if there is a
relationship between the gender of the survey responder and if physical
education in high school is important.
Introduction
• For reference, a cross-tabulation is a two- (or more) dimensional
table that records the number (frequency) of respondents that have
the specific characteristics described in the cells of the table.
• Cross-tabulation tables provide a wealth of information about the
relationship between the variables.
• Cross-tabulation analysis goes by several names in the research
world including crosstab, contingency table, chi-square and data
tabulation.
• For example:
• How many brand-loyal users are males?
• Is familiarity with a new product related to age and income levels?
• Is product ownership related to income (high, medium and low)?
Introduction
• A frequency distribution describes one variable at a time, but a cross-
tabulation describes two or more variables simultaneously.
• Cross-tabulation results in tables that reflect the joint distribution of
two or more variables with a limited number of categories or distinct
values.
• The categories of one variable are cross-classified with the categories
of one or more other variables.
• Thus, the frequency distribution of one variable is subdivided
according to the values or categories of the other variables.
• Using the GlobalCash Project as an example, suppose that interest was
expressed in determining whether the number of European countries that a
company operates in was associated with the plans to change the number of
banks they do business with.
• The cross-tabulation is shown in Table 18.2. A cross-tabulation includes a cell
for every combination of the categories of the two variables. The number in
each cell shows how many respondents gave that combination of responses.
• In Table 18.2, 105 operated in only one European country and did not plan to
change the number of banks they do business with.
• Table 18.2 Number of countries in Europe that a company operates in and
plans to change the number of banks that a company does business with
Scope
(1) cross-tabulation analysis and results can be easily interpreted
and understood by managers who are not statistically oriented;
(2) the clarity of interpretation provides a stronger link between
research results and managerial action;
(3) a series of cross-tabulations may provide greater insights into a
complex phenomenon than a single multivariate analysis;
(4) cross-tabulation may alleviate the problem of sparse cells,
which could be serious in discrete multivariate analysis; and
(5) cross-tabulation analysis is simple to conduct and appealing to
less-sophisticated researchers.
Steps
• Open the table builder (Analyze menu, Tables, Custom Tables).
• Click Reset to delete any previous selections in the table builder.
• In the table builder, drag and drop Age category from the variable list to the
Rows area on the canvas pane.
• Drag and drop Gender from the variable list to the Columns area on the canvas
pane. (You may have to scroll down through the variable list to find this
variable.)
• Click OK to create the table.
DATA PURITY
• Data Normality
• Data Validity
• Data Reliability
DATA NORMALITY
• Normality is a critical assumption for data analysis with continuous variables. Under
this assumption, the data sets are expected to follow a normal distribution (i.e. bell
shaped). Skewness and kurtosis are widely used to evaluate the normality of data
sets.
• The normal distribution (also called the Gaussian distribution: named after Johann
Gauss, a German scientist and mathematician who justified the least squares method
in 1809) is the most widely used family of statistical distributions on which many
statistical tests are based.
• Many measurements of physical and psychological phenomena can be approximated
by the normal distribution and, hence, the widespread utility of the distribution.
• In many areas of research, a sample is identified on which measurements of
particular phenomena are made. These measurements are then statistically tested,
via hypothesis testing, to determine whether the observations are different because
of chance. Assuming the test is valid, an inference can be made about the population
from which the sample is drawn.
What Is Normal Distribution?
• A normal distribution is a bell-shaped frequency distribution curve. Most of
the data values in a normal distribution tend to cluster around the mean. The
further a data point is from the mean, the less likely it is to occur. There are
many things, such as intelligence, height, and blood pressure, that naturally
follow a normal distribution. For example, if you took the height of one
hundred 22-year-old women and created a histogram by plotting height on the
x-axis, and the frequency at which each of the heights occurred on the y-axis,
you would get a normal distribution.
Characteristics of Normal Distribution
• Normal distributions are symmetric, unimodal, and asymptotic, and
the mean, median, and mode are all equal.
• A normal distribution is perfectly symmetrical around its center.
• That is, the right side of the center is a mirror image of the left side. There is
also only one mode, or peak, in a normal distribution.
• Normal distributions are continuous and have tails that are asymptotic, which
means that they approach but never touch the x-axis.
• The center of a normal distribution is located at its peak, and 50% of the data
lies above the mean, while 50% lies below.
• It follows that the mean, median, and mode are all equal in a normal
distribution.
To Check for Normal Distribution of Data
• From the menus, choose: Graphs > Chart Builder...
• Click the Gallery tab.
• Select Histogram in the Choose from: list.
• Drag and drop the Simple Histogram icon into the canvas area of the Chart
Builder.
• Drag and drop a scale variable onto the X-Axis.
• Click the Groups/Point ID tab and select Rows panel variable or Columns
panel variable.
• Select a categorical grouping variable to define the panels.
• A separate histogram is created for each subgroup defined by the grouping
variable.
1. Validity
In fact, validity is the ability of an instrument to measure what
is designed to measure.
It sounds simple that a measure should measure what it is
supposed to measure but has a great deal of difficulty in real
life.