Beruflich Dokumente
Kultur Dokumente
Introduction
What is Statistics? Statistics is the science of data. This involves collecting, classifying, summarizing, organizing, and interpreting numerical information. Aim: to make sense of data Types of Statistics Descriptive statistics utilizes numerical and graphical methods to look for patterns in a data set, to summarize the information revealed in a data set, and to present that information in a convenient form. Inferential statistics utilizes sample data to make estimates, decisions, predictions, or other generalizations about a larger set of data.
MBE12203, Statistics for Research Johnson Lim 2012
Introduction
Fundamental Elements of Statistics An experimental unit is an object (person, thing, transaction or event) about which we collect data. A population is a set of units (usually people, objects, transactions or events) that we are interested in studying. Example: all registered voters in Johor, all the stocks listed in Bursa Malaysia, all accidents occurred during Christmas, etc. A sample is a subset of a population. A variable is a characteristics or property of an individual population unit.
Source: [1]
MBE12203, Statistics for Research Johnson Lim 2012
Hypothesis Testing
Hypothesis accepted
MBE12203, Statistics for Research Johnson Lim 2012
Hypothesis rejected
Source: Image from [2] 5
Data Types
Two Main Data Category Quantitative Data measurements that are recorded on a naturally occurring numerical scale. Can be sub-divided into interval or ratio data E.g. height, IQ score, electric current, speed of a car Qualitative data measurements that cannot be measured on a natural numerical scale; they can only be classified into one of a group of categories. (nominal or ordinal data). Can be sub-divided into nominal or ordinal data E.g. blood types, postcode, color codes Nominal, ordinal, interval and ratio data are four major data types.
Data Types
Nominal Data Numbers are used as a categorical label to categorize something. It is not meaningful in numerical sense. Example: We may assign 1=Red, 2=Blue, 3=Yellow. But the average/mean of Red(1) and Yellow(3) cannot be Blue(2)! Note that we may also reassign a different number (e.g. 1883743728) to a different color.
Data Types
Ordinal Data Numbers are typically used to define an order of performance, and that's about it. Numbers are meaningful only to give an idea of ranked orders. Example: In a field event (Sport), assume 1 indicates first place and 4 represents fourth place and we have the following ranking order: John(1), Mary(2), Tom(3), Harry(4) John is better than Marry, Mary is better than Tom, etc. in terms of performance. However, the ranking order DO NOT tell us that the difference between 1 and 2 (John and Mary) is the same as the difference between 3 and 4 (Tom and Harry).
MBE12203, Statistics for Research Johnson Lim 2012
Data Types
Interval Data A type of data where the differences between consecutive numbers are of equal intervals. Example: Clocks, ruler, thermometers have interval scale on it. For example, in a clock, the difference between 11.05pm and 11.10pm, is the same with the difference between 7.45am and 7.50am. Under this scale, we may derive means and other types of metrics from it. These metrics makes sense of the interval/scale data.
Data Types
Ratio Data A special case of interval data. Ratio data have the equal interval characteristics of interval data. But the only difference, is on the ZERO reference point. Interval data can have arbitrary zero reference point. Example:
In an exam, 60 is the passing mark. John scored 20, and Kitty scored 80. Under interval, 0 can be labeled as the passing mark (with +1 means 1 mark higher than passing mark), thus John's score will be 40, and Kitty is 20. This zero value is not a genuine zero. Under ratio, 0 is a genuine zero (meaning a student did not score anything). Under this data scale, we can say Kitty scored 4 times higher than John (80/20=4).
Despite this, most statistical data analysis procedures do not distinguish between the interval and ratio properties
MBE12203, Statistics for Research Johnson Lim 2012
10
Data Types
A simple discussion
The Likert scale below is used by a researcher in studying students acceptance towards learning style X. After getting feedback from a group of 50 students, it was found that the mean of their acceptance is 4.2, of which the researcher claimed that students have high acceptance towards learning style X. 5 4 3 2 1 Strongly Agree Agree Neutral Disagree Strongly Disagree
Question: Does the researcher makes a correct judgment? Either yes or no, please give supporting reasons based on your understanding of statistical data types.
MBE12203, Statistics for Research Johnson Lim 2012
11
Cumulative Frequency the aggregated frequency for a category. For discrete data, it indicates the total number of observations less than or equal to the category. For continuous data, it displays the total number of observations less than or equal to the upper class limit of a class.
12
Source: [1]
13
Grouped Data
MBE12203, Statistics for Research Johnson Lim 2012
Source: [1]
14
15
References
References
[1] Examples from Dr. Wan Azlinda lecture slides (2006). [2] McClave, J.T. (2009). A First Course in Statistics. Pearson Prentice Hall: USA
16