Sie sind auf Seite 1von 23

Introduction to Statistics

WHAT IS STATISTICS?
Definition Statistics is a group of methods used to collect, analyze, present, and interpret data and to make decisions. A comprehensive definition given by Croxton and Cowden is: Statistics may be defined as the collection, presentation, analysis and interpretation of numerical data

Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.

Applications: Accounting: Generally the number of individual accounts receivable is large and time taking to check its validity. Based on sample data auditors make conclusions as to whether the accounts receivable amount shown on the clients balance is acceptable or not.

Finance: Financial analysis, uses variety of statistical information and methods to guide investment and recommendations.
Economics: Economists use a variety of statistical information and methods in making forecasting, planning and formulations economic policies price index numbers, unemployment rates, manufacturing capacity utilization, human development indicator indices, and quality control charts etc.

Population
The entire group of individuals is called the population. The collection of all responses, measurements, or counts that are of interest. For example, a researcher may be interested in the relation between class size (variable 1) and academic performance (variable 2) for the population of MBA students.

Sample
Usually populations are so large that a researcher cannot examine the entire group. Therefore, a sample is selected to represent the population in a research study. A portion or subset of the population is called sample. The goal is to use the results obtained from the sample to help answer questions about the population.

Variables
A variable is a characteristic or condition that can change or take on different values. Most research begins with a general question about the relationship between two or more variables for a specific group of individuals. Sometime research may be interested to get the in-depth explanation about the characteristic of a variable of interest.

Types of Variables
Variables can be classified as discrete or continuous. Discrete variables (such as class size) consist of indivisible categories, and continuous variables which can be measured with some standard units (such as time or weight) and can be divided into infinitely whatever units a researcher may choose. For example, time can be measured to the nearest minute, second, half-second, etc.

Measuring Variables
To establish relationships between variables, researchers must observe the variables and record their observations. This requires that the variables be measured. The process of measuring a variable requires a set of categories called a scale of measurement and a process that classifies each individual into one category.

Data
The measurements obtained in a research study are called the data. The goal of statistics is to help researchers organize and interpret the data.

A data set can be classified according to the highest level of measurement that applies. The four levels of measurement, listed from lowest to highest are: 1. Nominal 2. Ordinal

3. Interval

4. Ratio

Four Types of Measurement Scales


Differences between measurements, true zero exists Differences between measurements but no true zero

Ratio Data

Highest Level (Strongest forms of measurement)

Interval Data
Higher Levels

Ordered Categories (rankings, order, or scaling)

Ordinal Data
Lowest Level (Weakest form of measurement)

Categories (no ordering or direction)

Nominal Data

1. A nominal scale is an unordered set of categories identified only by name. Nominal measurements only permit you to determine whether two individuals are the same or different. Categories, names, labels, or qualities. Cannot perform mathematical operations on this data. Ex: type of car you drive, your major
2. An ordinal scale is an ordered set of categories. Ordinal measurements tell you the direction of difference between two individuals.

Data can be arranged in order. You can say one data entry is greater than another. TV ratings, condition of patient in hospital 3. An interval scale is an ordered series of equalsized categories. Interval measurements identify the direction and magnitude of a difference. The zero point is located arbitrarily on an interval scale. Data can be ordered and differences between 2 entries can be calculated. There is no inherent zero (a zero that means none). Temperature, year of birth

4. A ratio scale is an interval scale where a value of zero indicates none of the variable. Ratio measurements identify the direction and magnitude of differences and allow ratio comparisons of measurements. There is an inherent zero. Data can be ordered, differences can be found, and a ratio can be formed so you can say one data value is a multiple of another. Ex: Height, weight, age

Levels of Measurement and Measurement Scales


EXAMPLES:

Ratio Data

Differences between measurements, true zero exists

Height, Age, Weekly Food Spending

Interval Data

Differences between measurements but no true zero

Temperature in Fahrenheit, Standardized exam score Service quality rating, Standard & Poors bond rating, Student letter grades Marital status, Type of car owned

Ordinal Data

Ordered Categories (rankings, order, or scaling)

Nominal Data

Categories (no ordering or direction)

Sub Branches of Statistics


Descriptive statistics are methods for organizing and summarizing data. For example, tables or graphs are used to organize data, and descriptive values such as the average score are used to summarize data. A descriptive value for a population is called a parameter and a descriptive value for a sample is called a statistic.

Inferential Statistics
Inferential statistics are methods for using sample data to make general conclusions (inferences) about populations. Because a sample is typically only a part of the whole population, sample data provide only limited information about the population. As a result, sample statistics are generally imperfect representatives of the corresponding population parameters.

Sampling Error
The discrepancy between a sample statistic and its population parameter is called sampling error. Defining and measuring sampling error is a large part of inferential statistics.

Notation
The individual measurements or scores obtained for a research participant will be identified by the letter X (or X and Y if there are multiple scores for each individual). The number of scores in a data set will be identified by N for a population or n for a sample. Summing a set of values is a common operation in statistics and has its own notation. The Greek letter sigma, , will be used to stand for "the sum of." For example, X identifies the sum of the scores.

Das könnte Ihnen auch gefallen