Beruflich Dokumente
Kultur Dokumente
Types of Statistics: 1. Descriptive statistics -Used to describe and synthesize data 2. Inferential statistics- Used to make inferences about the population based on sample data
Review: LEVELS OF MEASUREMENT 1. Nominal measurement :Involves assigning numbers to classify characteristics into categories 2. Ordinal measurement :Involves sorting objects based on their relative standing on an attribute 3.Interval measurement : Occurs when objects are rank-ordered on a scale that has equal distances between points on the scale 4.Ratio measurement: Occurs when there are equal distances between score units and there is a rational, meaningful zero
Descriptive Statistics: Frequency Distribution (to condense data) :A systematic arrangement of numeric values on a variable from lowest to highest, and a count of the number of times each value was obtained Frequency distributions can be described in terms of: Shape Central tendency (measures of central tendency; also descriptive statistics) Variability (also descriptive statistics along with range, percentile )
Construction of Frequency Distribution: -Can be presented in tabular form (counts and percentages) - Can be presented graphically Histograms Frequency polygons
Shapes of Frequency Distribution 1. Symmetry Symmetric Skewed (asymmetric) Positive skew (long tail points to the right) Negative skew (long tail points to the left)
2.Peakedness (how sharp the peak is) 3.Modality (number of peaks) Unimodal (1 peak) Bimodal (2 peaks)
-Multimodal (2+ peaks Note: Modality could be described as:Symmetric, Unimodal , Not too peaked, not too flat )
MEASURES OF CENTRAL TENDENCY: 1. MODE Index of typicalness of set of scores that comes from center of the distribution Modethe most frequently occurring score in a distribution ; useful mainly as gross descriptor, especially of nominal measures 2 3 3 3 4 5 6 7 8 9 Mode = 3
2.Medianthe point in a distribution above which and below which 50% of cases fall; useful mainly as descriptor of typical value when distribution is skewed 2 3 3 3 4 5 6 7 8 9 Median = 4.5
3.Meanequals the sum of all scores divided by the total number of scores ; or the average; most stable and widely used indicator of central tendency 2 3 3 3 4 5 6 7 8 9 Mean = 5.0
Scale 5 4 3 2 1
Adjectival Description*
Interpretation
* Adjectival Description as used in the questionnaire; could be in terms of agreeability, frequency, etc. *Interpretation as answer or interpretation, based on the SOP
VARIABILITY OF DISTRIBUTIONS :The degree to which scores in a distribution are spread out or dispersed Homogeneitylittle variability Heterogeneitygreat variability
INDEXES OF VARIABILITY
Range: highest value minus lowest value Standard deviation (SD): average deviation of scores in a distribution Variance: a standard deviation, sq
CONTINGENCY TABLE (CROSS-TABS) A two-dimensional frequency distribution; frequencies of two variables are crosstabulated Cells at intersection of rows and columns display counts and percentages Variables must be nominal or ordinal
CORRELATION PROCEDURES Indicate direction and magnitude of relationship between two variables Used with ordinal, interval, or ratio measures Can be shown graphically (scatter plot) Correlation coefficient (usually Pearsons r) can be computed With multiple variables, a correlation matrix can be displayed
DESCRIBING RISK Absolute risk is the proportion of people who experienced an undesirable outcome in each group Absolute risk reduction (ARR) expresses the estimated proportion of people who would be spared from an adverse outcome through exposure to an intervention Relative Risk (RR) is the estimated proportion of the original risk of an adverse outcome that persists among people exposed to the intervention Relative risk reduction (RRR) is the estimated proportion of untreated risk that is reduced through exposure to the intervention ODDS RATIO :Ratio of the odds for the treated versus the untreated group, with the odds reflecting the proportion of people with the adverse outcome relative to those without it USING INFERENTIAL STATISTICS TO TEST HYPOTHESES : A means of drawing conclusions about a population (i.e., estimating population parameters), given data from a sample; Based on laws of probability
Sampling Distribution of the Mean:A theoretical distribution of means for an infinite number of samples drawn from the same population; Characteristics : (1)Is always normally distributed;(2) Has a mean that equals the population mean; (3)Has a standard deviation (SD) called the standard error of the mean (SEM); (4)SEM is estimated from a sample SD and the sample size
Sampling Distribution
1.Estimation of parameters Used to estimate a single parameter (e.g., a population mean) Two forms of estimation: (1)Point estimation- Calculating a single statistic to estimate the population parameter (e.g., the mean birth weight of infants born in the U.S.); (2)Interval estimation Calculating a range of values within which the parameter has a specified probability of lying-A confidence interval (CI) is constructed around the point estimate; The upper and lower limits are confidence limits 2.Hypothesis testing (more common) Based on rules of negative inference: research hypotheses are supported if null hypotheses can be rejected; Involves statistical decision making to either: accept the null hypothesis, or reject the null hypothesis Researchers compute a test statistic with their data, then determine whether the statistic falls beyond the critical region in the relevant theoretical distribution If the value of the test statistic indicates that the null hypothesis is improbable, the result is statistically significant A nonsignificant result means that any observed difference or relationship could have resulted from chance fluctuations Two types of incorrect decisions: (1)Type I error OR FALSE POSITIVE: a null hypothesis is rejected when it should not be rejected ;Risk of a Type I error is controlled by the level of significance (alpha), e.g., = .05 or .01. (2)Type II error OR FALSE NEGATIVE : failure to reject a null hypothesis when it should be rejected Two-tailed tests :Hypothesis testing in which both ends of the sampling distribution are used to define the region of improbable values
Critical Regions in the Sampling Distribution for a Two-Tailed Test: IVF Attitudes Example
One-tailed tests:Critical region of improbable values is entirely in one tail of the distributionthe tail corresponding to the direction of the hypothesis
Critical Region in the Sampling Distribution for a One-Tailed Test: IVF Attitudes Example
HYPOTHESIS TESTING PROCEDURE: 1.Select an appropriate test statistic 2.Establish the level of significance (e.g., = .05) 3.Select a one-tailed or a two-tailed test 4.Compute test statistic with actual data 5.Calculate degrees of freedom (df) for the test statistic 6.Obtain a tabled value for the statistical test 7.Compare the test statistic to the tabled value 8.Make decision to accept or reject null hypothesis COMPARISON BETWEEN PARAMETRIC AND NON-PARAMETRIC STATISTICS PARAMETRIC NON-PARAMETRIC (Distribution-free Statistics)
Involve several assumptions (e.g., that variables are normally distributed in the population)
Have less restrictive assumptions about the shape of the variables distribution than parametric tests
COMMONLY USED BIVARIATE STATISTICAL TESTS (INFERENTIAL) 1.t-Test :Tests the difference between two means t-Test for independent groups (between subjects) t-Test for dependent groups (within subjects) 2. Analysis of variance (ANOVA) Tests the difference between 3+ means: One-way ANOVAMultifactor (e.g., two-way) ANOVA; Repeated measures ANOVA (within subjects) 3. Pearsons r Pearsons r, a parametric test :Tests that the relationship (Test of Correlation) between two variables is not zero; Used when measures are on an interval or ratio scale 4. Chi-square test Tests the difference in proportions in categories within a contingency table; A nonparametric test Power Analysis: A method of reducing the risk of Type II errors and estimating their occurrence;With power = .80, the risk of a Type II error () is 20%; Method is frequently used to estimate how large a sample is needed to reliably test hypotheses Four components in a power analysis: (1)Significance criterion () ; (2)Sample size (N); (3)Population effect size the magnitude of the relationship between research variables (); (4)Powerthe probability of obtaining a significant result (1-) Note: CHAPTER 23 is not discussed as it is not to be commonly used for student researchers; you may also read it.