0 views

Uploaded by Alicia Reynolds

Proportions Activity PDF Notes

- Strategic Responses to Competitive Advantage by Public Universities in Kenya: A Case of the University of Nairobi
- REMOVAL SAMPLING
- Error Bars 2007 Jcb
- Estimation
- Hopreg.pdf
- Data Interpretasi Terbaru
- Edwin Roni Gumay NPM. 06510192
- 6113ch10ss17
- Data Anlalysis
- Thayer, Vietnam Reinstates Vote of Confidence in Government Leaders
- Anthony Rogers 2005 JWEIA Measure Correlate Predict
- Project Examples for Sampling and the Law of Large Numbers
- Statistical Tools and methods
- 06 Sampling and Estimation
- 153-162 the Effect of Certain Variables on Visual and Auditory Reaction Times.
- OpRiskHJF
- 20_LinearRegression
- 4 SE Differences
- Lecture7CI_If2016
- Gov Matters 6

You are on page 1of 8

Slide One:

Voiceover:

Harrigan University is a liberal arts university in the Midwest that attempts to attract the highest quality

students, especially from its region of the country. Harrigan is concerned that it is not getting enough of

the best students, and worse yet, that many of these best students are going to Harrigan’s main rival.

Slide Two:

Text on screen:

• HS Sports- number of varsity letters applicant earned

• HS Size- number of students in applicant’s graduation class

• MainRival- whether the applicant enrolls at Harrigans main rival university

• HSGPA- applicant’s high school GPA

• SAT- applicant’s combined SAT score

• HSClubs- number of high school clubs applicant served as an officer

• HSPctile- applicant’s percentile (in terms of GPA) in his or her graduating class

• Combined Score- a confidential internal combined score for the applicant used by Harrigan to

rank applicants

Voiceover:

Gertrude Cox has been tasked with investigating this concern and has gathered data on 178 applicants

who were accepted by Harrigan (a random sample from all acceptable applicants over the past several

years). The data contains numerous information about the applicants including whether the applicants

accepts Harrigan’s offer to enroll. Cox’s first step is to estimate the fraction of highly qualified applicant

that accepts the offer, which she completes in Excel.

Slide Three:

Voiceover:

Realizing that X is just a number that is a point estimator, Cox wants to build a confidence interval

around it. However, whether or not an applicant chooses to attend Harrigan is not a numerical outcome,

but rather a binary choice. Therefore, .579 is not a sample mean, put a population proportion and

therefore she will need to explore how to build confidence intervals around sample proportions

estimators.

Text on screen:

E[𝑝] = 𝑝

𝜎# ̂ = 𝑆𝐸 (𝑝) = 𝑝(1 − 𝑝) /𝑛

Sample Mean

• Point estimator of 𝜇

• Distribution: Normal if 𝑛 is large enough

2

• Standard error: 𝜎0 = 𝑆𝐸 𝑋 =

3

Sample Proportion

• Point estimator of 𝑝

• Distribution: Normal if 𝑛𝑝 > 5 and 𝑛(1-𝑝) > 5

• Standard error: 𝜎# = SE(𝑝) = 𝑝 1 − 𝑝 /𝑛

Since 𝑝 has a normal distribution we can apply the same methodology and mechanics to build

confidence intervals as we did with the sample mean.

#(<=#)

𝑃 ± (z−𝑚𝑢𝑙𝑖𝑡𝑖𝑝𝑙𝑒)

3

Voiceover:

When records can be classified in one of two categories: “success” or “failure”, the appropriate analysis

is that of sample proportion. In the case of Harrigan University, the success translates to a student that

accepted an offer of admissions and failure a student that decided to go somewhere else.

Let p denote the proportion of successes in the population. It is a parameter which is unknown and must

be estimated. If a random sample of size n is drawn from this population, then we can let p-hat denote

the proportion of successes in the sample. P-hat is a random variable, as p-hat from each random

sample will be slightly different from one another.

It can be shown that for sufficiently large n, the sampling distribution of p-hat is approximately normal

with mean p, that is the expected value of the sample proportion is the true population proportion, and

standard error of the square root of p times one minus p divided by the sample size n.

But what do we mean by sufficiently large? As a rule of thumb, if both n times p and n times one minus

p are greater than five the approximation hold, that is if p is either very small or very large we need a

bigger sample for the distribution to be normal, but for populations with p close to .5 we could get away

with samples as small as ten.

There are a lot of parallels between the sample mean and the sample proportions.

Both are point estimators of mu – a population average, and p – a population proportion respectively.

The Sample Mean is normally distributed if n is large enough, and the sample proportion is also normally

distributed if both n times p and n times one minus p are greater than 5.

The standard error of both estimates are generally estimated from the sample, that is for sigma in the

standard error of the sample mean we substitute s, the sample standard deviation [make sigma change

to s] and for the standard error of the sample proportion we substitute p-hat in for p [make p change to

p-hat].

The basic formula for any confidence interval is point estimator plus-minus margin of error. And as

before, we break down the margin of error into a multiple times the standard error of the point

estimator.

Now our point estimator is p-hat. Our multiple is the z-multiple, as p-hat is normally distributed, we can

find this value in Excel or let StatTools do the calculations. And the standard error is given by the square

root of p times one minus p divided by n.

The standard error formula for p-hat, and as a result, the confidence interval formula, contains the

unknown parameter p. As a result, we approximate standard error of p-hat, by substituting p-hat in for p

in the formula.

Slide Four:

Text on screen:

Cox wants to build a 95% confidence interval around the point estimate of the acceptance rate.

#(<=#)

𝑃 ± (z−𝑚𝑢𝑙𝑖𝑡𝑖𝑝𝑙𝑒)

3

• O.579

• 1.96

• 178

Slide Five:

Voiceover:

Any time we use StatTools we need to first define the dataset and we do that in the Data set Manager.

As the cursor was placed inside the data, StatTools detects the data range and [on selecting Yes] we

select Yes. And since all the variables look correct we [on selecting OK] select ok.

Now we are ready to construct the confidence interval. Again [on selecting StatTools] we select

StatTools and under [on selecting Statistical Inference] Statistical Inference we find Confidence Interval

and then select [on selecting Proportion] Proportion.

First we select data that we are constructing the interval for, in our case [on checking accepted]

Accepted and we want to analyse the proportion that accepted, so we dedicate Yes [on selecting yes] as

the category to analyze. We want a 95% confidence interval [on circling the CI] which is the default

value, so we can press OK.

And here is the resulting confidence interval, we notice that the estimated sample proportion is 0.579,

which we knew, and [on highlighting cells] the upper and lower limits of the interval are 0.506 and

0.651. That is wide for a proportion, so lets calculate the margin of error [on writing MOE]. The margin

of error is half the interval, so we subtract the lower limit from the upperlimit, which gives us the with of

the interval, and the margin of error is half that, so we [on dividing by 2 in the formula] divide by 2.

Slide Six:

Text on screen:

• Increase the sample size

#(<=#)

MOE= (z-multiple)

3

>=?@ABC#AD 2

N = ( ) p(1-p)

EFG

=

>=?@ABC#AD >=?@ABC#AD 3

EFG #(<=#) 2

( ) 2= ( )

>=?@ABC#AD 3

Voiceover:

Cox is not happy with the width of the confidence interval, that is, she would like a tighter interval. To

achieve this she has two options: to reduce the confidence level or increase the sample size. For Cox,

reducing the confidence is really not an option, so she decides to increase the sample size to achieve a

confidence interval of no more than plus-minus 5%. How large of a data set does she need?

Given a confidence level and a margin of error (MOE), what is the sample size that we should draw for

estimating p?

The formula for the margin of error is the multiple times the standard error of the estimate. Now we

want to determine n, so we need to use algebra to isolate n in this equation.

The final step is to move n to the left hand side and the Margin of Error divided by z-multiple, whole

square to the right hand side.

The formula for n will most times return a fractional value, but since we cannot draw fractional samples,

we round up the number of samples.

There is an additional challenge with applying this equation. Prior to polling we don’t know p and

therefore cannot substitute it into the formula.

One approach is to use the worst case scenario and substituting .5 in for p as p(1-p) is maximized when p

is .5.

Slide Seven:

Text on screen:

Cox wants to build a 95% confidence interval with a margin of error of no more than 5%.

Slide Eight:

Text on screen:

Voiceover:

Speaking with other seasoned admission people Cox gets the feeling that things have been getting

increasingly worse in the past few years.

A comprehensive study conducted 3 years ago on all admissions in previous years showed the

acceptance rate of highly qualified applicants to be 65%.

Slide Nine:

Text on screen:

1. Contruct Ho and H1

2. Determine on-tailed or two-tailed and appropriate significance level

3. Select appropriate parameter settings in StatTools and run analysis

4. Interpret results

Voiceover:

Cox decides to conduct a hypothesis testing at the 5% level of significance with the goal of proofing the

overall feeling of the admissions department. In this case, the same mechanism applies as with

hypothesis testing for the sample mean. What has changed is that now we are applying hypothesis

testing to a proportion, while before we were testing the sample mean.

Slide Ten:

Text on screen:

• Step 1

• Step 2

• Step 3

• Step 4

What is the status quo? The acceptance rate has not gone down

B. H1: The acceptance rate ≠.65

C. H1: The acceptance rate > .65

Correct: Correct! The alternative hypothesis is what Cox is trying to proof, that is that the acceptance

rate has gone down. The null hypothesis represents the status quo, i.e. that the acceptance rate has not

gone down. H0: the acceptance rate >=.65

Incorrect: Incorrect, Cox is not trying to proof that the acceptance rate has not changed.

Incorrect: Incorrect, Cox’s aim is to reject the null hypothesis which is a strong statement in favor of the

alternative, you should therefore set the alternative to be the statistical statement you are trying to

proof.

Voiceover:

The significance level was set by Cox, 5%, and we are conducting a one tail test, as only evidence in one

extreme are evidence against the null hypothesis, that is only very low values for p-hat will be evidence

against the null hypothesis that the acceptance rate has not declined.

Step 4: Interpret Results

Comparing the p-value to the significance level, we ______ the null hypothesis, meaning we ________

A. Accept … we do not have sufficient evidence to conclude that the acceptance level is lower.

B. Accept … we do have sufficient evidence to claim that the acceptance level has stayed the same.

C. Reject … we do not have sufficient evidence to conclude that the proportion of students

accepting the offer has gone down.

D. Reject … we conclude that the proportion of students accepting has gone down.

Incorrect: Incorrect: since the p-value is smaller than the significance level we can reject the hypothesis.

The low p-value is indicating that it is unlikely to observe this data if the null hypothesis is true.

Incorrect: since the p-value is smaller than the significance level we can reject the hypothesis. The low p-

value is indicating that it is unlikely to observe this data if the null hypothesis is true.

Incorrect: we can reject the null hypothesis, but as a result we can conclude that we have sufficient

evidence to reject the null hypothesis.

Correct!

Slide Fifteen:

Voiceover: The data indicate that the acceptance rate of highly qualified students has gone down in

recent years, a worrisome fact for Harrigan University. Cox realizes that she now has a number of other

questions she would like to answer, including: Has the composition of student applicants changed? Has

the acceptance rate gone down equally across all students, or for example, does it differ across the

Combined Score ranking? It is clear that more analysis is needed to search for an explanation for the

decline and for insights into the current admission trends.

- Strategic Responses to Competitive Advantage by Public Universities in Kenya: A Case of the University of NairobiUploaded byJASH MATHEW
- REMOVAL SAMPLINGUploaded byapi-27589736
- Error Bars 2007 JcbUploaded byMihalache Ionut
- EstimationUploaded byasdasdas asdasdasdsadsasddssa
- Hopreg.pdfUploaded byMoneesh Thomas
- Data Interpretasi TerbaruUploaded byJdm Part Ind
- Edwin Roni Gumay NPM. 06510192Uploaded byEdwin Gumay
- 6113ch10ss17Uploaded bykaylanchampion
- Thayer, Vietnam Reinstates Vote of Confidence in Government LeadersUploaded byCarlyle Alan Thayer
- Data AnlalysisUploaded byرجب جب
- Anthony Rogers 2005 JWEIA Measure Correlate PredictUploaded bytrcs2003a
- Project Examples for Sampling and the Law of Large NumbersUploaded byjohn goodpasture
- Statistical Tools and methodsUploaded byМохаммад Усман
- 06 Sampling and EstimationUploaded bydhanoj6522
- 153-162 the Effect of Certain Variables on Visual and Auditory Reaction Times.Uploaded byAncuta Gabriela Soptelea
- OpRiskHJFUploaded byrberrospi
- 20_LinearRegressionUploaded byRose Lee
- 4 SE DifferencesUploaded byjulio_cess2102
- Lecture7CI_If2016Uploaded byMobasher Messi
- Gov Matters 6Uploaded byQurnia Indah
- Gravel Loss AnalysisUploaded byfishtaff66
- Section Six One and TwoUploaded byhibongo
- Sampling Distributions 1Uploaded byHazel Papagayo
- E1 RDRUploaded byeaera381
- Effectiveness Of HomeworkUploaded byJerome Loreca Gloria
- Difference RelationshipUploaded bychard apollo
- 3jot_12Uploaded byfmfs60
- Psych 101 Transcribed Notes Lect 12Uploaded byGrace Ong
- INDE 3364 Final Exam Cheat SheetUploaded bybassoonsrock
- Souza e Junqueira 2005.pdfUploaded byEduardo Castro

- VariogramsUploaded byBuiNgocHieu
- Cac/Gl 54 2004Uploaded bysanjayghawana
- ATutorialinLogisticRegressionDeMaris1995JMarriageandtheFamilyUploaded byShiera Mae Labial Lange
- Business Statistics 2 .Uploaded byayub goher
- Standard Deviation - Wikipedia, The Free EncyclopediaUploaded byManoj Borah
- 71119Uploaded byrocky21st
- 8a87cTutorial Sheets Prob and StatsUploaded byBharat
- 1 ForecastingUploaded bytantriwidyas
- 5logreg Beamer OnlineUploaded byCésar López Godoy
- Some Sampling Distribution ProblemsUploaded byvuduyduc
- test math final solutionsUploaded byslimehat
- Linear Regression Analysis McqsUploaded byEngr Mujahid Iqbal
- Applied Statistics - MITUploaded bygzapas
- ANOVA Nanti Ganti NamaUploaded byIwan Irwan Arnol
- FORMULA MF10 Further MathematicsUploaded byRichie Tang
- BKM 10e Chap008 SM FinalUploaded byBiloni Kadakia
- Discrete Random Variables ExercisesUploaded byAlp Eren AKYUZ
- Work SamplingUploaded byrkhurana00727
- Lab 1 Data ScreeningUploaded byIrraKhan
- Logistic RegressionUploaded byDhruv Bansal
- Guide to ANOVAUploaded byPappu Kapali
- BiostatFinalsWithANSWERS KEYUploaded byRodin Paspasan
- Application of Ordinary Least Square Method in NonlinearUploaded byEny Enasty
- SLP #4 for Module #4 RES600 ClassUploaded byanhntran4850
- NormalUploaded byfcleong69
- Prems mannUploaded byThad
- Chapter 06 W9 L6 L7 Hypoyhesis Testing C6Uploaded byack
- Normal Distribution1Uploaded byArun Rai
- Toyol 1 Choosing Correct StatisticUploaded byIsmail SA
- Chem 26.1 - EXPT1-ATQUploaded byKelly Mangonon