Sie sind auf Seite 1von 3

But First, Lemme Take a #Snapchat Selfie!

By: Megan and Kyran

For the stats chapter 4 project, we decided to send out a survey to people and asked the

following questions: How many hours they were on their phone in a day, and how many

snapchats they send in a day. We expected that these two variables would have a positive

correlation. This was proven to be false once we collected the data and analysed it. Our

explanatory variable was the hours spent on phones per day, and the response variable was the

amount of snapchats people sent according to the number of hours on the phone.

Our scatter plot was less than impressive when we generated it. We expected our data

to have a strong correlation, however it was very weak when we saw all of the data. Our

correlation coefficient was 0.2651456467. A correlation coefficient is a number between -1 and

1 that represents the linear dependence of data. Our data was not dependent according to the

outcome.

Our results for X bar and Y bar were as follows: (4.43, 129.12). The meaning behind

these numbers is the average of the X and Y values in our data set. This point falls perfectly on

the regression line on our graph as well. Our regression equation is y=17.6x+51. The marginal

change for our data (the a value in the equation) is 17.6. This means that for every hour a

person spends on their phone according to the study, they will send about 18 snapchats.

There were a couple of influential points in our data set. One particular point that may

have influenced our results was a person that spent only 3 hours on their phone yet sent 1000

snapchats. The significance of this point is that it pulls the regression line toward that point on

the scatter plot. The coefficient of determination 𝑟 2 in our equation is .07. This tells us the

percentages of explained and unexplained variation in our data set. The amount of explained

regression in our data set is 7%. The unexplained variation in our data set is 93%. This tells us

that most of our data is has an unexplained variation to it.


We believe that there are some lurking variables that could have lead to a

misrepresentation of the data. There were a few instances where people answered the survey

saying they were on their phone for a longer amount of time that is necessarily possible. One of

the most drastic is a person who said they were on their phone for 18 hours per day. We

believe that there were people who took this survey as more of a joke and answered obscurely

to get a laugh out of their friends. This affected our study by exaggerating the results. Most

people are not on their phone for that significant amount of time. It is also improbable that

somebody is on their phone for 3 hours but sends an immense amount of snapchats.

If we were to predict how many snapchats someone sends if they were on their phone

for 2.5 hours, according to the regression line it would be about 104. This is an example of

interpolation. If we were to extrapolate, let’s say someone was on their phone for 19 hours. If we

were to continue the regression line it would be about 380 snapchats. This does not seem as

accurate considering that there were many people spent less time on their phones but sent

more snapchats. This is why extrapolation is not as accurate as interpolation.

Overall, the results were not as promising as we originally intended them to be. We

expected a near perfect positive correlation and we ended up with near zero. There were some

factors that could have played into this going worse than it actually did. Lurking variables likely

played a large role in our data being less accurate according to logic. Our correlation was very

weak, however it was still present and positive. In order to avoid lurking variables, we may want

to do a survey in person to avoid the temptation of lying on an online survey.


XBAR- 4.43

YBAR- 129.12

CORRELATION COEFFICIENT- 0.2651456467

R SQUARED- .07

REGRESSION EQUATION y=17.6x+51

Das könnte Ihnen auch gefallen