Bivariate Data Analysis Olympics Project: 4x100M Men
Alison Stefansic
1. Olympic Year Winning Time
2. (seconds) 1948 40.6 1952 40.1 1956 39.5 1960 39.5 1964 39.0 1968 38.2 1972 38.19 1976 38.33 1980 38.26 1984 37.83 1988 38.19 1992 37.40 1996 37.69 2000 37.61 2004 38.07 2008 38.06 2012 36.84 2016 37.27 3. This scatterplot shows a strong, negative, linear correlation between the Summer Olympic Year and Winning Time (sec) of the 4x100M Men’s Relay. 4. 5. The linear model is not appropriate for the relationship of Summer Olympic Year and Winning Time of the 4x100M Men’s Relay because there is an upside down u- shape to the residual plot. A linear model is only appropriate for residuals whose points are scattered and not in any pattern.
However, there is a strong R-value of
the data set. The R-value is the correlation coefficient which measures the strength and direction of two variables on a scatterplot. The R-value for this graph is -0.0885.
See references below
6. With the athletes given an extra year to practice for the 2021 Summer Olympics, I believe that this will impact the scores the athletes receive positively. They should be able to advance and excel in that year and potentially earn better times than they would have if the Olympics had not been cancelled this year. As far as the intervals, the 5-year difference in Summer Olympics would change the intervals between Olympics. 7. predicted y=121-0.0415x y=Winning Time (sec) x=Olympic Year 8. As the value of the Olympic Year (x-variable) increases by 1, Winning Time (predicted y- variable) decreases by 0.0415 seconds 9. When the Summer Olympic Year is 0, predicted winning time is 121. This is unimportant because it shows the year 0 where no Olympics took place. 10. predicted y=121-0.0415(2020) Predicted Winning Time 2020 Summer Olympics=37.17 sec 11. residual=y-predicted y 37.69-(121-0.0415(1996))= -0.476 seconds 12. r=-0.885 The Correlation Coefficient r indicates that there is a strong negative linear correlation between Summer Olympic Year and Winning Time. 13. r2=78.4% The Coefficient of Determination indicates the percent or likelihood of future events falling within the predicted y value or outcome. The 78.4% shows that there is a strong connection of predictability that if samples were added the probability that they would fall on the line of best fit is 78.4% between the variables of Summer Olympic Year and Winning Time. It is important to note that determination does not mean causation. 14. Olympic Year (explanatory) mean: 1982 Olympic Year (explanatory) std. deviation: 21.4 Winning Time (response) mean: 38.4 Winning Time (response) std. deviation: 1 15. b=(r)Sy/Sx b=(-0.89)(1/21.4) b=-0.0415 16.x́=1982 ý=38. 4 38.4=121-(.0415)(1982) Yes, the Line of Best Fit would go through this point. This is because when you use the mean values in the Least Squares Regression Equation, they are equal to the mean values of x and y.
References 4x100m relay men. (2018, May 15). International Olympic