STATISTICS COURSE: A COMPARISON OF INSTRUCTOR AND STUDENT
PERSPECTIVES
Cynthia L. Knott Marymount University, 2807 N. Glebe Road, Arlington, VA 22207 cynthia.knott@marymount.edu 703-284-5727
G. Steube Marymount University, 2807 N. Glebe Road, Arlington, VA 22207 gsteube@marymount.edu 703-284-5943
Northeast Decision Science Institute March 2010 Using Excel in an Introductory Statistics Course Page 2 of 13 Abstract Almost all undergraduate business degree programs require that students take at least one course in statistics. The instructor for these classes has a number of options in terms of selecting the software package that will be used. Three popular options are R, PASW (formerly SPSS), and Excel. This paper examines the advantages and disadvantages of these software choices in order to assist the instructor with making an informed choice. The strengths and weaknesses of these choices are explored in terms of both the instructors and students perspective. After reviewing these assessments, the paper concludes that Excel is the best choice because of its low cost, wide availability, familiar interface, and computational and charting flexibility. Using Excel in an Introductory Statistics Course Page 3 of 13 Although most business schools require at least one course in statistics advances in technology and the availability of statistical software has dramatically changed the way in which these classes are taught. In the past, instructors presented statistics the old fashioned way, with the paper and pencil. The students were expected to be involved in the mathematics as well as the calculations. Students learned how to use the z-tables and calculate the statistics by hand. Because of advances in technology, the reduction of cost in acquiring technology, and the ubiquitous presence of technology, the way in which statistics courses can be presented has changed to offer the instructor a wider range of choices. These choices center on the software package that will be used in the class to perform calculations and present the findings in graphs and charts. This paper focuses on some of the choices that instructors may face in select ing the best software support for their introductory statistics class. In the business school, statistics courses are applied rather than theoretical. It is important that these learners understand the underlying calculations and theories, but more significantly, these students must be able to apply, analyze, and interpret the results to improve decision making in the business environment. Communication is the top skill sought by most employers today (Barnes, 2009). Statistics courses should help the student in learning how to communicate the results of their analysis in their future business environment. Therefore the business statistics courses emphasize the communication of the results obtained from conducting a wide range of hypothesis testing and other types of analysis. The use of statistical software relieves the student and instructor from spending too much time on calculations and thus affords more time to emphasize the understanding, interpreting, and communicating the results of these computations. Using Excel in an Introductory Statistics Course Page 4 of 13 This paper compares three popular statistical packages; R, PASW (formerly known as SPSS) and Excel. Information about the R package can be obtained from the R Package for Statistical Computing (Department of Mathematics and Statistics at Vienna University, n.d.); PASW and Excel information can be located on the PASW website (SPSS Inc., 2009) and the Microsoft website (Microsoft Inc., 2009b). The use of any of these three packages has a number of strengths and deficiencies. These advantages and disadvantages are discussed from the instructor and student point of view in this report. The discussion will provide input by which assist an instructor in deciding among the three packages for his or her class in introductory statistics. Advantages and Disadvantages of R The advantages and disadvantages from the instructor and student views for R are summarized in Table 1 (instructor) and Table 2 (student), respectively. Table 1 Advantages and disadvantages of R from the instructor view Advantages 1. Availability Freely widely available at no cost (Zieffler & Long, n.d.)
2. Flexibility and customization Because R is a programming language almost any result can be achieved (Zieffler & Long, n.d.)
3.Up to date methods and packages Because R methods are written by users, R is more current than many commercial statistical packages that require updates to their base system (Zieffler & Long, n.d.)
4. Broad coverage R packages are extensive and include a wide variety of quantitative applications
5. Availability of help Because R has a large user network, help is readily available on almost any topic Using Excel in an Introductory Statistics Course Page 5 of 13
Disadvantages 1. Large data sets R may not handle large data sets as efficiently as SAS (Zieffler & Long, n.d.)
2. Speed Some procedures in R could take days to run (Zieffler & Long, n.d.)
3. Learning curve Because R is command line driven rather than a point and click application, the learning curve is more challenging than most commercial packages which use a graphical interface
4. Lack of a spreadsheet view of data Unlike Excel and SPSS, R does not include a spreadsheet view of the data set
5. Unfamiliarity R is less well-known than SPSS or Excel and may be viewed less positively because of the lack of familiarity
Table 2 Advantages and disadvantages of R from the student view Advantages 1. Availability No cost to the students and can be installed on their computer and thus eliminate trips the University's computer lab to do homework
2. Flexibility and customization After some investment in learning, R the student could find available packages and develop their own tailored applications
3.Up to date methods and packages As other users add packages to R to keep it current, students can continue using R throughout their academic and professional careers without concern about updating their base packages to obtain the newer software
4. Broad coverage Because of R broad coverage, students might be able to use R in other quantitative courses
5. Availability of help Students can use the network of available user without charge or other registration requirements to obtain guidance on almost any aspect of R Using Excel in an Introductory Statistics Course Page 6 of 13
Disadvantages 1. Large data sets May not be an issue for a student's academic use of R because the data sets are relatively small
2. Speed The basic procedures used in an introductory statistics course would not take days to run
3. Learning curve Students would need to invest time in learning R and that investment may be challenging and time intensive; the lack of graphic interface with point and click capability would be a challenge for students who have only worked with operating systems and applications that furnish these abilities
4. Lack of a spreadsheet view of data The lack of spreadsheet view may be a concern to students because they are tuned to table presentation of data especially for business students
5. Unfamiliarity Students may not be any more unfamiliar with R than SPSS; most students would be familiar with Excel
Advantages and Disadvantages of PASW The strengths and weaknesses from the instructor and student views for PASW are summarized in Table 3 (instructor) and Table 4 (student), respectively. Table 3 Advantages and disadvantages of PASW from the instructor view Advantages 1. Well known and supported PASW has been in the marketplace for many years and many textbooks for introductory statistics courses are based on this application. Other programs can easily import SPSS data files (Harrington, McLeod, & M.Clark, 2009).
2. Ease of use Because of its graphical user interface, a large of number of statistical functions are easy to use and access
3.Up to date PASW is generally updated at least once a year with minor updates available on the corporate website Using Excel in an Introductory Statistics Course Page 7 of 13
4. Broad coverage PASW includes a large number of statistical routines appropriate for an introductory statistics course
5. Availability of help There are number of Excellent books that provide comprehensive information on using PAWS; the availability of the list serve for this package also provides another source of guidance for specific issues
Disadvantages 1. Cost PASW is a commercial software that is available at a relatively high cost [ JourneyEd (2009) list PASW Graduate Pack for Windows at $199.98] ; some applications are available only as another product with a separate fee
2. Licensing complexity The license to use PASW is time limited; the license only allows installation on a limited number of computers; add-ons to the PASW require acquisition to additional licenses. The licensing is not user friendly (Harrington, et al., 2009).
3. Confusion among the different versions Because PASW is updated every year a number of issues can occur with features and data formats are available especially between the MAC and Windows products
Table 4 Advantages and disadvantages of PASW from the student view Advantages 1. Well known and supported Students would have relatively little difficulty in obtaining support for their SPSS work from textbooks, Internet resources, and other students. 2. Ease of use The graphical user interface provides the student with quick access to the statistical routines needed for an introductory course; student also has the ability to customize PASW results including the display of graphic output 3. Broad coverage All the statistical routines required by the introductory Using Excel in an Introductory Statistics Course Page 8 of 13 statistics course would be available to the students 4. Availability of help The student has access to a large number of excellent books that provide comprehensive information on using PASW
Disadvantages 1. Cost At price point of almost $200 even for the Graduate version of the PASW package, the cost is an issue for student 2. Licensing complexity Although students would be able to use their purchased package for the duration of the introductory statistics course subsequent use of the package in graduate and professional work would become a licensing problem 3. Confusion among the different versions Students may try to purchase or borrow older versions of PASW which could create issues for them with the instructions in the selected introductory statistics textbook 4. Lag in newer techniques "For academic use SPSS lags notably behind SAS, R and even perhaps others that are on the more mathematical rather than statistical side for modern data analysis (e.g. robust and bootstrapping approaches available easily conducted elsewhere are nonexistent or very difficult to do, basic tests of analytical assumptions are often not available)" (Harrington, et al., 2009)
Advantages and Disadvantages of Excel The advantages and disadvantages for Excel are provided in Table 5 (instructor) and Table 6 (student). Table 5 Advantages and disadvantages of Excel from the instructor view Advantages Using Excel in an Introductory Statistics Course Page 9 of 13 1. Availability All of the labs in the University have the Microsoft Office Package on them and therefore, access is always available. Instructors dont have to go through the hassle of having special software installed on the computers each semester. 2. Availability of Help On-line help is extensive and the package also has a built in help function that is very user friendly. 3. Ease of Use Very hands on friendly and intuitive to use. The dropdown menus make it easy to find things. 4. Add-Ins The software has a standard installation, but you can also add-in data analysis tool packages, specifically to do Statistical applications. Therefore, you dont have to know any coding or programming languages. 5. Coverage
6. Data Sets The package includes all of the Statistical applications that an introductory course needs. Many authors are including data sets with the textbooks that are already in Excel files and ready for analysis. Also, many web sites that manage data sets are putting them in a format that is easy to download as an Excel file.
Disadvantages 1. Cost Although the package is available in all of the labs on campus, an instructor would need to purchase the software package for any home computers that they use. 2. Potential Calculation Problems The program doesnt calculate the 3 rd quartile correctly (Anderson, 2009). Problems with the calculations in Excel have also been identified by McCullough and Heiser (2008) and Yalta (2008). 3. Limitations of Data The number of data points is limited to x amount. In terms of an introductory course this is not necessarily an issue, but it can be when showing analysis of large data sets. The limitations of Excel are provided by Using Excel in an Introductory Statistics Course Page 10 of 13 Microsoft (2009a). 4. Functions Some of the advanced statistical functions are not included in the package.
Table 6 Advantages and disadvantages of Microsofts Excel from the student view Advantages 1. Availability All of the University labs have Microsoft Office installed on them, so the students have access to Excel everywhere on campus. 2. Cost There is no cost to the students if they use the computers in the lab. 3. Availability of Help There is extensive on-line help and the package also includes a help function that is easy to find information. 4. Textbook Many of the textbooks include instruction on the functions in Excel, which allows students to follow along and do practice problems on their own. 5. Data files
6. Applicability to other Courses Many of the textbooks are including data files that are already in Excel format and also many web sites that manage data sets are making the files available for download in Excel format.
The use of Excel is becoming the standard in many other areas of business instruction such as accounting, finance and operations. Therefore, the students can use the skills they learn in other courses as they work through their programs.
Disadvantages 1. Customization The ability to customize the package is not very user Using Excel in an Introductory Statistics Course Page 11 of 13 friendly. You can add macros using visual basic, but this requires the knowledge of a computer programming language.
2. Large Data Sets The limitations of Excel are provided by Microsoft (2009a). 3. Confusion among the different versions Because Excel is in a current transition from 2003 to 2007, students sometimes are working in one version and being taught in another; this can be confusing to them. Although the functions are all the same, where to find them is different for each version.
Conclusion
This paper explored three popular software packages to determine their advantages and disadvantages for use in an introductory business statistics course. By assessing these pros and cons in terms of both the instructor and student perspective a better decision about which of these package to choose can be made. Although any of the three packages would provide learners with the ability to perform computations and display graphs and charts, each of these choices has its own unique set of strengths and weaknesses. This report identified a number of assets and liabilities for each of the three packages in Tables 1 thru 6. Because of its low cost, computational and charting flexibility, interface familiarity, and wide availability the selection of Excel is seen as the best choice. The availability factor for Excel includes not only its presence in college computer labs and classrooms but also its wide use in businesses. After students complete their business degrees and are employed in their fields, the availability of Excel in these future workplaces will far exceed SPSS or R. By selecting Excel for use in the statistics course, the instructor has added to the motivation for his or her students because in almost all of their potential employment destinations, the former student will be able to use Excel on the job Using Excel in an Introductory Statistics Course Page 12 of 13 for their data analysis needs. Consequently Excel is an excellent choice for instructors to use to support their introductory statistics class.
Using Excel in an Introductory Statistics Course Page 13 of 13 References Anderson, D. R. (2009). Essentials of modern business statistics (4th Ed. ed.). Eagan, MN: Cengage South-Western. Barnes, K. (2009). Skills Most Sought After by Employers Retrieved October 24, 2009, from http://iccweb.ucdavis.edu/LAB/articles/Skills.htm Department of Mathematics and Statistics at Vienna University. (n.d.). The R package for statistical computing Retrieved October 24, 2009, from http://www.r-project.org/ Harrington, R., McLeod, P., & M.Clark. (2009). SPSS short course. Retrieved October 12, 2009, from http://www.unt.edu/rss/class/SPSS/course1.htm#VI.__Relative_Advantages_and_Disadv anatges_ Journey Education Marketing, I. (2009). PASW Statistics Grad Pack 18.0 for Windows (Formerly SPSS). Retrieved October 12, 2009, from http://www.journeyed.com/item/SPSS/PASW+Statistics+Graduate+Pack/100966436?sS RCCODE=GOOGLEBASE&utm_source=GoogleBase&utm_medium=Comp%2BEngin es&utm_campaign=jce01 McCullough, B. D., & Heiser, D. A. (2008). On the accuracy of statistical procedures in Microsoft Excel 2007. Computational Statistics and Data Analysis, 52(10), 4570-4578. Microsoft Inc. (2009a). Excel specifications and limits Retrieved October 26, 2009, from http://office.microsoft.com/en-us/Excel/HP100738491033.aspx Microsoft Inc. (2009b). Microsoft Office Excel 2007 Retrieved October 25, 2009, from http://office.microsoft.com/en-us/Excel/FX100487621033.aspx SPSS Inc. (2009). SPSS an IBM company Retrieved October 24, 2009, from www.spss.com Yalta, A. T. (2008). The accuracy of statistical distributions in Microsoft Excel 2007. Computational Statistics & Data Analysis, 52(10), 4579-4586. Zieffler, A., & Long, J. D. (n.d.). Basics of R. Retrieved October 12, 2009, from http://docs.google.com/gview?a=v&q=cache%3AKNt87a_hJBAJ%3Awww.tc.umn.edu %2F%7Ezief0002%2FNotes%2FRBasics.pdf+advantages+of+R&hl=en&gl=us&sig=AF QjCNFNcPf8U4xtjQ3vC-qvzUcCqYod0w&pli=1