Assessing Learning in Introductory Statistics

MS 150 Statistics is an introductory statistics course with a focus on statistical operations and methods. The course is guided by the 2007 Guidelines for Assessment and Instruction in Statistics Education (GAISE), the spring 2016 draft GAISE update, and the ongoing effort at the college to incorporate authentic assessment in courses. A history of the evolution open data open data exploration exercises and associated presentations as authentic assessment in the course was covered in a May 2017 report.

Andrea and Berlene present their statistical analysis to the class

Three course level student learning outcomes currently guide MS 150 Introduction to Statistics:
  • Perform basic statistical calculations for a single variable up to and including graphical analysis, confidence intervals, hypothesis testing against an expected value, and testing two samples for a difference of means.
  • Perform basic statistical calculations for paired correlated variables.
  • Engage in data exploration and analysis using appropriate statistical techniques including numeric calculations, graphical approaches, and tests.
The course wrapped up coverage of content three weeks prior to the end of the term. The students then engaged in a series of three data analysis, exploration, and presentation exercises. The course then ended with a final examination which was completed as an online test inside Schoology. This term the spring 2017 final examination was posted as a practice test.

Statistics spring 2017 final examination

The use of the spring 2017 final examination as a practice test was made possible in part due to the use of Google Sheets for data sharing back in spring 2017 and the continued use of Google Sheets in the course this term. Spring 2017 was the first term to use Google Sheets as the supporting software for the course from day one.

Data for the spring 0217 statistics final examination

Spring 2018 saw the adoption of Schoology Institutional and the further deepening of integration between the course and Google Sheets via the Google Drive Assignments application in Schoology.

The basic structure of the final examination has been fairly stable over time, and performance on an item-by-item basis is also fairly stable over time.  

Thirty-three students sat the final examination spring 2018. The table depicts the percent success rate on each item on the final examination.


topicFall 2015Spring 2016Fall 2016Spring 2017Fall 2017Spring 2018
n0.930.970.970.980.91
mode0.941.000.960.890.980.85
median0.960.970.940.951.001.00
mean0.940.950.940.891.000.97
min0.940.950.960.950.951.00
max1.001.000.960.970.981.00
range0.940.950.950.981.00
quartile 10.960.950.900.920.980.97
quartile 30.941.000.880.891.000.91
boxplot0.490.840.920.790.67
histogram0.590.410.880.970.900.79
stdev sx0.930.970.940.890.931.00
st error se0.720.800.760.820.810.94
t-critical0.910.870.780.740.670.82
margin E0.720.690.410.630.500.73
lower950.590.440.310.420.710.52
upper950.590.410.290.450.710.52
n paired0.540.770.590.390.430.61
slope0.930.950.710.820.830.91
intercept0.930.740.780.840.830.88
correlation0.910.690.710.660.930.91
strength0.740.560.630.840.980.76
x predict y0.440.330.39
y predict x0.37

The first twelve questions covered basic one variable statistics. Questions 13 to 17 involved constructing a 95% confidence interval. Questions 18 to 23 covered statistics of two variable dependent variables. Students evidence strength in calculating basic statistics of both one and two variables. Performance of inferential calculations for both one and two variables has remained weaker. The existence of a practice test does not appear to have a marked impact versus prior terms and did not positively impact the calculation of the lower and upper bounds for a 95% confidence interval. Again this term students were hung up on the use of plus or minus two standard errors as an approximation of the 95% confidence interval. This is used to introduce t-critical values. Again the solution would seem to be to omit all mention of plus and minus two standard errors and move from material in chapter eight directly to chapter 9.2 omitting section 9.12

The MS 150 Statistics course spring 2018 consisted of two sections, a total of 40 students enrolled as of term end, 22 females and 18 males. The two sections are kept in curricular synchronization during the term. Both sections covered the same material, worked the same assignments, and gave presentations on the same topics. The sections met at 8:00 and 9:00 on a Monday-Wednesday-Friday schedule.
Performance by section on the final and in the course

The 8:00 section is impacted by transportation difficulties which have historically led to weaker performances by the 8:00 section versus the 9:00 section. Both the average on the final examination and course averages differed by section, consistent with the performance differential seen in prior terms. The students in the 8:00 section tend to have more absences and more frequently arrive late to class. There is also potentially a pre-selection factor. More organized students may register earlier and preferentially fill the later class - the 8:00 section is the last to fill during registration. 

Performance differences by gender in the course and on the final exam

Gender differences were not statistically significant given the small underlying sample size. The differences were sufficiently small as to be random.

The final examination in MS 150 Statistics has tracked performance on basic statistics and linear regression statistics since fall 2012. The course has also reported on performance on open data exploration exercises over the same period of time. This term performance on specific learning outcome 3.3 was used to report on performance on open data exploration exercises.
Final examination performance by course learning outcome

Final examination performance on the first course learning outcome, basic statistics, has been generally stable around 80% since fall 2012. Performance Spring 2017 rose to 84% and Fall 2017 performance improved again to 87%. This term performance rose to a 92% success rate. This rise from fall 2017 might be attributable to the practice test that was deployed for the first time this year.

While performance on the second student learning outcome, paired data calculations, had remained stable near 70%, last term saw a rise to 80%. The long term average remains 70% on this course learning outcome and the drop to a 74% success rate this term can be seen as a return to the long term mean.

There is far more inertia in a value than I suspect those in education comprehend. There is a penchant in education for "continuous improvement." These success rates tend to be stable over long periods of time and reflect both the difficulty of the material as well as the many reasons students do not succeed on the material. For every student who did not do well, there is a complex back story.

The third course learning outcome involving open data exploration cannot be compared on a term-on-term basis. The nature of the rubrics and scoring systems used to mark this section have changed over the terms. Spring 2018 the adoption of Schoology Institutional made possible direct measurement of this outcome based on work done during the term. Rubrics included as specific student learning outcome 3.3: "Report results of analysis. Draw conclusions based on statistical analyses and tests, obtain answers to questions about the data, supported by appropriate statistics." This student learning outcome was marked with evaluation of open data exploration and analysis skills in mind. Measured this way, performance appears stronger than in past terms. However, term-on-term comparisons cannot be made as the method of reporting learning for this outcome has varied from term to term.



Long term trend in final examination average Fall 2005 - Fall 2017:
y-axis does not start at zero nor end at 100! vertical range is exaggerated! 

Over the long haul, regression to the mean is as inescapable in statistics as entropy is in physics. Means return to long term means and those long term means return to even longer term means. And in the world of final examination averages, moving those longer term means is very difficult. Term-on-term fluctuations are almost meaningless and should not be viewed as calls to action. Since 2012 the final examination percent has moved in a narrow range of plus or minus four percent from 74%. Fall 2017 was slightly up at 79.3%, but this was projected last fall to be a random aberration. Just as there was a multi-term slump in the averages from Fall 2009 to Fall 2011, so can there be multi-term improvements that do not hold over the longer haul. Spring 2018 saw performance return to towards the mean as projected last fall.


Long term course average and standard deviation

The course average since 2007 has also remained stable and has tended to remain within four percent of 78%. This term's 78.7% course average is statistically indistinguishable from the long term mean of 78.1%.

The standard deviation of the students' individual course averages is also relatively stable around 15% with a slight rise to 18.5% seen spring 2018.

Overall, given a list of numbers and spreadsheet software, students show a strong mastery of basic statistics, good capabilities with linear regressions, and more moderate abilities with confidence intervals. 

Comments

Popular posts from this blog

Box and whisker plots in Google Sheets

Setting up a boxplot chart in Google Sheets with multiple boxplots on a single chart

Creating histograms with Google Sheets