Assessing Learning in Introductory Statistics

MS 150 Statistics is an introductory statistics course with a focus on statistical operations and methods. The course is guided by the 2007 Guidelines for Assessment and Instruction in Statistics Education (GAISE), the spring 2016 draft GAISE update, and the ongoing effort at the college to incorporate authentic assessment in courses. A history of the evolution open data open data exploration exercises and associated presentations as authentic assessment in the course was covered in a May 2017 report.

Nayme presents basic statistics on anonymous blood glucose levels measured at a recent health fair

Three course level student learning outcomes currently guide MS 150 Introduction to Statistics:
  • Perform basic statistical calculations for a single variable up to and including graphical analysis, confidence intervals, hypothesis testing against an expected value, and testing two samples for a difference of means.
  • Perform basic statistical calculations for paired correlated variables.
  • Engage in data exploration and analysis using appropriate statistical techniques including numeric calculations, graphical approaches, and tests.

Gino and Laureen present. During the term the students engaged in seven data analysis presentations

The course wrapped up coverage of content three weeks prior to the end of the term. The students then engaged in a series of three data analysis, exploration, and presentation exercises. The course then ended with a final examination which was completed as an online test inside Schoology. This term the spring 2018 final examination was posted as a practice test.

Statistics spring 2018 final examination

The use of the spring 2018 final examination as a practice test was made possible in part due to the use of Google Sheets for data sharing starting in spring 2017 and the continued use of Google Sheets in subsequent terms. Spring 2017 was the first term to use Google Sheets as the supporting software for the course from day one.


Spring 2018 saw the adoption of Schoology Institutional and the further deepening of integration between the course and Google Sheets via the Google Drive Assignments application in Schoology.

Nemely presenting solo

The basic structure of the final examination has been fairly stable over time, and performance on an item-by-item basis is also fairly stable over time.  

Thirty-three students sat the final examination spring 2018. The table depicts the percent success rate on each item on the final examination across seven terms.


topicFall 2015Spring 2016Fall 2016Spring 2017Fall 2017Spring 2018Fall 2018
n0.930.970.970.980.911.00
mode0.941.000.960.890.980.850.98
median0.960.970.940.951.001.001.00
mean0.940.950.940.891.000.971.00
min0.940.950.960.950.951.001.00
max1.001.000.960.970.981.001.00
range0.940.950.950.981.000.89
quartile 10.960.950.900.920.980.970.96
quartile 30.941.000.880.891.000.911.04
boxplot0.490.840.920.790.670.85
histogram0.590.410.880.970.900.790.93
stdev sx0.930.970.940.890.931.000.93
st error se0.720.800.760.820.810.940.78
t-critical0.910.870.780.740.670.820.89
margin E0.720.690.410.630.500.730.63
lower950.590.440.310.420.710.520.39
upper950.590.410.290.450.710.520.35
n paired0.540.770.590.390.430.610.54
slope0.930.950.710.820.830.910.93
intercept0.930.740.780.840.830.880.87
correlation0.910.690.710.660.930.910.87
strength0.740.560.630.840.980.760.78
x predict y0.440.330.390.30
y predict x0.370.26

The first twelve questions covered basic one variable statistics. Questions 13 to 17 involved constructing a 95% confidence interval. Questions 18 to 24 covered statistics of two variable dependent variables.

Students evidenced strength in calculating basic statistics of both one and two variables.

Performance of inferential calculations for both one and two variables has remained weaker. The existence of a practice test does not appear to have a marked impact versus prior terms and did not positively impact the calculation of the lower and upper bounds for a 95% confidence interval. Again this term, as in fall 2017 and spring 2018, students were hung up on the use of plus or minus two standard errors as an approximation of the 95% confidence interval. This is used to introduce t-critical values. The solution would seem to be to omit all mention of plus and minus two standard errors and move from material in chapter eight directly to chapter 9.2 omitting section 9.12. That said, explaining the sudden appearance of Student's t-critical is more challenging. "Plus and minus two" is introduced conceptually in chapter 2.4 with unusual z-scores. Finding a way to segue directly from distributions of the mean and the standard error to confidence intervals remains a necessary curriculum modification at this point. 

The MS 150 Statistics course fall 2018 consisted of two sections, a total of 58 students enrolled as of term end, 22 females and 18 males. The two sections are kept in curricular synchronization during the term. Both sections covered the same material, worked the same assignments, and gave presentations on the same topics. The sections met at 8:00 and 9:00 on a Monday-Wednesday-Friday schedule.

Performance by section on the final and in the course

Although the 8:00 section is impacted by transportation difficulties which have historically led to weaker performances by the 8:00 section versus the 9:00 section, the 8:00 section performed statistically equivalent to the 9:00 section both on the final examination and in terms of overall course averages. The differences seen in the performance were not statistically significant (p > 0.05).

The course permits late arriving students to enter the class whenever they arrive. That the two sections performed equivalently suggests support for this approach in an 8:00 section. 


Performance differences by gender in the course and on the final exam

Gender differences were not statistically significant given the small underlying sample size. The differences were sufficiently small as to be random.

Course performance over time

The introduction of Schoology Institutional in January 2018 has made possible tracking of performance based on student learning outcomes. Prior to January 2018 Schoology Basic permitted the entering of student learning outcomes, but the Basic version does not provide access to the Mastery screen. Once the college adopted the institutional version, however, Mastery data from as far back as the instructor measured against student learning outcomes becomes available. Data across five terms is reported for the following three learning outcomes:

1.0 Perform basic statistical calculations for a single variable up to and including graphical analysis, confidence intervals, hypothesis testing against an expected value, and testing two samples for a difference of means.
2.0 Perform basic statistical calculations for paired correlated variables.
3.3 Draw conclusions based on statistical analyses and tests, obtain answers to questions about the data, supported by appropriate statistics

Performance on student learning outcome over five terms. 

Final examination performance on the first course learning outcome, 1.0 basic statistics, has been known to be generally stable around 80% since fall 2012. A success rate of 83.1% for fall 2018 is on par for this particular outcome. The average on the final for questions in this area was 84.8%, in good agreement with the 83.1% result from the student learning outcome data reported by Schoology. Note that the Schoology result is an aggregation of performances on 26 assignments during the term. The agreement with the final examination performance on items serving learning outcome 1.0 is remarkable.

Performance on the second student learning outcome, paired data calculations, has remained stable near 77%. A success rate of 76.1% for fall 2018 is on par for student learning outcome two. The average on the final for questions in this area was 65.2%, not in good agreement with the 76.1% success rate seen on the seven assignments during the course. Underneath that 65.2% success rate on final exam questions serving student learning outcome two is a split distribution. When asked to calculate basic two variable statistics, students had a 80.4% success rate on the final exam. When asked to predict a value based on their linear regression, the success rate plummeted to 28.3%. Overall success on the student learning outcome reflects better the emphasis made during the course on calculating and interpreting the basic linear regression statistics. Using the linear regression to predict values was not a focus in the course, hence the weakness seen on the final examination.

Student learning outcome 3.0 (Engage in data exploration and analysis using appropriate statistical techniques including numeric calculations, graphical approaches, and tests) includes specific learning outcomes that unavoidably overlap student learning outcome one material. Students still must demonstrate basic statistical competencies when working on an open data exploration exercise.  Hence only performance on 3.3 is reported because drawing conclusions based on statistical analyses and tests and obtaining answers to questions about the data is the core intent of the third learning outcome.

Performance on outcome 3.3 averages 75% and this term's 73% average success rate across seven assignments during the course is in line with that longer term average. This learning is not tested on the final examination.

As an educator, I am aware that there is a penchant in education for "continuous improvement." The reality is that there is far more inertia in a value than I suspect those in education comprehend. These success rates tend to be stable over long periods of time and reflect both the difficulty of the material as well as the many reasons students do not succeed on the material. For every student who did not do well, there is a complex back story. Data for success on the final examination demonstrates this longer term stability and the tendency to return to the long term average.



Long term lack of a trend in final examination averages Fall 2005 - Fall 2018:
y-axis does not start at zero nor end at 100! vertical range is exaggerated! 

Over the long haul, regression to the mean is as inescapable in statistics as entropy is in physics. Means return to long term means and those long term means return to even longer term means. And in the world of final examination averages, moving those longer term means is very difficult. Term-on-term fluctuations are almost meaningless and should not be viewed as calls to action.

Since 2012 the final examination percent has moved in a narrow range of plus or minus five percent from 74%. Fall 2017 was slightly up at 79.3%, but this was projected last fall to be a random aberration. Just as there was a multi-term slump in the averages from Fall 2009 to Fall 2011, so can there be multi-term improvements that do not hold over the longer haul. Spring 2018 saw performance return to towards the mean as as had been projected the previous fall.

This term the average on the final exam was 79.1%, nearly identical to the value of a year ago. There are sidewalk conversations that suggest fall term performance exceeds spring term performance, but the long term average for fall term final exams is 75% and for spring term finals is 73%. Over the longer haul the differences disappear.


Long term course average and standard deviation

The course average since 2007 has also remained stable and has tended to remain within four percent of 78%. This term's 79.6% course average remains statistically indistinguishable from the long term mean of 78.1%.  Course averages also do not show a fall to spring difference. The course average in fall is 78.4% and a statistically indistinguishable 78.0% in the spring term.

The standard deviation of the students' individual course averages is also relatively stable around 15% with a slight rise to 18.5% seen spring 2018.

Overall, given a list of numbers and spreadsheet software, students show a strong mastery of basic statistics, good capabilities with linear regressions, and more moderate abilities with confidence intervals and open data exploration.

Comments

Popular posts from this blog

Box and whisker plots in Google Sheets

Setting up a boxplot chart in Google Sheets with multiple boxplots on a single chart

Creating histograms with Google Sheets