Assessing learning in introductory statistics

MS 150 Introduction to Statistics has utilized an outline based in part on the 2007 Guidelines for Assessment and Instruction in Statistics Education (GAISE), the spring 2016 draft GAISE update, and the ongoing effort at the college to incorporate authentic assessment in courses. The three course level student learning outcomes currently guiding MS 150 Introduction to Statistics are:
  1. Perform basic statistical calculations for a single variable up to and including graphical analysis, confidence intervals, hypothesis testing against an expected value, and testing two samples for a difference of means.
  2. Perform basic statistical calculations for paired correlated variables.
  3. Engage in data exploration and analysis using appropriate statistical techniques including numeric calculations, graphical approaches, and tests.
The first two outcomes involve basic calculation capabilities of the students and are assessed via an item analysis of the final examination (original was a test inside Schoology.com). Thirty-nine students in two sections took the final examination.

Average success rate based on an analysis of the three sections of the final examination

In the above chart the centers of the yellow topmost circles are located at the average success rate for the students on final examination questions under the first course learning outcome - basic single variable statistics. The chart reports results from 2012 to present. The radii are the standard deviations. The middle blue circles track performance under the second course level learning outcome, paired dependent data. The orange bottom-most circles track performance on the open data exploration and analysis.

The first course learning outcome focuses on basic statistics. Twenty-one questions on the final examination required the students to perform basic single variable statistical calculations on a small sample. Based on the item analysis the average success rate was 78%. Although slightly lower term-on-term, the difference was not significant. Performance has remained stable. Over the past four years average success rate on this material has been 80%.

Success rates on individual final exam items for 39 students

Performance on the second course learning outcome, linear regression statistics, was measured by six questions on the final examination. Student performance on this section was 68%. The four year average is 69%. Student success on linear regressions has remained lower than success rates on basic statistical calculations. This performance is also stable.

Performance on the third course learning outcome, open data exploration and analysis, is not comparable term-on-term. The scoring system for the open data exploration section of the final examination varies term-on-term. Performance is always weaker on this open data exploration and analysis section than on the first two learning outcomes. Students perform strongly when asked to calculate a specific statistic, students struggle when raw data and open ended questions are posed about the data. The students responded to this section with a single essay question set up using Schoology. This one question was then marked by the instructor.

The 51% student success rate seen on the third learning outcome this term represents the number of students who arrived at a fully or partially correct analysis of the data presented. Of the 39 students who took the final examination, only five made a fully correct analysis of the data, measuring the means and running a test for a significance difference in those means. The open data exploration this term explored two samples where the optimal solution would have been to calculate the means and then test for a significant difference in the means.

Breakdown of solution quality for open data exploration on the final examination

Another five students reported the means and noted that the means differed but did not then run a t-test for a difference of means. Ten cited the correct data set as having a higher mean, but either cited inappropriate statistics or no statistical support.

Fifteen students obtained an incorrect solution, often without providing statistical support for that solution. A few of these students noted as evidence the greater number of highest scores in the data sample that actually had the lower overall average.

Overall success rate on the final examinations has been exceptionally stable over the past three years, and generally stable for the past decade. The long term average success rate is 73.6%, the current term saw a 75.7% success rate on basic and linear regression statistics. 

Final examination average since 2005

In an educational world where a common goal is "continuously improving" best practices, the inert stability of the success rate above might be seen as a failure to continuously improve. The effort to continuously improve mathematics education overall goes back not to the new math of the 1960's but much, much further. Ultimately there are long term average success rates, and statistics assures us that numbers tend to return to long term averages. A look at the running cumulative mean success rate on the final examination since 2005 suggests that the longer term mean to which terms return might be improving, but even this statistic is subject to a tendency to return to an even longer term mean.

Note that the y-axis does not start at zero: exaggerated vertical scale


In general students who complete the course are able to successfully make basic statistical calculations on 73% of the questions posed.

The course average over time includes performance on homework, quizzes, and tests. Course level performance underlies course completion rates. Data on course level performance is available from 2007 forward.

Course average over the past eight years


The course wide average has a long term average of 77.6%, the current term average is 73% . The radii of the circle is proportional to the standard deviation of the student averages in all three sections of the course. The standard deviation is fairly constant over time at about 15%.

This term the open data exploration exercises were each capped off with a presentation rather than a quiz. Performance on the open data explorations was marked using rubrics. Each rubric consisted of five criteria and generated up to twenty points. Three criteria were content oriented, one focused on the presentation software, and the final criteria on the presentor.


Criteria 4 Excellent 3 Good 2 Satisfactory 1 Needs improvement
Basic statistics: Appropriate basic statistics calculated correctly and reported meaningfully. All appropriate statistics reported in a meaningful manner Appropriate basic statistics reported and cited in report Some basic statistics reported A few basic statistics cited.
Nitrogen storage: Do nitrogen fixing trees store significantly more carbon in the soil than non-nitrogen fixing trees? Answer is correct and supported by a fully appropriate statistical analysis Answer is correct supported by statistics which do not provide evidence that answer is correct Answer is correct but unsupported by numeric values Result is incorrect.
Strength of the difference: How strong is the effect size for this study? Answer is correct and supported by fully appropriate statistical analysis Answer is correct supported by statistics which do not provide evidence that answer is correct Answer is correct but unsupported by numeric values Result is incorrect.
Presentation software: Original work submitted as presentation software, presentation is appropriate to the material and subject matter, presentation generally follows guidelines for a good presentation. Presentation that heeds general presentation guidelines, avoids distracting visual extras, and is appropriate to the subject matter presentation with only a few areas in which the presentation as a visual aid could be improved Presentation with more than a few issues. Transitions distract from the content, timing is inappropriate, or other issues such that the visual aid becomes a distraction Submission of a spreadsheet or other fundamental fault in the submission.
Presentation mechanics: Presentor delivered clearly, concisely, demonstrated familiarity with the contents. Well delivered exhibiting preparation and knowledge of the presentation. Spoke clearly and always towards the audience Presentor showed evidence of preparation and some familiarity with the content of presentation. Usually faced the audience Presentor was able to read the slides, sometimes with their back to the audience Little evidence of preparation, unfamiliar with the slide contents, spoken facing the display panel."
A typical presentation rubric

The open data exploration assignments were structured as assignments in the Schoology learning management system. Students had to submit by midnight on the day prior to the presentation, the assignment locking system in Schoology permitted this functionality.  

Schoology assignment editing screen, locking set at the bottom

The average score for the presentations was 16.12 points, which is 80.6% of 20 possible points. 

Students presenting in MS 150 Statistics

The presentations were downloaded as a batch using the download all functionality of Schoology.


The students then presented using native Microsoft PowerPoint or LibreOffice.org Impress software. 


The item analysis of the twenty-seven final examination questions also provides insight on the success rate against the two general education program learning outcomes served by the course.

Program Learning Outcomes PLO PLO sum PLO n PLO%
3.1 Demonstrate understanding and apply mathematical concepts in problem solving and in day to day activities 3.1 14.000 17 82.4
3.2 Present and interpret numeric information in graphic forms 3.2 6.436 10 64.4

Overall performance remains stable in this mature but evolving course. 

Popular posts from this blog

Box and whisker plots in Google Sheets

Traditional food dishes of Micronesia

Creating histograms with Google Sheets