Assessing learning in introductory statistics

MS 150 Statistics is an introductory statistics course with a focus on statistical operations and methods. The course is guided by the 2007 Guidelines for Assessment and Instruction in Statistics Education (GAISE), the spring 2016 draft GAISE update, and the ongoing effort at the college to incorporate authentic assessment in courses.

In the fall of 2012 the statistics curriculum was adjusted to include a couple of weeks of open data exploration exercises at the end of the term. These exercises were submitted as assignments and marked by the instructor. Spring 2015 the last open data exploration in a set of three was assigned as a presentation to the class. This changed the stakes from an assignment seen only by the instructor to a presentation seen by all of the students in the class. That shift would ramp up the level of effort students put into their open data exploration exercises. The arrival of improved presentation technology fall 2015 in the form of a brighter flat panel display would lead to the decision to repeat having the last open data exploration as a presentation to the class.


By now the benefit of having the students do presentations was clear. Spring 2016 the statistics syllabus trimmed and compressed the content yet again, producing a twelve week "traditional" lecture-quiz-test structure followed by three open data exploration presentations in the last three weeks of the course.

Sarah Elma

I often noted to the class during the spring term that the curriculum structure was essentially learning to play the game of statistics for twelve weeks and then playing the game in the final three weeks of the course. The difficulty with this approach is the reality that very little real learning and comprehension occurs in those first twelve weeks. When the students engage in open data exploration beginning in week thirteen, the students are often at a loss as to what to do. During those final three weeks the students demonstrated active engagement with the material.


During the summer of 2016 I sought to integrate a more problem based learning into the course and to find a way to bring the presentations forward in the course. One does not learn a sport by first learning every single rule and strategy. One usually starts by goofing around with the equipment, playing with a few friends, becoming interested in the sport and then learning the rules, strategies, and nuances. The desire was to have statistics mimic this approach. Start by playing the game and cover rules and strategies as the players progressed.

The challenge was how to start the term with the students engaged in an open data exploration leading to a presentation on a day one zero knowledge start. While out on a run I realized that what I wanted to do was to walk into class on the first day of class and have the first thing I say be, "Your statistical analysis presentations will be on Friday. Any questions?"

The break through came with learning of the Mars and Murrie option. With a way to start the term on a zero day open data exploration presentation, I resected the curriculum yet again. The goal was to retain the current content of the outline - continue to meet the content goals of the outline and course - while expanding the number of presentations and then having the presentations occur throughout the fall 2016 term. There would still be "traditonal" lecture style presentation of information and content coverage, but these would be "punctuated" by presentations.

The result would a term with eight presentations. The data explorations would not be entirely unguided open data explorations, some of the data explorations would be more tightly guided. The third presentation would be perhaps the most structured and guided. This presentation would have the students explore the drop height versus first bounce height for a superball. The description specified:

Make a table of bounce number versus bounce height. Use this data to build a table, create an xy scattergraph. Can you add a trend line to the graph? What type of trend line appears to produce the best fit? For that trend line can you at least report the correlation? What does the best fit trend line suggest about bouncing - does the ball stop bouncing or do the bounces just get smaller and smaller until you cannot see them? Any idea what kind of equation might predict the bounce height for any bounce number?

An end of term survey asked the open question "The activity that contributed most to my learning was…"

Presentations were cited by the students as contributing the most to their learning. Of interest is that when then asked, "The biggest obstacle for me in my learning the material was…" ten students responded that presentations were the biggest obstacle, with at least four of the ten having cited presentations as the activity that contributed the most to their learning. The mixed message appears to be that the presentations engendered learning but were challenging and required more work on the part of the student.

As noted above students were also asked, "The biggest obstacle for me in my learning the material was…" There was a wider diversity of responses to this open question than for the activity that most engendered learning.

As noted above, the presentations were seen as both supporting learning and as being one of the biggest challenges. Analyzing data, making statistical calculations, and the raw mathematics of statistics were the second most often cited obstacles. Use of spreadsheets tied with responses that cited fear as an obstacle to learning. Fear is an unusual response and warrants including those three responses to better understand the source of the fear.

Presenting in front of people.
FEAR... Honestly I always wanted to say or ask something but I dis not do.
I don't know why, and that why i put it there fear.

The first fear is understandable - presenting in front of people can be scary. The second fear citation is from a student who is what might be termed "painfully shy" and experiences fear in regards asking questions. I am aware of this fear in the classroom, there are cultural factors that play into this particular fear. While I do sometimes ask open, class wide questions, I also often ask students if they have questions in a one-on-one interaction. The presentations have helped facilitate this approach. Wednesdays of presentation weeks are working days and I can move around the room working with students one-on-one. 

The last fear is non-specific and suggests that some students simply exist in a state of fearfulness. I am aware of this and work to make the class room environment a safe place in terms of the overall atmosphere.

One of my hopes is that as the result of one of my courses a student will be more favorably disposed towards that field of study. If a student successfully completes my class and thinks, "Whew, I am glad that is over, I never want to see that subject ever again!" then I have failed in a very deep way. That failure can have serious social and communal consequences down stream. The rise of "fake news," "science deniers," and a misunderstanding of relative risk in educated populations is in part a consequence of the distaste for science and mathematics engendered in all too many courses.

The post hoc survey asked the students their attitude towards mathematics prior to the course.
Half of the students reported either having a negative or neutral attitude towards mathematics prior to taking the statistics course.

By term end 74% of the students, three of every four, had a positive attitude towards mathematics. Note that the question was intentionally framed using the wording "mathematics" and not statistics. This was an attempt to disconnect their potential feelings about the specific course or instructor from their responses. This was also done to determine their broader attitude towards mathematics in general.

The textbook was crafted to support the course. Students are not required to bring the textbook to class, and the textbook is available on line. The students were asked whether they read the textbook.

Thirty percent of the responses were students responding that they read the print version. Twice as many responses said that they read the on line version of the text. Note that students were allowed to answer both that they read the print version and the on line version, some students utilized both.

Given that I rarely saw the textbook in class, I wondered whether the students perceived the textbook as helpful. In response to that question, 90% of the responses were that the student found the textbook helpful. Thus the absence of textbooks in class was likely more a function of the textbook being used on line in class than the textbook being perceived as useless. Note that the class is held in a computer laboratory.

While the affective domain aspects of the students reaction to the course material and textbook provide valuable insights, ultimately the course must deliver statistical learning outcomes. The three course level student learning outcomes currently guiding MS 150 Introduction to Statistics are:

  • Perform basic statistical calculations for a single variable up to and including graphical analysis, confidence intervals, hypothesis testing against an expected value, and testing two samples for a difference of means.
  • Perform basic statistical calculations for paired correlated variables.
  • Engage in data exploration and analysis using appropriate statistical techniques including numeric calculations, graphical approaches, and tests.
Although some faculty opt to measure these during the term, my own work on the loss of mathematical knowledge among beginning of the term physical science students suggests that in term measurement of learning could generate inflated success rates. With the end of the term in statistics, the students in the fall 2016 run of the statistics course have not had new material presented for over four weeks prior to the final examination. There has been sufficient time for specific learning outcomes knowledge to be lost. Thus an item analysis of the final examination may provide some insight into retained learning.

Forty-eight students sat the final examination. The chart depicts the percent success rate on each item on the final examination. 

The final examination consists of three subsections, each subsection addressing one of the three course learning outcomes. The first 15 questions address the first student learning outcome on basic statistical skills. Questions 16 to 20 address the second learning outcome, analyzing paired data. The last two questions call on on the highest level skills: given raw data, determine what statistical test is most appropriate and what is the result of that statistical test. Where questions such as number nine are simply, "Calculate the mean" question 21 presents two sets of wind speeds samples and asks, "What is the most appropriate statistical analysis to determine whether the mean ten minute sustained winds were stronger in 2015 than in 1997?" The students are then asked in the question 22, "Was the mean ten minute sustained winds in 2015 significantly larger in 1997?" 

In general the students showed strong levels of learning for basic statistical calculations. The students did have difficulty calculating 95% confidence intervals. The students displayed moderate levels of learning for the paired data calculations. The final hypothesis testing scenario proved the most daunting.

There was effectively no difference in performance by gender, neither in overall course nor final examination percentages.

The MS 150 Statistics course consists of three sections. The three sections are kept in synch during the term. All three sections covered the same material, worked the same assignments, and gave presentations on the same topics. The sections meet at 8:00, 9:00, and 10:00 on a Monday-Wednesday-Friday schedule.

The 8:00 section is impacted by transportation difficulties some students encounter. The students in the 8:00 section tend to have more absences and more frequently arrive late to class. I tend to attribute historically weaker performance in the 8:00 section to the higher rate of absences and late arrivals. There is also potentially a pre-selection factor. More organized students may register earlier and preferentially fill later classes - the 8:00 section is the last to fill during registration. Performance on the final examination by section shows weaker performance on the final at 8:00.

The final examination in MS 150 Statistics has tracked data on the three subsections of the final examination, each subsection addressing one of the course learning outcomes, since the fall of 2012. 
Final examination performance by course learning outcome

Performance on the first subsection, basic statistics, has been stable around 80% since fall 2012. On paired data calculations, performance has remained stable at 70%. The third subsection of the final examination cannot be compared on a term-on-term basis. The nature of the rubrics and scoring systems used to mark this section have changed over the terms. Recent prior terms had used an open data exploration exercise answered as an essay question with the analysis marked by a rubric.

This term the existence of eight data exploration presentations provided a rich tapestry of data and insight into student comprehension. By the end of the term an observer could have sat in on the presentations and determined the level of mastery of statistics for each presenter. With the presentation rubrics including student learning outcomes from the outline. By the end of the term I felt I had good data on the third course level student learning outcome. As a result, the final included only two questions in this area and did not demand an essay analysis.

The students were aware that the final examination would not have a significant impact on their grade. The students knew that the presentations carried the most weight in their grade, and that the final examination would weigh in at only slightly more than any one test or any one presentation. High stakes tests do not measure what the student is likely to retain but rather what the student could cram, memorize, regurgitate, forget. Projects are what students remember, and the presentations are the projects in statistics.

y-axis does not start at zero! vertical range is exaggerated! 

Regression to the mean is as inescapable in statistics as entropy is in physics. Means return to long term means and those long term means return to even longer term means. And in the world of final examination averages, moving those longer term means is very difficult. Term-on-term fluctuations are almost meaningless and should not be viewed as calls to action. Since 2012 the final examination percent has moved in a narrow range of plus or minus four percent from 74%.

The course average since 2007 has tended to remain within four percent of 78%. The standard deviation of the students' individual course averages is also relatively stable around 15%. The amount of internal variation in student scores is fairly consistent term-on-term. 

Eight open or guided data explorations during the term led to presentations. The students were surveyed as to "Which open data exploration and presentation was your favorite?" and "Which open data exploration and presentation was your LEAST favorite?" Students could choose none, specific presentations, or all. The chart depicts the number students choosing a favored, least favored (disfavored), and the net difference favored minus disfavored. The chart is in descending order of the net values.

The open data explorations and presentations listed on the chart were:

MM: Mars and Murrie candy exploration
EVAW: Ending violence against women: FSM Family Health and Safety survey data
FiboBelly: A hypothesis test as to whether one's belly button is located at the golden ratio
HR: A heart rate before and after exercise data exploration
Bouncing: An exploration of the relationship between the drop height and bounce height for a superball
All: All open data explorations and presentations were checked
Chuuk: An exploration of the relationship between the JHET and teacher attendance in the elementary schools of Chuuk.
Ekiden: A data exploration from the world of exercise sport science, data from an ekiden on Pohnpei. Although not well liked, the presence of a number of track and field athletes in the course this term was the impetus behind including this exploration.
SOC: An exploration of data on soil organic carbon sequestration under nitrogen-fixing and non-nitrogen fixing trees.

Looking ahead to next term the course will shift to using Google Sheets and Google Slides as the main course software. A sixth edition of the statistics textbook with improved display on mobile platforms has been developed. The new edition also shifts to specifically supporting Google Sheets. This blog article including all analysis and graphics were done using ChromeOS on a ChromeBook using Google Sheets. The next term will continue to emphasize presentations while retaining content. 


Popular posts from this blog

Box and whisker plots in Google Sheets

Areca catechu leaf sheaf petiole plates

Setting up a boxplot chart in Google Sheets with multiple boxplots on a single chart