### Data Exploration Exercise in Statistics class

Over the years I have learned that when given data and asked to calculate the minimum, maximum, mode, median, mean, standard deviation, standard error of the mean, t-critical for an alpha of five percent, a margin of the error for the mean, and a confidence interval, the students are generally capable of doing so. For example, when asked to calculate the mean for a set of 20 data values, 74 of 78 students were able to successfully do so on the final examination fall term 2012. The four who failed to correctly calculate the mean appeared to have data entry errors and not a fundamental inability to calculate that value.

When given simply raw data and asked to determine the most appropriate analysis, the students have more difficulty. The basis of this judgment is based in part on student projects that the students completed during previous terms. The students gathered data of their choosing and analyzed that data. Over a number of years the projects indicated difficulty with understanding what the appropriate statistical procedure would have been for their data. I also would receive emails from alumni now and again that typically began, "My boss wants me to analyze this data, what do I do?" The alumni knew how to calculate various statistics, but when facing only data did not know what statistics to calculate.

The addition of a statistics project first occurred fall of 2008. At that time the project was a single, term long project with submission of multiple drafts during the term. This did not result in the students gaining any insight into analyzing data beyond the specific type they had collected.

Fall 2011 I shifted to a series of "mini-projects" each focusing on a different area in statistics. These quickly proved to be formulaic and did not lead to the students knowing what type of analysis to run when not specifically told what type to run.

During spring 2012 I realized that I wanted to provide more opportunities to work with raw data found in the wild with less instructional scaffolding. I envisioned this happening at the end of the term so that we, as a class, could reach back and pull out any of the statistical tools we had encountered - basic statistics, regressions, confidence intervals, t-tests for a difference of sample means.

This required, however, trimming the curriculum to provide space for data exploration and open ended analysis. Calculating the mean from a frequency distribution, the normal distribution and the associated normdist, norminv functions were dropped from the curriculum. Coverage of probability was trimmed. The projects/mini-projects were also dropped. Box plots and quartile calculations were added to the curriculum to increase the number of visual tools that would be available at term end. These changes provided time for unstructured data exploration and analysis while retaining the key statistical concepts.

Cheryl balances while Delinda times

The result was a couple weeks of open time at the end of the fall 2012 term. During this period three data sets were explored. The first set of data was accompanied by questions that required only basic statistics. The second set of data was best analyzed by regression analysis. The students were not told this, but were again simply given open ended questions to answer. The third set was a paired t-test for the means to test whether balance was better on the dominant foot than the non-dominant foot.

Joan balances on one foot, eyes closed, while Janice times

Getting the students to come up with statistical approaches to the raw data presented was like pulling embedded molars. Encouraging them to work with neighbors in small groups did not help. At one level I realized I was asking a lot - they had only just learned the statistical concepts I now wanted them to apply without my guidance. I was pushing high into Bloom's taxonomy looking for synthesis, analysis, evaluation skill sets using statistical tools the students did not fully understand.

Richinia balances

On the final examination the students faced 33 specific statistical questions about two different sets of data. The average of 24.9 (75%) is in line with historical performance levels, if not slightly above the long term average of 73%.

A second section of the final examination presented raw data and the "story" behind the data. No questions were directly asked as any direct question would suggest a statistical approach to the data. To ask whether the two samples differed as to their means would immediately suggest a two sample t-test. The students were told, "The goal of Resh, Binkley, and Parrotta was to determine whether nitrogen fixing trees were able to store (sequestrate) more carbon in the soil than the non-nitrogen fixing trees. If true, then nitrogen fixing trees would remove more overall carbon from the atmosphere. This would reduce the green house effect and help slow down global warming."

"The data is the gain or loss of soil organic carbon (SOC) in grams per meter squared per year for the soil associated with an individual tree. The greater the gain (positive values), the better for the purposes of this study."

Thirty-six students (46%) simply left this section blank. Fifteen students cited statistics but drew no conclusion in regards whether nitrogen fixing trees were able to sequester more carbon in the soil than the non-nitrogen fixing trees. Eighteen students made statistically relevant calculations such as calculating the mean or the sum (net gain/loss of soil organic carbon) and used the resulting differential to render a decision. A difference in the means or sum was taken as evidence that the nitrogen fixing trees sequestered more carbon. A few of these students noted a difference but then said the difference was not significant, although no evidence of the lack of significance was offered.

Only nine students framed the question as a hypothesis test with a null and alternate hypothesis. Of these six then only cited a difference in means. Two students ran a paired t-test for a difference in the means. The two samples were designed to have the same sample size by intent, this opened up more statistical options that the student could have explored.  Only one student out of 78 set up a hypothesis test, ran an independent t-test for two samples, and then correctly rejected the null hypothesis at an alpha of five percent.

A statistics instructor might hope that students would be able to tackle raw data and make statistical sense of the data at the end of a statistics course. Clearly the students do not have this ability as a result of a first course in statistics. The data exploration work done this past term has been a pilot run, an experiment in authentic assessment in statistics. The data exploration exercises have done more than the projects to bring a focus on handling raw data. The data exploration exercises will be repeated spring 2013 with the benefit that the instructor has done this once before.