Five marbles originally started as a confidence interval hypothesis testing exercise back in 2017. By 2019 I had realized that the five marbles exercise could illustrate the standard error of the mean. In 2020 I realized that there was a vulnerability in the model. If the groups were not equal in the number of students, then the exercise as done would fail to demonstrate that the means of the sample means would be equal to the population mean.
The challenge this term is that attendance has collapsed to where the number of students attending class can be five or less. As noted in the linked blog, attendance is not correlated to performance as the course is now a true hybrid blend of online and residential instruction options for the students.
The solution was to make each student their own group, giving them each five cupcake cups with five marbles each.
Then the students were to trade marbles with each other. Give away marbles, receive marbles. I already knew that if a student did not give or receive any marbles to anyone else, then their sample mean would remain five. They would still have 25 marbles and 5 cups. Since I wanted variation in the sample means, I encouraged them to give away marbles to other students.
Vince exchanging marbles
Lizbethlay gave away a cup, which would have resulted in the issue of
groups being of unequal size, so I had to add the caveat to give away marbles, not cups. Only four students came to class, so there were only 100 marbles in circulation.
Before the exchange every cup held five marbles. After the exchanging the number of marbles per cup varied from 0 to 11. Each of the four students still had five cups. Each student calculated the sample mean for their five cups, yielding four sample means. Note that the sample means distribute more narrowly than the data. The average of the sample means, however, remains five.
As hoped for, and as predicted, the standard deviation of the sample data (the number per cup) is larger than the standard deviation of the four sample means. The sample means distributed more narrowly.
The sample means also distributed slightly more symmetrically than the data distribution. That is the prediction: if the samples are good random samples each with a "sufficient" sample size, and if there are "enough" samples, then the sample means distribute normally around the population mean. The class sizes used for the histogram above included five in the class with a class upper limit of six.
Note that if each student had calculated the standard error of the mean based on the five cups in front of them after the exchange, then they would each have obtained a different estimate of the standard error of the mean. The average of the four sample standard errors is 0.78. If one considers all twenty cups as the sample, the standard error of the mean obtained is 0.53.
If each student then constructed a 95% confidence interval for the population mean, three students would capture the population mean and one would not have captured the population mean. The second sample has a t-statistic of 3.1 an a p-value of 0.03.
The model helps demonstrate the narrower nature of the sample mean distribution. The non-random nature of the marble exchange may be problematic from the perspective of capturing the population mean. Ultimately frequentist approaches to inference are unable to take into account social motivations of those sorting the marbles. That said, a Bayesian approach would have to account for all possible social motivations as priors and would be cumbersomely complex for a beginning statistics course. Frequentist statistics has an intuitive appeal that can get lost of a forest of possible priors and calculations in a Bayesian analysis.
The board at class end.
Comments
Post a Comment