### How many marbles on a brainfoggy Monday morning

Just as an illustration of how a 95% confidence interval works, I poured marbles into into Le'ah's hands and had the rest of the class estimate the number of marbles by writing their guess on scraps of paper.

Regina counts Leah's marbles.

I used the guesses to construct a 95% confidence interval for the mean using t-critical and the standard error of the mean.

The 8:00 section confidence interval from 33 to 60 included the actual number of marbles, 50. The concept to try this had dawned in my sleepy brain this morning prior to finishing my first cup of coffee.

Only as I repeated the exercise at 9:00 did I suddenly realize how horribly flawed was my reasoning. The guesses could only be used to construct a 95% confidence interval for the data, not the mean. There is no mean number of marbles involved, these are data level guesses. I would need to use the mean ± t-critical*standard deviation and not mean ± t-critical*standard error of the mean to have any hope of a 95% level of confidence. The standard error will be too narrow for larger n, and the guessing is not of a mean but a data value: the actual number of marbles.

At nine o'clock the process went dutifully south, revealing the flawed logic of a foggy morning. A 36 to 49 marble 95% confidence interval did not include the actual result of 55 marbles. The 9:00 data supports a data confidence interval that runs from 15 to 69 marbles, with 55 well within this range. In fact, the student's guesses in the 9:00 section are reasonably heap like in their distribution, leptokurtic if anything.

I have visions of students each getting a dollop of marbles and then attempting to capture the population mean number of marbles for the class from the data set, but then that is obviously true as the class would be the population. I suppose I could give everyone a dollop of marbles, count them, write the number on a scrap of paper, collect only a sample, run the 95% confidence interval, then obtain the rest of the papers and calculate the true population mean. That might behave properly. The sample size would have to be reasonable, not so small as to generate a huge interval that is meaningless. Things to think about for another term.