### Favorite colors

Each term when chapter three section two nominal level histograms rolls around in statistics, I wake up early in the morning, put on a blue shirt over black pants with a green t-shirt underneath the blue shirt. At the start of class I ask the students to write their favorite color on a scrap of of a blank three-by-five card. Then I tally the data on the board. The favorite color data is always a good example of qualitative, nominal level, discrete data with a countable number of outcomes. By countable I mean that at present only 15 colors have been named by 1099 students in my MS 150 statistics class since 2007.

The favorite color data is held in a Google docs spreadsheet. Over the past nine years blue has remained just ahead of black as a favorite color in terms of frequency order. The gap has been small but persistent. During some routine updating of links I discovered that there were two favorite color spreadsheets. Sometime in 2010 I worked in a separate copy until 2011 and then that version was orphaned. When I re-integrated the missing 2010 and 2011 data the gap between blue and black vanished. Blue was the favorite color of 240 students, black of 239 students.

 Favorite color Freq f Rel freq Blue 240 22% Black 239 22% Green 158 14% Red 131 12% Brown 101 9% White 62 6% Purple 54 5% Gray 33 3% Pink 29 3% Yellow 30 3% Orange 7 1% Maroon 7 1% Aqua 4 0% Yellow-green 3 0% Indigo 1 0% 1099 100%

The data is primarily section by section, but for some terms the sections are combined. There are 43 of these section or term columns. Bearing in mind that my shirt selection is based on blue being number one over the long haul, I checked to see how often I was winning with a blue shirt. Of course if blue was not number one, black was and hence my pants and shirt combination allowed me to say "The top two selections are blue and black, as are my clothes. I had the order wrong, but I knew the top two would be blue and black."

The green shirt is a back-up for green in number three. In small sample sizes anything can take first rank. Green and red are the two that most often take first rank in a small sample. Note that aqua is used in lieu of "blue-green" which is what the students write. I use the aqua label to include blue-greens, teals, aquamarines, cyans, and turquoises, although the later three have never come up to the best of my recollection. The logic is that these are all at a color angle of roughly 180 degrees. Students who have had SC 130 Physical Science encounter the secondary colors of light in that class, yellow, cyan, and magenta. I have also put the occasional baby blue into the blues. Maroon is the color of the uniform of the public high schools here on Pohnpei.

When I look at blue wins, I found that of 43 columns blue actually exceeds black only 16 of 43 times, or 37% of the time. Twenty-seven times black is equal to or larger than blue. This suggests I could "win" outright more often by running with a black shirt over blue pants.

 blue>black 16 37% blue<=black 27 63% 43

In the case of a tie I can wiggle and weasel my way around my betting on blue over black by saying that since it was a tie I could not have won (except to perhaps wear a shirt that is half blue and half black?). The magic is in predicting that the class will choose blue and then getting that right and noting that I made this prediction when I got dressed in the morning. Ties dilute the magic.

In fact, if weaseling around the tie is accepted as a blue win, the choice of shirt color becomes a statistical dead heat.

 blue>=black 22 51% blue 21 49% 43

Explaining the section and term win ratios is more complex that pointing at the frequency rank order to explain why I have on a blue dress shirt. Black pants. A green t-shirt. And, once in a while, red underwear.