### Statistics test three

MS 150 statistics test three includes some material that is not subsequently covered on the final examination. The material that is not covered are normal curve calculations that provide a foundation for the t-distribution and subsequent consideration of confidence intervals. Some of the student outcomes on the outline must be assessed by test three. Course learning outcomes 2.1 and 2.2 are assessed during test three.

Test three consisted of 14 questions based on a small 12 item data sample. 73 students sat for test three. The fourteen question test was worth 15 points, two for the final question which involved citing a confidence interval. Overall performance was an average of 71% on the test The median test score was 80% correct.

The first three questions were basic statistics that the students calculate on almost every quiz and test.

Questions four and five cover material the students have not seen since midterm about a month ago, these are effectively a measure of retention of material for which the students are not likely to have studied.

Questions six to nine involve normal distribution calculations. Ten has the students calculate the standard error of the mean. Eleven to fourteen are the newest material, the students first encounter with a 95% confidence interval using the t-distribution.

Performance on questions one to three averaged 97%.

Questions four and five, which were unannounced and covered old material from before the midterm saw a 46% success rate. Retention of knowledge remains elusive over any span of time. Nine of the 36 students who answered four incorrectly omitted parentheses in the z-score formula. Students have access to formulas during the test, thus this error is particularly puzzling. The other 27 incorrect answers showed no discernible pattern.

Course learning outcome 2.1, normal distribution calculations, had a 62% success rate. The easiest problem, using the NORMDIST function to find the area to the left of a data value had an 81% success rate. Students are able to make a basic single calculation which requires no special set-up or analysis.

Questions seven and eight required more careful analysis. Question seven asked for the area to the right of a data value. This requires finding the area to the left and then subtracting that value from one. The dominant incorrect answer was simply the result of the first step: finding the area to the left of the data value.

In light of the above paragraph, the improvement seen in number eight is somewhat puzzling as eight required finding the area between two data values. This is actually more complex than number seven, however the wrong answer in number seven can be used to derive the answer to number eight. Thus getting seven wrong was not necessarily an impediment to getting number eight correct.

Question nine required use of the NORMINV function to find an x-value from the area to the left of x. This was a simply calculation and saw a 79% success rate. Bear in mind that errors in questions two or three prevent obtaining the correct answers for problems involving both the NORMDIST and NORMINV function. Five students answered two and/or three incorrectly.

Question ten, calculating the standard error, which is course learning outcome 2.2, saw an 81% success rate. This calculation will appear on subsequent quizzes and the final examination, thus success rates should theoretically improve.

Questions eleven to fourteen test course learning outcome 2.3. This material will appear again, including on the final examination. While the average success rate is 68%, the questions are a chain of sequential calculations where answering a preceding question incorrectly usually prevents answering subsequent questions correctly.

Overall performance by course learning outcome is summarized in the following table:

Current discussion at the college centers in part on what is an acceptable rate of success and whether a 60% D should be a passing grade. Is a D really an F? I would also ask whether the standards applied to the individual student apply to item analysis of a group of students. The students are still be graded on the traditional scale, but what are the acceptable success rates for student learning outcome?

Historically overall averages are similar to those seen in the CLO table above.

Spring 2011 performance on 2.1, 2.2, and 2.3 are in line with historic averages and are, if anything, potentially slightly stronger. The terrible performance seen on 2.1 Fall 2010 has been strongly reversed.

I would be remiss if I did not note that stability in the basic outline and assessment process choices over the past six years has provided the ability to look at success rates by CLO across that span of time. The ability to provide coherent assessment data that is comparable across this time span is, for me, a form of validation of the usefulness of simple item analysis to determine the level at which learning is occurring.

This data also, at least for me, provides a basis for discussing what is an acceptable level of performance on an individual student learning outcome and on overall average performance levels. Over the past ten terms overall performance per term is bounded between an average 65% to 79% success rate as measured by item analysis. Given that the margin of error for the mean at a 95% level of confidence is near nine percent at present means that specifying, for example, a 70% success rate could still see term averages as low as 61% on a purely random basis.

The upshot is that looking at one term is statistically insufficient. One term at 61% would not, at least for statistics, mean that a long term average of 70% was not being maintained. Thus saying that all courses will achieve 70% success rates (or any other rate) for all outcomes across all terms is simply unrealistic.

Test three consisted of 14 questions based on a small 12 item data sample. 73 students sat for test three. The fourteen question test was worth 15 points, two for the final question which involved citing a confidence interval. Overall performance was an average of 71% on the test The median test score was 80% correct.

The first three questions were basic statistics that the students calculate on almost every quiz and test.

Questions four and five cover material the students have not seen since midterm about a month ago, these are effectively a measure of retention of material for which the students are not likely to have studied.

Questions six to nine involve normal distribution calculations. Ten has the students calculate the standard error of the mean. Eleven to fourteen are the newest material, the students first encounter with a 95% confidence interval using the t-distribution.

Qstn | Chap | Topic | n corr | %corr |

1 | 1 | n | 71 | 0.97 |

2 | 3 | mean | 71 | 0.97 |

3 | 3 | sx | 70 | 0.96 |

4 | 3 | z-score | 36 | 0.49 |

5 | 3 | infer | 31 | 0.42 |

6 | 7 | normdist | 59 | 0.81 |

7 | 7 | normdist | 25 | 0.34 |

8 | 7 | normdist | 40 | 0.55 |

9 | 7 | norminv | 58 | 0.79 |

10 | 8 | stand err | 59 | 0.81 |

11 | 9 | deg free | 61 | 0.84 |

12 | 9 | tc | 59 | 0.81 |

13 | 9 | margin err | 42 | 0.58 |

14 | 9 | CI | 36 | 0.49 |

Performance on questions one to three averaged 97%.

Questions four and five, which were unannounced and covered old material from before the midterm saw a 46% success rate. Retention of knowledge remains elusive over any span of time. Nine of the 36 students who answered four incorrectly omitted parentheses in the z-score formula. Students have access to formulas during the test, thus this error is particularly puzzling. The other 27 incorrect answers showed no discernible pattern.

Course learning outcome 2.1, normal distribution calculations, had a 62% success rate. The easiest problem, using the NORMDIST function to find the area to the left of a data value had an 81% success rate. Students are able to make a basic single calculation which requires no special set-up or analysis.

Questions seven and eight required more careful analysis. Question seven asked for the area to the right of a data value. This requires finding the area to the left and then subtracting that value from one. The dominant incorrect answer was simply the result of the first step: finding the area to the left of the data value.

In light of the above paragraph, the improvement seen in number eight is somewhat puzzling as eight required finding the area between two data values. This is actually more complex than number seven, however the wrong answer in number seven can be used to derive the answer to number eight. Thus getting seven wrong was not necessarily an impediment to getting number eight correct.

Question nine required use of the NORMINV function to find an x-value from the area to the left of x. This was a simply calculation and saw a 79% success rate. Bear in mind that errors in questions two or three prevent obtaining the correct answers for problems involving both the NORMDIST and NORMINV function. Five students answered two and/or three incorrectly.

Question ten, calculating the standard error, which is course learning outcome 2.2, saw an 81% success rate. This calculation will appear on subsequent quizzes and the final examination, thus success rates should theoretically improve.

Questions eleven to fourteen test course learning outcome 2.3. This material will appear again, including on the final examination. While the average success rate is 68%, the questions are a chain of sequential calculations where answering a preceding question incorrectly usually prevents answering subsequent questions correctly.

Overall performance by course learning outcome is summarized in the following table:

CLO | Perc |

2.1 | 0.62 |

2.2 | 0.81 |

2.3 | 0.68 |

Current discussion at the college centers in part on what is an acceptable rate of success and whether a 60% D should be a passing grade. Is a D really an F? I would also ask whether the standards applied to the individual student apply to item analysis of a group of students. The students are still be graded on the traditional scale, but what are the acceptable success rates for student learning outcome?

Historically overall averages are similar to those seen in the CLO table above.

SLO | Fa 05 | Fa 06 | Fa 07 | Sp 08 | Fa 08 | Sp 09 | Fa 09 | Sp 10 | Fa 10 | Sp 11 | Avg |

1.1 | 0.6 | 0.92 | 0.82 | 0.78 | 0.82 | 0.94 | 0.96 | 0.74 | 0.44 | 0.78 | |

1.2 | 0.61 | 0.56 | 0.6 | 0.83 | 0.84 | 0.75 | 0.79 | 0.55 | 0.7 | 0.69 | |

1.3 | 0.89 | 0.91 | 0.87 | 0.9 | 0.8 | 0.86 | 0.93 | 0.74 | 0.95 | 0.87 | |

1.4 | 0.5 | 0.67 | 0.82 | 0.59 | 0.73 | 0.52 | 0.64 | ||||

1.5 | 0.77 | 0.52 | 0.75 | 0.79 | 0.71 | ||||||

2.1 | 0.61 | 0.71 | 0.61 | 0.45 | 0.71 | 0.19 | 0.62 | 0.55 | |||

2.2 | 0.74 | 0.84 | 0.84 | 0.87 | 0.91 | 0.64 | 0.78 | 0.81 | 0.80 | ||

2.3 | 0.65 | 0.79 | 0.4 | 0.66 | 0.85 | 0.67 | 0.73 | 0.55 | 0.65 | 0.68 | 0.66 |

2.4 | 0.7 | 0.69 | 0.64 | 0.36 | 0.56 | 0.55 | 0.55 | 0.46 | 0.83 | 0.59 | |

2.5 | 0.68 | 0.55 | 0.63 | 0.43 | 0.56 | 0.46 | 0.71 | 0.57 | |||

3 | 0.74 | 0.61 | 0.73 | 0.8 | 0.79 | 0.63 | 0.69 | 0.7 | 0.76 | 0.72 | |

3.1 | 0.95 | 1.61 | 0.9 | 0.91 | 0.95 | 0.85 | 1 | 0.79 | 0.88 | 0.98 | |

3.2 | 0.91 | 0.37 | 0.93 | 0.84 | 0.85 | 0.87 | 0.92 | 0.79 | 0.86 | 0.82 | |

3.3 | 0.49 | 0.69 | 0.56 | 0.66 | 0.44 | 0.38 | 0.47 | 0.55 | 0.58 | 0.54 | |

Avg | 0.73 | 0.79 | 0.69 | 0.73 | 0.76 | 0.71 | 0.72 | 0.65 | 0.69 | 0.70 | 0.72 |

Spring 2011 performance on 2.1, 2.2, and 2.3 are in line with historic averages and are, if anything, potentially slightly stronger. The terrible performance seen on 2.1 Fall 2010 has been strongly reversed.

I would be remiss if I did not note that stability in the basic outline and assessment process choices over the past six years has provided the ability to look at success rates by CLO across that span of time. The ability to provide coherent assessment data that is comparable across this time span is, for me, a form of validation of the usefulness of simple item analysis to determine the level at which learning is occurring.

This data also, at least for me, provides a basis for discussing what is an acceptable level of performance on an individual student learning outcome and on overall average performance levels. Over the past ten terms overall performance per term is bounded between an average 65% to 79% success rate as measured by item analysis. Given that the margin of error for the mean at a 95% level of confidence is near nine percent at present means that specifying, for example, a 70% success rate could still see term averages as low as 61% on a purely random basis.

The upshot is that looking at one term is statistically insufficient. One term at 61% would not, at least for statistics, mean that a long term average of 70% was not being maintained. Thus saying that all courses will achieve 70% success rates (or any other rate) for all outcomes across all terms is simply unrealistic.