When Tests No Longer Test: Teaching in the Age of AI
In my residential (face-to-face) section of MS 150 Statistics, the first chapter test occurs in class to provide an opportunity for me to assist the students should they have difficulties. Tests for chapters two and beyond are completed outside of class by the students. This mirrors the testing environment of the online section of the course where students complete all of the tests outside of a classroom.
The tests in the course are set up in a Moodle learning management system. The tests allow a single attempt. With the tests online, they are open notes, open book. This also means that the taking of the tests is not monitored by me. The one test that I get to see is the first test, and only for students in the residential section of the course.
I rarely intervene during this first test, allowing the students to work the test as they might when completing a test outside of the classroom.
One of the students had the above on their computer. I paused and watched. The student spent some time clearly reading the explanation for the answer being displayed. When the student looked up and saw me watching them, they did not react in a manner that would suggest they thought there was any issue with using Microsoft Copilot to obtain and explain the answer. Nor had I said anything about the use of large language models.
Microsoft Copilot was providing the correct answer with an appropriate explanation. Including an example that drew from the actual data set in the test. Copilot echoed the same explanation I had given in class. When I asked the student if the explanation given by Copilot helped, they said yes.
I noted that the student was not signed in to Copilot - they were operating as anonymous guest. The opening screen notes that Generative Pre-trained Transformer five is now available in Copilot - Copilot is OpenAI under the hood. At no cost, no sign in required.
Over the past summer I read articles and viewed videos on college professors who have chosen to return to literal paper and pencil tests done in class, with cell phones confiscated. That is not the reality of the workplaces that our students will find themselves in after graduation, that will not be the reality of the rest of their lives. Resistance to AI is understandable, but not sustainable. Some instructors are in the even more awkward position of using large language models themselves while banning the use of large language models by their students.
Some instructors, seeking ways to control, proctor, and monitor online tests, will turn to lockdown browsers. Yet students can end run lockdown browsers by use of a second device. This leads some instructors to use the student's laptop camera to monitor the student while they take the test. If they look away for an extended period of time, or get up and leave the view of the camera, they fail the test. This is a dark slope that leads ever downward until students are afraid to get up to urinate. Education becomes torture.
There is a curious desperation underneath the above scenarios to keep testing locked into a pre-existing paradigm. To dig in one's heels and decide "this much change and no more." Some instructors are even going farther back than the invention of writing and returning to oral tests.
With thanks to a former colleague, I have taken a different road. I have accepted that tests do not test. I now use data exploration exercises, which are more open ended, to assess learning. And these too are now falling to large language models.
On the cusp of the arrival of pocket calculators during my high school years I was told I could not use a calculator in math class because I would not always have a calculator on my hip. Now I always have a calculator on my hip that also makes phone calls and connects me to a planetary information network. To argue that students should not use AI echoes the resistance to calculators.
At the same time I am rather bemused by the plethora of conferences and workshops targeting educators on how to use AI in the classroom. One can sense that consultants and consultancies smell the money to be made trotting out a workshop on the topic du jour. Not that there isn't value in some of these offers, just that the Internet, and YouTube, is flooded with the same information at no cost.
I was pleased that Copilot at least explained the reasoning behind the answer, and that will have to be good enough. My students already consider consulting large language models a matter of routine, as the way one gets answers. My task will be to continue to find ways to integrate these new technologies into my courses and curricula, while also training students to view the results with a critical mind. This process will be a journey, uncomfortable for me at times, but necessary to prepare my students for the world that they will face beyond graduation.
Post-script
Performance on the first test, which is comprised primarily of questions as basic as the one seen above, still distributed across all grades. As might be expected, if not hoped for outright, the majority of students obtained a grade of A.
Microsoft Copilot was fed the full text of this article and asked to come up with suggested titles. Copilot produced nine options, one of which was chosen for the title.
Comments
Post a Comment