Large Language Model AI assignment in MS 150 Statistics

An assignment was developed for an introductory first contact statistics course. The assignment was based on the prior observation that a few students had been submitting assignments to large language model AI systems and turning in solutions that were correct but which used statistical tests beyond the scope of the course. 

The assignment was redesigned prior to the term start and then deployed with instructions to use a large language model to generate a solution. The intent was for the assignment to only be an initial exploration of whether all students would be able to use an LLM AI and to see what they would turn in.

The assignment used carbon sequestration pseudo-data generated from results in a paper by Resh, Binkley, and Parrotta. The data was the amount of soil organic gained or lost in the soil under nitrogen-fixing and non-nitrogen-fixing trees. 


The course presents the use of the TTEST function for a two sample test for a difference of means in independent samples. 



The type 3 TTEST p-value result of 0.03484 concurs with the result given by Welch's test, thus the algorithm underneath type 3 is the correct algorithm for the unequal variances. 


The above would be a reasonable chart to depict the differences in the means. When the assignment was redesigned, tests of the prompt and data produced the same analysis as can be seen above. Two months later, when assignments started to be submitted, the large language models were no longer getting correct results. 

The assignment was redesigned while already deployed with new directions.

  1. Calculate the mean for the Falcataria trees. 
  2. Calculate the mean for the Eucalyptus trees.
  3. Make a labelled column chart of the two means.
  4. Use the function =TTEST(A2:A21,B2:B21,2,3) to calculate the p-value.
  5. Copy the data in cells A1 to B21 and paste the data into an AI such as ChatGPT, Gemini, or another AI. Copy both the labels and the data in columns A and B.
    Copy the following prompt into the AI:

    The table provides data on the kilograms of carbon sequestered in the soil per square meter per year for two types of trees in two different locations. The samples are independent samples. The trees were not paired in any way. Which type of tree sequesters more carbon? Is the difference significant? Run a hypothesis test for this data. Include a chart of the mean carbon sequestered by each tree type.

    Press enter to submit the table and prompt that you copied into the AI.

  6. Make a slides presentation that compares your calculations and results with those of the AI. Report on where they agree and disagree. Include charts from both your manual work and the AI - if the AI gave you a chart. Some AI programs might not generate a chart. If you get no chart from the AI, report that as a difference. In the discussion and conclusion for your presentation explain what the AI got right and what the AI got wrong. 
The first submission for this new assignment came in. The student first made calculations in a spreadsheet as per the directions:

On another slide in the presentation the student presented the AI results:

Perhaps because the AI is a large language model, the AI system was unable to calculate the means. These large language models do not typically calculate anything, they just make statistical predictions on what word might come next. And where LLM AI systems provided a chart two months earlier, the AI did not provide a chart to the student.


The student added another slide comparing the results. Although the AI concurred that the difference was significant, the AI obtained incorrect values for the means and the p-value. The p-value was, however, close to the correct value. The student also added a final slide summarizing their findings. 

This remains an assignment under active development and will undoubtedly see further refinement. The exercise represents a first foray into intentionally using AI in the course. 

Gemini note




If the above exercise is done using Gemini in Google Sheets in a personal account with Gemini enabled, the results are very different. Gemini detected the table from A1 to B21 without prompting of any sort.


Gemini produced the same chart as might be expected by someone working with the data in a spreadsheet. Note that the means are correct in this chart.



Gemini also reports the correct means and p-value. Note that if one attempts to paste the table into the Gemini website app, the table loses all formatting and the analysis fails. The above result only occurs when Gemini is used in the context of Google Sheets, in the side panel. In a managed workspace where the side panel Gemini is disabled, the above in situ analysis cannot be done. And use of the website app will fail. Thus a Gemini in situ based approach cannot be used by students using their college accounts, not unless they switch over to a personal account in which Gemini has been enabled. 

Comments

Popular posts from this blog

Plotting polar coordinates in Desmos and a vector addition demonstrator

Traditional food dishes of Micronesia

Setting up a boxplot chart in Google Sheets with multiple boxplots on a single chart