AI Can’t Cure a Flaccid Mind

Many of my classes consist of a large writing component. I’ve designed the courses so that most students write the best paper that they’ll ever write in their life. Recently, I had reason to believe that a student was using AI or a paid service to write their paper. I couldn’t find conclusive evidence that they didn’t write it, but it ended up not mattering much in the end.

Rather than requiring a long, terrible, thoughtless paper at the end of the semester that would be crammed into a single all-nighter, I do something different. I break the research paper up into a total of 14 submissions (yikes!). Whether you teach or take classes, that number looks impossible. Who has time for that?

The paper is split into 7 parts. Something along the lines of:

  1. Proposal
  2. Literature Review
  3. Theory
  4. Data or Context
  5. Empirics or Model
  6. Conclusion
  7. Re-introduction & Final draft

Students append each new section on to the prior ones and submit the entire draft each time. Further, I encourage them to edit previous sections in red so that I can see changes that are relevant for the latest submission.

Each of the 7 parts has 2 submissions. The first is a good faith effort and is peer-reviewed by two other students. Each first submission should earn a 100%. Students then revise their work and submit a 2nd draft that I evaluate. I grade according to punctuality, clarity, detail, grammar, and I provide extensive feedback. Each 2nd submission is graded candidly because the score is essentially averaged with the full credit earned on the first submission. I grade the final submission according to a different, more comprehensive rubric.

This may sound like a lot of work (and it is), but the result is typically a very well-considered paper. Further, having seen the papers multiple times, reading the final drafts isn’t a terrible experience. For some students, this is the only time that they will document their ideas, grapple with them, and change them overtime.

Responding to feedback and writing a cohesive paper makes cheating very difficult or very expensive. But now we have artificial intelligence that students can employ. We’ve all seen Chatgpt. You give it a prompt and it’s off to the races, writing a somewhat wrong and somewhat superficial version that isn’t worthless. Another piece of software that uses AI is Grammarly. Basically, you express an idea in writing, and the AI expresses it better. Although they seem quite different, neither type of AI is the worst thing for our students.

Have you seen the AI responses to a prompt for code? That seems pretty great to me. I’d just need to proof-read it first. The same is true for students. Chatgpt will produce something, but it needs to be proofread and elaborated. Grammarly does the same thing in reverse order. It proofreads what you’ve created. In both cases, a student running on autopilot will end up creating vacuous nonsense. Chatgpt won’t be deep and detailed and Grammarly won’t make a student deep and detailed.

It’s not bad for students to use these AI. It’s similar to them meeting with a writing tutor. The onus is on the instructor to create an assignment prompt that filters out silliness. It’s on the instructor to grade in a way that folly isn’t given a passing grade.

Have you seen analyzemywriting.com?  It’s pretty great. You can just paste in some text – or even a whole paper – and the site will spit out some stats. Remember that student who I mentioned at the top?  Their writing changed dramatically as of the 2nd  draft of the third part that they submitted. Turnitin, the plagiarism software, didn’t register anything exceptional. And googling portions didn’t either. When I say that the writing changed dramatically, I mean, by a lot. But I can’t very well submit an honor code case based on how a paper reads. Or can I?

No, I can’t.

But I can quantify how the submissions read differently. I measured several stats:

  1. Word length average
  2. Word length standard deviation
  3. Sentence length average
  4. Sentence length standard deviation
  5. Readability or Grade level complexity
  6. Percent passive voice
  7. Turnitin score

None of these measures are indicators of good writing. There was practically no correlation between these measures and the grades that students received. However, there was a very high correlation among submissions for the same student because, although edits and revisions occur, student writing style remains largely constant over the course of a semester. Below are scatterplots of the stats for the part 3 first submission and the final term paper submission.

Can you guess which dot describes the suspect?

It turns out that most of the writing stats for each student are very stable over the course of the semester (except sentence length standard deviation).  The data lead me to think that the suspect used Grammarly and did not pay someone else to write the paper. For this student, word length, readability, and passive voice all increased substantially. I don’t like passive voice personally, but students often perceive it as more formal.

What’s not in the stats is anything about the content of the papers. Like I said, grades were not strongly correlated with any of these measures. The suspect wrote a very long paper that was full of the same repeated assertions on a topic that appeared to leverage social desirability bias. The paper was absent of any logic or clear exposition of fundamental concepts that we had learned in class. But, the grammar was excellent. In the end, the paper earned a grade that comported to the richness of its content. So, a lot of the academic dishonesty worry turned out not to be relevant in this case.

Faculty need to take notice. Turnitin will be replaced by a competitor if they don’t begin to mark other signs of cheating besides plagiarism. I don’t have the time to perform the above analysis for each writing assignment, which could easily be automated. Rather, instructors must teach in a way that exercises the weak recesses of a student’s brain. We need to teach and then test students in ways that encourage them to explore new concepts and push their brains to work analytically and intensively. Artificial intelligence will systematically harm those who pretend to teach and those who pretend to learn by placing them in structures that are built on sand.

Update: this is supposed to detect AI writing: https://huggingface.co/openai-detector/

3 thoughts on “AI Can’t Cure a Flaccid Mind

  1. StickerShockTrooper December 16, 2022 / 12:11 pm

    Great analysis, especially that last sentence. Same thing about graphing calculators: you can pass the test, but good luck getting any further without intuitive understanding of the material.

    Long term, with the ubiquity of tools like Grammarly, I think that grading for grammar and spelling will go the way of grading for long division. I’d rather that future teachers simply assume (if not require) AI assistants for these things (really, how is that different from consulting a dictionary?) and concentrate on grading for content.

    Like

  2. JOS December 16, 2022 / 12:39 pm

    Hi,

    I just wanted to write and say that this teaching method is very valuable and probably one of the best ways to provide instruction. I’ve interviewed undergrads and the ability to write coherently is one of the most valuable skills. AI can’t match that and it’s, so far, easy to tell who has skated through college. Bravo on a good method, even if it requires a ton more work.

    Like

  3. James Bailey December 22, 2022 / 10:05 am

    Sounds a lot like my capstone class. I’m also not especially worried about AI-based cheating there, for similar reasons

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s