Hallucination as a User Error

You don’t use a flat head screwdriver to drill a hole in a board. You should know to use a drill.

I appreciate getting feedback on our manuscript, “LLM Hallucination of Citations in Economics Persists with Web-Enabled Models,” via X/Twitter. @_jannalulu wrote: “that paper only tested 4o (which arguably is a bad enough model that i almost never use it).”

Since the scope and frequency of hallucinations came as a surprise to many LLM users, they have often been used as a ‘gotcha’ to criticize AI optimists. People, myself included, have sounded the alarm that hallucinations could infiltrate articles, emails, and medical diagnoses.

The feedback I got from power users on Twitter this week made me think that there might be a cultural shift in the medium term. (Yes, we are always looking for someone to blame.) Hallucinations will be considered the fault of the human user who should have:

  1. Used a better model (learn your tools)
  2. Written a better prompt (learn how to use your tools)
  3. Assigned the wrong task to LLMs (it’s been known for over 2 years that general LLM models hallucinate citations). What did you expect from “generative” AI? LLMs are telling you what literature ought to exist as opposed to what does exist.

Leave a comment