For the Love of Green Beans, Stop Using the Term “Hallucination”
You are confusing executives and hurting your credibility. Use the term “probabilistic outcome” instead.
For the love of green beans and good food everywhere, stop using the term "hallucination" when discussing GenAI, as it confuses executives and employees and hurts your credibility when making budget requests.
At a basic level, computer systems are either deterministic or probabilistic.
LLMs are specifically architected to be probabilistic, utilizing highly sophisticated algorithms that process an input (a "prompt" or an API call) and produce an output (such as text or numbers). This output relies on training from trillions of words, tuning, and other factors, and isn't based on if-then rules. This makes the responses from LLMs not precisely explainable.
When an individual enters revenue and cost numbers into accounting software, the profit is calculated exactly. The output remains constant with the same input, and the logic for the output is 100% explainable.
In contrast, when an individual inputs text into ChatGPT (e.g., "create a safety training program" or "outline our next board meeting agenda based on these factors"), the LLM generates output influenced by training data, the specific prompt, and other variables. The result is often factually correct but can be inaccurate, perhaps 2-5% of the time. However, the outcome isn't random — it's probabilistic.
The term "hallucination" incorrectly suggests a bug or fault in the software to executives. In reality, the probabilistic outcome generation is a feature or characteristic. LLMs are explicitly designed to have some variability and this is part of the transformative power and uniqueness of LLMs.
Deterministic and probabilistic software programs are fundamentally different tools. A chainsaw differs from a power saw. Both are powerful, but you wouldn't cut a hole in the wall with a chainsaw, nor would you use a power saw to cut down a tree.
Yes, yes, yes….I know that the term, hallucination, is now firmly established in the GenAI lexicon, but this doesn’t mean you need to keep using it.
GenAI represents a fundamentally different computing paradigm and can be challenging initially to understand fully.
At a board meeting where I presented about GenAI, an executive asked: “Are you telling me that Microsoft invested $10B in a startup [OpenAI] that sometimes provides incorrect answers.” I replied with a "yes."
Choosing between deterministic or probabilistic software depends on the business' tolerance for inaccuracies in a particular use case. Quarterly financial reports to Wall Street necessitate 100% accurate financial accounting tools. Conversely, the auto insurance marketplace, Jerry, would not have realized a 400% ROI and $4M yearly savings by automating customer service calls without probabilistic, LLM software.
Various approaches exist to reduce outcome variability. Many processes a "human-in-the-loop" process to shield any variability from customers. Additionally, advancements in AI software, prompt engineering, and approaches like RAG reduce unwanted output variability.
When an executive or employee mentions “hallucinations” as an objection to your budget request for a GenAI project, use this as an opportunity to explain the difference between deterministic and probabilistic software tools and the pros and cons of each.
This newsletter is sponsored by CustomGPT.
Read enterprise GenAI news rated Essential from last week by our AI analysts (view all ratings here. Listen to our daily GenAI executive briefing each day)
How Did Companies Use Generative AI in 2023? Here’s a Look at Five Early Adopters (by WSJ).
Rating rationale: Provides vital insights on AI adoption and strategies and shows executives GenAI projects have real ROI.Sports Data Labs, Inc. Announces Issuance of New U.S. Patent Covering its Novel Generative AI-Based Method for Creating Synthetic Data to Replace Missing and Outlier Data Values
Rating rationale: a granted patent for any MISSING data sets. could have profound impacts. Could key data get locked up by patents in your industry?First Ever AI Solution to Integrate Drug Discovery and Synthesis
Rating rationale: revolutionizes drug discovery; significant industry impact.JP Morgan introduces DocLLM
Rating rationale: another great example of industry-specific LLM and data effort by a large company, showing the nimbleness of some large organizations in the Age of AINotable upcoming events
Jan 8: 7p EST (virtual, free) Learning Lab discussion GenAI Adoption Barriers. Sign up here.
Jan 15. 5:30p-7p (in-person, free). AI Woodstock - Atlanta at the Battery in Atlanta, GA for GenAI networking event. RSVP here.
Jan 31: Noon EST (virtual, free) Case study of why one team at MIT chose a no-code LLM solution from CustomGPT to meet a pressing need. Register here
Apr 8 (all day - in person; free): The Future of Business with AI at MIT Media Lab. Last year, speakers included Sam Altman from OpenAI, Vinod Khosla, Lex Fridman, Stephen Wolfram, etc.
May 7: (in person) Harvard Business School GenAI Conference.
Oct 7-8 (2 days. in-person, paid): Generative AI World 2024 in Boston, MA. This is our flagship annual conference.
Looking for AI jobs: Amogh Patankar is a researcher at Stanford Medical and is looking for opportunities to utilize machine learning and/or deep learning techniques to build, validate, and deploy models using frameworks like scikit-learn, PyTorch, and TensorFlow. Contact him via Linkedin.
“Love your enemies, for they tell you your faults.” - Ben Franklin
Onward,
Paul
@Michael Novak alerted me to an article in WIRED by Steven Levy defending AI Hallucinations
https://www.wired.com/story/plaintext-in-defense-of-ai-hallucinations-chatgpt/
I don't love the term "hallucination," though I think it's better than most. I'm not opposed to calling them "lies" either - I think new and non-technical users of LLMs should be thoroughly warned before assuming they're infallible.