Disclaimer: No AI functions were used to write this column.

Despite spending my entire career in software technology and even directing a program to use machine learning to help manage large systems, I wouldn’t know a neural net from a hairnet. Like many outside the software field, my understanding of AI is very limited and while I recognize that it exists, and sort of understand some of what it is capable of, I have no earthly idea how it works. AI doesn’t follow the prescriptive, algorithmic flow of traditional software. Older forms of AI used machine learning to recognize patterns and make recommendations. The newer AI systems, based on what are known as Large Language Models (LLMs,) are defined as being “agentic,” which simply means that they can operate autonomously. Rather than simply analyzing data and making recommendations for humans to act upon, these newer agentic AI systems can plan, execute and adapt to changing situations that they encounter along the way. Where a traditional machine learning system might detect a potential problem in a system and create a ticket for a support engineer to investigate and resolve, a modern agentic AI system can independently verify the issue, take steps to correct and document the outcome in a ticket without any need for human interaction.

I recently upgraded my old PC to a new AI-enabled model and decided to try out some of the AI features to see what all the fuss was about. Creating images is a breeze although the results aren’t always accurate. Here’s a simple rendering of Mount Rainier from an extremely simple prompt. I’m sure that a more refined prompt would give a more refined result.

I also asked the AI to summarize a novel I had recently read.

Here the results were truly impressive. The AI demonstrated intimate knowledge of the subject matter and when pressed to elaborate on themes within the novel, it provided well-reasoned and insightful responses.

Things went downhill rather rapidly from there, however. Inspired by the ads that were airing during the Mariners ALCS series against Toronto about how AI had crunched the stats on things like hopping over the foul line on the way to the plate or tapping home plate with the bat, I decided to ask a sports stat question. I’ve been a fan of Arsenal in the English Premier League for over 50 years, so indulge me.

Impressive — right? It wasn’t even fazed by the typo in my prompt. Except that the result in the Arsenal game was 4-0 and Martin Odegaard is out injured and wasn’t even in the squad. Oh, and Newcastle actually beat Benfica 3-0. The AI also provided a table summarizing all the results:

Except, none of this is correct. Every single result is wrong. OK – let’s try again.

I guess I should have asked for the “correct” results the first time. But the new results are no better. Like Odegaard, Gabriel Jesus is out injured. There was no penalty to Arsenal. Saka didn’t score. Alvaro Morata does not play for Atletico Madrid and neither does Angel DiMaria play for Benfica. Only one of the six matches was reported with the correct score.

OK – “verified and corrected.” That’s better. This time, all the scores were accurate but many of the details were still off. What is impressive here though is that the AI clearly understood and acted on my natural language feedback and echoed that feedback in its responses.

I then asked why this query had proven so difficult to get right, and this is quite insightful.

For those queries that were being advertised during the Mariners games, the AI models would have been trained with precisely the data that was required to answer those queries. But even then, did anyone actually check the outputs?

In conclusion, while the capabilities of modern AI systems are certainly impressive, they are not foolproof and, because the results are presented with utmost confidence, it can be hard to tell when something isn’t right. As Ronald Reagan once said, “trust but verify”. The responsibility to verify the results of AI queries lies with you, the user.

Niall McShane is an Edmonds resident, occasional contributor to Scene in Edmonds and a retired IBM executive with experience in managing software development and customer service organizations.