AIS “Hallucinations” Demand Verification,Says Pioneer Vishal Sikka
A growing chorus of experts warns that Large Language Models (LLMs) are prone to generating inaccurate or misleading facts – often referred to as “hallucinations” – and a leading AI researcher says the solution lies not in refining the models themselves,but in pairing them with systems capable of verifying their output.
According to Vishal Sikka, CEO of Vianai Systems, expecting LLMs to reliably handle arbitrarily complex calculations is a basic miscalculation. “To expect that a model that has been trained on a certain amount of data will be able to do an arbitrarily large number of calculations which are reliable is a wrong assumption. This is the point of the paper,” Sikka stated during a recent discussion of his team’s research.
Sikka brings decades of experience to the debate.A PhD graduate of Stanford University – where he studied under John McCarthy, the scientist who coined the term “artificial intelligence” in 1955 – he has held leadership positions at SAP and Infosys. Inspired by lessons learned from McCarthy, Sikka collaborated with his son on a study, “Hallucination Stations: On Some Basic Limitations of Transformer-Based Language Models,” published in July, to explore these limitations.
“You have to perform extreme caution when you do these kinds of things.”
However, Sikka believes a solution is within reach: augmenting LLMs with verification systems. He points to Vianai’s Hila, which he says can reduce financial reporting time from 20 days to just five minutes by combining an LLM with a domain-specific knowledge model.”For certain domains, when you surround the LLM with guardrails, with reliable approaches that are proven, then you are able to provide reliability in the overall system,” he said. “It’s not only us. A lot of systems out there work like that where they pair the LLM with another system which is able to ensure that the LLM has correctness. So we do that in our product Hila. We combine the LLM with a knowledge model for a particular domain and then, after that, Hila does not make mistakes.”
This approach mirrors the strategy employed by Google’s alphafold, which uses a custom LLM called Evoformer to generate potential protein structures, then feeds those candidates into a separate system to identify flaws. “And so anything that comes out of that has a much higher likelihood of being an actual protein, and then it repeats this cycle three times, and the outcome of that is pretty much guaranteed to be a protein for a particular situation,” Sikka explained, noting that AlphaFold has already produced 250,000 proteins using this method – a task that once required years of work by teams of scientists.
Sikka’s concerns are rooted in a long history of observing the ebb and flow of AI hype. He described the current moment as the “fourth time around” for AI mania,recalling similar surges of enthusiasm in the 1980s. He noted that despite the current successes in areas like coding, a recent MIT study revealed that 95 percent of AI projects ultimately fail. He likened the current state of AI to the early days of television, where news anchors simply read information as they had on the radio.
“I think so far,we are just regurgitating our prior known things using AI,but soon we will see breakthrough,new things that are possible,” Sikka predicted. He believes that carefully targeted applications of AI offer notable potential for return on investment, but cautioned against a “blanket use of LLMs.”
Sikka’s outlook is informed by insights gleaned from pioneers like Alan Kay and Marvin Minsky, the latter of whom wrote a letter of recommendation that helped Sikka gain admission to Stanford. Minsky, author of the influential 1986 book The Society of Mind, believed that intelligence arises from the interaction of multiple components. “That there is a collection of things that come together to create intelligence. I think that’s kind of where we will end up, but we’ll stumble along our way through to that,” Sikka summarized. Ultimately, understanding the boundaries of these powerful tools – a concept McCarthy termed “circumscription” – is crucial for responsible innovation.
