OKINAWA, Japan, May 8, 2024 – Ever catch yourself muttering under your breath when tackling a tough problem? It turns out that internal monologue isn’t just a human quirk—it could be a key to unlocking more sophisticated artificial intelligence. Researchers have discovered that AI systems perform significantly better when “trained” to engage in a form of inner speech alongside utilizing short-term memory.
AI’s New Superpower: The Benefits of ‘Self-Talk’
Table of Contents
This study reveals that how an AI interacts with itself during training is just as important as its underlying structure.
- AI systems benefit from a process mirroring human ‘inner speech.’
- Combining internal dialogue with working memory boosts performance.
- This approach allows AI to generalize learning to new situations.
- Researchers are exploring real-world applications for this technology.
The findings, published in Neural Computation, suggest that learning isn’t solely about the architecture of an AI system, but also about how it interacts with itself during the learning process. As Dr. Jeffrey Queißer, Staff Scientist in the Cognitive Neurorobotics Research Unit at the Okinawa Institute of Science and Technology (OIST), explains, “This study highlights the importance of self-interactions in how we learn. By structuring training data in a way that teaches our system to talk to itself, we show that learning is shaped not only by the architecture of our AI systems, but by the interaction dynamics embedded within our training procedures.”
How Does ‘Self-Talk’ Improve AI?
To test this concept, the research team combined self-directed internal speech—described as quiet “mumbling”—with a specialized working memory system. This allowed their AI models to learn more efficiently, adapt to unfamiliar situations, and handle multiple tasks simultaneously. The results demonstrated clear improvements in flexibility and overall performance compared to systems relying on memory alone.
The Quest for Generalizable AI
A core objective of this work is content agnostic information processing—the ability of AI to apply learned skills beyond the specific scenarios encountered during training, using general rules rather than rote memorization. “Rapid task switching and solving unfamiliar problems is something we humans do easily every day. But for AI, it’s much more challenging,” says Dr. Queißer. “That’s why we take an interdisciplinary approach, blending developmental neuroscience and psychology with machine learning and robotics amongst other fields, to find new ways to think about learning and inform the future of AI.”
Why Working Memory is Crucial
The researchers initially focused on memory design in AI models, specifically examining working memory and its role in generalization. Working memory is the short-term ability to hold and manipulate information—whether it’s following instructions or performing mental calculations. By testing tasks of varying difficulty, the team compared different memory structures.
They found that models with multiple working memory slots—temporary containers for information—performed better on complex problems, such as reversing sequences or recreating patterns. These tasks require holding and manipulating several pieces of information in the correct order. Adding targets that encouraged the system to “talk to itself” a specific number of times further improved performance, particularly during multitasking and multi-step tasks.
“Our combined system is particularly exciting because it can work with sparse data instead of the extensive data sets usually required to train such models for generalization. It provides a complementary, lightweight alternative,” Dr. Queißer says.
From the Lab to the Real World
The researchers are now planning to move beyond controlled laboratory tests and explore more realistic conditions. “In the real world, we’re making decisions and solving problems in complex, noisy, dynamic environments. To better mirror human developmental learning, we need to account for these external factors,” explains Dr. Queißer.
This work supports the team’s broader goal of understanding how human learning functions at a neural level. “By exploring phenomena like inner speech, and understanding the mechanisms of such processes, we gain fundamental new insights into human biology and behavior,” Dr. Queißer concludes. “We can also apply this knowledge, for example in developing household or agricultural robots which can function in our complex, dynamic worlds.”
