In recent years, artificial agents have achieved superhuman levels of performance in complex games such as Go, poker, and Atari Arcade Classics. Underlying each of these artificial intelligence (AI) success stories is an algorithm called “deep reinforcement learning”, which combines neural network modeling with a process for learning from reward and punishment. In these games, rewards are baked into the environment: The artificial agents receive rewards when they win a hand, gain points, or advance to a subsequent round.
In contrast, humans in the real world learn to solve problems in complex environments even in the absence of such extrinsic rewards. How might AI agents demonstrate intelligent behavior in contexts in which these rewards are absent?
“How might AI agents demonstrate intelligent behavior in contexts in which extrinsic rewards are absent?”
Insights from developmental science into how children learn may be key to building general purpose AI agents that can learn in the absence of explicit rewards. Developmental psychologists have proposed that human learning is scaffolded by intrinsic motivations. Intrinsic motivation refers to the impetus to do something because it is inherently rewarding. Curiosity and agency — drives to understand and influence one’s environment — are two such intrinsic motivations that have been proposed to guide children’s play and exploration and speed their learning.
Recent advances in AI suggest that similar intrinsic motivations might also promote machine learning — when the drives to predict and control the environment are incorporated into the algorithms that govern the behavior of artificial agents, they learn faster and can solve a broader range of problems.
“Beyond being motivated to understand their environments, children may also be motivated to control them by learning about when and how their actions can influence future events or outcomes.”
What are children motivated to play with and explore in the lab and in the real world? Many laboratory studies suggest that children direct their exploration toward parts of the environment about which they are most uncertain. But uncertainty-directed exploration can fail in real-world contexts, because some aspects of our world will always be impossible to predict. For example, a robot programmed to explore the most uncertain parts of their environment might become stuck watching the static on a broken television, transfixed by a series of patterns that will always remain unpredictable and surprising.
Human children don’t get stuck in this way, but instead, appear to preferentially engage with stimuli that are at an intermediate level of complexity — not entirely predictable, but still learnable. AI researchers have similarly found that by rewarding progress in learning, AI agents can learn to efficiently direct their exploration to the parts of the environment where they are able to reduce their uncertainty the most.
“Intrinsic motivations such as curiosity and a drive toward agency that can be readily observed in children’s play and learning are also critical component processes for developing intelligence in artificial agents.”
Beyond being motivated to understand their environments, children may also be motivated to control them by learning about when and how their actions can influence future events or outcomes. Infants or young children at play commonly engage in behaviors that may seem largely without purpose — sticking fingers or toes in their mouths, banging objects against a table, or stacking and knocking down blocks. However, through such play, infants learn contingencies between their actions and changes in what they see, hear, or feel.
Laboratory studies suggest that infants and children prefer situations in which outcomes are contingent on their own actions. Being drawn to contexts in which one can exert control in turn provides a scaffolding for causal learning and planning, cognitive processes that are critical for goal-directed behavior. Likewise, artificial agents that are programmed to seek out high levels of “empowerment”, a metric of how much an agent can reliably and perceptibly influence their current environment, learn to solve complex tasks faster.
These findings suggest that intrinsic motivations such as curiosity and a drive toward agency that can be readily observed in children’s play and learning are also critical component processes for developing intelligence in artificial agents. Beyond intrinsic motivation, building into artificial agents other features of children’s cognitive development, like the reactivation of prior experiences during sleep, the flexible use of different learning strategies, and the ability to monitor and infer the goals and intentions of others, may also promote the speed and flexibility of machine learning.
“It may be that developmental science holds the key to the genesis of machine intelligence.”
As so many of the algorithmic innovations that have yielded rapid advances in AI have strong parallels to processes that underpin the development of human learning, it may be that developmental science holds the key to the genesis of machine intelligence.