OpenAI – the AI research group, originally founded by Elon Musk and Sam Altman and now heavily backed by Microsoft – has released some truly groundbreaking AI using reinforcement learning techniques in a multi-agent game of hide and seek. Games showcase important theoretical advances in AI, but putting too much weight on how an AI figures out a game can misrepresent progress and raise expectations (or fears) about machine capabilities in the real world.
But this work strikes us as different, for a couple of reasons.
First, the AI agents developed, on their own, human-like strategies for competing. This property – called emergence – led to as many as six distinct adaptations of strategy and counter-strategy in the game, some of which, the researchers did not know were even possible in the simulated environment they created.
Second, these emergent strategies involved the complex use of tools, which, as the researchers point out, is a hallmark of human and animal intelligence.
Third, the AI agents show how they can exploit, outside of a human’s control, an evolutionary arms race to improve their strategies. As the researchers note in their paper, “the vast amount of complexity and diversity on Earth evolved due to co-evolution and competition between organisms, directed by natural selection. When a new successful strategy or mutation emerges, it changes the implicit task distribution neighboring agents need to solve and creates a new pressure for adaptation.” These AI’s were off on their own.
In this research, there are no direct incentives created by human AI designers. Everything interesting happens solely a result of the competing agents continually creating new tasks for each other.
Here’s what they do:
- the hiders learn to use the tools at their disposal and intentionally modify their environment. They begin to construct secure shelters in which to hide by moving many boxes together or against walls and locking them in place.
- the seekers also learn rudimentary tool use; they learn to move and use ramps to jump over obstacles, allowing them to enter the hiders’ shelter.
- the hiders then learn to defend against this strategy; they bring the ramps to the edge of the play area and lock them in place, seemingly removing the only tool the seekers have at their disposal.
- the seekers learn to bring a box to the edge of the play area where the hiders have locked the ramps. The seekers then jump on top of the box and surf it to the hiders’ shelter.
- the hiders learn to lock all of the boxes in place before building their shelter.
Why is this scary?
For starters, the researchers got a shock. “We observed emergence of as many as six distinct rounds of strategy and counter-strategy, suggesting that multi-agent self-play with simple game rules in sufficiently complex environments could lead to open-ended growth in complexity.”
Those who worry about the control problem in AI – essentially that humans will only have one chance to “get it right” in the creation of a superintelligence or face certain annihilation, and who include Elon Musk, Stuart Russell, Nick Bostrom – have just been given more reason to worry. Any “open-ended growth in complexity” where a machine has the upper hand due to speed, scale and non-intuitive, alien-like knowledge is surely a big worry for humans.
Then there’s emergence. Emergence is one of those things that we’re yet to understand, even in humans. How our consciousness emerged from a bunch of carbon-based DNA remains a mystery, yet we are conscious, sentient, moral beings. At what point do we worry about consciousness emerging in silicon-based beings? How would we know? As Sam Harris, the neuroscientist and philosopher points out, if consciousness is just a matter of information processing, we could build machines or simulated worlds just by accident.
This research, while exciting and amazing to watch, should make all of us a bit more worried about AI. It’s one thing to wrestle with the ethics of AI we can control, it’s another altogether to see how rapidly AI agents can take what we give them and create whole new moves, some desired, some not, and many of them a surprise.