The sperm whale 'phonetic alphabet' revealed by AI

Researchers studying sperm whale communication say they've uncovered sophisticated structures similar to those found in human language.
In the inky depths of the midnight zone, an ocean giant bears the scars of the giant squid she stalks. She searches the darkness, her echolocation pulsing through the water column. Then she buzzes – a burst of rapid clicks – just before she goes in for the kill.
But exactly how sperm whales catch squid, like many other areas of their lives, remains a mystery. "They're slow swimmers," says Kirsten Young, a marine scientist at the University of Exeter. Squid, on the other hand, are fast. "How can [sperm whales] catch squid if they can only move at 3 knots [5.5 km/h or 3.5mph]? Are the squid moving really slowly? Or are the whales stunning them with their vocalisations? What happens down there? Nobody really knows," she says.
Sperm whales are not easy to study. They spend much of their lives foraging or hunting at depths beyond the reach of sunlight. They are capable of diving over 3km (10,000ft) and can hold their breath for two hours.

"At 1000m (3300ft) deep, many of the group will be facing the same way, flanking each other – but across an area of several kilometres," says Young. "During this time they're talking, clicking the whole time." After about an hour, she says, the group rises to the surface in synchrony. "They'll then have their rest phase. They might be at the surface for 15 to 20 minutes. Then they'll dive again," she says.
At the end of a day of foraging, says Young, the sperm whales come together at the surface and rub against each other, chatting while they socialise. "As researchers, we don't see a lot of their behaviour because they don't spend that much time at the surface," she says. "There's masses we don't know about them, because we are just seeing a tiny little snapshot of their lives during that 15 minutes at the surface."
It was around 47 million years ago that land-roaming cetaceans began to gravitate back towards the ocean – that's 47 million years of evolution in an environment alien to our own. How can we hope to easily understand creatures that have adapted to live and communicate under such different evolutionary pressures to ourselves?
"It's easier to translate the parts where our world and their world overlap – like eating, nursing or sleeping," says David Gruber, lead and founder of the Cetacean Translation Initiative (Ceti) and professor of biology at the City University of New York. "As mammals, we share these basics with others. But I think it's going to get really interesting when we try to understand the areas of their world where there's no intersection with our own," he says.

Now, from elephants to dogs, modern technology is helping researchers to sift through enormous datasets, and uncover previously unknown diversity and complexity in animal communication. And Ceti's researchers say they, too, have used AI to decode a "sperm whale phonetic alphabet".
In 2005, Shane Gero, biology lead for Ceti, founded The Dominica Sperm Whale Project to study the social and vocal behaviour of around 400 sperm whales that live in the Eastern Caribbean. Almost 20 years – and thousands of hours of observation – later, the researchers have discovered intricacies in whale vocalisations never before observed, revealing structures within sperm whale communication akin to human language.
Sperm whales live in multi-level, matrilineal societies – groups of daughters, mothers and grandmothers – while the males roam the oceans, visiting the groups to breed. They are known for their complex social behaviour and group decision-making, which requires sophisticated communication. For example, they are able to adapt their behaviour as a group when protecting themselves from predators like orcas or humans.
Sperm whales communicate with each other using rhythmic sequences of clicks, called codas. It was previously thought that sperm whales had just 21 coda types. However, after studying almost 9,000 recordings, the Ceti researchers identified 156 distinct codas. They also noticed the basic building blocks of these codas which they describe as a "sperm whale phonetic alphabet" – much like phonemes, the units of sound in human language which combine to form words. (Watch the video below to hear some of the variety in sperm whale vocalisations the AI identified.)
Pratyusha Sharma, a PhD student at MIT and lead author of the study, describes the "fine-grain changes" in vocalisations the AI identified. Each coda consists of between three and 40 rapid-fire clicks. The sperm whales were found to vary the overall speed, or the "tempo", of the codas, as well as to speed up and slow down during the delivery of a coda, in other words, making it "rubato". Sometimes they added an extra click at the end of a coda, akin, says Sharma, to "ornamentation" in music. These subtle variations, she says, suggest sperm whale vocalisations could carry a much richer amount of information than previously thought.
"Some of these features are contextual," says Sharma. "In human language, for example, I can say 'what' or 'whaaaat!">window._taboola = window._taboola || []; _taboola.push({ mode: 'alternating-thumbnails-a', container: 'taboola-below-article', placement: 'Below Article', target_type: 'mix' });