The CETI project (Cetacean Translation Initiative) is an initiative that aims to decode the “language” of sperm whales. Using artificial intelligence, the team behind the project hopes to one day decode the clicks (or clicks) that sperm whales use for echolocation and to communicate with each other. Eventually, they may develop a language to converse with these sea giants.

The clicks are organized in standard sequences, called codas. To decode these series of sounds, the researchers plan to exploit the automatic processing of natural language (Natural Language Processing or NLP) – a subfield of artificial intelligence focused on the processing of written and spoken human language. The team has already applied sperm whale coda recordings to an NLP algorithm, with promising results.

Why sperm whales in particular? Apart from having the largest brains of any species, these animals display traits similar to humans. They are endowed with conscious thought, the capacity for planning, speaking and feeling; they are gregarious animals, which live in groups of 20 to 40 individuals, linked by strong family ties. Their sophisticated acoustic communications provide an excellent starting point for advanced machine learning tools that can be applied to other animals.

Objectives: collect and contextualize data

The main obstacle to overcome in this project is to collect enough data; machine learning requires a very large set of data for training and model building. The team’s objective is to succeed in collecting four billion “words” from sperm whales! Initially, to establish a first proof of concept, it relied on research carried out by the Dominica Sperm Whale Project, who collected just under 100,000 codas in addition to valuable information on the social life and behavior of cetaceans.

Almost 100,000 codas may sound like a lot, but it’s actually very little for the task. For comparison, GPT-3 – the deep learning predictive language model developed by OpenAI and released in 2020 – was trained using around 175 billion parameters (ten times more than any model language developed previously)!

Another point of difficulty: putting all the codas in their context. In human language, words can have different meanings depending on the context, or even have no meaning at all; the same is true for other languages, including the clicking of sperm whales. However, it will undoubtedly take years of research dedicated to the study of these cetaceans in their natural habitat to associate each sound with a particular context.

The CETI project brings together cryptographers, roboticists, linguists, AI experts, technologists and marine biologists from universities around the world to carry out this Herculean task. Note that in 2020, the team organized a workshop dedicated to decoding communication in non-human species in Simons Institute for the Theory of Computing, where experts studying non-human communication across a variety of species shared their research; it was an opportunity to glean a lot of information on the subject.





Towards greater respect for the living world

But do animals really have a language? The question is still debated among the scientific community. Many believe that language is a human exclusivity. According to Austrian biologist Konrad Lorenz, one of the pioneers in the science of animal behavior, “ animals do not have language in the true sense of the word “; in other words, they communicate, but do not speak. Karsten Brensing, a German marine biologist specializing in animal communication, believes on the contrary that the exchanges of many animals can be described as languages.

For this, several conditions must be met according to him: a semantics (for the meaning), a grammar (to construct the sentences) and a learning of the vocabulary (for this to be considered as a language, all the sounds produced by an animal do not must not be innate). And some animals (certain species of birds, or even dolphins) have already proven this learning capacity, while respecting the other two conditions. Sperm whale clicks appear to be ideal candidates for trying to decode their meaning – not least because they are easier to translate as 0s and 1s than continuous sounds produced by other whale species.

If the team meets their goals, the next step would be to develop an interactive chatbot that would attempt to engage in a dialogue with sperm whales living in the wild – a feat that could completely alter the way humans perceive and interact with nature. The researchers admit that their research might also reveal nothing of interest, in other words, the whales could turn out to be incredibly boring. ” But we don’t think that’s the case. In my experience as a biologist, whenever I have really looked at something up close, there has never been a moment when I have been disappointed with animals. », Said David Gruber, project manager.

The team specifies that the CETI data will be made public for cross collaboration. She hopes that the discoveries made on sperm whales will provide a basis for better understanding the communication of other animals both in the ocean and on land: elephants, birds, gorillas, and more. ” If we find out that there is an entire civilization under our noses, it could lead to a change in the way we treat our environment. And maybe this will result in greater respect for the living world Explains Michael Bronstein, head of machine learning for the CETI project.

Sources: CETI Project and Hakai Magazine