“We have built an AI capable of predicting everything about an individual’s life”

“The death calculator” knows the date of your last breath. Called “life2vec”, an algorithm developed by Danish researchers, with power similar to that of the latest generations of artificial intelligence, is capable of predicting all stages of an individual’s life until their death. Trained on anonymized data from six million Danes collected by the National Statistics Institute, the algorithm works in the same way as ChatGPT except that instead of dissecting a text or a sentence, it analyzes events in the lives of individuals to guess what happens next. Life2vec predicts death correctly in 78% of cases.

Sune Lehmann, professor at Danamerk Technical University (DTU) and co-author of the study published in the journal Nature Computational Sciencehelps us understand the importance of this crystal ball algorithm.

Can you explain to us how your algorithm works?

Beyond the prediction of death, the general nature of our research is interesting from a scientific point of view. We live in an era dominated by these big language models like ChatGPT or Gemini [l’IA de Google] and what they accomplish is incredible. They analyze language as a sequence of words that follow one another. And, fundamentally, they work the same way as the autocomplete of our smartphones: you write a letter and they suggest a word, often correctly. That’s basically what these big language models do. They don’t just look at the word just before, they look at all the words in order to give a continuation to a text; of Shakespeare, for example. Inside the model, an idea of ​​Shakespeare’s style is formed, if you use old English words like “thee”. It detects the general state of the text, whether it is happy or sad, so that the prediction is as accurate as possible. It’s a sort of mini-model of the world.

What about life2vec?

I have just described the general idea of ​​these large artificial intelligence models. Now, if we look at human life as a sentence. You were born in this hospital, in this place, you have this birthday, you have this Apgar score [évaluation sur 10 de la vitalité d’un nouveau-né], you live at this address, you go to this nursery, then this school and so on. What if in the same way that one word follows another word, the events of life follow each other? Large language models continue to write piles of text or perform tasks better than humans. On the same principle, the algorithm gives meaning to life events. We have built a model capable of predicting everything about an individual’s life. This is what economics has been doing for years: predicting the behavior of individuals or groups of individuals. Insurance companies too.

The algorithm predicts personality much better than all other algorithms »

How does your algorithm differ from the old ones?

It differs in the way it works. We feed the model everything we know, the history of millions of people, and we ask it to predict with everything it knows about those people. In the old models, we had to think about what could be important: the age of the individual, their place of life, their sex at birth, their level of education… Here, the model searches for the information by itself which he considers interesting in the sequence of life and the result is much better than that of the old models.

Apart from death, what can Life2vec predict?

People’s personalities. Determining death, which is very well studied and accurately recorded, is at the extreme end of the spectrum. At the other end is personality. The algorithm is presented with the data of 50,000 people who have died and that of 50,000 people who are still alive, and it learns to understand the trends in the lives of people who are alive and those who did not survive. When he is shown new people, thanks to these tendencies, he is able to classify them into one or another category, alive or dead. In the database, some people took personality tests. The algorithm manages to make connections between individuals’ ways of life and their responses to these tests: if they are extroverted, if they dominate discussions, if they are sociable… The algorithm predicts personality much better than all other algorithms. Same on the question of migration, knowing if a Dane will leave the country during his life.

How can the algorithm predict life’s accidents?

Chance is an important part of human life and we cannot predict it. This is a question for my next study. Some things like health do not happen by chance, you can predict your health in several years. But career, for example, is much more unpredictable. It has much more to do with accidents, positive or negative, if you are lucky and meet the right person, your trajectory will be changed. We can begin to understand what is unpredictable and predictable in human lives.

How will your model be used once it is available?

This is a research project. It’s about discovering the underlying data that allows us to understand the world. It will not be accessible to the general public. I don’t think it’s interesting for someone to know that an algorithm knows their death date. It is rather a question of the precision of this algorithm compared to others. Clearly, this way of modeling human life is powerful and could be of interest in the field of health. See how this could help patients in their prevention efforts. Looking at how we could identify people at risk of cancer earlier. This algorithm is a proof of concept [une démonstration de faisabilité]but it has a whole bunch of biases.

Is this really the world we want to live in, giving incredible amounts of data to companies that can predict with great accuracy what we will do in the future? »

What kinds of bias?

Historical differences in how we have been treated lead to differences in how we live. Even if we are making progress in terms of equality between men and women, for example, it is not perfect. The more old data the database contains, the more bias we will find.

And what are the concrete consequences of these biases?

A well-known example of algorithmic bias is that of predictive policing in the United States. Crime statistics already had a bias. As many racialized people appeared in the database, the algorithm marked these people as at risk of committing a crime. And at the same time, young white people who smoke cannabis, for example, were not monitored because the algorithm had not designated them as people who could be in violation. Trained on an already biased database, the algorithm accentuates the problem.

What is the next step, working on even more precise data than that provided by the national statistics institute?

We need a lot of data on each person and we need the responses to be able to be compared to each other. And some people already have this information. All the big tech companies have them. Instagram, for example, has stored tons of images, the context of these images, the people who appear in them, the socio-professional category of these people… Google has all your data, your email exchanges… Our work gives an overview of this what can be done with this data. And these companies refuse to say what they do with it. Facebook collects all kinds of data and makes money by selling predictions about our future behaviors. Is this really the world we want to live in, giving incredible amounts of data to companies that can predict with great accuracy what we will do in the future? We need to start talking about it and hopefully decide that we don’t feel comfortable with this idea.

source site