A research group has used artificial intelligence to identify thousands of previously unknown RNA viruses. The developed AI model is based on the same architecture as ChatGPT.
The most important thing in brief
Research into RNA viruses has not yet made much progress.
Scientists have now discovered 70,500 unknown types of viruses using artificial intelligence.
In the future, it is hoped that this will enable us to gather more information about mysterious diseases.
The use of artificial intelligence in medicine and research holds immense potential. Now a team of researchers has used AI technology to discover over 70,000 previously unknown viruses. Many of them are strange and bear no resemblance to known species.
As the magazine “Nature” reports, citing the magazine “Cell”, recent studies reveal how scientists machine learning used to study predicted protein structures. The scientists see the potential in the new method with AI “dark matter” of the RNA virus universe to explore.
As Artem Babaian, a computer virologist at the University of Toronto in Canada, says Number of undetected virus species “a bottomless pit” – some could cause illness in people. According to the virologist, their determination could help to explain mysterious illnesses.
Study: There are tons of viruses on toothbrushes
The AI model is based on a protein prediction tool called ESMFoldcreated by researchers Meta (formerly Facebook) was developed. The RNA viruses discovered were detected using Metagenomics identified. Genomes are examined in their natural habitats and not cultivated.
Advances in virus research using machine learning
The scientists searched 5.7 million genomic samples from publicly accessible databases in 2022. 132,000 new RNA viruses were identified. However, because RNA viruses evolve quickly, existing methods miss many species. For example, a common method includes one Section of the genome to look for the one Key protein encodes for RNA replication – what is called RNA-dependent RNA polymerase (RdRp). However, if this sequence is very different from all known sequences, it cannot be recognized.
Shi Mang, an evolutionary biologist at Sun Yat-sen University in Shenzhen, China, and co-author of the study, developed the model LucaProtwhich is on the transformer neural network architecture, like ChatGPT, is based. Shi Mang fed the model Sequencing and ESMFold protein prediction data. In connection he trained it to recognize viral RdRps. He then used it to find corresponding sequences that encoded these enzymes – which was proof that the sequences belonged to a virus.
This method made it possible to identify over 160,000 RNA viruses. Among this number of viruses identified, some were exposed to exceptionally long periods of time and extreme environments, such as hot springs, salt lakes, or even some that were airborne. About half of these viruses were not previously known. They found “small areas of RNA virus biodiversity that really far away in the vastness of evolutionary space lies,” says computer virologist Babaian.
Hope for more insights into the “dark matter” of the RNA virus universe
“This is a really promising approach to expanding the virus sphere,” said Jackie Mahar, an evolutionary virologist at the CSIRO Australian Center for Disease Preparedness in Geelong. Characterizing the viruses can help scientists Origins of microorganisms to understand and how this develops in different hosts have. With each newly identified virus, it becomes easier to find similar viruses. “Suddenly you can see things that you simply hadn’t seen before,” says Babaian. However, the research team was unable to determine the hosts of the viruses they discovered.
Researchers are particularly interested in the question of whether one of the new viruses Archaea infected. These are single-celled organisms without a cell nucleus, which have been considered a separate domain of life alongside bacteria and eukaryotes since 1977. Using state-of-the-art technology, these have been detected in all ecosystems, including the human intestine. They are based on analyzes of the genetic information evolutionary precursors of our cells traded. To date, no RNA viruses have been clearly detected in archaea.
RNA viruses have only one strand of DNA and usually reproduce faster than other types of viruses. Some of these pathogens also affect people, such as the recent coronavirus Influenza, Ebola, Hepatitis B and Rabies. However, viruses not only cause diseases, but also have other causes Functions in the ecosystemas researchers suspect.
By researching RNA viruses, scientists hope to be able to draw more conclusions in the future about which living beings can become infected with a virus. Little is still known about the function of RNA viruses and what diseases they can cause. With more knowledge, future pandemics could be better assessed and treated in advance.
Sources used:
nature.com: “AI scans RNA ‘dark matter’ and uncovers 70,000 new viruses”
br.de: “Viruses from prehistoric times? – About the origin of viruses and their role in the evolution of all living things”
welt.de: “Thousands of mysterious RNA viruses discovered in seawater”