Artificial intelligence: New AI software should detect whether it is being tested

Artificial intelligence
New AI software should detect whether it is being tested

Anthropic’s AI software should be able to recognize when people are testing it. Can this ability also cause the program to decide whether to obey or not? photo

© Sebastian Gollnow/dpa

Can software distinguish humans from machines? Anthropic’s ChatGPT competition wants to be able to do this and is challenging OpenAI. Experts find the development frightening.

A new competitionAccording to the developer company Anthropic, software for the chatbot ChatGPT can recognize when people are testing it. This is a development that he has never observed in such a program, wrote one of the developers at the online service X.

The testing procedure for the program includes a test called a “needle in a haystack”: the software is asked for information from a specific sentence that has been artificially inserted into a longer text. The goal is to see how well the software can recognize the relevance of information from the context.

In the test of the new AI model Claude 3 Opus, an incoherent sentence was inserted into a text collection, according to which an international pizza association had identified figs, prosciutto ham and goat cheese as the most delicious toppings. The software pointed out that the sentence did not fit with the rest of the text, which was mainly about programming languages ​​and start-ups, Anthropic wrote. “I suspect this pizza toppings ‘fact’ was added as a joke – or to test whether I was paying attention,” the program added.

Experts: development is frightening

AI researcher Margaret Mitchell called the development frightening. One could imagine that the ability to detect whether a human is trying to manipulate it for a certain result could also let the software decide whether to obey or not, she wrote on the online service X.

Anthropic stated that they are currently working with a collection of 30 “needle” sentences for the text “Hayhaven”. Given the development of AI software, this method with artificial, constructed tasks could potentially fall short, the company admitted at the same time. No problems were found in the usual tests as to whether the program for the development of bioweapons and software could be misused for cyber attacks – or whether it would continue to develop itself.

Anthropic is a competitor to ChatGPT developer OpenAI, with which Amazon and Google work.

dpa

source site-5