with Gemini, Google intends to compete with the best OpenAI models

“Gemini is the model of artificial intelligence [IA] the most powerful and generalist that we have published », welcomed Google CEO Sundar Pichai on Wednesday, December 6. This new model of automatic language and image processing represents “the state of the art in the results of numerous model evaluation tests”, added the leader. One way of saying that Gemini is supposed to compete with its best competitors in the market, including GPT-4, the most recent model released by OpenAI, the creator of ChatGPT.

Google claims that, according to its evaluations, Gemini exceeds GPT-4 in eight of the nine main tests recognized by the community, in math, translation, python programming… The model would also have solved 90% of the 57 tasks of one of these classic tests imagined by the University of Berkeley (California) in 2020 to compare the comprehension abilities of language models, in math, law or history. GPT-4 is at 87%, when GPT-3, its predecessor, was at 45%.

Another new feature, Gemini has been trained not only on trillions of texts, but also on tons of images (photos, graphics, etc.), sounds or videos. The company also emphasizes the capabilities of “reasoning” and of ” planning “ of his tool “multimodal”.

Read the decryption: Article reserved for our subscribers How Google seeks to stay at the forefront of artificial intelligence

In its demonstrations, Google shows that the latter is capable of ” to understand ” a hand-written school exercise, to spot errors and propose the correct reasoning. The software can also find a curve in a scientific article to possibly update it with new data. Gemini can also offer answers in images, although, for the moment, the answers in its first versions accessible to the public and businesses will remain textual.

Very evasive

Gemini comes in three versions, different sizes, Ultra, Pro and Nano. The last one, with less than 4 billion parameters, can run on a smartphone. This is also the case for “small” models of Llama (Meta) or Mistral, with, according to Google’s own evaluations, better performance than Gemini Nano.

Gemini would also have made it possible to considerably increase the capabilities of another software from its subsidiary Google DeepMind, AlphaCode, specialized in programming and released in 2022. AlphaCode 2 would thus have solved twice as many problems in this area as its predecessor. The tool will seek to compete with the leader in this category, GitHub Copilot, created by Microsoft with models from its partner OpenAI.

You have 55% of this article left to read. The rest is reserved for subscribers.

source site