08.12.2023
350
Google on Wednesday began rolling out Gemini, a new artificial intelligence (AI) model that will help it compete more effectively with OpenAI (creator of ChatGPT) and Microsoft, from consumer applications to enterprise computing capabilities.
Eli Collins, vice president of Google DeepMind, the California Group's AI research lab, assured in a press presentation, "This is our most consistent, most talented, and most popular AI model".
He then demonstrated a video showing users Gemini objects, drawings, and videos. The AI system verbally comments on what it "sees", identifies objects, plays music, and answers questions that require some analysis to justify its "reasoning".
For example, faced with an image of a plastic duck that must choose between two paths-left to another duck drawn on paper and right to a bear that looks menacing-Gemini suggests choosing the left path because it is "better to make friends than enemies.
The video also makes recognizable references without much context, such as the scene in "The Matrix" where Gemini is played by a man pretending to dodge bullets in slow motion.
The new model "is multimedia from the start, has sophisticated reasoning capabilities, and can be coded at a high level of sophistication", Eli Collins elaborates.
He said Gemini is the first AI model to outperform human experts in the industry-standard "MMLU" test, which is used to assess the ability of computer programs to reason in areas ranging from math to history to law.
Since launching ChatGPT a year ago, the Silicon Valley giant has joined the frantic race for so-called generative AI, which can produce text, images, or lines of code equivalent to what a human would produce simply by asking queries in everyday language.
AI leader Google, alarmed by the phenomenal success of ChatGPT, has responded with its own chatbot Bard.
But it's all about the model - the computer system at the heart of these applications, initially powered by text collected from the Internet, now receives all sorts of data to process image queries and communicate with users.
In September, OpenAI said it had made ChatGPT "more intuitive" by adding voice and vision.
Gemini is "another step toward our goal of bringing the world's best AI collaborator," as Sissy Hsiao, Google's vice president for Bard, said Wednesday.
Bard's functionality has already been expanded, but is still limited to written requests and English only.
Other features and formats, such as advanced help with math problems, will have to wait until 2024.
Less well known than ChatGPT, Bard has an opportunity to try to win back a competitor, which has become a victim of its own success: overwhelmed by demand, OpenAI suspended subscriptions to the paid version.
On December 13, Google will also offer access to the first version of Gemini to its cloud (remote computing) customers, including developers building their own AI applications using the Vertex AI platform.
In this area, the Internet giant is in direct competition with Microsoft, a major investor in OpenAI and the world's second largest provider of cloud computing after Amazon.
Both US concerns have been adding generative AI tools to their software (search engines, office and productivity software, cloud platforms, etc.) for a year.
Google CEO Sundar Pichai was quoted in the press release as saying, "This new model era is one of the greatest technological efforts we have made as a company".
Review
leave feedback