Google I/O 2024: Fascinating announcements and new products

Google I/O 2024: Exciting announcements from the company's event

Artificial Intelligence has been a priority for Google ever since CEO Sundar Pichai announced that the company would focus on "AI-First" in 2017. However, at the latest Google I/O ceremony, that promise seems to have come through stronger than ever. "Google has fully entered the Gemini era. We've been investing in AI for over a decade and innovating at every level: research, products, infrastructure, and today we're going to talk about all of that," Pichai said at the event. "However, we are in the early stages of the AI platform shift. We see a lot of opportunities ahead for creators, developers, startups, everyone. Helping realise these opportunities is what our Gemini era is all about," he added.

During the event in Mountain View, California, Google made a number of announcements regarding its most talked about technological advancements in recent years. From new Gemini models to AI-powered virtual assistants to image and video creation tools, we'll break down some of the most significant Google I/O 2024 announcements below.

Google Search is revolutionising with generative AI

Google is using generative AI to expand its search capabilities by introducing AI-based summaries to provide quick summaries and information on complex topics. These summaries, based on the Gemini artificial intelligence model, simplify searches by providing users with comprehensive information with minimal effort. The tool will first be available in the US and will be rolled out to other countries in the coming months.

Google Search is also introducing multi-step reasoning capabilities to efficiently solve complex queries, allowing users to get accurate and detailed answers in a single search. Planning features integrated directly into Search further simplify tasks such as meal and travel planning by offering personalised recommendations and easy customisation options. Google's advances in video insights allow users to leverage video in Search, making it easier to search and find relevant information based on visual cues.

Google Gemini expands its family

One of the first announcements at the event was the expansion of Google's Gemini family of AI models with the introduction of the new 1.5 Flash model - lighter than the 1.5 Pro and designed for fast and efficient service at scale. According to Demis Hassabis, CEO of Google DeepMind, Flash is great for tasks such as writing resumes, working with chat apps, creating captions for images and videos, and extracting data from long documents.

For its part, the company unveiled significant improvements to Gemini 1.5 Pro, which boasts an expanded context window of up to 2 million tokens and improved performance in several areas including reasoning, coding and image understanding. Gemini also features new data analysis capabilities and an improved conversational experience. The update, available to Gemini Advanced subscribers, includes a 1 million tokens context window, enabling Gemini to understand and analyse large amounts of information. This enables functions such as summarising 100 emails or analysing 1,500-page documents. Users can also upload files directly into Gemini for analysis and reporting.

Google Gemini

Gemini mobile app updates

According to Google, these updates allow chatbot users more fine-grained control over their responses, paving the way for greater communication and workflow automation. Gemini Nano, previously focused on text input, is now being expanded to include image understanding, promising users a more complete AI experience.

Sissy Hsiao, vice president and general manager of Gemini Experiences and Google Assistant, also unveiled updates to the Gemini mobile app. Among them is "Live," a mobile conversational experience that uses advanced voice technology for a more natural interaction. Users can talk to Gemini and choose from multiple voice responses.

In addition, Gemini can now perform actions on behalf of users, such as creating customised travel itineraries based on preferences and information from Gmail and Google Maps. The update allows users to customise Gemini by creating so-called "gems": personalised versions of the AI assistant that meet specific needs. Users can create Gemini for a variety of purposes, such as as a career coach or creative writing guide.

Project Astra Google

Project Astra: The future of AI assistants

In parallel with the development of Project Astra, Google announced Gemma 2, the next generation of open source models that aim to revolutionise AI assistants by improving their understanding and responsiveness to human interaction. The idea behind Project Astra is for AI assistants to be able to understand and respond in the same way as humans, and to internalise and memorise what they see and hear in order to understand context and respond accordingly.

The Astra project prototype was built on the Gemini and other task-oriented models and was designed to process information faster by continuously encoding video frames, combining video and voice data into a timeline of events, and caching this information for efficient retrieval. Google has also improved audio by giving AI agents a wider range of intonation. Better understanding of the context in which they are being used and quicker responses in conversation will come. Some of these features will appear in Google products, such as the Gemini app, later this year.

Generative AI as a tool for empowerment

Google's efforts in generative AI are not limited to Gemini: Veo, a high-definition video generation model, and Image 3, a world-class text-to-image model, have been released. These developments provide creators with unprecedented control and precision in their creative endeavours, allowing them to produce realistic videos and images with surprising ease.

In Veo's case, it is the most efficient video creation model available today, capable of generating high-quality 1080p videos that can last over a minute, in a wide range of visual and cinematic styles. Veo has an advanced understanding of natural language and visual semantics and can generate videos that accurately reflect the user's creative vision, presenting details in longer cues and capturing tone.

The model also understands cinematic terms such as "timelapse" or "aerial landscape shot," providing an unprecedented level of creative control. It also allows you to create consistent and cohesive images: people, animals and objects move realistically throughout the frame. Starting today, Veo is available to select creators as a closed preview in VideoFX. Google has said it will incorporate some of Veo's features into YouTube Shorts and other products in the future.

Conclusion

The Google I/O 2024 event demonstrated that the company is not resting on its laurels and continues to aggressively develop artificial intelligence with innovative solutions for users. Google is making significant steps towards making AI an integral part of everyday life and work. With such innovations, the future of artificial intelligence looks promising and full of possibilities.

Review

leave feedback