Grokking
Published on

Everything announced at Google I/O 2025

Authors
  • avatar
    Name
    Daniyal Afaqi
    Linkedin
    Follow

Google I/O 2025: A Look at the Latest in AI and Beyond

Google I/O is a pivotal event, bringing together developers and showcasing the company's latest advancements, particularly in the realm of Artificial Intelligence. This year's conference highlighted how AI, especially Gemini models, is being deeply integrated across Google's products and services, moving from research into real-world applications. Google noted a significant increase in AI usage, processing over 480 trillion tokens a month across products and APIs, a 50x increase from the previous year. Developer adoption of Gemini is also seeing incredible momentum, with over 7 million developers building with it, five times more than last year, and Gemini usage on Vertex AI is up 40 times.

Advancements in Gemini Models

According to Google's announcements at I/O, the company continues to push the state-of-the-art with updates to its Gemini 2.5 series and the introduction of new models.

Description of image
  • Gemini 2.5 Flash Preview has been updated with stronger performance in coding and complex reasoning, optimized for speed and efficiency. This updated version is becoming the new default model in the Gemini app due to its blend of quality and speed.
  • Gemini 2.5 Pro is noted as a world-leading model for coding, topping the WebDev Arena and LMArena leaderboards, and also excelling in learning applications.
  • New Capabilities are coming to both 2.5 Pro and 2.5 Flash, including native audio output for more natural conversations, advanced security safeguards, and Project Mariner's computer use capabilities.
  • Deep Think is an experimental, enhanced reasoning mode for 2.5 Pro, utilizing cutting-edge research in thinking and reasoning, including parallel techniques, to improve performance on highly complex math and coding tasks. It will be available to Ultra subscribers and select testers first.
  • Improved Transparency and Control features like thought summaries are now available across 2.5 models, and thinking budgets are coming to 2.5 Pro Preview soon, allowing developers more control over the model's thinking process.
  • New Models introduced include Gemma 3n, an efficient open multimodal model designed to run on phones, laptops, and tablets, handling audio, text, image, and video. Gemini Diffusion, a new text model, generates at five times the speed of Google's fastest model while matching its coding performance. Lyria RealTime is an experimental interactive music generation model available via the Gemini API.

Based on the I/O announcements, both Gemini 2.5 Flash and 2.5 Pro are available in Google AI Studio and Vertex AI in Preview, with general availability for Flash in early June and Pro following soon after. A custom version of Gemini 2.5 is also coming to Search this week for both AI Mode and AI Overviews in the U.S..

Google is making significant strides in integrating AI into Search, aiming to go beyond providing information to offering intelligence.

  • AI Overviews, launched last year, have scaled to over 1.5 billion users in 200 countries and territories. People using AI Overviews are reportedly happier with their results and search more often. They are driving over 10% increase in usage for the types of queries that show them in major markets like the U.S. and India. AI Overviews are noted as one of the most successful launches in Search in the past decade.
  • AI Mode is being introduced as a "total reimagining of Search" for users wanting an end-to-end AI experience. It offers more advanced reasoning and multimodality, allowing for longer, more complex questions and follow-up questions. AI Mode is rolling out to everyone in the U.S. starting today. It will appear as a new tab in Search and in the search bar of the Google app. AI Mode uses a query fan-out technique to break down questions into subtopics and perform multiple queries simultaneously, enabling deeper exploration of the web. It will be the first place to receive Gemini's frontier capabilities.
  • Deep Search is coming to AI Mode for thorough research, using an enhanced query fan-out technique to issue hundreds of searches, reason across disparate information, and create a fully-cited report in minutes. This feature will launch first in Labs in the coming months.
  • Search Live is bringing Project Astra's live capabilities into Search, allowing users to talk back-and-forth with Search about what they see using their camera in real-time. This will come to Labs this summer.
  • Personal Context will soon be available in AI Mode, offering personalized suggestions based on past searches. Users can opt in to connect other Google apps, starting with Gmail, to bring in personal context, enabling tailored results like restaurant recommendations based on past bookings or event suggestions near a user's stay based on flight/hotel confirmations.

Additionally, Google introduced a new shopping experience in AI Mode, bringing together Gemini with the Shopping Graph to help users browse, evaluate products, and even try clothes on virtually. An agentic checkout experience is also available for quick purchases.

AI in Hardware and Beyond

Google is expanding the reach of AI and its platforms to new device types.

  • Android XR is taking a significant step forward, bringing Gemini to glasses and headsets. Google is partnering with Samsung to extend Android XR to glasses, creating a software and reference hardware platform for the ecosystem. Glasses running Android XR are equipped with a camera, microphones, speakers, and an optional in-lens display, working with a phone to provide hands-free app access and information. Paired with Gemini, these glasses understand context by seeing and hearing the user's environment. Developers will be able to start building for this platform later this year. Samsung's Project Moohan, a headset, will be available for purchase later this year and allows users to talk with their AI assistant about what they see.
  • Google Beam, formerly known as Project Starline, is a new AI-first 3D video communications platform. It uses a state-of-the-art AI volumetric video model to transform 2D video streams into realistic 3D experiences, achieved with an array of six cameras and AI to merge streams and render on a 3D lightfield display. Google Beam offers near-perfect head tracking at 60 frames per second in real-time, resulting in a more natural and immersive conversational experience. Google is collaborating with HP to bring the first Google Beam devices to early customers later this year. They are also partnering with industry leaders like Zoom and channel partners to bring Google Beam to businesses worldwide.
  • Speech Translation technology from Project Starline is coming to Google Meet, enabling near real-time translation that matches the speaker's voice, tone, and expressions. Translation in English and Spanish is rolling out in beta to Google AI Pro and Ultra subscribers, with more languages coming soon. It will also be available for early testing for Workspace business customers this year.

Fueling Creativity

Google announced new generative media models and tools designed to empower artists and creators.

  • Veo 3 is Google's latest state-of-the-art video model, pushing the frontier of media generation. It is the first in the world to have native support for sound effects, background noises, and dialogue, allowing users to generate immersive video scenes from text prompts. Veo 3 is available today in the Gemini app for Google AI Ultra subscribers in the U.S..
  • Imagen 4 is Google's latest and most capable image generation model. It combines speed with precision, offering remarkable clarity in fine details and excelling in both photorealistic and abstract styles. Imagen 4 can create images in various aspect ratios up to 2k resolution and is significantly better at spelling and typography. Everyone can try Imagen 4 today in the Gemini app.
  • Lyria 2 access is being expanded, providing musicians with more tools to create music.
  • Flow is a new AI filmmaking tool designed for Veo, Imagen, and Gemini models. Built with and for creatives, it allows users to seamlessly create cinematic clips, scenes, and stories using natural language prompts. Users can manage story "ingredients" like cast, locations, objects, and styles in one place and weave their narrative into scenes. Flow is available today for Google AI Pro and Ultra plan subscribers in the U.S., with more countries coming soon. Ultra subscribers get the highest usage limits and early access to Veo 3.

Empowering Developers

Google announced various updates across its developer products to make building transformative AI applications easier.

  • Firebase AI Logic is the evolution of Vertex AI in Firebase. It allows developers to easily integrate generative AI into their apps either directly client-side or through Genkit for server-side implementations. New features include access to the Gemini Developer API, hybrid and on-device inference, Unity support, image generation and editing with Gemini, and enhanced observability in AI monitoring dashboards.
    • Developers now have the choice between the Gemini Developer API (easiest to get started with a no-cost tier) and the Vertex AI Gemini API (for enterprise-grade features, already GA) when accessing generative AI models directly from their apps.
  • Genkit, designed for server-side scenarios, now supports dynamic model lookup for Node.js to use the latest Gemini models without package updates. Plugin updates enhance the Genkit Developer UI experience, automatically showing the latest and locally installed open-source models and surfacing model-specific parameters.
  • Firebase Studio, a cloud-based AI workspace for creating functional apps from prompts, has been upgraded to leverage the latest Gemini 2.5 Flash and Pro models for building richer, more complex applications. It also supports new Gemini API features like native audio output and Live API for building interactive, agentic applications. Firebase Studio now supports sharing prompts directly via a URL.
  • Gemini API updates include a preview version of audio-visual input and native audio out dialogue in the Live API for building conversational experiences. Preview text-to-speech (TTS) capabilities for Gemini 2.5 Flash and Pro models enable sophisticated single and multi-speaker speech output with controllable voice style, accent, and pace. Asynchronous Function Calling allows longer-running functions to be called in the background without blocking conversation flow. A new Computer Use API allows applications to browse the web or use other software tools under direction, available today to Trusted Testers and rolling out to more developers later this year. URL Context is a new experimental tool that retrieves full page context from URLs. The Gemini API and SDK will support Model Context Protocol (MCP) for easy use of a wide range of open-source tools.
  • An experimental Firebase MCP Server is launching as part of the Firebase CLI, providing AI assistants access to tools for configuring and working with Firebase. Installing it locally allows integration with popular IDEs and LLM clients.
  • Data Connect, a backend-as-a-service for Cloud SQL Postgres, announced General Availability last month and is releasing new features. Additional transaction support is now available, allowing execution of a series of operations, querying data, and then executing another series using returned information.

Enhancing Learning

Google is integrating AI to make learning more active, engaging, and effective.

  • LearnLM, Google's family of models fine-tuned for learning, is being infused directly into Gemini 2.5, making it the world's leading model for learning and outperforming competitors on learning science principles.
  • In the Gemini app, students globally (ages 18+) will have the ability to create custom quizzes from any topic or uploaded documents starting today. The interactive quizzes provide hints, explanations, and a summary of strengths and areas needing more study.
  • Canvas, the co-creation space in the Gemini app, allows users to transform documents into dynamic webpages, infographics, quizzes, or even podcast-style Audio Overviews in 45 languages. Vibe coding is enabling users to build functional apps simply by describing them.
  • Deep Research in the Gemini app now allows users to upload their own PDFs and images to be included in customized research reports alongside public data, and will soon connect to Google Drive and Gmail.
  • Video Overviews are coming soon to NotebookLM, allowing users to turn the content of their notebook into an educational video.
  • Search Live in AI Mode is noted as helpful for learning by allowing real-time questioning about the world around you through the camera.
  • Google is expanding its offer of a free Google AI Pro plan for a school year to college students in Brazil, Indonesia, Japan, and the United Kingdom, in addition to the U.S.. This includes access to the Pro plan, 2 TB of storage, NotebookLM, and more.

Google is also experimenting with new learning ideas through Labs:

  • Sparkify is an experiment that helps turn questions or ideas into short animated videos using Gemini and Veo models, with capabilities coming to Google products later this year.
  • A conversational tutor is being prototyped in Project Astra that can follow a user's work, provide step-by-step guidance, identify mistakes, and generate diagrams. This research project will also come to Google products later this year, and Android Trusted Testers can sign up for a preview.
  • Learn About, an experimental Labs project, is getting improvements based on LearnLM capabilities for more nuanced explanations and connections. It is being made available to more learners (including teens), adding session history, and offering the ability to upload source documents for grounding explanations in personal materials.

Trust and Safety

As generative AI advances, Google is focused on providing transparency and tools for identifying AI-generated content.

  • The SynthID Detector is a new verification portal to quickly and efficiently identify AI-generated content made with Google AI. It provides detection capabilities across images, text, audio, and video in one place. It can also highlight parts of content likely watermarked with SynthID. SynthID is a tool that embeds imperceptible watermarks that remain detectable even after content transformations and has been expanded to cover content generated by Gemini, Imagen, Lyria, and Veo models. Over 10 billion pieces of content have already been watermarked with SynthID. The SynthID Detector is starting to roll out to early testers today, with a waitlist available for journalists, media professionals, and researchers.
  • Google has open-sourced SynthID text watermarking to grow a trusted ecosystem and partnered with NVIDIA to watermark videos generated by their services.

Google AI Subscription Plans

Based on the I/O announcements, Google is introducing new subscription plans to provide access to its AI capabilities.

  • Google AI Pro is the evolution of the existing AI Premium plan, now offering a suite of AI tools for $19.99/month. It includes the Gemini app (formerly Gemini Advanced), access to Flow with the Veo 2 model, Whisk for image-to-video creation, NotebookLM, Gemini in Google apps, Gemini in Chrome (early access), and 2 TB of storage. These new benefits are coming to U.S. subscribers first, with more countries to follow.
  • Google AI Ultra is a new premium plan designed for users who want the highest usage limits and early access to the most capable models and experimental features. Available in the U.S. for $249.99/month (with a limited-time offer), it includes the best version of the Gemini app with the highest usage limits across Deep Research and video generation with Veo models. Ultra subscribers get early access to Deep Think in 2.5 Pro, the highest limits in Flow with 1080p video generation and advanced camera controls, Whisk Animate, enhanced NotebookLM capabilities, Gemini in Google apps and Chrome, early access to the Project Mariner agentic research prototype, an individual YouTube Premium plan, and 30 TB of storage. It is available in the U.S. today and coming soon to more countries.

These announcements from Google I/O 2025 demonstrate a strong focus on integrating advanced AI capabilities across their product ecosystem, empowering users and developers alike with new tools for productivity, creativity, communication, and learning.