At the end of I/O, Google’s annual developer conference at the Shoreline Amphitheater in Mountain View, Google CEO Sundar Pichai revealed that the company had said “AI” 121 times. That, essentially, was the crux of Google’s two-hour keynote — stuffing AI into every Google app and service used by more than two billion people around the world. Here are all the major updates that Google announced at the event.
Gemini 1.5 Flash and updates to Gemini 1.5 Pro
Google announced a brand new AI model called Gemini 1.5 Flash, which it says is optimised for speed and efficiency. Flash sits between Gemini 1.5 Pro and Gemini 1.5 Nano, which its the company’s smallest model that runs locally on device. Google said that it created Flash because developers wanted a lighter and less expensive model than Gemini Pro to build AI-powered apps and services while keeping some of the things like a long context window of one million tokens that differentiates Gemini Pro from competing models. Later this year, Google will double Gemini’s context window to two million tokens, which means that it will be able to process two hours of video, 22 hours of audio, more than 60,000 lines of code or more than 1.4 million words at the same time.
Project Astra
Google showed off Project Astra, an early version of a universal assistant powered by AI that Google’s DeepMind CEO Demis Hassabis said was Google’s version of an AI agent “that can be helpful in everyday life.”
In a video that Google says was shot in a single take, an Astra user moves around Google’s London office holding up their phone and pointing the camera at various things — a speaker, some code on a whiteboard, and out a window — and has a natural conversation with the app about what it seems. In one of the video’s most impressive moments, the correctly tells the user where she left her glasses before without the user ever having brought up the glasses.
The video ends with a twist — when the user finds and wears the missing glasses, we learn that they have an onboard camera system and are capable of using Project Astra to seamlessly carry on a conversation with the user, perhaps indicating that Google might be working on a competitor to Meta’s Ray Ban smart glasses.
Ask Google Photos
Google Photos was already intelligent when it came to searching for specific images or videos, but with AI, Google is taking things to the next level. If you’re a Google One subscriber in the US, you will be able to ask Google Photos a complex question like “show me the best photo from each national park I’ve visited” when the feature rolls out over the next few months. Google Photos will use GPS information as well as its own judgement of what is “best” to present you with options. You can also ask Google Photos to generate captions to post the photos to social media.
Veo and Imagen 3
Google’s new AI-powered media creation engines are called Veo and Imagen 3. Veo is Google’s answer to OpenAI’s Sora. It can produce “high-quality” 1080p videos that can last “beyond a minute”, Google said, and can understand cinematic concepts like a timelapse.
Imagen 3, meanwhile, is a text-to-image generator that Google claims handles text better than its previous version, Imagen 2. The result is the company’s highest quality” text-to-image model with “incredible level of detail” for “photorealistic, lifelike images” and fewer artifacts — essentially pitting it against OpenAI’s DALLE-3.
Big updates to Google Search
Google is making big changes to how Search fundamentally works. Most of the updates announced today like the ability to ask really complex questions (“Find the best yoga or pilates studios in Boston and show details on their intro offers and walking time from Beacon Hill.”) and using Search to plan meals and vacations won’t be available unless you opt in to Search Labs, the company’s platform that lets people try out experimental features.
But a big new feature that Google is calling AI Overviews and which the company has been testing for a year now, is finally rolling out to millions of people in the US. Google Search will now present AI-generated answers on top of the results by default, and the company says that it will bring the feature to more than a billion users around the world by the end of the year.
Gemini on Android
Google is integrating Gemini directly into Android. When Android 15 releases later this year, Gemini will be aware of the app, image or video that you’re running, and you’ll be able to pull it up as an overlay and ask it context-specific questions. Where does that leave Google Assistant that already does this? Who knows! Google didn’t bring it up at all during today’s keynote.
There were a bunch of other updates too. Google said it would add digital watermarks to AI-generated video and text, make Gemini accessible in the side panel in Gmail and Docs, power a virtual AI teammate in Workspace, listen in on phone calls and detect if you’re being scammed in real time, and a lot more.
Catch up on all the news from Google I/O 2024 right here!
Source: www.engadget.com