The Tensor G4 and Gemini nano are trying to outgun Apple
Google never provided an official explanation for the shift of the Made by Google event, but there’s a good reason why it happened in August. It’s obvious that Google want to deflect attention from the impending iPhone 16 with the early Pixel 9 release.
In order to accomplish that, Google showcased a variety of Pixel 9 AI features, such as Call Notes, which records and summarizes phone conversations, and Add Me, which allows you to add someone to a photo after the fact. Furthermore, by employing natural language searches, the Google Pixel Screenshots feature can assist you in finding specifics.The new Tensor G4 chip, which was created specifically to run Google’s most sophisticated AI models, powers all of this. It’s actually the first CPU that supports Gemini Nano with multimodality, meaning that text, images, and music can all be understood by the Pixel 9, Pixel 9 Pro, Pixel 9 Pro XL, and Pixel 9 Pro Fold.
I chatted with Zach Gleicher, the product manager for Google DeepMind, and Jesse Seed, the group product manager for Google Silicon, to learn more about what the Tensor G4 processor is capable of and how it differs from other AI weapons.
What makes the Tensor G4 chip stand out in a sea of smartphones?
Jesse Seed Being the first silicon and phone to run Gemini Nano with multi-modality, I believe was the biggest breakthrough we did this year. Pixel Screenshots is one of the really fascinating use cases that become available as a result. That comes in very handy when trying to recall information.
The Add Me option is another feature that’s unrelated to the Gemini Nano model yet that I truly adore. The ability to go back and dynamically add the photographer will be greatly appreciated by those of us who are the family or team photographers. And with the help of the Google Augmented Reality SDK, we spent a lot of time fine-tuning more than fifteen distinct machine learning models.
Apple is about to make a big deal with its Apple Intelligence and iPhone 16. How confident are you that the Tensor G4 is one step ahead of the competition?
Our newest and most potent SoC, the Tensor G4, is specifically designed for Pixel and is our fourth generation chip. Thanks to a collaborative design with Google DeepMind, it introduces our newest and most powerful model, Gemini Nano with Multimodality, to Pixel 9. With multimodal capabilities like Pixel Screenshots and Call Notes, as well as ongoing functions like Recorder Summarization, Gemini Nano enhances your phone’s ability to comprehend text, images, audio, and speech.
To achieve optimal performance when working together, the Tensor and Google DeepMind teams collaborate to design both hardware and Gemini Nano models. The benefits of this codesign include the ability to scale a model that is three times more capable to operate effectively on a phone in a relatively short amount of time, as well as the capacity to
Optimizations like this in short design cycles are only possible thanks to our close collaboration. And when coupled with Pixel 9’s extra memory capacity, this means the model will always be there, ready to assist you quickly when needed. And of course whether on device, in the cloud, or a blend of the two, with Google Gemini models and apps, data is never sent to a third party for processing.
How did you squeeze something as advanced as Gemini Nano down fit on a phone?
Zach Gleicher: We at DeepMind work closely with numerous Google teams, and we want to make sure that the Gemini models we develop satisfy the requirements of every Google product. Thus, we became aware of the necessity for on-device models when working with Android and Pixel to build Gemini. We considered this a challenge since, on the server side, there was a drive for larger, presumably more powerful models. Conversely, we faced a plethora of intriguing limitations that were absent previously, such as limitations on memory usage, power consumption, etc.
So in partnership with the Tensor team and Pixel, we were able to come together and understand what, what are the core use cases for these on-device models, what are the constraints for these on-device models, and we actually co-developed a model together. Which was a really exciting experience and made it possible to build something that was so capable and able to power these use cases.
For someone who hasn’t upgraded their phone in 3-4 years, what’s going to stand out to them with the G4 chip?
Seed: Therefore, we attach great importance to optimizing what we refer to as essentials, such as power and performance. Our fourth generation chip, the Tensor G4, is both our most performant and efficient. Thus, we think that regular experiences like web browsing or performance, app launches, and general UI snappiness will demonstrate that to users. It seems like a very seamless experience to me. You’ll notice that, on average, browser performance is 20% faster, and app launches are 17% faster
.
And what about gaming performance, as that’s really important these days for people buying a new phone?
Seed: So in our testing, we actually have seen both improved peak and sustained performance in gaming and common games that run on the platform.
How does the Tensor G4 help with battery life?
Seed: In many common use cases, we increased power efficiency. Thus, activities like shooting pictures, recording videos, and browsing social media use less energy than they did in the previous generation.
All of those things add up to the 20% longer battery life that was mentioned in the keynote. Therefore, Tensor G4 is helping to achieve and contributing to roughly 20% greater battery life.
What are some of the AI features Gemini enables on Pixel 9 phones that you’re most excited about?
Gleicher Improved dependability is one of the primary reasons we saw the Tensor and Pixel teams approaching us for on-device use cases. Because of this, you can depend on the experience to function wherever you are without needing an internet connection.
One more issue we consider is privacy. With an on-device LLM, developers can choose to process data entirely on the device without allowing it to leave the device.
Regarding AI features, I’m particularly thrilled about Pixel screenshots. As you can see in the image, I believe that demonstrates how we are able to enable these multi-modal features to function on a device that can
How is Pixel Screenshots different than Windows Recall, which got in hot water for privacy concerns?
Seed: Having a strong on-device model is one way we safeguard user privacy. In order for the analysis of the screenshot to remain on the device, nothing leaves it. Thus, it’s one way we can take care of that privacy issue.
The other element, in my opinion, is simply giving consumers the freedom to do as they like, including using Gemini however they see fit. as well as the use cases they feel at ease with and those they don’t. It truly comes down to user choice, in my opinion. However, in the instance of Pixel Screenshots specifically, that is an entirely on-device use case.