Exclusive: Here’s an inside look at the Pixel 9’s breakthrough Tensor G4 chip

 

The Tensor G4 and Gemini nano are trying to outgun Apple

Image credit: Google / Future)

Google never provided an official explanation for the shift of the Made by Google event, but there’s a good reason why it happened in August. It’s obvious that Google want to deflect attention from the impending iPhone 16 with the early Pixel 9 release.

In order to accomplish that, Google showcased a variety of Pixel 9 AI features, such as Call Notes, which records and summarizes phone conversations, and Add Me, which allows you to add someone to a photo after the fact. Furthermore, by employing natural language searches, the Google Pixel Screenshots feature can assist you in finding specifics.

The new Tensor G4 chip, which was created specifically to run Google’s most sophisticated AI models, powers all of this. It’s actually the first CPU that supports Gemini Nano with multimodality, meaning that text, images, and music can all be understood by the Pixel 9, Pixel 9 Pro, Pixel 9 Pro XL, and Pixel 9 Pro Fold.

 

I chatted with Zach Gleicher, the product manager for Google DeepMind, and Jesse Seed, the group product manager for Google Silicon, to learn more about what the Tensor G4 processor is capable of and how it differs from other AI weapons.

What makes the Tensor G4 chip stand out in a sea of smartphones?

(Image credit: Google)

Jesse Seed Being the first silicon and phone to run Gemini Nano with multi-modality, I believe was the biggest breakthrough we did this year. Pixel Screenshots is one of the really fascinating use cases that become available as a result. That comes in very handy when trying to recall information.

 

The Add Me option is another feature that’s unrelated to the Gemini Nano model yet that I truly adore. The ability to go back and dynamically add the photographer will be greatly appreciated by those of us who are the family or team photographers. And with the help of the Google Augmented Reality SDK, we spent a lot of time fine-tuning more than fifteen distinct machine learning models.

Apple is about to make a big deal with its Apple Intelligence and iPhone 16. How confident are you that the Tensor G4 is one step ahead of the competition?

Seed: We are not unfamiliar with helpful on-device intelligence. Tensor’s primary goal has been to put Google’s machine learning advancements into your pocket since the company’s founding. This year is not any different.

Our newest and most potent SoC, the Tensor G4, is specifically designed for Pixel and is our fourth generation chip. Thanks to a collaborative design with Google DeepMind, it introduces our newest and most powerful model, Gemini Nano with Multimodality, to Pixel 9. With multimodal capabilities like Pixel Screenshots and Call Notes, as well as ongoing functions like Recorder Summarization, Gemini Nano enhances your phone’s ability to comprehend text, images, audio, and speech.

To achieve optimal performance when working together, the Tensor and Google DeepMind teams collaborate to design both hardware and Gemini Nano models. The benefits of this codesign include the ability to scale a 3x more capable model to operate effectively on a phone in a very short amount of time, as well as the ability to run the same model in both “full” and “efficient” modes, which enable us to achieve an industry-leading peak output rate of 45 tokens/second when using the full version and improve energy efficiency by 26% in other scenarios.

Only with close cooperation from us are short design cycles able to achieve such optimizations. And when combined with the additional RAM on the Pixel 9, this implies that the model will always be there and ready to help you as soon as you need it. Naturally, data processed by Google Gemini models and apps is never transferred to a third party, whether it is on device, in the cloud, or a combination of the two.

Zach Gleicher: We at DeepMind work closely with numerous Google teams, and we want to make sure that the Gemini models we develop satisfy the requirements of every Google product. Thus, we became aware of the necessity for on-device models when working with Android and Pixel to build Gemini. We considered this a challenge since, on the server side, there was a drive for larger, presumably more powerful models. Conversely, we were faced with a plethora of intriguing limitations that were absent previously, such as limitations on memory usage and power consumption.

Thus, working together with Pixel and the Tensor team, we were able to identify the main applications for these on-device models as well as their limitations. From there, we co-developed a model. It allowed for the construction of something extremely powerful and capable of powering various use cases. It was an incredibly thrilling experience.

Seed: Therefore, we attach great importance to optimizing what we refer to as essentials, such as power and performance. Our fourth generation chip, the Tensor G4, is both our most performant and efficient. Thus, we think that regular experiences like web browsing or performance, app launches, and general UI snappiness will demonstrate that to users. It seems like a very seamless experience to me. You’ll notice that, on average, browser performance is 20% faster, and app launches are 17% faster.

Seed: As a result, throughout our testing, we observed enhanced peak and sustained performance in common games that are played on the platform.

All of those things add up to the 20% longer battery life that was mentioned in the keynote. Therefore, Tensor G4 is helping to achieve and contributing to roughly 20% greater battery life.

Gleicher Improved dependability is one of the primary reasons we saw the Tensor and Pixel teams approaching us for on-device use cases. Because of this, you can depend on the experience to function wherever you are without needing an internet connection.

 

One more issue we consider is privacy. With an on-device LLM, developers can choose to process data entirely on the device without allowing it to leave the device.

 

Regarding AI features, I’m particularly thrilled about Pixel screenshots. That, in my opinion, aptly illustrates how we are able to get these multi-modal features functioning on a device that is capable of doing what the demos show. It has very little latency and was really quick, but it’s also a very powerful model. Additionally, all of this data and information can be handled locally on your device and is saved locally. We are therefore quite happy that Gemini nano can make such experiences possible.

 

Leave a Comment