With the goal of providing a service that is intelligent, user-friendly, and always available, artificial intelligence developers have been working on cracking the code for a while now. I tested this AI for a full day to see how close it comes to being really helpful. Google’s latest attempt at this is called Gemini Live, which was unveiled at Made by Google earlier this week.
I wanted to test what the advantages of having an open-ended discussion with an AI assistant like Gemini would be, even though I’m not used to speaking with them directly beyond asking them to set timers when I’m cooking. And even if I have less experience, after this testing day, I’m at least convinced of the benefits of conversing with AI in this way.
The variety of queries Gemini Live answered for me gives us a good idea of what it does and does not do well, even though my experiments with it are by no means a formal test of its capabilities. Therefore, I’m confident in my prediction that Gemini Live will be a useful addition to the Gemini package and possibly a compelling enough incentive for some of the free users to upgrade to Gemini Advanced, which costs $20 a month. even though it hasn’t yet accomplished all of its goals.
Thursday Afternoon — The setup
Included with the Gemini Advanced membership, Gemini Live is currently being rolled out, however not all users have access to it yet. Thankfully, I can test it out using a Google Pixel 9 Pro XL. While we’ll be concentrating only on Gemini Live here, you can learn more about the phone by reading our hands-on review of the Google Pixel 9 Pro and Pro XL.
Another issue is that in order to utilize Gemini, you presently have to set the language to US English. Thankfully, I was still able to select “Capella,” a British voice for Gemini Chat, out of the ten available voices even after I had done this. With varying vocal pitches and levels of passion, they all seem rather natural. It’s rare to encounter a truly egregious mispronunciation or strangely constructed sentence—even when you start asking questions.
Thursday evening — Getting home
After everything was set up, I asked Gemini Chat for directions home, which was my first significant encounter with it. After I confirmed the stations I intended to travel between and told Gemini Live my preferred mode of transportation, it didn’t initially inform me of what it had discovered. I finally got it to tell me what it had discovered and the path after a protracted wait.
With the path, I most likely would have made it home. It wouldn’t have been the easiest trip, though. Gemini seemed to make up a train out of thin air after misidentifying a train line and a station. She also failed to mention that one of my modifications would actually necessitate walking between two stations. Which
Although having an authoritative-sounding voice (with a British accent no less) suggest a route could have caused someone less familiar with London public transportation to become quite lost, this is more of an issue with the underlying AI model than Gemini Live. For this kind of thing, it seems like you’re better off staying with Google Maps.
Friday morning — News briefing
When I was getting ready for work the next day, I asked Gemini to walk me through the breaking news of the day. It told me a lot about the changing hosts of This Morning and Good Morning Britain and made a passing mention of the latest stabbing in Leicester Square with only one prompt. However, things became more bizarre when I requested for tech news.
I was first informed by Google Gemini that Microsoft had launched the Surface Duo 3, a gadget that hasn’t been confirmed and has been the subject of months of rumors regarding its cancellation. The final sentence appears to be alluding to the Crowdstrike outage from last month, and the PS5 Slim is legitimate, despite having been released in the autumn.
I then urged Gemini Live to focus on iPhone rumors, but at first, all of their responses were about the existing iPhone 15 series. After more prodding, it gave a brief description of various iPhone 16 camera rumors.
Friday mid-morning — Brewing guide
It was time for a coffee break after working for a few hours, so I tried to get Gemini Live to walk me through the process of producing a V60 pourover.
I was hoping to get step-by-step directions from the AI, but the issue here is that you have to keep interrupting or prompting Gemini Live in order to make it provide responses in the form of steps. Nevertheless, it managed to sustain the dialogue by providing thoughtful responses, even though the transcript indicates that at first it misinterpreted my commands.
In terms of knowledge Gemini had mixed results. It provided some advanced advice, including purifying my water before cooking it. Though basic overall, the recipe produced a cup that was acceptable to drink. However, Gemini Live also recommended a coffee weight in tablespoons of beans, which isn’t a common measurement when brewing, instead of grams or ounces. However, I was able to obtain a gram amount with an additional prompt.
Friday lunchtime — fighting talk
I spoke with Gemini Live briefly about Street Fighter 6, which is the game I’m playing the most right now, while we were having lunch. It accurately identified the SF6 Evo 2024 champion for this year as well as their rival, but once more provided very little preliminary information.
I asked about training advise because I tend to rely too much on particular moves, and I received some ideas on how to change my strategy throughout a match. Even if it’s easier said than done while your opponent is hurling fireballs at you, the advise was sound.
I also attempted to seek some advice on where to locate face-to-face gatherings, but this didn’t quite work out as well. When it attempted to look for more information, it discovered that the official website only contained information about Capcom’s sanctioned tournaments. After that, it directed me to a local Facebook group, but it was unable to provide me with a link to view the transcript afterwards.
Friday afternoon — writing advice
I choose to go meta for Gemini’s last assignment, and no, we’re not talking about Llama 3. I requested it to assist me in writing this article’s introduction.
I was taken aback by Gemini’s willingness to offer precise wordings, as I had before experienced Gemini’s lack of clarity in my responses. It reacted rationally when I requested it to add more details or alter its perspective. Furthermore, as Google proudly demonstrated during its Made by Google demo, Gemini Live has the ability to adapt to disruptions and change its responses as needed.
The nicest part of Gemini Live was this: revising a concept aloud seems quite natural, even when you’re speaking into a glowing waveform on your phone. Ultimately, I did write the introduction to this post from scratch. But if you scroll back up and compare it with what it provided me, you can definitely find some echoes of its final proposal.
Google Gemini Live: Final thoughts
This article may give the impression that I’m not a big fan of Gemini Live, but that’s not exactly accurate. The Gemini Advanced model running it receives the brunt of my critiques because in multiple test scenarios, it appeared to be misinterpreting what it was supposed to be looking for. We recently had a Gemini vs. Gemini Advanced face-off, and funny enough, it turns out I might have been better off staying with standard Gemini.
In the meantime, Gemini Live was really impressive on its own. It seems like a much better method to communicate than text or graphic cues to be able to maintain a continuous discussion with a chatbot, assuming you’re prepared to be explicit and interrupt if it wanders off track. Although you can ask follow-up questions to conventional digital assistants, the experience is still not as smooth as it was with Gemini Live. And it’s just that seamlessness that makes it useful, offering assistance and answers not only hands-free but also eyes-free, letting you concentrate on other things while conversing with the chatbot.
The key question still stands, though, regarding how this stacks up against the soon-to-be ChatGPT Voice—especially given that ChatGPT Voice can process speech directly, while Gemini Live must first translate voice as text. Despite the typical AI warnings, however, it appears like Google is headed in the right direction as it pursues the ideal of a digital personal assistant.