Gemini Live can now mark the object over the screen

Google has announced remarkable innovations for Gemini Live. These innovations remove the artificial intelligence assistant not only a tool that responds to voice. Instead, a versatile assistant emerges in which visual and auditory experiences meet. In this way, users get more practical solutions in their daily lives. The announcement was made at Made by Google event.

The visual orientation feature was one of the most striking of these updates. When users turn their phone cameras into any object, Gemini Live can directly mark that object on the screen. However, this feature is not only a detail, but also a step that deepens the user experience. For example, the person who wants to choose the right tool in a tool box can find the appropriate option without being hesitant. In addition, this method can also benefit in learning processes.

Google Gemini Live can mark the object on the screen

This feature will first be available with Pixel 10 devices that will be released on August 28th. Google announced that it will distribute to other Android devices on the same day. However, it will be necessary to wait a little longer on the iOS side because the company says it will commission this support within a few weeks. In addition, it is understood that large masses will provide access in a short time, regardless of the device. This approach shows that Google wants to spread the update quickly.

Gemini Live is not limited to visual marking. Google is preparing to integrate the assistant with basic applications such as messages, telephone and watch. This integration will support users to maintain their daily work with less deduction. For example, when receiving a route recipe, he may also need to send a message to a friend. At this point, the assistant can intervene and perform both jobs with a single command.

The user experience aims to provide uninterrupted flow. Let’s say the user gets directions and noticed that he would be late. When Gemini tells Live to Live, “This route is appropriate, write a 10 -minute delay to Alex,” he prepares the draft of the assistant message. In addition, it can send the message directly after receiving approval. This feature saves time, while at the same time increases the ease of use. However, the importance of safe messaging infrastructure comes to the agenda again at this point.

In addition to these developments, Google is renewing the sound model. The company states that the new model will more naturally mimic the rhythm, emphasis and intonation in human speech. Thus, Gemini Live can use a tone in accordance with the feeling of the subject spoken. When a stressful issue comes up, a more calm voice is heard. When a fun story is desired, a more energetic intonation comes to the fore.

However, users will be able to adjust the assistant’s speech speed. The option of speaking faster or more slowly will be useful for people with different habits. In addition to all these, Gemini Live will be able to choose an appropriate accent when a narrative is requested from the mouth of a particular character or a historical person. This will contribute to the fact that the stories become more interesting. Especially in the fields of education and creative content will be able to offer a different experience.

Gemini Live’s audio developments lead to comparisons of Chatgpt with sound mode. Both platforms take similar steps to provide a more natural artificial intelligence experience. Nevertheless, it is noteworthy that Google focuses on personalization details such as accent and tempo. It is predicted that users will help express themselves better in different scenarios. In addition, these diversified features can increase dependence on artificial intelligence in daily use.

Although these innovations are exciting, confidentiality and data safety questions come with them. Visual orientation and application integrations process user data more. Despite this, Google emphasizes that it takes the necessary precautions to ensure the safety of the data. How users react to this process will become clearer in the future. On the other hand, the balance between security and functionality continues to be critical for such technologies.