Recently, Google at the company’s annual developer conference, presented an early version of Project Astra. About Project Astra: It is a new multimodal AI agent developed by Google. It is capable of answering real-time questions fed to it through text, video, images and speech by pulling up the relevant information. It can see the world, remember where one has left a thing and even answer if a computer code is correct by looking at it through the phone’s camera. It is more straight-forward, there is no range of emotional diversity in its voice. It is not limited to smartphones. Google also showed it being used with a pair of smart glasses. Project Astra can learn about the world, making it as close as possible to a human-assistant-like experience. What is multimodal model AI? A multimodal model is a ML (machine learning) model that is capable of processing information from different modalities, including images, videos and text. For example, G...
Welcome to Hayat Ashraf IAS Mentorship Program