[Tips] Google Bard can now watch videos, give a summary, and answer questions about the video, including give the recipe.

lucidinferno@lemmy.world · edit-2 11 months ago

[Tips] Google Bard can now watch videos, give a summary, and answer questions about the video, including give the recipe.

lucidinferno@lemmy.world · edit-2 11 months ago

It’s kind of “watching” as it views visual information (don’t want to make this a semantics discussion), though at the moment it does a pretty poor job with watching as I haven’t been able to get it to answer a simple question correctly, such as what color shirt is the host wearing. It tries to answer, though. The exciting part is this will be the worst it’ll ever get.

This is Bard’s response to my query about how it “views” YouTube videos:

“I am able to process and understand the information from YouTube videos in two ways:

Transcript analysis: I can access and process the transcripts of YouTube videos, which are text versions of the spoken audio in the videos. This allows me to understand the content of the videos, even if I cannot directly see the visual elements.

Limited visual processing: I have some limited ability to process visual information from YouTube videos. I can identify basic objects and scenes, and I can track the movement of objects in the videos. However, my ability to process visual information is not as sophisticated as that of a human, and I may not be able to understand all of the visual information in a video.

Overall, I am able to understand YouTube videos through a combination of transcript analysis and limited visual processing. This allows me to provide helpful and informative responses to questions about YouTube videos, even if I cannot directly see the videos myself.”

Rexios@lemm.ee · edit-2 11 months ago

I wouldn’t trust an AI to explain how itself works. Also there’s no way it could respond in a reasonable amount of time if it was analyzing every frame of a video in real time.

lucidinferno@lemmy.world · edit-2 11 months ago

I don’t trust most humans either, but here we are, having discussions, exchanging ideas.

I don’t automatically trust that the system knows exactly how it works, but it seemed to know what it was talking about. Or, at the very least, a response to my question was preprogrammed, as it seems to be a major feature, and there’s bound to be many people asking about it.