Are 'visual' AI models actually blind?

The latest round of language models, like GPT-4o and Gemini 1.5 Pro, are touted as "multi-modal," able to understand images and audio as well as text —

Jul 11, 2024 - 20:30
 0  5
Are 'visual' AI models actually blind?
The latest round of language models, like GPT-4o and Gemini 1.5 Pro, are touted as "multi-modal," able to understand images and audio as well as text —

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow