How I Built a Multimodal CX Agent with Just an SOP and Gemini Live API
I wanted to test a simple idea: what if you architected an AI support agent the same way? Give it a training manual instead of a workflow tree. Give it Google Search instead of a RAG pipeline. And use a single multimodal model so you don’t need separate systems for voice, text, and vision.
I built Cortado for the Gemini Live Agent Challenge to explore what that looks like in practice.
...