Spaces:
Runtime error
Runtime error
File size: 2,127 Bytes
b6b7427 6c09f76 b6b7427 6c09f76 b6b7427 6c09f76 b6b7427 6c09f76 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
title: ML6-Gemini-Demo
app_file: src/app.py
sdk: gradio
sdk_version: 5.23.0
---
# Gemini Voice Agent Demo
This repo contains a demo using the Gemini MultiModal API to create a voice-based agent that can conduct professional technical screening interviews.
## Technical Overview
The system is based on FastRTC and Gradio to provide a real-time voice UI.
### About the modality
You can configure the output modality:
- If set to AUDIO
- The agent will respond with an audio response.
- There is no text output so no transcription
if set to TEXT
- The agent will respond with a text response.
- The text output will be transcribed to audio using the TTS API.
- Transcriptions are available.
### Function Calling
There are 2 functions that can be called:
- Answer validation
- will check the answer type vs the expected type
- will store the answer
- Log Input
- will log the user input
- this is a form of transcribing the incoming audio
## Getting Started
To run the application, follow these steps:
1. Install uv (if not already installed):
`curl -LsSf https://astral.sh/uv/install.sh | sh`
2. Install dependencies:
`uv sync`
3. Setup the environment variables for either GenAI or VertexAI (see below)
4. Run the application:
`python src/app.py`
5. Visit `http://127.0.0.1:7860` in your browser to interact with the voice agent.
### GenAI vs VertexAI
"gemini-2.0-flash-exp" can be used in both GenAI and VertexAI. [more info](https://github.com/heiko-hotz/gemini-multimodal-live-dev-guide?tab=readme-ov-file)
- GenAI requires just a GEMINI_API_KEY environment variable [link](https://ai.google.dev/gemini-api/docs/api-key)
- VertexAI requires a GCP project and the following environment variables:
```
export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
export GOOGLE_CLOUD_LOCATION=europe-west4
export GOOGLE_GENAI_USE_VERTEXAI=True
```
Depending `GOOGLE_GENAI_USE_VERTEXAI` flag this demo will use either GenAI or VertexAI.
### Note
The gradio-webrtc install fails unless you have ffmpeg@6, on mac:
```
brew uninstall ffmpeg
brew install ffmpeg@6
brew link ffmpeg@6
```
|