fciannella commited on
Commit
41b6e84
·
1 Parent(s): f64b107

fixed the protobuf issue

Browse files
Dockerfile CHANGED
@@ -51,7 +51,7 @@ WORKDIR /app
51
 
52
  # App files
53
  COPY --chown=user pyproject.toml uv.lock \
54
- LICENSE README.md NVIDIA_PIPECAT.md \
55
  ./
56
  COPY --chown=user src/ ./src/
57
  COPY --chown=user examples/${EXAMPLE_NAME} ./examples/${EXAMPLE_NAME}
 
51
 
52
  # App files
53
  COPY --chown=user pyproject.toml uv.lock \
54
+ LICENSE README.md \
55
  ./
56
  COPY --chown=user src/ ./src/
57
  COPY --chown=user examples/${EXAMPLE_NAME} ./examples/${EXAMPLE_NAME}
NVIDIA_PIPECAT.md CHANGED
@@ -2,4 +2,4 @@
2
 
3
  The NVIDIA Pipecat library augments [the Pipecat framework](https://github.com/pipecat-ai/pipecat) by adding additional frame processors and services, as well as new multimodal frames to facilitate the creation of human-avatar interactions. This includes the integration of NVIDIA services and NIMs such as [NVIDIA Riva](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/index.html), [NVIDIA Audio2Face](https://build.nvidia.com/nvidia/audio2face-3d), and [NVIDIA Foundational RAG](https://build.nvidia.com/nvidia/build-an-enterprise-rag-pipeline). It also introduces a few processors with a focus on improving the end-user experience for multimodal conversational agents, along with speculative speech processing to reduce latency for faster bot responses.
4
 
5
- The nvidia-pipecat source code can be found in [the GitHub repository](https://github.com/NVIDIA/ace-controller). Follow [the documentation](https://docs.nvidia.com/ace/ace-controller-microservice/latest/index.html) for more details.
 
2
 
3
  The NVIDIA Pipecat library augments [the Pipecat framework](https://github.com/pipecat-ai/pipecat) by adding additional frame processors and services, as well as new multimodal frames to facilitate the creation of human-avatar interactions. This includes the integration of NVIDIA services and NIMs such as [NVIDIA Riva](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/index.html), [NVIDIA Audio2Face](https://build.nvidia.com/nvidia/audio2face-3d), and [NVIDIA Foundational RAG](https://build.nvidia.com/nvidia/build-an-enterprise-rag-pipeline). It also introduces a few processors with a focus on improving the end-user experience for multimodal conversational agents, along with speculative speech processing to reduce latency for faster bot responses.
4
 
5
+ The nvidia-pipecat source code can be found in [the GitHub repository](https://github.com/NVIDIA/ace-controller). Follow [the documentation](https://docs.nvidia.com/ace/ace-controller-microservice/latest/index.html) for more details.
examples/voice_agent_multi_thread/DOCKER_DEPLOYMENT.md DELETED
@@ -1,322 +0,0 @@
1
- # Docker Deployment - Multi-Threaded Voice Agent
2
-
3
- ## Overview
4
-
5
- This Docker container runs the complete multi-threaded telco voice agent stack:
6
- - **LangGraph Server** (`langgraph dev`) on port 2024
7
- - **Pipecat Pipeline** (FastAPI + WebRTC) on port 7860
8
- - **React UI** served at `http://localhost:7860`
9
-
10
- ## Quick Start
11
-
12
- ### Build the Image
13
-
14
- ```bash
15
- # From project root
16
- docker build -t voice-agent-multi-thread .
17
- ```
18
-
19
- ### Run the Container
20
-
21
- ```bash
22
- docker run -p 7860:7860 \
23
- -e RIVA_API_KEY=your_nvidia_api_key \
24
- -e NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409 \
25
- -e NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d \
26
- voice-agent-multi-thread
27
- ```
28
-
29
- ### Access the UI
30
-
31
- Open your browser to: **http://localhost:7860**
32
-
33
- ## What Happens Inside the Container
34
-
35
- The `start.sh` script orchestrates two processes:
36
-
37
- ### 1. LangGraph Server (Port 2024)
38
- ```bash
39
- cd /app/examples/voice_agent_multi_thread/agents
40
- uv run langgraph dev --no-browser --host 0.0.0.0 --port 2024
41
- ```
42
-
43
- This runs the multi-threaded telco agent with:
44
- - Main thread for long operations
45
- - Secondary thread for interim queries
46
- - Store-based coordination
47
-
48
- ### 2. Pipecat Pipeline (Port 7860)
49
- ```bash
50
- cd /app/examples/voice_agent_multi_thread
51
- uv run pipeline.py
52
- ```
53
-
54
- This runs the voice pipeline with:
55
- - WebRTC transport
56
- - RIVA ASR (speech-to-text)
57
- - LangGraphLLMService (multi-threaded routing)
58
- - RIVA TTS (text-to-speech)
59
- - React UI
60
-
61
- ## Environment Variables
62
-
63
- ### Required
64
-
65
- ```bash
66
- # NVIDIA API Key for RIVA services
67
- RIVA_API_KEY=nvapi-xxxxx
68
- ```
69
-
70
- ### Optional
71
-
72
- ```bash
73
- # LangGraph Configuration
74
- LANGGRAPH_HOST=0.0.0.0
75
- LANGGRAPH_PORT=2024
76
- LANGGRAPH_ASSISTANT=telco-agent
77
-
78
- # User Configuration
79
- USER_EMAIL=user@example.com
80
-
81
- # ASR Configuration
82
- NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409
83
- RIVA_ASR_LANGUAGE=en-US
84
- RIVA_ASR_MODEL=parakeet-1.1b-en-US-asr-streaming-silero-vad-asr-bls-ensemble
85
-
86
- # TTS Configuration
87
- NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
88
- RIVA_TTS_VOICE_ID=Magpie-ZeroShot.Female-1
89
- RIVA_TTS_MODEL=magpie_tts_ensemble-Magpie-ZeroShot
90
- RIVA_TTS_LANGUAGE=en-US
91
-
92
- # Zero-shot audio prompt (optional)
93
- ZERO_SHOT_AUDIO_PROMPT_URL=https://github.com/your-repo/audio-prompt.wav
94
-
95
- # Multi-threading (default: true)
96
- ENABLE_MULTI_THREADING=true
97
-
98
- # Debug
99
- LANGGRAPH_DEBUG_STREAM=false
100
- ```
101
-
102
- ## Docker Compose
103
-
104
- Create `docker-compose.yml`:
105
-
106
- ```yaml
107
- version: '3.8'
108
-
109
- services:
110
- voice-agent:
111
- build: .
112
- ports:
113
- - "7860:7860"
114
- environment:
115
- - RIVA_API_KEY=${RIVA_API_KEY}
116
- - NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409
117
- - NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
118
- - USER_EMAIL=user@example.com
119
- - LANGGRAPH_ASSISTANT=telco-agent
120
- - ENABLE_MULTI_THREADING=true
121
- volumes:
122
- # Optional: mount .env file
123
- - ./examples/voice_agent_multi_thread/.env:/app/examples/voice_agent_multi_thread/.env:ro
124
- # Optional: persist audio recordings
125
- - ./audio_dumps:/app/examples/voice_agent_multi_thread/audio_dumps
126
- healthcheck:
127
- test: ["CMD", "curl", "-f", "http://localhost:7860/get_prompt"]
128
- interval: 30s
129
- timeout: 10s
130
- retries: 3
131
- start_period: 60s
132
- ```
133
-
134
- Run with:
135
- ```bash
136
- docker-compose up
137
- ```
138
-
139
- ## Using .env File
140
-
141
- Create `.env` in `examples/voice_agent_multi_thread/`:
142
-
143
- ```bash
144
- # NVIDIA API Keys
145
- RIVA_API_KEY=nvapi-xxxxx
146
-
147
- # LangGraph
148
- LANGGRAPH_ASSISTANT=telco-agent
149
- LANGGRAPH_BASE_URL=http://127.0.0.1:2024
150
-
151
- # User
152
- USER_EMAIL=test@example.com
153
-
154
- # ASR
155
- NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409
156
-
157
- # TTS
158
- NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
159
- RIVA_TTS_VOICE_ID=Magpie-ZeroShot.Female-1
160
- ```
161
-
162
- The `start.sh` script automatically loads this file.
163
-
164
- ## Ports
165
-
166
- | Service | Internal Port | External Port | Purpose |
167
- |---------|---------------|---------------|---------|
168
- | LangGraph Server | 2024 | - | Agent runtime (internal only) |
169
- | Pipecat Pipeline | 7860 | 7860 | WebRTC + HTTP API |
170
- | React UI | - | 7860 | Served by pipeline |
171
-
172
- **Note**: Only port 7860 is exposed externally. LangGraph runs internally on 2024.
173
-
174
- ## Healthcheck
175
-
176
- The container includes a healthcheck that verifies the pipeline is responding:
177
-
178
- ```bash
179
- curl -f http://localhost:7860/get_prompt
180
- ```
181
-
182
- Check health status:
183
- ```bash
184
- docker ps
185
- # Look for "(healthy)" in STATUS column
186
- ```
187
-
188
- ## Logs
189
-
190
- View all logs:
191
- ```bash
192
- docker logs -f <container-id>
193
- ```
194
-
195
- You'll see both:
196
- - LangGraph server startup and agent logs
197
- - Pipeline startup and WebRTC connection logs
198
-
199
- ## Testing Multi-Threading
200
-
201
- 1. **Open UI**: http://localhost:7860
202
- 2. **Select Agent**: Choose "Telco Agent"
203
- 3. **Test Long Operation**:
204
- - Say: *"Close my contract"*
205
- - Confirm: *"Yes"*
206
- - Operation starts (50 seconds)
207
- 4. **Test Secondary Thread**:
208
- - While waiting, say: *"What's the status?"*
209
- - Agent responds with progress
210
- - Say: *"How much data do I have left?"*
211
- - Agent answers while main operation continues
212
-
213
- ## Troubleshooting
214
-
215
- ### Container won't start
216
- ```bash
217
- # Check logs
218
- docker logs <container-id>
219
-
220
- # Common issues:
221
- # 1. Missing RIVA_API_KEY
222
- # 2. Port 7860 already in use
223
- # 3. Insufficient memory
224
- ```
225
-
226
- ### LangGraph not starting
227
- ```bash
228
- # Check if agents directory exists
229
- docker exec <container-id> ls -la /app/examples/voice_agent_multi_thread/agents
230
-
231
- # Check langgraph.json
232
- docker exec <container-id> cat /app/examples/voice_agent_multi_thread/agents/langgraph.json
233
- ```
234
-
235
- ### Pipeline not responding
236
- ```bash
237
- # Check pipeline logs
238
- docker logs <container-id> 2>&1 | grep pipeline
239
-
240
- # Check if port is accessible
241
- curl http://localhost:7860/get_prompt
242
- ```
243
-
244
- ### Multi-threading not working
245
- ```bash
246
- # Verify env var
247
- docker exec <container-id> env | grep MULTI_THREADING
248
-
249
- # Check LangGraph server
250
- docker exec <container-id> curl http://localhost:2024/assistants
251
- ```
252
-
253
- ## Development Mode
254
-
255
- To develop inside the container:
256
-
257
- ```bash
258
- # Run with shell
259
- docker run -it -p 7860:7860 \
260
- -v $(pwd)/examples/voice_agent_multi_thread:/app/examples/voice_agent_multi_thread \
261
- voice-agent-multi-thread /bin/bash
262
-
263
- # Inside container:
264
- cd /app/examples/voice_agent_multi_thread
265
-
266
- # Start services manually
267
- cd agents && uv run langgraph dev &
268
- cd .. && uv run pipeline.py
269
- ```
270
-
271
- ## Building for Production
272
-
273
- ### Multi-stage optimization
274
- The Dockerfile uses a multi-stage build:
275
- 1. **ui-builder**: Compiles React UI
276
- 2. **python base**: Installs Python dependencies
277
- 3. **Final image**: ~2GB (UI + Python + agents)
278
-
279
- ### Reducing image size
280
- ```dockerfile
281
- # Use slim Python base (already done)
282
- FROM python:3.12-slim
283
-
284
- # Clean up build artifacts (already done)
285
- RUN apt-get clean && rm -rf /var/lib/apt/lists/*
286
-
287
- # Use uv for faster installs (already done)
288
- RUN pip install uv
289
- ```
290
-
291
- ## Security Considerations
292
-
293
- 1. **Non-root user**: Container runs as UID 1000
294
- 2. **No secrets in image**: Use environment variables or mount secrets
295
- 3. **Read-only filesystem**: UI dist is built at image time
296
- 4. **Health checks**: Automatic restart on failure
297
-
298
- ## Performance
299
-
300
- - **Startup time**: ~30-60 seconds
301
- - **Memory**: ~2GB recommended
302
- - **CPU**: 2 cores minimum
303
- - **Storage**: ~3GB for image + runtime
304
-
305
- ## Related Files
306
-
307
- - `Dockerfile` - Container definition
308
- - `start.sh` - Startup orchestration
309
- - `agents/langgraph.json` - Agent configuration
310
- - `pipeline.py` - Pipecat pipeline
311
- - `langgraph_llm_service.py` - Multi-threaded LLM service
312
-
313
- ## Support
314
-
315
- For issues:
316
- 1. Check logs: `docker logs <container-id>`
317
- 2. Verify environment variables
318
- 3. Test components individually (LangGraph, Pipeline)
319
- 4. Review `PIPECAT_MULTI_THREADING.md` for architecture details
320
-
321
-
322
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
examples/voice_agent_multi_thread/agents/requirements.txt CHANGED
@@ -11,5 +11,6 @@ docling
11
  pymongo
12
  yt_dlp
13
  requests
14
- protobuf==6.31.1
 
15
  twilio
 
11
  pymongo
12
  yt_dlp
13
  requests
14
+ # protobuf==6.31.1
15
+ protobuf
16
  twilio
pyproject.toml CHANGED
@@ -2,7 +2,7 @@
2
  name = "nvidia-pipecat"
3
  version = "0.2.0"
4
  description = "NVIDIA ACE Pipecat SDK"
5
- readme = "NVIDIA_PIPECAT.md"
6
  license = { file = "LICENSE" }
7
  authors = [
8
  { name = "NVIDIA ACE", email = "ace-dev@exchange.nvidia.com" }
 
2
  name = "nvidia-pipecat"
3
  version = "0.2.0"
4
  description = "NVIDIA ACE Pipecat SDK"
5
+ readme = "README.md"
6
  license = { file = "LICENSE" }
7
  authors = [
8
  { name = "NVIDIA ACE", email = "ace-dev@exchange.nvidia.com" }