Spaces:
Running
Running
NVIDIA Pipecat 0.1.0 (23 April 2025)
The NVIDIA Pipecat library augments the Pipecat framework by adding additional frame processors and services, as well as new multimodal frames to enhance avatar interactions. This is the first release of the NVIDIA Pipecat library.
New Features
- Added Pipecat services for Riva ASR (Automatic Speech Recognition), Riva TTS (Text to Speech), and Riva NMT (Neural Machine Translation) models.
- Added Pipecat frames, processors, and services to support multimodal avatar interactions and use cases. This includes
Audio2Face3DService
,AnimationGraphService
,FacialGestureProviderProcessor
, andPostureProviderProcessor
. - Added
ACETransport
, which is specifically designed to support integration with existing ACE microservices. This includes a FastAPI-based HTTP and WebSocket server implementation compatible with ACE. - Added
NvidiaLLMService
for NIM LLM models andNvidiaRAGService
for the NVIDIA RAG Blueprint. - Added
UserTranscriptSynchronization
processor for user speech transcripts andBotTranscriptSynchronization
processor for synchronizing bot transcripts with bot audio playback. - Added custom context aggregators and processors to enable Speculative Speech Processing to reduce latency.
- Added
UserPresence
,Proactivity
, andAcknowledgementProcessor
frame processors to improve human-bot interactions. - Released source code for the voice assistant example using
nvidia-pipecat
, along with thepipecat-ai
library service, to showcase NVIDIA services withACETransport
.
Improvements
- Added
ElevenLabsTTSServiceWithEndOfSpeech
, an extended version of the ElevenLabs TTS service with end-of-speech events for usage in avatar interactions.