metadata

title: Responses.js
emoji: 😻
colorFrom: red
colorTo: red
sdk: docker
pinned: false
license: mit
short_description: Check out https://github.com/huggingface/responses.js
app_port: 3000

responses.js

A lightweight Express.js server that implements a translation layer between the two main LLM APIs currently available. Works with any Chat Completion API, whether it's a local LLM or the cloud provider of your choice.

🎮 Live Demo

Try responses.js right now, no installation needed!

✨ Features

ResponsesAPI: Partial implementation of OpenAI's Responses API, on top of Chat Completion API
Provider Agnostic: Works with any Chat Completion API (local or remote)
Streaming Support: Support for streamed responses
Structured Output: Support for structured data responses (e.g. jsonschema)
Function Calling: Tool and function calling capabilities
Multi-modal Input: Text and image input support
Remote MCP: Execute MCP tool calls remotely
Demo UI: Interactive web interface for testing

Not implemented: remote function calling, file upload, stateful API, etc.

🚀 Quick Start

Prerequisites

Node.js (v18 or higher)
pnpm (recommended) or npm
an Hugging Face token with inference permissions. Create one from your user settings.

Installation & Setup

# Clone the repository
git clone https://github.com/huggingface/responses.js.git
cd responses.js

# Install dependencies
pnpm install

# Start the development server
pnpm dev

The server will be available at http://localhost:3000.

Running Examples

Explore the various capabilities with our example scripts located in the ./examples folder:

# Basic text input
pnpm run example text

# Multi-turn conversations
pnpm run example multi_turn

# Text + image input
pnpm run example image

# Streaming responses
pnpm run example streaming

# Structured output
pnpm run example structured_output
pnpm run example structured_output_streaming

# Function calling
pnpm run example function
pnpm run example function_streaming

🧪 Testing

Important Notes

Server must be running (pnpm dev) on http://localhost:3000
API_KEY environment variable set with your LLM provider's API key
Tests use real inference providers and may incur costs
Tests are not run in CI due to billing requirements

Running Tests

# Run all tests
pnpm test

# Run specific test patterns
pnpm test --grep "streaming"
pnpm test --grep "function"
pnpm test --grep "structured"

Interactive Demo UI

Experience the API through our interactive web interface, adapted from the openai-responses-starter-app.

Setup

Create a configuration file:

# Create demo/.env
cat > demo/.env << EOF
MODEL="moonshotai/Kimi-K2-Instruct:groq"
OPENAI_BASE_URL=http://localhost:3000/v1
OPENAI_API_KEY=${HF_TOKEN:-<your-huggingface-token>}
EOF

Install demo dependencies:

pnpm demo:install

Launch the demo:

pnpm demo:dev

The demo will be available at http://localhost:3001.

🐳 Running with Docker

You can run the server in a production-ready container using Docker.

Build the Docker image

docker build -t responses.js .

Run the server

docker run -p 3000:3000 responses.js

The server will be available at http://localhost:3000.

📁 Project Structure

responses.js/
├── demo/             # Interactive chat UI demo
├── examples/         # Example scripts using openai-node client
├── src/
│   ├── index.ts      # Application entry point
│   ├── server.ts     # Express app configuration and route definitions
│   ├── routes/       # API route implementations
│   ├── middleware/   # Middleware (validation, logging, etc.)
│   └── schemas/      # Zod validation schemas
├── scripts/          # Utility and build scripts
├── package.json      # Package configuration and dependencies
└── README.md         # This file

🛣️ Done / TODOs

Note: This project is in active development. The roadmap below represents our current priorities and may evolve. Do not take anything for granted.

OpenAI types integration for consistent output
Streaming mode support
Structured output capabilities
Function calling implementation
Repository migration to dedicated responses.js repo
Basic development tooling setup
Demo application with comprehensive instructions
Multi-turn conversation fixes for text messages + tool calls
Correctly return "usage" field
MCP support (non-streaming)
MCP support (streaming)
Tools execution (web search, file search, image generation, code interpreter)
Background mode support
Additional API routes (GET, DELETE, CANCEL, LIST responses)
Reasoning capabilities

🤝 Contributing

We welcome contributions! Please feel free to submit issues, feature requests, or pull requests.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Based on OpenAI's Responses API specification
Built on top of OpenAI's nodejs client
Demo UI adapted from openai-responses-starter-app