Spaces:
Running
A newer version of the Gradio SDK is available:
5.38.2
title: LLM-Powered Tool Invocation System
emoji: π οΈ
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: mit
π οΈ LLM-Powered Tool Invocation System
This Hugging Face Space demonstrates an advanced, end-to-end framework for interpreting natural language commands and executing corresponding software tools. The system uses a dual-model approach: a sentence-transformer for semantic search and a lightweight instruction-tuned LLM for structured data extraction.
The core idea is to bridge the gap between human language and machine-executable functions, allowing users to perform complex tasks by simply describing what they want to do.
π‘ Core Innovation: Latent Space vs. Fine-Tuning
A traditional approach to teaching an LLM to use tools is fine-tuning. This involves re-training a large model on thousands of examples of "command-to-tool" mappings. While effective, this method has significant drawbacks:
- Static & Brittle: If a new tool is added or an existing one changes, the entire model must be fine-tuned again.
- Data-Intensive: It requires a large, high-quality dataset of command-execution pairs.
- Computationally Expensive: Fine-tuning is a costly process in both time and computing resources.
- Risk of Catastrophic Forgetting: The model can lose some of its general reasoning abilities when it becomes overly specialized in a specific task.
This project uses a more modern and flexible zero-shot approach that operates within a latent space and does not alter the models' weights at all.
How Latent Space Works Here
"Latent space" is a high-dimensional vector space where the semantic meaning of text is represented geometrically. In this space, concepts with similar meanings are located closer to each other.
Our system cleverly separates the problem into two distinct steps:
Finding the Right Tool (Semantic Search in Latent Space): We use a specialized
SentenceTransformer
model to instantly map the user's command (e.g., "show me the weather in Paris for 2 days") into this latent space. We do the same for all available tool descriptions. The system then performs a lightning-fast mathematical calculation (cosine similarity) to find the tool vector that is geometrically closest to the user's command vector. This is like a librarian who instantly knows which aisle in the library contains the books that match your topic of interest.Using the Tool (LLM Reasoning): Once the correct tool is identified (e.g.,
weather_reporter
), we leverage a general-purpose instruction-tuned LLM (Qwen/Qwen2-0.5B-Instruct
) as a pure reasoning engine. We provide it with the original command and the specific "table of contents" for the chosen toolβits JSON argument schema. The LLM's only job is to read the command and fill out the schema, a task it excels at without any special training.
This separation of concerns is the key advantage. It results in a system that is:
- β Dynamic & Infinitely Scalable: Add a new tool by simply writing a Python function and a description. No re-training is required. The system learns it instantly.
- β Resource-Efficient: It avoids the high costs of fine-tuning and requires only a small, fast LLM for the reasoning part.
- β Robust & Transparent: The selection process is not a "black box." It's a clear, mathematical similarity search that is easy to debug and understand.
π§ Project Components
Component | Model/Library | Purpose |
---|---|---|
Tool Selection | sentence-transformers/all-mpnet-base-v2 |
Encodes user queries and tool descriptions into vectors for semantic similarity matching in latent space. |
Argument Extraction | Qwen/Qwen2-0.5B-Instruct |
A lightweight LLM that reasons over the user's query to extract structured JSON arguments based on a schema. |
Visualization | umap-learn , matplotlib |
Reduces the high-dimensional embedding space into a 2D plot for visualization. |
Web Interface | gradio |
Provides an interactive web UI for the application. |
Core ML/Data Handling | transformers , torch , numpy |
The foundational libraries for running the models and handling data. |
π How to Run & Customize
This Space is designed to be self-contained and easy to run.
Running the Space
The application will run automatically. Because it uses the public Qwen/Qwen2-0.5B-Instruct
model, no special API keys or secrets are required. For the best performance, it's recommended to run this Space on a GPU, which can be configured in the Space's settings.
Customizing and Adding New Tools
The latent space architecture makes this framework highly extensible. To add a new tool, you only need to modify app.py
:
Create the Tool's Function:
- Write a standard Python function that performs the tool's logic. It should accept arguments with type hints.
- Example:
def send_email(recipient: str, subject: str, body: str): """Simulates sending an email.""" if not all([isinstance(arg, str) for arg in [recipient, subject, body]]): return {"error": "Invalid argument types."} print(f"Email sent to {recipient} with subject '{subject}'") return {"status": "success", "recipient": recipient}
Define the Tool's Schema:
- In the
tools
list (around line 180), create a newTool
object. - Provide the
name
,description
,args_schema
(as a JSON schema), and point to thefunction
you just created. A good description and relevant examples are crucial for the semantic search to work well. - Example:
Tool( name="email_sender", description="Sends an email to a specified recipient with a subject and body.", args_schema={ "type": "object", "properties": { "recipient": {"type": "string", "description": "The email address of the recipient."}, "subject": {"type": "string", "description": "The subject line of the email."}, "body": {"type": "string", "description": "The main content of the email."} }, "required": ["recipient", "subject", "body"] }, function=send_email, examples=[ {"prompt": "send an email to jane@example.com about the project update with the body 'Hi Jane, please see the attached document.'", "args": {"recipient": "jane@example.com", "subject": "Project Update", "body": "Hi Jane, please see the attached document."}} ] )
- In the
Once added, the application will automatically embed the new tool's description into the latent space, making it immediately available for use without any downtime or re-training.