burtenshaw commited on
Commit
49da546
Β·
1 Parent(s): 14270af

change functionality to add tags to model repos

Browse files
Files changed (3) hide show
  1. README.md +7 -241
  2. app.py +216 -57
  3. mcp_server.py +109 -45
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: Mcp Discussion Bot
3
  emoji: πŸ‘€
4
  colorFrom: purple
5
  colorTo: yellow
@@ -10,246 +10,12 @@ pinned: false
10
  base_path: /gradio
11
  ---
12
 
13
- # πŸ€– Hugging Face Discussion Bot
14
 
15
- A FastAPI and Gradio application that automatically responds to Hugging Face Hub discussion comments using AI-powered responses via Hugging Face Inference API with MCP integration.
16
 
17
- ## ✨ Features
18
 
19
- - **Webhook Integration**: Receives real-time webhooks from Hugging Face Hub when new discussion comments are posted
20
- - **AI-Powered Responses**: Uses Hugging Face Inference API with MCP support for intelligent, context-aware responses
21
- - **Interactive Dashboard**: Beautiful Gradio interface to monitor comments and test functionality
22
- - **Automatic Posting**: Posts AI responses back to the original discussion thread
23
- - **Testing Tools**: Built-in webhook simulation and AI testing capabilities
24
- - **MCP Server**: Includes a Model Context Protocol server for advanced tool integration
25
-
26
- ## πŸš€ Quick Start
27
-
28
- ### 1. Installation
29
-
30
- ```bash
31
- # Clone the repository
32
- git clone <your-repo-url>
33
- cd mcp-course-unit3-example
34
-
35
- # Install dependencies
36
- pip install -e .
37
- ```
38
-
39
- ### 2. Environment Setup
40
-
41
- Copy the example environment file and configure your API keys:
42
-
43
- ```bash
44
- cp env.example .env
45
- ```
46
-
47
- Edit `.env` with your credentials:
48
-
49
- ```env
50
- # Webhook Configuration
51
- WEBHOOK_SECRET=your-secure-webhook-secret
52
-
53
- # Hugging Face Configuration
54
- HF_TOKEN=hf_your_hugging_face_token_here
55
-
56
- # Model Configuration (optional)
57
- HF_MODEL=microsoft/DialoGPT-medium
58
- HF_PROVIDER=huggingface
59
- ```
60
-
61
- ### 3. Run the Application
62
-
63
- ```bash
64
- python server.py
65
- ```
66
-
67
- The application will start on `http://localhost:8000` with:
68
- - πŸ“Š **Gradio Dashboard**: `http://localhost:8000/gradio`
69
- - πŸ”— **Webhook Endpoint**: `http://localhost:8000/webhook`
70
- - πŸ“‹ **API Documentation**: `http://localhost:8000/docs`
71
-
72
- ## πŸ”§ Configuration
73
-
74
- ### Hugging Face Hub Webhook Setup
75
-
76
- 1. Go to your Hugging Face repository settings
77
- 2. Navigate to the "Webhooks" section
78
- 3. Create a new webhook with:
79
- - **URL**: `https://your-domain.com/webhook`
80
- - **Secret**: Same as `WEBHOOK_SECRET` in your `.env`
81
- - **Events**: Subscribe to "Community (PR & discussions)"
82
-
83
- ### Required API Keys
84
-
85
- #### Hugging Face Token
86
- 1. Go to [Hugging Face Settings](https://huggingface.co/settings/tokens)
87
- 2. Create a new token with "Write" permissions
88
- 3. Add it to your `.env` as `HF_TOKEN`
89
-
90
- ## πŸ“Š Dashboard Features
91
-
92
- ### Recent Comments Tab
93
- - View all processed discussion comments
94
- - See AI responses in real-time
95
- - Refresh and filter capabilities
96
-
97
- ### Test HF Inference Tab
98
- - Direct testing of the Hugging Face Inference API
99
- - Custom prompt input
100
- - Response preview
101
-
102
- ### Simulate Webhook Tab
103
- - Test webhook processing without real HF events
104
- - Mock discussion scenarios
105
- - Validate AI response generation
106
-
107
- ### Configuration Tab
108
- - View current setup status
109
- - Check API key configuration
110
- - Monitor processing statistics
111
-
112
- ## πŸ”Œ API Endpoints
113
-
114
- ### POST `/webhook`
115
- Receives webhooks from Hugging Face Hub.
116
-
117
- **Headers:**
118
- - `X-Webhook-Secret`: Your webhook secret
119
-
120
- **Body:** HF Hub webhook payload
121
-
122
- ### GET `/comments`
123
- Returns all processed comments and responses.
124
-
125
- ### GET `/`
126
- Basic API information and available endpoints.
127
-
128
- ## πŸ€– MCP Server
129
-
130
- The application includes a Model Context Protocol (MCP) server that provides tools for:
131
-
132
- - **get_discussions**: Retrieve discussions from HF repositories
133
- - **get_discussion_details**: Get detailed information about specific discussions
134
- - **comment_on_discussion**: Add comments to discussions
135
- - **generate_ai_response**: Generate AI responses using HF Inference
136
- - **respond_to_discussion**: Generate and post AI responses automatically
137
-
138
- ### Running the MCP Server
139
-
140
- ```bash
141
- python mcp_server.py
142
- ```
143
-
144
- The MCP server uses stdio transport and can be integrated with MCP clients following the [Tiny Agents pattern](https://huggingface.co/blog/python-tiny-agents).
145
-
146
- ## πŸ§ͺ Testing
147
-
148
- ### Local Testing
149
- Use the "Simulate Webhook" tab in the Gradio dashboard to test without real webhooks.
150
-
151
- ### Webhook Testing
152
- You can test the webhook endpoint directly:
153
-
154
- ```bash
155
- curl -X POST http://localhost:8000/webhook \
156
- -H "Content-Type: application/json" \
157
- -H "X-Webhook-Secret: your-webhook-secret" \
158
- -d '{
159
- "event": {"action": "create", "scope": "discussion.comment"},
160
- "comment": {
161
- "content": "@discussion-bot How do I use this model?",
162
- "author": "test-user",
163
- "created_at": "2024-01-01T00:00:00Z"
164
- },
165
- "discussion": {
166
- "title": "Test Discussion",
167
- "num": 1,
168
- "url": {"api": "https://huggingface.co/api/repos/test/repo/discussions"}
169
- },
170
- "repo": {"name": "test/repo"}
171
- }'
172
- ```
173
-
174
- ## πŸ—οΈ Architecture
175
-
176
- ```
177
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€οΏ½οΏ½β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
178
- β”‚ HF Hub │───▢│ FastAPI │───▢│ HF Inference β”‚
179
- β”‚ Webhook β”‚ β”‚ Server β”‚ β”‚ API β”‚
180
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
181
- β”‚
182
- β–Ό
183
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
184
- β”‚ Gradio β”‚
185
- β”‚ Dashboard β”‚
186
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
187
- β”‚
188
- β–Ό
189
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
190
- β”‚ MCP Server β”‚
191
- β”‚ (Tools) β”‚
192
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
193
- ```
194
-
195
- ## πŸ”’ Security
196
-
197
- - Webhook secret verification prevents unauthorized requests
198
- - Environment variables keep sensitive data secure
199
- - CORS middleware configured for safe cross-origin requests
200
-
201
- ## πŸš€ Deployment
202
-
203
- ### Using Docker (Recommended)
204
-
205
- ```dockerfile
206
- FROM python:3.11-slim
207
-
208
- WORKDIR /app
209
- COPY . .
210
- RUN pip install -e .
211
-
212
- EXPOSE 8000
213
- CMD ["python", "server.py"]
214
- ```
215
-
216
- ### Using Cloud Platforms
217
-
218
- The application can be deployed on:
219
- - **Hugging Face Spaces** (recommended for HF integration)
220
- - **Railway**
221
- - **Render**
222
- - **Heroku**
223
- - **AWS/GCP/Azure**
224
-
225
- ## 🀝 Contributing
226
-
227
- 1. Fork the repository
228
- 2. Create a feature branch
229
- 3. Make your changes
230
- 4. Add tests if applicable
231
- 5. Submit a pull request
232
-
233
- ## πŸ“ License
234
-
235
- This project is licensed under the MIT License.
236
-
237
- ## πŸ†˜ Support
238
-
239
- If you encounter issues:
240
-
241
- 1. Check the Configuration tab in the dashboard
242
- 2. Verify your API keys are correct
243
- 3. Ensure webhook URL is accessible
244
- 4. Check the application logs
245
-
246
- For additional help, please open an issue in the repository.
247
-
248
- ## πŸ”— Related Links
249
-
250
- - [Hugging Face Webhooks Guide](https://huggingface.co/docs/hub/en/webhooks-guide-discussion-bot)
251
- - [Hugging Face Hub Python Library](https://huggingface.co/docs/huggingface_hub/en/guides/community)
252
- - [Tiny Agents in Python Blog Post](https://huggingface.co/blog/python-tiny-agents)
253
- - [FastAPI Documentation](https://fastapi.tiangolo.com/)
254
- - [Gradio Documentation](https://gradio.app/)
255
- - [Model Context Protocol (MCP)](https://modelcontextprotocol.io/)
 
1
  ---
2
+ title: tag-a-repo bot
3
  emoji: πŸ‘€
4
  colorFrom: purple
5
  colorTo: yellow
 
10
  base_path: /gradio
11
  ---
12
 
13
+ # HF Tagging Bot
14
 
15
+ This is a bot that tags HuggingFace models when they are mentioned in discussions.
16
 
17
+ ## How it works
18
 
19
+ 1. The bot listens to discussions on the HuggingFace Hub
20
+ 2. When a discussion is created, the bot checks for tag mentions in the comment
21
+ 3. If a tag is mentioned, the bot adds the tag to the model repository via a PR
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -1,6 +1,8 @@
1
  import os
 
 
2
  from datetime import datetime
3
- from typing import List, Dict, Any, Optional
4
 
5
  from fastapi import FastAPI, Request, BackgroundTasks
6
  from fastapi.middleware.cors import CORSMiddleware
@@ -16,14 +18,56 @@ load_dotenv()
16
  WEBHOOK_SECRET = os.getenv("WEBHOOK_SECRET", "your-webhook-secret")
17
  HF_TOKEN = os.getenv("HF_TOKEN")
18
  HF_MODEL = os.getenv("HF_MODEL", "microsoft/DialoGPT-medium")
19
- HF_PROVIDER = os.getenv("HF_PROVIDER", "huggingface")
 
 
20
 
21
- # Simple storage for processed comments
22
- comments_store: List[Dict[str, Any]] = []
23
 
24
  # Agent instance
25
  agent_instance: Optional[Agent] = None
26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  class WebhookEvent(BaseModel):
29
  event: Dict[str, str]
@@ -32,7 +76,7 @@ class WebhookEvent(BaseModel):
32
  repo: Dict[str, str]
33
 
34
 
35
- app = FastAPI(title="HF Discussion Bot")
36
  app.add_middleware(CORSMiddleware, allow_origins=["*"])
37
 
38
 
@@ -42,12 +86,17 @@ async def get_agent():
42
  if agent_instance is None and HF_TOKEN:
43
  agent_instance = Agent(
44
  model=HF_MODEL,
45
- provider=HF_PROVIDER,
46
  api_key=HF_TOKEN,
47
  servers=[
48
  {
49
  "type": "stdio",
50
- "config": {"command": "python", "args": ["mcp_server.py"]},
 
 
 
 
 
51
  }
52
  ],
53
  )
@@ -55,45 +104,129 @@ async def get_agent():
55
  return agent_instance
56
 
57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
  async def process_webhook_comment(webhook_data: Dict[str, Any]):
59
- """Process webhook using Agent with MCP tools"""
60
  comment_content = webhook_data["comment"]["content"]
61
  discussion_title = webhook_data["discussion"]["title"]
62
  repo_name = webhook_data["repo"]["name"]
63
  discussion_num = webhook_data["discussion"]["num"]
 
64
 
65
- agent = await get_agent()
66
- if not agent:
67
- ai_response = "Error: Agent not configured (missing HF_TOKEN)"
68
- else:
69
- # Use Agent to respond to the discussion
70
- prompt = f"""
71
- Please respond to this HuggingFace discussion comment using the available tools.
72
-
73
- Repository: {repo_name}
74
- Discussion: {discussion_title} (#{discussion_num})
75
- Comment: {comment_content}
76
-
77
- First use generate_discussion_response to create a helpful response, then use post_discussion_comment to post it.
78
- """
79
-
80
- try:
81
- response_parts = []
82
- async for item in agent.run(prompt):
83
- # Collect the agent's response
84
- if hasattr(item, "content") and item.content:
85
- response_parts.append(item.content)
86
- elif isinstance(item, str):
87
- response_parts.append(item)
88
-
89
- ai_response = (
90
- " ".join(response_parts) if response_parts else "No response generated"
91
- )
92
- except Exception as e:
93
- ai_response = f"Error using agent: {str(e)}"
94
 
95
- # Store the interaction with reply link
96
- discussion_url = f"https://huggingface.co/{repo_name}/discussions/{discussion_num}"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
 
98
  interaction = {
99
  "timestamp": datetime.now().isoformat(),
@@ -102,12 +235,13 @@ async def process_webhook_comment(webhook_data: Dict[str, Any]):
102
  "discussion_num": discussion_num,
103
  "discussion_url": discussion_url,
104
  "original_comment": comment_content,
105
- "ai_response": ai_response,
106
- "comment_author": webhook_data["comment"]["author"],
 
107
  }
108
 
109
- comments_store.append(interaction)
110
- return ai_response
111
 
112
 
113
  @app.post("/webhook")
@@ -120,7 +254,8 @@ async def webhook_handler(request: Request, background_tasks: BackgroundTasks):
120
  payload = await request.json()
121
  event = payload.get("event", {})
122
 
123
- if event.get("action") == "create" and event.get("scope") == "discussion.comment":
 
124
  background_tasks.add_task(process_webhook_comment, payload)
125
  return {"status": "processing"}
126
 
@@ -143,40 +278,64 @@ async def simulate_webhook(
143
  },
144
  "discussion": {
145
  "title": discussion_title,
146
- "num": len(comments_store) + 1,
147
  },
148
  "repo": {"name": repo_name},
149
  }
150
 
151
  response = await process_webhook_comment(mock_payload)
152
- return f"βœ… Processed! AI Response: {response}"
153
 
154
 
155
  def create_gradio_app():
156
  """Create Gradio interface"""
157
- with gr.Blocks(title="HF Discussion Bot", theme=gr.themes.Soft()) as demo:
158
- gr.Markdown("# πŸ€– HF Discussion Bot Dashboard")
159
- gr.Markdown("*Powered by HuggingFace Tiny Agents + FastMCP*")
 
 
 
 
 
 
 
 
 
 
160
 
161
  with gr.Column():
162
- sim_repo = gr.Textbox(label="Repository", value="microsoft/DialoGPT-medium")
163
- sim_title = gr.Textbox(label="Discussion Title", value="Test Discussion")
 
 
 
 
 
 
 
 
164
  sim_comment = gr.Textbox(
165
  label="Comment",
166
  lines=3,
167
- value="How do I use this model?",
 
168
  )
169
- sim_btn = gr.Button("πŸ“€ Test Webhook")
170
 
171
  with gr.Column():
172
  sim_result = gr.Textbox(label="Result", lines=8)
173
 
174
  sim_btn.click(
175
- fn=simulate_webhook,
176
  inputs=[sim_repo, sim_title, sim_comment],
177
- outputs=[sim_result],
178
  )
179
 
 
 
 
 
 
180
  return demo
181
 
182
 
@@ -186,7 +345,7 @@ app = gr.mount_gradio_app(app, gradio_app, path="/gradio")
186
 
187
 
188
  if __name__ == "__main__":
189
- print("πŸš€ Starting HF Discussion Bot with Tiny Agents...")
190
- print("πŸ“Š Dashboard: http://localhost:7860")
191
  print("πŸ”— Webhook: http://localhost:7860/webhook")
192
  uvicorn.run("app:app", host="0.0.0.0", port=7860, reload=True)
 
1
  import os
2
+ import re
3
+ import json
4
  from datetime import datetime
5
+ from typing import List, Dict, Any, Optional, Literal
6
 
7
  from fastapi import FastAPI, Request, BackgroundTasks
8
  from fastapi.middleware.cors import CORSMiddleware
 
18
  WEBHOOK_SECRET = os.getenv("WEBHOOK_SECRET", "your-webhook-secret")
19
  HF_TOKEN = os.getenv("HF_TOKEN")
20
  HF_MODEL = os.getenv("HF_MODEL", "microsoft/DialoGPT-medium")
21
+ # Use a valid provider literal from the documentation
22
+ DEFAULT_PROVIDER: Literal["hf-inference"] = "hf-inference"
23
+ HF_PROVIDER = os.getenv("HF_PROVIDER", DEFAULT_PROVIDER)
24
 
25
+ # Simple storage for processed tag operations
26
+ tag_operations_store: List[Dict[str, Any]] = []
27
 
28
  # Agent instance
29
  agent_instance: Optional[Agent] = None
30
 
31
+ # Common ML tags that we recognize for auto-tagging
32
+ RECOGNIZED_TAGS = {
33
+ "pytorch",
34
+ "tensorflow",
35
+ "jax",
36
+ "transformers",
37
+ "diffusers",
38
+ "text-generation",
39
+ "text-classification",
40
+ "question-answering",
41
+ "text-to-image",
42
+ "image-classification",
43
+ "object-detection",
44
+ "conversational",
45
+ "fill-mask",
46
+ "token-classification",
47
+ "translation",
48
+ "summarization",
49
+ "feature-extraction",
50
+ "sentence-similarity",
51
+ "zero-shot-classification",
52
+ "image-to-text",
53
+ "automatic-speech-recognition",
54
+ "audio-classification",
55
+ "voice-activity-detection",
56
+ "depth-estimation",
57
+ "image-segmentation",
58
+ "video-classification",
59
+ "reinforcement-learning",
60
+ "tabular-classification",
61
+ "tabular-regression",
62
+ "time-series-forecasting",
63
+ "graph-ml",
64
+ "robotics",
65
+ "computer-vision",
66
+ "nlp",
67
+ "cv",
68
+ "multimodal",
69
+ }
70
+
71
 
72
  class WebhookEvent(BaseModel):
73
  event: Dict[str, str]
 
76
  repo: Dict[str, str]
77
 
78
 
79
+ app = FastAPI(title="HF Tagging Bot")
80
  app.add_middleware(CORSMiddleware, allow_origins=["*"])
81
 
82
 
 
86
  if agent_instance is None and HF_TOKEN:
87
  agent_instance = Agent(
88
  model=HF_MODEL,
89
+ provider=DEFAULT_PROVIDER,
90
  api_key=HF_TOKEN,
91
  servers=[
92
  {
93
  "type": "stdio",
94
+ "config": {
95
+ "command": "python",
96
+ "args": ["mcp_server.py"],
97
+ "cwd": ".", # Ensure correct working directory
98
+ "env": {"HF_TOKEN": HF_TOKEN} if HF_TOKEN else {},
99
+ },
100
  }
101
  ],
102
  )
 
104
  return agent_instance
105
 
106
 
107
+ def extract_tags_from_text(text: str) -> List[str]:
108
+ """Extract potential tags from discussion text"""
109
+ text_lower = text.lower()
110
+
111
+ # Look for explicit tag mentions like "tag: pytorch" or "#pytorch"
112
+ explicit_tags = []
113
+
114
+ # Pattern 1: "tag: something" or "tags: something"
115
+ tag_pattern = r"tags?:\s*([a-zA-Z0-9-_,\s]+)"
116
+ matches = re.findall(tag_pattern, text_lower)
117
+ for match in matches:
118
+ # Split by comma and clean up
119
+ tags = [tag.strip() for tag in match.split(",")]
120
+ explicit_tags.extend(tags)
121
+
122
+ # Pattern 2: "#hashtag" style
123
+ hashtag_pattern = r"#([a-zA-Z0-9-_]+)"
124
+ hashtag_matches = re.findall(hashtag_pattern, text_lower)
125
+ explicit_tags.extend(hashtag_matches)
126
+
127
+ # Pattern 3: Look for recognized tags mentioned in natural text
128
+ mentioned_tags = []
129
+ for tag in RECOGNIZED_TAGS:
130
+ if tag in text_lower:
131
+ mentioned_tags.append(tag)
132
+
133
+ # Combine and deduplicate
134
+ all_tags = list(set(explicit_tags + mentioned_tags))
135
+
136
+ # Filter to only include recognized tags or explicitly mentioned ones
137
+ valid_tags = []
138
+ for tag in all_tags:
139
+ if tag in RECOGNIZED_TAGS or tag in explicit_tags:
140
+ valid_tags.append(tag)
141
+
142
+ return valid_tags
143
+
144
+
145
  async def process_webhook_comment(webhook_data: Dict[str, Any]):
146
+ """Process webhook to detect and add tags"""
147
  comment_content = webhook_data["comment"]["content"]
148
  discussion_title = webhook_data["discussion"]["title"]
149
  repo_name = webhook_data["repo"]["name"]
150
  discussion_num = webhook_data["discussion"]["num"]
151
+ comment_author = webhook_data["comment"]["author"]
152
 
153
+ # Extract potential tags from the comment and discussion title
154
+ comment_tags = extract_tags_from_text(comment_content)
155
+ title_tags = extract_tags_from_text(discussion_title)
156
+ all_tags = list(set(comment_tags + title_tags))
157
+
158
+ result_messages = []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
159
 
160
+ if not all_tags:
161
+ result_messages.append("No recognizable tags found in the discussion.")
162
+ else:
163
+ agent = await get_agent()
164
+ if not agent:
165
+ msg = "Error: Agent not configured (missing HF_TOKEN)"
166
+ result_messages.append(msg)
167
+ else:
168
+ # Process each tag
169
+ for tag in all_tags:
170
+ try:
171
+ # Get response from agent
172
+ responses = []
173
+ prompt = (
174
+ f"Add the tag '{tag}' to repository {repo_name} "
175
+ "using add_new_tag"
176
+ )
177
+
178
+ async for item in agent.run(prompt):
179
+ # Just collect the response content
180
+ responses.append(str(item))
181
+
182
+ response_text = " ".join(responses) if responses else "Completed"
183
+
184
+ # Try to parse JSON from response if possible
185
+ try:
186
+ # Look for JSON in the response
187
+ json_found = False
188
+ for response_part in responses:
189
+ response_str = str(response_part)
190
+ if "{" in response_str and "}" in response_str:
191
+ # Try to extract JSON from the response
192
+ start_idx = response_str.find("{")
193
+ end_idx = response_str.rfind("}") + 1
194
+ json_str = response_str[start_idx:end_idx]
195
+
196
+ try:
197
+ json_response = json.loads(json_str)
198
+ status = json_response.get("status")
199
+ if status == "success":
200
+ pr_url = json_response.get("pr_url", "")
201
+ msg = f"Tag '{tag}': PR created - {pr_url}"
202
+ elif status == "already_exists":
203
+ msg = f"Tag '{tag}': Already exists"
204
+ else:
205
+ tag_msg = json_response.get(
206
+ "message", "Processed"
207
+ )
208
+ msg = f"Tag '{tag}': {tag_msg}"
209
+ json_found = True
210
+ break
211
+ except json.JSONDecodeError:
212
+ continue
213
+
214
+ if not json_found:
215
+ # If no JSON found, use the response as is
216
+ msg = f"Tag '{tag}': {response_text}"
217
+
218
+ except Exception as parse_error:
219
+ msg = f"Tag '{tag}': Response parse error - {response_text}"
220
+
221
+ result_messages.append(msg)
222
+
223
+ except Exception as e:
224
+ error_msg = f"Error processing tag '{tag}': {str(e)}"
225
+ result_messages.append(error_msg)
226
+
227
+ # Store the interaction
228
+ base_url = "https://huggingface.co"
229
+ discussion_url = f"{base_url}/{repo_name}/discussions/{discussion_num}"
230
 
231
  interaction = {
232
  "timestamp": datetime.now().isoformat(),
 
235
  "discussion_num": discussion_num,
236
  "discussion_url": discussion_url,
237
  "original_comment": comment_content,
238
+ "comment_author": comment_author,
239
+ "detected_tags": all_tags,
240
+ "results": result_messages,
241
  }
242
 
243
+ tag_operations_store.append(interaction)
244
+ return " | ".join(result_messages)
245
 
246
 
247
  @app.post("/webhook")
 
254
  payload = await request.json()
255
  event = payload.get("event", {})
256
 
257
+ scope_check = event.get("scope") == "discussion.comment"
258
+ if event.get("action") == "create" and scope_check:
259
  background_tasks.add_task(process_webhook_comment, payload)
260
  return {"status": "processing"}
261
 
 
278
  },
279
  "discussion": {
280
  "title": discussion_title,
281
+ "num": len(tag_operations_store) + 1,
282
  },
283
  "repo": {"name": repo_name},
284
  }
285
 
286
  response = await process_webhook_comment(mock_payload)
287
+ return f"βœ… Processed! Results: {response}"
288
 
289
 
290
  def create_gradio_app():
291
  """Create Gradio interface"""
292
+ with gr.Blocks(title="HF Tagging Bot", theme=gr.themes.Soft()) as demo:
293
+ gr.Markdown("# 🏷️ HF Tagging Bot Dashboard")
294
+ gr.Markdown("*Automatically adds tags to models when mentioned in discussions*")
295
+
296
+ gr.Markdown("""
297
+ ## How it works:
298
+ - Monitors HuggingFace Hub discussions
299
+ - Detects tag mentions in comments (e.g., "tag: pytorch",
300
+ "#transformers")
301
+ - Automatically adds recognized tags to the model repository
302
+ - Supports common ML tags like: pytorch, tensorflow,
303
+ text-generation, etc.
304
+ """)
305
 
306
  with gr.Column():
307
+ sim_repo = gr.Textbox(
308
+ label="Repository",
309
+ value="burtenshaw/play-mcp-repo-bot",
310
+ placeholder="username/model-name",
311
+ )
312
+ sim_title = gr.Textbox(
313
+ label="Discussion Title",
314
+ value="Add pytorch tag",
315
+ placeholder="Discussion title",
316
+ )
317
  sim_comment = gr.Textbox(
318
  label="Comment",
319
  lines=3,
320
+ value="This model should have tags: pytorch, text-generation",
321
+ placeholder="Comment mentioning tags...",
322
  )
323
+ sim_btn = gr.Button("🏷️ Test Tag Detection")
324
 
325
  with gr.Column():
326
  sim_result = gr.Textbox(label="Result", lines=8)
327
 
328
  sim_btn.click(
329
+ simulate_webhook,
330
  inputs=[sim_repo, sim_title, sim_comment],
331
+ outputs=sim_result,
332
  )
333
 
334
+ gr.Markdown(f"""
335
+ ## Recognized Tags:
336
+ {", ".join(sorted(RECOGNIZED_TAGS))}
337
+ """)
338
+
339
  return demo
340
 
341
 
 
345
 
346
 
347
  if __name__ == "__main__":
348
+ print("πŸš€ Starting HF Tagging Bot...")
349
+ print("πŸ“Š Dashboard: http://localhost:7860/gradio")
350
  print("πŸ”— Webhook: http://localhost:7860/webhook")
351
  uvicorn.run("app:app", host="0.0.0.0", port=7860, reload=True)
mcp_server.py CHANGED
@@ -1,80 +1,144 @@
1
  #!/usr/bin/env python3
2
  """
3
- Simplified MCP Server for HuggingFace Hub Operations using FastMCP
4
  """
5
 
6
  import os
 
7
  from fastmcp import FastMCP
8
- from huggingface_hub import comment_discussion, InferenceClient
 
9
  from dotenv import load_dotenv
10
 
11
  load_dotenv()
12
 
13
  # Configuration
14
  HF_TOKEN = os.getenv("HF_TOKEN")
15
- DEFAULT_MODEL = os.getenv("HF_MODEL", "Qwen/Qwen2.5-72B-Instruct")
16
 
17
- # Initialize HF client
18
- inference_client = (
19
- InferenceClient(model=DEFAULT_MODEL, token=HF_TOKEN) if HF_TOKEN else None
20
- )
21
 
22
  # Create the FastMCP server
23
- mcp = FastMCP("hf-discussion-bot")
24
 
25
 
26
  @mcp.tool()
27
- def generate_discussion_response(
28
- discussion_title: str, comment_content: str, repo_name: str
29
- ) -> str:
30
- """Generate AI response for a HuggingFace discussion comment"""
31
- if not inference_client:
32
- return "Error: HF token not configured for inference"
33
-
34
- prompt = f"""
35
- Discussion: {discussion_title}
36
- Repository: {repo_name}
37
- Comment: {comment_content}
38
-
39
- Provide a helpful response to this comment.
40
- """
41
 
42
  try:
43
- messages = [
44
- {
45
- "role": "system",
46
- "content": ("You are a helpful AI assistant for ML discussions."),
47
- },
48
- {"role": "user", "content": prompt},
49
- ]
50
-
51
- response = inference_client.chat_completion(messages=messages, max_tokens=150)
52
- content = response.choices[0].message.content
53
- ai_response = content.strip() if content else "No response generated"
54
- return ai_response
55
 
56
  except Exception as e:
57
- return f"Error generating response: {str(e)}"
 
58
 
59
 
60
  @mcp.tool()
61
- def post_discussion_comment(repo_id: str, discussion_num: int, comment: str) -> str:
62
- """Post a comment to a HuggingFace discussion"""
63
- if not HF_TOKEN:
64
- return "Error: HF token not configured"
65
 
66
  try:
67
- comment_discussion(
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
  repo_id=repo_id,
69
- discussion_num=discussion_num,
70
- comment=comment,
 
 
 
 
 
71
  token=HF_TOKEN,
 
72
  )
73
- success_msg = f"Successfully posted comment to discussion #{discussion_num}"
74
- return success_msg
 
 
 
 
 
 
 
 
 
 
 
 
 
75
 
76
  except Exception as e:
77
- return f"Error posting comment: {str(e)}"
 
 
 
 
 
 
78
 
79
 
80
  if __name__ == "__main__":
 
1
  #!/usr/bin/env python3
2
  """
3
+ Simplified MCP Server for HuggingFace Hub Tagging Operations using FastMCP
4
  """
5
 
6
  import os
7
+ import json
8
  from fastmcp import FastMCP
9
+ from huggingface_hub import HfApi, model_info, ModelCard, ModelCardData
10
+ from huggingface_hub.utils import HfHubHTTPError
11
  from dotenv import load_dotenv
12
 
13
  load_dotenv()
14
 
15
  # Configuration
16
  HF_TOKEN = os.getenv("HF_TOKEN")
 
17
 
18
+ # Initialize HF API client
19
+ hf_api = HfApi(token=HF_TOKEN) if HF_TOKEN else None
 
 
20
 
21
  # Create the FastMCP server
22
+ mcp = FastMCP("hf-tagging-bot")
23
 
24
 
25
  @mcp.tool()
26
+ def get_current_tags(repo_id: str) -> str:
27
+ """Get current tags from a HuggingFace model repository"""
28
+ if not hf_api:
29
+ return json.dumps({"error": "HF token not configured"})
 
 
 
 
 
 
 
 
 
 
30
 
31
  try:
32
+ info = model_info(repo_id=repo_id, token=HF_TOKEN)
33
+ current_tags = info.tags if info.tags else []
34
+
35
+ result = {
36
+ "status": "success",
37
+ "repo_id": repo_id,
38
+ "current_tags": current_tags,
39
+ "count": len(current_tags),
40
+ }
41
+ return json.dumps(result)
 
 
42
 
43
  except Exception as e:
44
+ error_result = {"status": "error", "repo_id": repo_id, "error": str(e)}
45
+ return json.dumps(error_result)
46
 
47
 
48
  @mcp.tool()
49
+ def add_new_tag(repo_id: str, new_tag: str) -> str:
50
+ """Add a new tag to a HuggingFace model repository via PR"""
51
+ if not hf_api:
52
+ return json.dumps({"error": "HF token not configured"})
53
 
54
  try:
55
+ # Get current model info and tags
56
+ info = model_info(repo_id=repo_id, token=HF_TOKEN)
57
+ current_tags = info.tags if info.tags else []
58
+
59
+ # Check if tag already exists
60
+ if new_tag in current_tags:
61
+ result = {
62
+ "status": "already_exists",
63
+ "repo_id": repo_id,
64
+ "tag": new_tag,
65
+ "message": f"Tag '{new_tag}' already exists",
66
+ }
67
+ return json.dumps(result)
68
+
69
+ # Add the new tag to existing tags
70
+ updated_tags = current_tags + [new_tag]
71
+
72
+ # Create model card content with updated tags
73
+ try:
74
+ # Load existing model card
75
+ card = ModelCard.load(repo_id, token=HF_TOKEN)
76
+ if not hasattr(card, "data") or card.data is None:
77
+ card.data = ModelCardData()
78
+ except HfHubHTTPError:
79
+ # Create new model card if none exists
80
+ card = ModelCard("")
81
+ card.data = ModelCardData()
82
+
83
+ # Update tags - create new ModelCardData with updated tags
84
+ card_dict = card.data.to_dict()
85
+ card_dict["tags"] = updated_tags
86
+ card.data = ModelCardData(**card_dict)
87
+
88
+ # Create a pull request with the updated model card
89
+ pr_title = f"Add '{new_tag}' tag"
90
+ pr_description = f"""
91
+ ## Add tag: {new_tag}
92
+
93
+ This PR adds the `{new_tag}` tag to the model repository.
94
+
95
+ **Changes:**
96
+ - Added `{new_tag}` to model tags
97
+ - Updated from {len(current_tags)} to {len(updated_tags)} tags
98
+
99
+ **Current tags:** {", ".join(current_tags) if current_tags else "None"}
100
+ **New tags:** {", ".join(updated_tags)}
101
+ """
102
+
103
+ # Create commit with updated model card using CommitOperationAdd
104
+ from huggingface_hub import CommitOperationAdd
105
+
106
+ commit_info = hf_api.create_commit(
107
  repo_id=repo_id,
108
+ operations=[
109
+ CommitOperationAdd(
110
+ path_in_repo="README.md", path_or_fileobj=str(card).encode("utf-8")
111
+ )
112
+ ],
113
+ commit_message=pr_title,
114
+ commit_description=pr_description,
115
  token=HF_TOKEN,
116
+ create_pr=True,
117
  )
118
+
119
+ # Extract PR URL from commit info
120
+ pr_url_attr = commit_info.pr_url
121
+ pr_url = pr_url_attr if hasattr(commit_info, "pr_url") else str(commit_info)
122
+
123
+ result = {
124
+ "status": "success",
125
+ "repo_id": repo_id,
126
+ "tag": new_tag,
127
+ "pr_url": pr_url,
128
+ "previous_tags": current_tags,
129
+ "new_tags": updated_tags,
130
+ "message": f"Created PR to add tag '{new_tag}'",
131
+ }
132
+ return json.dumps(result)
133
 
134
  except Exception as e:
135
+ error_result = {
136
+ "status": "error",
137
+ "repo_id": repo_id,
138
+ "tag": new_tag,
139
+ "error": str(e),
140
+ }
141
+ return json.dumps(error_result)
142
 
143
 
144
  if __name__ == "__main__":