Spaces:
Sleeping
Sleeping
title: HF RepoSense | |
emoji: π | |
colorFrom: blue | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 5.32.1 | |
app_file: app.py | |
pinned: false | |
short_description: AI-powered HuggingFace repository intelligence | |
tags: | |
- agent-demo-track | |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
# π HF RepoSense : [Video demo](https://youtu.be/UqSRQy_8t-E) | |
HF Accounts for all contributors: | |
Naman Gupta: naman1102 | |
Surya Boddu : MLconArtist | |
Lakshmi GirijaΒ Dhulipati: dlgirija | |
Mohamed Ifreen Seyed Ibrahim: Mohamed-Ifreen | |
**AI-powered HuggingFace repository intelligence** | |
An intelligent AI system for discovering, analyzing, and evaluating HuggingFace repositories. HF RepoSense uses advanced AI to understand your requirements, search for relevant repositories, and provide comprehensive analysis with personalized recommendations. | |
 | |
 | |
 | |
## β¨ Features | |
- π€ **AI Assistant**: Intelligent conversation-based repository discovery | |
- π **Smart Search**: Auto-detection of repository IDs vs. keywords | |
- π **Automated Analysis**: LLM-powered repository evaluation and ranking | |
- π **Top 3 Selection**: AI-curated most relevant repositories | |
- π¬ **Repository Explorer**: Interactive chat with repository contents | |
- π― **Requirements Extraction**: Automatic keyword extraction from conversations | |
- π **Comprehensive Results**: Detailed analysis with strengths, weaknesses, and specialities | |
## π¦ Quick Start | |
### Prerequisites | |
- Python 3.8+ | |
- OpenAI API key (for LLM analysis) | |
- Hugging Face access (for repository downloads) | |
### Installation | |
1. **Clone the repository** | |
```bash | |
git clone <repository-url> | |
cd HF-RepoSense | |
``` | |
2. **Install dependencies** | |
```bash | |
pip install -r requirements.txt | |
``` | |
3. **Set up environment variables** | |
```bash | |
export modal_api="your_openai_api_key" | |
export base_url="your_openai_base_url" | |
``` | |
4. **Run the application** | |
```bash | |
python app.py | |
``` | |
5. **Open your browser** to `http://localhost:7860` | |
## π User Guide | |
### π€ Using the AI Assistant (Recommended) | |
1. **Start a Conversation** | |
- Navigate to the "π€ AI Assistant" tab | |
- Describe your project: "I'm building a chatbot for customer service" | |
- The AI will ask clarifying questions about your needs | |
2. **Automatic Discovery** | |
- When the AI has enough information, it will automatically: | |
- Extract relevant keywords from your conversation | |
- Search for matching repositories | |
- Analyze and rank them by relevance | |
3. **Review Results** | |
- The interface automatically switches to "π¬ Analysis & Results" | |
- View the top 3 most relevant repositories | |
- Browse all analyzed repositories with detailed insights | |
### π Using Smart Search (Direct Input) | |
1. **Repository IDs** | |
``` | |
microsoft/DialoGPT-medium | |
openai/whisper | |
huggingface/transformers | |
``` | |
2. **Keywords** | |
``` | |
text generation | |
image classification | |
sentiment analysis | |
``` | |
3. **Mixed Input** | |
- The system automatically detects the input type | |
- Repository IDs (containing `/`) are processed directly | |
- Keywords trigger automatic repository search | |
### π¬ Analyzing Results | |
- **Top 3 Repositories**: AI-selected most relevant based on your requirements | |
- **Detailed Analysis**: Strengths, weaknesses, specialities, and relevance ratings | |
- **Quick Actions**: Click repository names to visit or explore them | |
- **Repository Explorer**: Deep dive into individual repositories with AI chat | |
### π Repository Explorer | |
1. **Access Methods**: | |
- Click "π Open in Repo Explorer" from repository actions | |
- Manually enter repository ID in the Repo Explorer tab | |
2. **Features**: | |
- Automatic repository loading and analysis | |
- Interactive chat about repository contents | |
- File structure exploration | |
- Code analysis and explanations | |
## π οΈ Technical Architecture | |
### Core Components | |
``` | |
app.py # Main Gradio interface and orchestration | |
βββ analyzer.py # Repository analysis and LLM processing | |
βββ hf_utils.py # Hugging Face API interactions | |
βββ chatbot_page.py # AI assistant conversation logic | |
βββ repo_explorer.py # Repository exploration interface | |
``` | |
### Key Features Implementation | |
#### π€ AI Assistant | |
- **System Prompt**: Focused on requirements gathering, not recommendations | |
- **Auto-Extraction**: Detects conversation readiness for keyword extraction | |
- **Smart Processing**: Converts natural language to actionable search queries | |
#### π Smart Input Detection | |
```python | |
def is_repo_id_format(text: str) -> bool: | |
# Detects if input contains repository IDs (with /) vs keywords | |
lines = [line.strip() for line in re.split(r'[\n,]+', text) if line.strip()] | |
slash_count = sum(1 for line in lines if '/' in line) | |
return slash_count >= len(lines) * 0.5 | |
``` | |
#### π LLM-Powered Repository Ranking | |
- **Model**: `Orion-zhen/Qwen2.5-Coder-7B-Instruct-AWQ` | |
- **Criteria**: Requirements matching, strengths, relevance rating, speciality alignment | |
- **Output**: JSON-formatted repository rankings | |
#### π Analysis Pipeline | |
1. **Download**: Repository files (`.py`, `.md`, `.txt`) | |
2. **Combine**: Merge files into single analyzable document | |
3. **Analyze**: LLM evaluation for strengths, weaknesses, specialities | |
4. **Rank**: User requirement-based relevance scoring | |
5. **Select**: Top 3 most relevant repositories | |
### Data Flow | |
```mermaid | |
graph TD | |
A[User Input] --> B{Input Type?} | |
B -->|Keywords| C[Repository Search] | |
B -->|Repo IDs| D[Direct Processing] | |
C --> E[Repository List] | |
D --> E | |
E --> F[Download & Analyze] | |
F --> G[LLM Evaluation] | |
G --> H[Ranking & Selection] | |
H --> I[Results Display] | |
I --> J[Repository Explorer] | |
``` | |
### File Structure | |
``` | |
π¦ HF-RepoSense/ | |
βββ π app.py # Main application | |
βββ π analyzer.py # Repository analysis logic | |
βββ π hf_utils.py # Hugging Face utilities | |
βββ π chatbot_page.py # AI assistant functionality | |
βββ π repo_explorer.py # Repository exploration | |
βββ π requirements.txt # Python dependencies | |
βββ π README.md # Documentation | |
βββ π repo_ids.csv # Analysis results storage | |
βββ π repo_files/ # Temporary repository downloads | |
``` | |
### Dependencies | |
``` | |
gradio>=4.0.0 # Web interface framework | |
pandas>=1.5.0 # Data manipulation | |
regex>=2022.0.0 # Advanced regex operations | |
openai>=1.0.0 # LLM API access | |
huggingface_hub>=0.16.0 # HF repository access | |
requests>=2.28.0 # HTTP requests | |
``` | |
### Environment Variables | |
| Variable | Description | Required | | |
|----------|-------------|----------| | |
| `modal_api` | OpenAI API key for LLM analysis | β | | |
| `base_url` | OpenAI API base URL | β | | |
### LLM Integration | |
#### Analysis Prompt Structure | |
```python | |
ANALYSIS_PROMPT = """ | |
Analyze this repository and provide: | |
1. Strengths and capabilities | |
2. Potential weaknesses or limitations | |
3. Primary speciality/use case | |
4. Relevance rating for: {user_requirements} | |
Return valid JSON with: strength, weaknesses, speciality, relevance rating | |
""" | |
``` | |
#### Repository Ranking System | |
- **Input**: User requirements + repository analysis data | |
- **Processing**: LLM evaluates relevance and ranks repositories | |
- **Output**: Top 3 most relevant repositories in order | |
### UI Components | |
#### Modern Design Features | |
- **Gradient Backgrounds**: Linear gradients for visual appeal | |
- **Glassmorphism**: Backdrop blur effects for modern look | |
- **Responsive Layout**: Adaptive to different screen sizes | |
- **Interactive Elements**: Hover effects and smooth transitions | |
- **Modal System**: Repository action selection popups | |
#### Tab Organization | |
1. **π€ AI Assistant**: Conversation-based discovery | |
2. **π Smart Search**: Direct input processing | |
3. **π¬ Analysis & Results**: Comprehensive analysis display | |
4. **π Repo Explorer**: Interactive repository exploration | |
### Advanced Features | |
#### Auto-Navigation | |
- Automatic tab switching based on workflow state | |
- Smooth scrolling to top on tab changes | |
- Progressive disclosure of information | |
#### Error Handling | |
- Graceful fallbacks for LLM failures | |
- CSV update retry mechanisms | |
- User-friendly error messages | |
#### Performance Optimizations | |
- Parallel processing for multiple repositories | |
- Progress tracking for long operations | |
- Efficient file caching and cleanup | |
## π§ Configuration | |
### Customizing Analysis | |
- Modify `CHATBOT_SYSTEM_PROMPT` for different assistant behavior | |
- Adjust repository search limits in `search_top_spaces()` | |
- Configure analysis criteria in `get_top_relevant_repos()` | |
### Adding File Types | |
```python | |
# In analyzer.py | |
download_filtered_space_files( | |
repo_id, | |
local_dir="repo_files", | |
file_extensions=['.py', '.md', '.txt', '.js', '.ts'] # Add more | |
) | |
``` | |
## π€ Contributing | |
1. Fork the repository | |
2. Create a feature branch | |
3. Implement your changes | |
4. Add tests if applicable | |
5. Submit a pull request | |
## π License | |
This project is licensed under the MIT License - see the LICENSE file for details. | |
## π Acknowledgments | |
- **Gradio**: For the amazing web interface framework | |
- **Hugging Face**: For the incredible repository ecosystem | |
- **OpenAI**: For powerful language model capabilities | |
--- | |
<div align="center"> | |
<p>Built with β€οΈ for the open source community</p> | |
<p>π Happy repository hunting! π</p> | |
</div> | |