File size: 4,717 Bytes
780b542
2a735cc
780b542
 
 
 
 
 
 
 
a33458e
2a735cc
a33458e
2a735cc
a33458e
 
 
2a735cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a33458e
2a735cc
a33458e
2a735cc
 
 
 
a33458e
2a735cc
4fe6054
2a735cc
 
 
 
 
4fe6054
2a735cc
4fe6054
2a735cc
 
 
 
4fe6054
2a735cc
 
 
 
 
 
 
4fe6054
2a735cc
4fe6054
2a735cc
4fe6054
2a735cc
4fe6054
2a735cc
 
 
 
4fe6054
2a735cc
a33458e
2a735cc
a33458e
2a735cc
 
 
 
 
 
 
 
 
 
 
4fe6054
 
 
2a735cc
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
---
title: 🧠 Personal AI Second Brain
emoji: 🤗
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: true
license: mit
---

# 🧠 Personal AI Second Brain

A personalized AI assistant that serves as your second brain, built with Hugging Face, Streamlit, and Telegram integration. This system helps you store and retrieve information from your documents, conversations, and notes through a powerful Retrieval-Augmented Generation (RAG) system.

## Features

- **Chat Interface**: Ask questions and get answers based on your personal knowledge base
- **Document Management**: Upload and process documents (PDF, TXT, DOC, etc.)
- **RAG System**: Retrieve relevant information from your knowledge base
- **Telegram Integration**: Access your second brain through Telegram
- **Persistent Chat History**: Store conversations in Hugging Face Datasets
- **Expandable**: Easy to add new data sources and functionalities

## Architecture

The system is built with the following components:

1. **LLM Layer**: Uses Hugging Face models for text generation and embeddings
2. **Memory Layer**: Vector database (Qdrant) for storing and retrieving information
3. **RAG System**: Retrieval-Augmented Generation to ground answers in your data
4. **Ingestion Pipeline**: Process documents and chat history
5. **Telegram Bot**: Integration with Telegram for chat-based access
6. **Hugging Face Dataset**: Persistent storage for chat history

## Setup

### Requirements

- Python 3.8+
- Hugging Face account (for model access and hosting)
- Telegram account (for bot integration, optional)

### Installation

1. Clone the repository:
   ```
   git clone <repository-url>
   cd personal-ai-second-brain
   ```

2. Install dependencies:
   ```
   pip install -r requirements.txt
   ```

3. Create a `.env` file with your configuration:
   ```
   # API Keys
   HF_API_KEY=your_huggingface_api_key_here
   TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here
   
   # LLM Configuration
   LLM_MODEL=gpt2  # Use small model for Hugging Face Spaces
   EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
   
   # Vector Database
   VECTOR_DB_PATH=./data/vector_db
   COLLECTION_NAME=personal_assistant
   
   # Application Settings
   DEFAULT_TEMPERATURE=0.7
   CHUNK_SIZE=512
   CHUNK_OVERLAP=128
   MAX_TOKENS=256
   
   # Telegram Bot Settings
   TELEGRAM_ENABLED=false
   TELEGRAM_ALLOWED_USERS=  # Comma-separated list of Telegram user IDs
   
   # Hugging Face Dataset Settings
   HF_DATASET_NAME=username/second-brain-history  # Your username/dataset-name
   CHAT_HISTORY_DIR=./data/chat_history
   SYNC_INTERVAL=60  # How often to sync history to HF (minutes)
   ```

4. Create necessary directories:
   ```
   mkdir -p data/documents data/vector_db data/chat_history
   ```

### Running Locally

Start the application:
```
streamlit run app/ui/streamlit_app.py
```

### Deploying to Hugging Face Spaces

1. Create a new Space on Hugging Face
2. Upload the code to the Space
3. Set the environment variables in the Space settings
4. The application will automatically start

## Telegram Bot Setup

1. Talk to [@BotFather](https://t.me/botfather) on Telegram
2. Use the `/newbot` command to create a new bot
3. Get your bot token and add it to your `.env` file
4. Set `TELEGRAM_ENABLED=true` in your `.env` file
5. To find your Telegram user ID (for restricting access), talk to [@userinfobot](https://t.me/userinfobot)

### Telegram Commands

- **/start**: Start a conversation with the bot
- **/help**: Shows available commands
- **/search**: Use `/search your query` to search your knowledge base
- **Direct messages**: Send any message to chat with your second brain

## Hugging Face Dataset Integration

To enable persistent chat history across deployments:

1. Create a private dataset repository on Hugging Face Hub
2. Set your API token in the `.env` file as `HF_API_KEY`
3. Set your dataset name as `HF_DATASET_NAME` (format: username/repo-name)

## Customization

### Using Different Models

You can change the models by updating the `.env` file:

```
LLM_MODEL=mistralai/Mistral-7B-Instruct-v0.2
EMBEDDING_MODEL=sentence-transformers/all-mpnet-base-v2
```

### Adding Custom Tools

To add custom tools to your agent, modify the `app/core/agent.py` file to include additional functionality.

## Roadmap

- [ ] Web search tool integration
- [ ] Calendar and email integration
- [ ] Voice interface
- [ ] Mobile app integration
- [ ] Fine-tuning for personalized responses

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

Created by [p3rc03](https://huggingface.co/p3rc03)