Spaces:
Sleeping
Sleeping
Angel
commited on
Commit
·
5373d2a
1
Parent(s):
8b78e1d
Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,30 @@ The process of fetching new YouTube videos and extracting their transcripts is a
|
|
13 |
- **Retrieval-Augmented Generation (RAG)**: The bot uses RAG to query the AI and retrieve information from video transcripts to answer user queries.
|
14 |
- **Bot Interaction**: A chatbot interface answers questions based on the YouTube video transcripts.
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
## Project Structure
|
17 |
|
18 |
```bash
|
|
|
13 |
- **Retrieval-Augmented Generation (RAG)**: The bot uses RAG to query the AI and retrieve information from video transcripts to answer user queries.
|
14 |
- **Bot Interaction**: A chatbot interface answers questions based on the YouTube video transcripts.
|
15 |
|
16 |
+
## How vector DB works
|
17 |
+
First Check for Vector DB:
|
18 |
+
Tries to get existing collection named "transcript_collection"
|
19 |
+
If not found, creates a new one
|
20 |
+
If found, uses the existing one
|
21 |
+
Document Comparison:
|
22 |
+
Gets all existing documents from the database
|
23 |
+
Takes your new text chunks
|
24 |
+
Compares them to find which chunks are new (not in database)
|
25 |
+
Processing New Content:
|
26 |
+
If no new content is found → stops (nothing to do)
|
27 |
+
If new content exists → only generates embeddings for these new chunks
|
28 |
+
Update Database:
|
29 |
+
Takes the new embeddings
|
30 |
+
Adds them to the existing vector database
|
31 |
+
Maintains all previous data while adding new content
|
32 |
+
So if you have:
|
33 |
+
|
34 |
+
Original DB with chunks A, B, C
|
35 |
+
New text with chunks A, B, C, D, E
|
36 |
+
It will only process and add D and E to the database
|
37 |
+
This makes the process much more efficient since you're not reprocessing content that's already in the database!
|
38 |
+
|
39 |
+
|
40 |
## Project Structure
|
41 |
|
42 |
```bash
|