Spaces:

can-org
/

AI-Checker

Running

App Files Files Community

Pujan-Dev commited on 6 days ago

Commit

b4f755d

1 Parent(s): 09c8783

feat: added files of img classifier and documented

Browse files

Files changed (16) hide show

docs/api_endpoints.md +75 -0
docs/deployment.md +105 -0
docs/functions.md +53 -0
docs/nestjs_integration.md +82 -0
docs/security.md +9 -0
docs/setup.md +23 -0
docs/structure.md +54 -0
features/image_classifier/controller.py +11 -0
features/image_classifier/inferencer.py +22 -0
features/image_classifier/model_loader.py +32 -0
features/image_classifier/preprocess.py +16 -7
features/image_classifier/routes.py +8 -3
features/nepali_text_classifier/preprocess.py +2 -5
models/.gitattributes +0 -35
readme.md +11 -320
requirements.txt +2 -0

docs/api_endpoints.md ADDED Viewed

	@@ -0,0 +1,75 @@

+# 🧩 API Endpoints
+### English (GPT-2) - `/text/`
+| Endpoint                         | Method | Description                               |
+| --------------------------------- | ------ | ----------------------------------------- |
+| `/text/analyse`                  | POST   | Classify raw English text                 |
+| `/text/analyse-sentences`        | POST   | Sentence-by-sentence breakdown            |
+| `/text/analyse-sentance-file`    | POST   | Upload file, per-sentence breakdown       |
+| `/text/upload`                   | POST   | Upload file for overall classification    |
+| `/text/health`                   | GET    | Health check                             |
+#### Example: Classify English text
+```bash
+curl -X POST http://localhost:8000/text/analyse \
+  -H "Authorization: Bearer <SECRET_TOKEN>" \
+  -H "Content-Type: application/json" \
+  -d '{"text": "This is a sample text for analysis."}'
+```
+**Response:**
+```json
+{
+  "result": "AI-generated",
+  "perplexity": 55.67,
+  "ai_likelihood": 66.6
+}
+```
+#### Example: File upload
+```bash
+curl -X POST http://localhost:8000/text/upload \
+  -H "Authorization: Bearer <SECRET_TOKEN>" \
+  -F 'file=@yourfile.txt;type=text/plain'
+```
+---
+### Nepali (SentencePiece) - `/NP/`
+| Endpoint                         | Method | Description                               |
+| --------------------------------- | ------ | ----------------------------------------- |
+| `/NP/analyse`                    | POST   | Classify Nepali text                      |
+| `/NP/analyse-sentences`          | POST   | Sentence-by-sentence breakdown            |
+| `/NP/upload`                     | POST   | Upload Nepali PDF for classification      |
+| `/NP/file-sentences-analyse`     | POST   | PDF upload, per-sentence breakdown        |
+| `/NP/health`                     | GET    | Health check                             |
+#### Example: Nepali text classification
+```bash
+curl -X POST http://localhost:8000/NP/analyse \
+  -H "Authorization: Bearer <SECRET_TOKEN>" \
+  -H "Content-Type: application/json" \
+  -d '{"text": "यो उदाहरण वाक्य हो।"}'
+```
+**Response:**
+```json
+{
+  "label": "Human",
+  "confidence": 98.6
+}
+```
+#### Example: Nepali PDF upload
+```bash
+curl -X POST http://localhost:8000/NP/upload \
+  -H "Authorization: Bearer <SECRET_TOKEN>" \
+  -F 'file=@NepaliText.pdf;type=application/pdf'
+```

docs/deployment.md ADDED Viewed

	@@ -0,0 +1,105 @@

+#  Deployment
+This project is containerized and deployed on **Hugging Face Spaces** using a custom `Dockerfile`. This guide explains the structure of the Dockerfile and key considerations for deploying FastAPI apps on Spaces with Docker SDK.
+---
+## 📦 Base Image
+```dockerfile
+FROM python:3.9
+````
+We use the official Python 3.9 image for compatibility and stability across most Python libraries and tools.
+---
+## 👤 Create a Non-Root User
+```dockerfile
+RUN useradd -m -u 1000 user
+USER user
+ENV PATH="/home/user/.local/bin:$PATH"
+```
+* Hugging Face Spaces **requires** that containers run as a non-root user with UID `1000`.
+* We also prepend the user's local binary path to `PATH` for Python package accessibility.
+---
+## 🗂️ Set Working Directory
+```dockerfile
+WORKDIR /app
+```
+All application files will reside under `/app` for consistency and clarity.
+---
+## 📋 Install Dependencies
+```dockerfile
+COPY --chown=user ./requirements.txt requirements.txt
+RUN pip install --no-cache-dir --upgrade -r requirements.txt
+```
+* Copies the dependency list with correct file ownership.
+* Uses `--no-cache-dir` to reduce image size.
+* Ensures the latest compatible versions are installed.
+---
+## 🔡 Download Language Model (Optional)
+```dockerfile
+RUN python -m spacy download en_core_web_sm || echo "Failed to download model"
+```
+* Downloads the small English NLP model required by SpaCy.
+* Uses `|| echo ...` to prevent build failure if the download fails (optional safeguard).
+---
+## 📁 Copy Project Files
+```dockerfile
+COPY --chown=user . /app
+```
+Copies the entire project source into the container, setting correct ownership for Hugging Face's user-based execution.
+---
+## 🌐 Start the FastAPI Server
+```dockerfile
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
+```
+* Launches the FastAPI app using `uvicorn`.
+* **Port 7860 is mandatory** for Docker-based Hugging Face Spaces deployments.
+* `app:app` refers to the `FastAPI()` instance in `app.py`.
+---
+## ✅ Deployment Checklist
+* [x] Ensure your main file is named `app.py` or adjust `CMD` accordingly.
+* [x] All dependencies should be listed in `requirements.txt`.
+* [x] If using models like SpaCy, verify they are downloaded or bundled.
+* [x] Test your Dockerfile locally with `docker build` before pushing to Hugging Face.
+---
+## 📚 References
+* Hugging Face Docs: [Spaces Docker SDK](https://huggingface.co/docs/hub/spaces-sdks-docker)
+* Uvicorn Docs: [https://www.uvicorn.org/](https://www.uvicorn.org/)
+* SpaCy Models: [https://spacy.io/models](https://spacy.io/models)
+---
+Happy deploying!
+**P.S.** Try not to break stuff. 😅

docs/functions.md ADDED Viewed

	@@ -0,0 +1,53 @@

+# Major  Functions used
+## in Text Classifier (`features/text_classifier/` and `features/text_classifier/`)
+- **`load_model()`**
+  Loads the GPT-2 model and tokenizer from the specified directory paths.
+- **`lifespan()`**
+  Manages the application lifecycle. Initializes the model at startup and handles cleanup on shutdown.
+- **`classify_text_sync()`**
+  Synchronously tokenizes input text and predicts using the GPT-2 model. Returns classification and perplexity.
+- **`classify_text()`**
+  Asynchronously runs `classify_text_sync()` in a thread pool for non-blocking text classification.
+- **`analyze_text()`**
+  **POST** endpoint: Accepts text input, classifies it using `classify_text()`, and returns the result with perplexity.
+- **`health()`**
+  **GET** endpoint: Simple health check for API liveness.
+- **`parse_docx()`, `parse_pdf()`, `parse_txt()`**
+  Utilities to extract and convert `.docx`, `.pdf`, and `.txt` file contents to plain text.
+- **`warmup()`**
+  Downloads the model repository and initializes the model/tokenizer using `load_model()`.
+- **`download_model_repo()`**
+  Downloads the model files from the designated `MODEL` folder.
+- **`get_model_tokenizer()`**
+  Checks if the model already exists; if not, downloads it—otherwise, loads the cached model.
+- **`handle_file_upload()`**
+  Handles file uploads from the `/upload` route. Extracts text, classifies, and returns results.
+- **`extract_file_contents()`**
+  Extracts and returns plain text from uploaded files (PDF, DOCX, TXT).
+- **`handle_file_sentence()`**
+  Processes file uploads by analyzing each sentence (under 10,000 chars) before classification.
+- **`handle_sentence_level_analysis()`**
+  Checks/strips each sentence, then computes AI/human likelihood for each.
+- **`analyze_sentences()`**
+  Splits paragraphs into sentences, classifies each, and returns all results.
+- **`analyze_sentence_file()`**
+  Like `handle_file_sentence()`—analyzes sentences in uploaded files.
+## for image_classifier

docs/nestjs_integration.md ADDED Viewed

	@@ -0,0 +1,82 @@

+# Nestjs + fastapi
+You can easily call this API from a NestJS microservice.
+**.env**
+```env
+FASTAPI_BASE_URL=http://localhost:8000
+SECRET_TOKEN=your_secret_token_here
+```
+**fastapi.service.ts**
+```typescript
+import { Injectable } from "@nestjs/common";
+import { HttpService } from "@nestjs/axios";
+import { ConfigService } from "@nestjs/config";
+import { firstValueFrom } from "rxjs";
+@Injectable()
+export class FastAPIService {
+  constructor(
+    private http: HttpService,
+    private config: ConfigService,
+  ) {}
+  async analyzeText(text: string) {
+    const url = `${this.config.get("FASTAPI_BASE_URL")}/text/analyse`;
+    const token = this.config.get("SECRET_TOKEN");
+    const response = await firstValueFrom(
+      this.http.post(
+        url,
+        { text },
+        {
+          headers: {
+            Authorization: `Bearer ${token}`,
+          },
+        },
+      ),
+    );
+    return response.data;
+  }
+}
+```
+**app.module.ts**
+```typescript
+import { Module } from "@nestjs/common";
+import { ConfigModule } from "@nestjs/config";
+import { HttpModule } from "@nestjs/axios";
+import { AppController } from "./app.controller";
+import { FastAPIService } from "./fastapi.service";
+@Module({
+  imports: [ConfigModule.forRoot(), HttpModule],
+  controllers: [AppController],
+  providers: [FastAPIService],
+})
+export class AppModule {}
+```
+**app.controller.ts**
+```typescript
+import { Body, Controller, Post, Get } from '@nestjs/common';
+import { FastAPIService } from './fastapi.service';
+@Controller()
+export class AppController {
+  constructor(private readonly fastapiService: FastAPIService) {}
+  @Post('analyze-text')
+  async callFastAPI(@Body('text') text: string) {
+    return this.fastapiService.analyzeText(text);
+  }
+  @Get()
+  getHello(): string {
+    return 'NestJS is connected to FastAPI';
+  }
+}
+```

docs/security.md ADDED Viewed

	@@ -0,0 +1,9 @@

+# Security: Bearer Token Auth
+All endpoints require authentication via Bearer token:
+- Set `SECRET_TOKEN` in `.env`
+- Add header: `Authorization: Bearer <SECRET_TOKEN>`
+Unauthorized requests receive `403 Forbidden`.

docs/setup.md ADDED Viewed

	@@ -0,0 +1,23 @@

+# Setup & Installation
+## 1. Clone the Repository
+```bash
+git clone https://github.com/cyberalertnepal/aiapi
+cd aiapi
+```
+## 2. Install Dependencies
+```bash
+pip install -r requirements.txt
+```
+## 3. Configure Environment
+Create a `.env` file:
+```env
+SECRET_TOKEN=your_secret_token_here
+```
+## 4. Run the API
+```bash
+uvicorn app:app --host 0.0.0.0 --port 8000
+```

docs/structure.md ADDED Viewed

	@@ -0,0 +1,54 @@

+## 🏗️ Project Structure
+```
+├── app.py                   # Main FastAPI app entrypoint
+├── config.py                # Configuration loader (.env, settings)
+├── features/
+│   ├── text_classifier/     # English (GPT-2) classifier
+│   │   ├── controller.py
+│   │   ├── inferencer.py
+│   │   ├── model_loader.py
+│   │   ├── preprocess.py
+│   │   └── routes.py
+│   └── nepali_text_classifier/ # Nepali (sentencepiece) classifier
+│       ├── controller.py
+│       ├── inferencer.py
+│       ├── model_loader.py
+│       ├── preprocess.py
+│       └── routes.py
+├── np_text_model/           # Nepali model artifacts (auto-downloaded)
+│   ├── classifier/
+│   │   └── sentencepiece.bpe.model
+│   └── model_95_acc.pth
+├── models/                  # English GPT-2 model/tokenizer (auto-downloaded)
+│   ├── merges.txt
+│   ├── tokenizer.json
+│   └── model_weights.pth
+├── Dockerfile               # Container build config
+├── Procfile                 # Deployment entrypoint (for PaaS)
+├── requirements.txt         # Python dependencies
+├── README.md
+├── Docs                     # documents
+└── .env                     # Secret token(s), environment config
+```
+### 🌟 Key Files and Their Roles
+- **`app.py`**: Entry point initializing FastAPI app and routes.
+- **`Procfile`**: Tells Railway (or similar platforms) how to run the program.
+- **`requirements.txt`**: Tracks all Python dependencies for the project.
+- **`__init__.py`**: Package initializer for the root module and submodules.
+- **`features/text_classifier/`**
+  - **`controller.py`**: Handles logic between routes and the model.
+  - **`inferencer.py`**: Runs inference and returns predictions as well as file system
+  utilities.
+- **`features/NP/`**
+  - **`controller.py`**: Handles logic between routes and the model.
+  - **`inferencer.py`**: Runs inference and returns predictions as well as file system
+  utilities.
+  - **`model_loader.py`**: Loads the ML model and tokenizer.
+  - **`preprocess.py`**: Prepares input text for the model.
+  - **`routes.py`**: Defines API routes for text classification.
+-[Main](../README.md)

features/image_classifier/controller.py CHANGED Viewed

	@@ -0,0 +1,11 @@

+from fastapi import HTTPException,File,UploadFile
+from .preprocess import preprocess_image
+from .inferencer import classify_image
+async def Classify_Image_router(file: UploadFile = File(...)):
+    try:
+        image_array = preprocess_image(file)
+        result = classify_image(image_array)
+        return result
+    except Exception as e:
+        raise HTTPException(status_code=400, detail=str(e))

features/image_classifier/inferencer.py CHANGED Viewed

	@@ -0,0 +1,22 @@

+import numpy as np
+from .model_loader import load_model
+model = load_model()
+def classify_image(image: np.ndarray):
+    predictions = model.predict(image)[0]
+    human_conf = float(predictions[0])
+    ai_conf = float(predictions[1])
+    if ai_conf > 0.55:
+        label = "AI Generated"
+    elif ai_conf < 0.45:
+        label = "Human Generated"
+    else:
+        label = "Maybe AI"
+    return {
+        "label": label,
+        "ai_confidence": round(ai_conf * 100, 2),
+        "human_confidence": round(human_conf * 100, 2)
+    }

features/image_classifier/model_loader.py CHANGED Viewed

	@@ -0,0 +1,32 @@

+import tensorflow as tf
+from tensorflow.keras.models import load_model as keras_load_model
+import os
+from huggingface_hub import snapshot_download  # fix import syntax here
+import shutil
+# Constants
+REPO_ID = "can-org/AI-VS-HUMAN-IMAGE-classifier"
+MODEL_DIR = "./IMG_models"
+MODEL_PATH = os.path.join(MODEL_DIR, 'latest-my_cnn_model.h5')  # adjust path as needed
+def warmup():
+    global _model_img
+    if not os.path.exists(MODEL_DIR):
+        download_model_Repo()
+    _model_img = load_model()
+def download_model_Repo():
+    # fix typo: os.path.exists (not os.path.exist)
+    if os.path.exists(MODEL_DIR):
+        return
+    # download the repo snapshot from HF hub
+    snapshot_path = snapshot_download(repo_id=REPO_ID)
+    os.makedirs(MODEL_DIR, exist_ok=True)
+    # copy contents from snapshot_path to MODEL_DIR, allow existing dirs
+    shutil.copytree(snapshot_path, MODEL_DIR, dirs_exist_ok=True)
+def load_model():
+    if not os.path.exists(MODEL_DIR):
+        download_model_Repo()
+    model = keras_load_model(MODEL_PATH)
+    return model

features/image_classifier/preprocess.py CHANGED Viewed

@@ -1,9 +1,18 @@
-import cv2
 import numpy as np
-def image_preprocessing(img_path):
-    img =cv2.imread(img_path)
-    img = cv2.resize(img,(128,128))
-    img= cv2.cvtColor(img,cv2.COLOR_BayerGR2RGB)
-    img = img/255.0
-    img = np.expand_dims(img,axis=0)
     return img

 import numpy as np
+import cv2
+def preprocess_image(file):
+    # Read bytes from UploadFile
+    image_bytes = file.file.read()
+    # Convert bytes to NumPy array
+    nparr = np.frombuffer(image_bytes, np.uint8)
+    img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
+    if img is None:
+        raise ValueError("Could not decode image.")
+    img = cv2.resize(img, (256, 256))  # Changed size to 256x256
+    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
+    img = img / 255.0
+    img = np.expand_dims(img, axis=0)
     return img

features/image_classifier/routes.py CHANGED Viewed

@@ -4,15 +4,20 @@ from fastapi import APIRouter, File, Request, Depends, HTTPException, UploadFile
 from fastapi.security import HTTPBearer
 from slowapi import Limiter
 from slowapi.util import get_remote_address
 router = APIRouter()
 limiter = Limiter(key_func=get_remote_address)
 security = HTTPBearer()
 @router.post("/analyse")
 @limiter.limit(ACCESS_RATE)
-async def analyse(request: Request, file:UploadFile,token: str = Depends(security)):
-    return {"filename": file}
 @router.get("/health")
 @limiter.limit(ACCESS_RATE)

 from fastapi.security import HTTPBearer
 from slowapi import Limiter
 from slowapi.util import get_remote_address
+from .controller import Classify_Image_router
 router = APIRouter()
 limiter = Limiter(key_func=get_remote_address)
 security = HTTPBearer()
 @router.post("/analyse")
 @limiter.limit(ACCESS_RATE)
+async def analyse(
+    request: Request,
+    file: UploadFile = File(...),
+    token: str = Depends(security)
+):
+    result = await Classify_Image_router(file)  # await the async function
+    return result
 @router.get("/health")
 @limiter.limit(ACCESS_RATE)

features/nepali_text_classifier/preprocess.py CHANGED Viewed

@@ -20,20 +20,17 @@ def parse_pdf(file: BytesIO):
         for page_num in range(doc.page_count):
             page = doc.load_page(page_num)
             text += page.get_text()
-        return text
     except Exception as e:
         logging.error(f"Error while processing PDF: {str(e)}")
         raise HTTPException(
             status_code=500, detail="Error processing PDF file")
 def parse_txt(file: BytesIO):
     return file.read().decode("utf-8")
 def end_symbol_for_NP_text(text: str) -> str:
     if not text.endswith("।"):
         text += "।"
     return text

         for page_num in range(doc.page_count):
             page = doc.load_page(page_num)
             text += page.get_text()
+        return text
     except Exception as e:
         logging.error(f"Error while processing PDF: {str(e)}")
         raise HTTPException(
             status_code=500, detail="Error processing PDF file")
 def parse_txt(file: BytesIO):
     return file.read().decode("utf-8")
 def end_symbol_for_NP_text(text: str) -> str:
+    text = text.strip()
     if not text.endswith("।"):
         text += "।"
     return text

models/.gitattributes DELETED Viewed

@@ -1,35 +0,0 @@
-*.7z filter=lfs diff=lfs merge=lfs -text
-*.arrow filter=lfs diff=lfs merge=lfs -text
-*.bin filter=lfs diff=lfs merge=lfs -text
-*.bz2 filter=lfs diff=lfs merge=lfs -text
-*.ckpt filter=lfs diff=lfs merge=lfs -text
-*.ftz filter=lfs diff=lfs merge=lfs -text
-*.gz filter=lfs diff=lfs merge=lfs -text
-*.h5 filter=lfs diff=lfs merge=lfs -text
-*.joblib filter=lfs diff=lfs merge=lfs -text
-*.lfs.* filter=lfs diff=lfs merge=lfs -text
-*.mlmodel filter=lfs diff=lfs merge=lfs -text
-*.model filter=lfs diff=lfs merge=lfs -text
-*.msgpack filter=lfs diff=lfs merge=lfs -text
-*.npy filter=lfs diff=lfs merge=lfs -text
-*.npz filter=lfs diff=lfs merge=lfs -text
-*.onnx filter=lfs diff=lfs merge=lfs -text
-*.ot filter=lfs diff=lfs merge=lfs -text
-*.parquet filter=lfs diff=lfs merge=lfs -text
-*.pb filter=lfs diff=lfs merge=lfs -text
-*.pickle filter=lfs diff=lfs merge=lfs -text
-*.pkl filter=lfs diff=lfs merge=lfs -text
-*.pt filter=lfs diff=lfs merge=lfs -text
-*.pth filter=lfs diff=lfs merge=lfs -text
-*.rar filter=lfs diff=lfs merge=lfs -text
-*.safetensors filter=lfs diff=lfs merge=lfs -text
-saved_model/**/* filter=lfs diff=lfs merge=lfs -text
-*.tar.* filter=lfs diff=lfs merge=lfs -text
-*.tar filter=lfs diff=lfs merge=lfs -text
-*.tflite filter=lfs diff=lfs merge=lfs -text
-*.tgz filter=lfs diff=lfs merge=lfs -text
-*.wasm filter=lfs diff=lfs merge=lfs -text
-*.xz filter=lfs diff=lfs merge=lfs -text
-*.zip filter=lfs diff=lfs merge=lfs -text
-*.zst filter=lfs diff=lfs merge=lfs -text
-*tfevents* filter=lfs diff=lfs merge=lfs -text

readme.md CHANGED Viewed

@@ -1,330 +1,21 @@
-# 🚀 FastAPI AI Text Detector
-A production-ready FastAPI application for **AI-generated vs. human-written text detection** in both **English** and **Nepali**. Models are auto-managed and endpoints are secured via Bearer token authentication.
----
-## 🏗️ Project Structure
-```
-├── app.py                   # Main FastAPI app entrypoint
-├── config.py                # Configuration loader (.env, settings)
-├── features/
-│   ├── text_classifier/     # English (GPT-2) classifier
-│   │   ├── controller.py
-│   │   ├── inferencer.py
-│   │   ├── model_loader.py
-│   │   ├── preprocess.py
-│   │   └── routes.py
-│   └── nepali_text_classifier/ # Nepali (sentencepiece) classifier
-│       ├── controller.py
-│       ├── inferencer.py
-│       ├── model_loader.py
-│       ├── preprocess.py
-│       └── routes.py
-├── np_text_model/           # Nepali model artifacts (auto-downloaded)
-│   ├── classifier/
-│   │   └── sentencepiece.bpe.model
-│   └── model_95_acc.pth
-├── models/                  # English GPT-2 model/tokenizer (auto-downloaded)
-│   ├── merges.txt
-│   ├── tokenizer.json
-│   └── model_weights.pth
-├── Dockerfile               # Container build config
-├── Procfile                 # Deployment entrypoint (for PaaS)
-├── requirements.txt         # Python dependencies
-├── README.md                # This file
-└── .env                     # Secret token(s), environment config
-```
----
-### 🌟 Key Files and Their Roles
-- **`app.py`**: Entry point initializing FastAPI app and routes.
-- **`Procfile`**: Tells Railway (or similar platforms) how to run the program.
-- **`requirements.txt`**: Tracks all Python dependencies for the project.
-- **`__init__.py`**: Package initializer for the root module and submodules.
-- **`features/text_classifier/`**
-  - **`controller.py`**: Handles logic between routes and the model.
-  - **`inferencer.py`**: Runs inference and returns predictions as well as file system
-  utilities.
-- **`features/NP/`**
-  - **`controller.py`**: Handles logic between routes and the model.
-  - **`inferencer.py`**: Runs inference and returns predictions as well as file system
-  utilities.
-  - **`model_loader.py`**: Loads the ML model and tokenizer.
-  - **`preprocess.py`**: Prepares input text for the model.
-  - **`routes.py`**: Defines API routes for text classification.
----
-## ⚙️ Setup & Installation
-1. **Clone the repository**
-   ```bash
-   git clone https://github.com/cyberalertnepal/aiapi
-   cd aiapi
-   ```
-2. **Install dependencies**
-   ```bash
-   pip install -r requirements.txt
-   ```
-3. **Configure secrets**
-   - Create a `.env` file at the project root:
-     ```env
-     SECRET_TOKEN=your_secret_token_here
-     ```
-   - **All endpoints require `Authorization: Bearer <SECRET_TOKEN>`**
----
-## 🚦 Running the API Server
 ```bash
 uvicorn app:app --host 0.0.0.0 --port 8000
 ```
----
-## 🔒 Security: Bearer Token Auth
-All endpoints require authentication via Bearer token:
-- Set `SECRET_TOKEN` in `.env`
-- Add header: `Authorization: Bearer <SECRET_TOKEN>`
-Unauthorized requests receive `403 Forbidden`.
----
-## 🧩 API Endpoints
-### English (GPT-2) - `/text/`
-| Endpoint                         | Method | Description                               |
-| --------------------------------- | ------ | ----------------------------------------- |
-| `/text/analyse`                  | POST   | Classify raw English text                 |
-| `/text/analyse-sentences`        | POST   | Sentence-by-sentence breakdown            |
-| `/text/analyse-sentance-file`    | POST   | Upload file, per-sentence breakdown       |
-| `/text/upload`                   | POST   | Upload file for overall classification    |
-| `/text/health`                   | GET    | Health check                             |
-#### Example: Classify English text
-```bash
-curl -X POST http://localhost:8000/text/analyse \
-  -H "Authorization: Bearer <SECRET_TOKEN>" \
-  -H "Content-Type: application/json" \
-  -d '{"text": "This is a sample text for analysis."}'
-```
-**Response:**
-```json
-{
-  "result": "AI-generated",
-  "perplexity": 55.67,
-  "ai_likelihood": 66.6
-}
-```
-#### Example: File upload
-```bash
-curl -X POST http://localhost:8000/text/upload \
-  -H "Authorization: Bearer <SECRET_TOKEN>" \
-  -F 'file=@yourfile.txt;type=text/plain'
-```
----
-### Nepali (SentencePiece) - `/NP/`
-| Endpoint                         | Method | Description                               |
-| --------------------------------- | ------ | ----------------------------------------- |
-| `/NP/analyse`                    | POST   | Classify Nepali text                      |
-| `/NP/analyse-sentences`          | POST   | Sentence-by-sentence breakdown            |
-| `/NP/upload`                     | POST   | Upload Nepali PDF for classification      |
-| `/NP/file-sentences-analyse`     | POST   | PDF upload, per-sentence breakdown        |
-| `/NP/health`                     | GET    | Health check                             |
-#### Example: Nepali text classification
-```bash
-curl -X POST http://localhost:8000/NP/analyse \
-  -H "Authorization: Bearer <SECRET_TOKEN>" \
-  -H "Content-Type: application/json" \
-  -d '{"text": "यो उदाहरण वाक्य हो।"}'
-```
-**Response:**
-```json
-{
-  "label": "Human",
-  "confidence": 98.6
-}
-```
-#### Example: Nepali PDF upload
-```bash
-curl -X POST http://localhost:8000/NP/upload \
-  -H "Authorization: Bearer <SECRET_TOKEN>" \
-  -F 'file=@NepaliText.pdf;type=application/pdf'
-```
----
-## 📝 API Docs
-- **Swagger UI:** [http://localhost:8000/docs](http://localhost:8000/docs)
-- **ReDoc:** [http://localhost:8000/redoc](http://localhost:8000/redoc)
----
-## 🧪 Example: Integration with NestJS
-You can easily call this API from a NestJS microservice.
-**.env**
-```env
-FASTAPI_BASE_URL=http://localhost:8000
-SECRET_TOKEN=your_secret_token_here
-```
-**fastapi.service.ts**
-```typescript
-import { Injectable } from "@nestjs/common";
-import { HttpService } from "@nestjs/axios";
-import { ConfigService } from "@nestjs/config";
-import { firstValueFrom } from "rxjs";
-@Injectable()
-export class FastAPIService {
-  constructor(
-    private http: HttpService,
-    private config: ConfigService,
-  ) {}
-  async analyzeText(text: string) {
-    const url = `${this.config.get("FASTAPI_BASE_URL")}/text/analyse`;
-    const token = this.config.get("SECRET_TOKEN");
-    const response = await firstValueFrom(
-      this.http.post(
-        url,
-        { text },
-        {
-          headers: {
-            Authorization: `Bearer ${token}`,
-          },
-        },
-      ),
-    );
-    return response.data;
-  }
-}
-```
-**app.module.ts**
-```typescript
-import { Module } from "@nestjs/common";
-import { ConfigModule } from "@nestjs/config";
-import { HttpModule } from "@nestjs/axios";
-import { AppController } from "./app.controller";
-import { FastAPIService } from "./fastapi.service";
-@Module({
-  imports: [ConfigModule.forRoot(), HttpModule],
-  controllers: [AppController],
-  providers: [FastAPIService],
-})
-export class AppModule {}
-```
-**app.controller.ts**
-```typescript
-import { Body, Controller, Post, Get } from '@nestjs/common';
-import { FastAPIService } from './fastapi.service';
-@Controller()
-export class AppController {
-  constructor(private readonly fastapiService: FastAPIService) {}
-  @Post('analyze-text')
-  async callFastAPI(@Body('text') text: string) {
-    return this.fastapiService.analyzeText(text);
-  }
-  @Get()
-  getHello(): string {
-    return 'NestJS is connected to FastAPI';
-  }
-}
-```
----
-## 🧠 Main Functions in Text Classifier (`features/text_classifier/` and `features/text_classifier/`)
-- **`load_model()`**
-  Loads the GPT-2 model and tokenizer from the specified directory paths.
-- **`lifespan()`**
-  Manages the application lifecycle. Initializes the model at startup and handles cleanup on shutdown.
-- **`classify_text_sync()`**
-  Synchronously tokenizes input text and predicts using the GPT-2 model. Returns classification and perplexity.
-- **`classify_text()`**
-  Asynchronously runs `classify_text_sync()` in a thread pool for non-blocking text classification.
-- **`analyze_text()`**
-  **POST** endpoint: Accepts text input, classifies it using `classify_text()`, and returns the result with perplexity.
-- **`health()`**
-  **GET** endpoint: Simple health check for API liveness.
-- **`parse_docx()`, `parse_pdf()`, `parse_txt()`**
-  Utilities to extract and convert `.docx`, `.pdf`, and `.txt` file contents to plain text.
-- **`warmup()`**
-  Downloads the model repository and initializes the model/tokenizer using `load_model()`.
-- **`download_model_repo()`**
-  Downloads the model files from the designated `MODEL` folder.
-- **`get_model_tokenizer()`**
-  Checks if the model already exists; if not, downloads it—otherwise, loads the cached model.
-- **`handle_file_upload()`**
-  Handles file uploads from the `/upload` route. Extracts text, classifies, and returns results.
-- **`extract_file_contents()`**
-  Extracts and returns plain text from uploaded files (PDF, DOCX, TXT).
-- **`handle_file_sentence()`**
-  Processes file uploads by analyzing each sentence (under 10,000 chars) before classification.
-- **`handle_sentence_level_analysis()`**
-  Checks/strips each sentence, then computes AI/human likelihood for each.
-- **`analyze_sentences()`**
-  Splits paragraphs into sentences, classifies each, and returns all results.
-- **`analyze_sentence_file()`**
-  Like `handle_file_sentence()`—analyzes sentences in uploaded files.
----
 ## 🚀 Deployment
 - **Local**: Use `uvicorn` as above.

+# 🚀 FastAPI AI Detector
+A production-ready FastAPI app for detecting AI vs. human-written text in English and Nepali. It uses GPT-2 and SentencePiece-based models, with Bearer token security.
+## 📂 Documentation
+- [Project Structure](docs/structure.md)
+- [API Endpoints](docs/api_endpoints.md)
+- [Setup & Installation](docs/setup.md)
+- [Deployment](docs/deployment.md)
+- [Security](docs/security.md)
+- [NestJS Integration](docs/nestjs_integration.md)
+- [Core Functions](docs/functions.md)
+## ⚡ Quick Start
 ```bash
 uvicorn app:app --host 0.0.0.0 --port 8000
 ```
 ## 🚀 Deployment
 - **Local**: Use `uvicorn` as above.

requirements.txt CHANGED Viewed

@@ -11,3 +11,5 @@ python-multipart
 slowapi
 spacy
 nltk

 slowapi
 spacy
 nltk
+tensorflow
+opencv-python