🚀 FastAPI AI Text Detector

A production-ready FastAPI application for AI-generated vs. human-written text detection in both English and Nepali. Models are auto-managed and endpoints are secured via Bearer token authentication.

🏗️ Project Structure

├── app.py                   # Main FastAPI app entrypoint
├── config.py                # Configuration loader (.env, settings)
├── features/
│   ├── text_classifier/     # English (GPT-2) classifier
│   │   ├── controller.py
│   │   ├── inferencer.py
│   │   ├── model_loader.py
│   │   ├── preprocess.py
│   │   └── routes.py
│   └── nepali_text_classifier/ # Nepali (sentencepiece) classifier
│       ├── controller.py
│       ├── inferencer.py
│       ├── model_loader.py
│       ├── preprocess.py
│       └── routes.py
├── np_text_model/           # Nepali model artifacts (auto-downloaded)
│   ├── classifier/
│   │   └── sentencepiece.bpe.model
│   └── model_95_acc.pth
├── models/                  # English GPT-2 model/tokenizer (auto-downloaded)
│   ├── merges.txt
│   ├── tokenizer.json
│   └── model_weights.pth
├── Dockerfile               # Container build config
├── Procfile                 # Deployment entrypoint (for PaaS)
├── requirements.txt         # Python dependencies
├── README.md                # This file
└── .env                     # Secret token(s), environment config

🌟 Key Files and Their Roles

app.py: Entry point initializing FastAPI app and routes.
Procfile: Tells Railway (or similar platforms) how to run the program.
requirements.txt: Tracks all Python dependencies for the project.
__init__.py: Package initializer for the root module and submodules.
features/text_classifier/
- controller.py: Handles logic between routes and the model.
- inferencer.py: Runs inference and returns predictions as well as file system utilities.
features/NP/
- controller.py: Handles logic between routes and the model.
- inferencer.py: Runs inference and returns predictions as well as file system utilities.
- model_loader.py: Loads the ML model and tokenizer.
- preprocess.py: Prepares input text for the model.
- routes.py: Defines API routes for text classification.

⚙️ Setup & Installation

Clone the repository

git clone https://github.com/cyberalertnepal/aiapi
cd aiapi

Install dependencies
```
pip install -r requirements.txt
```
Configure secrets
- Create a .env file at the project root:
```
SECRET_TOKEN=your_secret_token_here
```
- All endpoints require Authorization: Bearer <SECRET_TOKEN>

🚦 Running the API Server

uvicorn app:app --host 0.0.0.0 --port 8000

🔒 Security: Bearer Token Auth

All endpoints require authentication via Bearer token:

Set SECRET_TOKEN in .env
Add header: Authorization: Bearer <SECRET_TOKEN>

Unauthorized requests receive 403 Forbidden.

🧩 API Endpoints

English (GPT-2) - `/text/`

Endpoint	Method	Description
`/text/analyse`	POST	Classify raw English text
`/text/analyse-sentences`	POST	Sentence-by-sentence breakdown
`/text/analyse-sentance-file`	POST	Upload file, per-sentence breakdown
`/text/upload`	POST	Upload file for overall classification
`/text/health`	GET	Health check

Example: Classify English text

curl -X POST http://localhost:8000/text/analyse \
  -H "Authorization: Bearer <SECRET_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"text": "This is a sample text for analysis."}'

Response:

{
  "result": "AI-generated",
  "perplexity": 55.67,
  "ai_likelihood": 66.6
}

Example: File upload

curl -X POST http://localhost:8000/text/upload \
  -H "Authorization: Bearer <SECRET_TOKEN>" \
  -F 'file=@yourfile.txt;type=text/plain'

Nepali (SentencePiece) - `/NP/`

Endpoint	Method	Description
`/NP/analyse`	POST	Classify Nepali text
`/NP/analyse-sentences`	POST	Sentence-by-sentence breakdown
`/NP/upload`	POST	Upload Nepali PDF for classification
`/NP/file-sentences-analyse`	POST	PDF upload, per-sentence breakdown
`/NP/health`	GET	Health check

Example: Nepali text classification

curl -X POST http://localhost:8000/NP/analyse \
  -H "Authorization: Bearer <SECRET_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"text": "यो उदाहरण वाक्य हो।"}'

Response:

{
  "label": "Human",
  "confidence": 98.6
}

Example: Nepali PDF upload

curl -X POST http://localhost:8000/NP/upload \
  -H "Authorization: Bearer <SECRET_TOKEN>" \
  -F 'file=@NepaliText.pdf;type=application/pdf'

📝 API Docs

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

🧪 Example: Integration with NestJS

You can easily call this API from a NestJS microservice.

.env

FASTAPI_BASE_URL=http://localhost:8000
SECRET_TOKEN=your_secret_token_here

fastapi.service.ts

import { Injectable } from "@nestjs/common";
import { HttpService } from "@nestjs/axios";
import { ConfigService } from "@nestjs/config";
import { firstValueFrom } from "rxjs";

@Injectable()
export class FastAPIService {
  constructor(
    private http: HttpService,
    private config: ConfigService,
  ) {}

  async analyzeText(text: string) {
    const url = `${this.config.get("FASTAPI_BASE_URL")}/text/analyse`;
    const token = this.config.get("SECRET_TOKEN");

    const response = await firstValueFrom(
      this.http.post(
        url,
        { text },
        {
          headers: {
            Authorization: `Bearer ${token}`,
          },
        },
      ),
    );

    return response.data;
  }
}

app.module.ts

import { Module } from "@nestjs/common";
import { ConfigModule } from "@nestjs/config";
import { HttpModule } from "@nestjs/axios";
import { AppController } from "./app.controller";
import { FastAPIService } from "./fastapi.service";

@Module({
  imports: [ConfigModule.forRoot(), HttpModule],
  controllers: [AppController],
  providers: [FastAPIService],
})
export class AppModule {}

app.controller.ts

import { Body, Controller, Post, Get } from '@nestjs/common';
import { FastAPIService } from './fastapi.service';

@Controller()
export class AppController {
  constructor(private readonly fastapiService: FastAPIService) {}

  @Post('analyze-text')
  async callFastAPI(@Body('text') text: string) {
    return this.fastapiService.analyzeText(text);
  }

  @Get()
  getHello(): string {
    return 'NestJS is connected to FastAPI';
  }
}

🧠 Main Functions in Text Classifier (`features/text_classifier/` and `features/text_classifier/`)

load_model()
Loads the GPT-2 model and tokenizer from the specified directory paths.
lifespan()
Manages the application lifecycle. Initializes the model at startup and handles cleanup on shutdown.
classify_text_sync()
Synchronously tokenizes input text and predicts using the GPT-2 model. Returns classification and perplexity.
classify_text()
Asynchronously runs classify_text_sync() in a thread pool for non-blocking text classification.
analyze_text()
POST endpoint: Accepts text input, classifies it using classify_text(), and returns the result with perplexity.
health()
GET endpoint: Simple health check for API liveness.
parse_docx(), parse_pdf(), parse_txt()
Utilities to extract and convert .docx, .pdf, and .txt file contents to plain text.
warmup()
Downloads the model repository and initializes the model/tokenizer using load_model().
download_model_repo()
Downloads the model files from the designated MODEL folder.
get_model_tokenizer()
Checks if the model already exists; if not, downloads it—otherwise, loads the cached model.
handle_file_upload()
Handles file uploads from the /upload route. Extracts text, classifies, and returns results.
extract_file_contents()
Extracts and returns plain text from uploaded files (PDF, DOCX, TXT).
handle_file_sentence()
Processes file uploads by analyzing each sentence (under 10,000 chars) before classification.
handle_sentence_level_analysis()
Checks/strips each sentence, then computes AI/human likelihood for each.
analyze_sentences()
Splits paragraphs into sentences, classifies each, and returns all results.
analyze_sentence_file()
Like handle_file_sentence()—analyzes sentences in uploaded files.

🚀 Deployment

Local: Use uvicorn as above.
Railway/Heroku: Use the provided Procfile.
Hugging Face Spaces: Use the Dockerfile for container deployment.

💡 Tips

Model files auto-download at first start if not found.
Keep requirements.txt up-to-date after adding dependencies.
All endpoints require the correct Authorization header.
For security: Avoid committing .env to public repos.