Spaces:
Running
π FastAPI AI Text Detector
A production-ready FastAPI application for AI-generated vs. human-written text detection in both English and Nepali. Models are auto-managed and endpoints are secured via Bearer token authentication.
ποΈ Project Structure
βββ app.py # Main FastAPI app entrypoint
βββ config.py # Configuration loader (.env, settings)
βββ features/
β βββ text_classifier/ # English (GPT-2) classifier
β β βββ controller.py
β β βββ inferencer.py
β β βββ model_loader.py
β β βββ preprocess.py
β β βββ routes.py
β βββ nepali_text_classifier/ # Nepali (sentencepiece) classifier
β βββ controller.py
β βββ inferencer.py
β βββ model_loader.py
β βββ preprocess.py
β βββ routes.py
βββ np_text_model/ # Nepali model artifacts (auto-downloaded)
β βββ classifier/
β β βββ sentencepiece.bpe.model
β βββ model_95_acc.pth
βββ models/ # English GPT-2 model/tokenizer (auto-downloaded)
β βββ merges.txt
β βββ tokenizer.json
β βββ model_weights.pth
βββ Dockerfile # Container build config
βββ Procfile # Deployment entrypoint (for PaaS)
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ .env # Secret token(s), environment config
π Key Files and Their Roles
app.py
: Entry point initializing FastAPI app and routes.Procfile
: Tells Railway (or similar platforms) how to run the program.requirements.txt
: Tracks all Python dependencies for the project.__init__.py
: Package initializer for the root module and submodules.features/text_classifier/
controller.py
: Handles logic between routes and the model.inferencer.py
: Runs inference and returns predictions as well as file system utilities.
features/NP/
controller.py
: Handles logic between routes and the model.inferencer.py
: Runs inference and returns predictions as well as file system utilities.model_loader.py
: Loads the ML model and tokenizer.preprocess.py
: Prepares input text for the model.routes.py
: Defines API routes for text classification.
βοΈ Setup & Installation
Clone the repository
git clone https://github.com/cyberalertnepal/aiapi cd aiapi
Install dependencies
pip install -r requirements.txt
Configure secrets
Create a
.env
file at the project root:SECRET_TOKEN=your_secret_token_here
All endpoints require
Authorization: Bearer <SECRET_TOKEN>
π¦ Running the API Server
uvicorn app:app --host 0.0.0.0 --port 8000
π Security: Bearer Token Auth
All endpoints require authentication via Bearer token:
- Set
SECRET_TOKEN
in.env
- Add header:
Authorization: Bearer <SECRET_TOKEN>
Unauthorized requests receive 403 Forbidden
.
π§© API Endpoints
English (GPT-2) - /text/
Endpoint | Method | Description |
---|---|---|
/text/analyse |
POST | Classify raw English text |
/text/analyse-sentences |
POST | Sentence-by-sentence breakdown |
/text/analyse-sentance-file |
POST | Upload file, per-sentence breakdown |
/text/upload |
POST | Upload file for overall classification |
/text/health |
GET | Health check |
Example: Classify English text
curl -X POST http://localhost:8000/text/analyse \
-H "Authorization: Bearer <SECRET_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"text": "This is a sample text for analysis."}'
Response:
{
"result": "AI-generated",
"perplexity": 55.67,
"ai_likelihood": 66.6
}
Example: File upload
curl -X POST http://localhost:8000/text/upload \
-H "Authorization: Bearer <SECRET_TOKEN>" \
-F 'file=@yourfile.txt;type=text/plain'
Nepali (SentencePiece) - /NP/
Endpoint | Method | Description |
---|---|---|
/NP/analyse |
POST | Classify Nepali text |
/NP/analyse-sentences |
POST | Sentence-by-sentence breakdown |
/NP/upload |
POST | Upload Nepali PDF for classification |
/NP/file-sentences-analyse |
POST | PDF upload, per-sentence breakdown |
/NP/health |
GET | Health check |
Example: Nepali text classification
curl -X POST http://localhost:8000/NP/analyse \
-H "Authorization: Bearer <SECRET_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"text": "ΰ€―ΰ₯ ΰ€ΰ€¦ΰ€Ύΰ€Ήΰ€°ΰ€£ ΰ€΅ΰ€Ύΰ€ΰ₯ΰ€― ΰ€Ήΰ₯ΰ₯€"}'
Response:
{
"label": "Human",
"confidence": 98.6
}
Example: Nepali PDF upload
curl -X POST http://localhost:8000/NP/upload \
-H "Authorization: Bearer <SECRET_TOKEN>" \
-F 'file=@NepaliText.pdf;type=application/pdf'
π API Docs
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
π§ͺ Example: Integration with NestJS
You can easily call this API from a NestJS microservice.
.env
FASTAPI_BASE_URL=http://localhost:8000
SECRET_TOKEN=your_secret_token_here
fastapi.service.ts
import { Injectable } from "@nestjs/common";
import { HttpService } from "@nestjs/axios";
import { ConfigService } from "@nestjs/config";
import { firstValueFrom } from "rxjs";
@Injectable()
export class FastAPIService {
constructor(
private http: HttpService,
private config: ConfigService,
) {}
async analyzeText(text: string) {
const url = `${this.config.get("FASTAPI_BASE_URL")}/text/analyse`;
const token = this.config.get("SECRET_TOKEN");
const response = await firstValueFrom(
this.http.post(
url,
{ text },
{
headers: {
Authorization: `Bearer ${token}`,
},
},
),
);
return response.data;
}
}
app.module.ts
import { Module } from "@nestjs/common";
import { ConfigModule } from "@nestjs/config";
import { HttpModule } from "@nestjs/axios";
import { AppController } from "./app.controller";
import { FastAPIService } from "./fastapi.service";
@Module({
imports: [ConfigModule.forRoot(), HttpModule],
controllers: [AppController],
providers: [FastAPIService],
})
export class AppModule {}
app.controller.ts
import { Body, Controller, Post, Get } from '@nestjs/common';
import { FastAPIService } from './fastapi.service';
@Controller()
export class AppController {
constructor(private readonly fastapiService: FastAPIService) {}
@Post('analyze-text')
async callFastAPI(@Body('text') text: string) {
return this.fastapiService.analyzeText(text);
}
@Get()
getHello(): string {
return 'NestJS is connected to FastAPI';
}
}
π§ Main Functions in Text Classifier (features/text_classifier/
and features/text_classifier/
)
load_model()
Loads the GPT-2 model and tokenizer from the specified directory paths.lifespan()
Manages the application lifecycle. Initializes the model at startup and handles cleanup on shutdown.classify_text_sync()
Synchronously tokenizes input text and predicts using the GPT-2 model. Returns classification and perplexity.classify_text()
Asynchronously runsclassify_text_sync()
in a thread pool for non-blocking text classification.analyze_text()
POST endpoint: Accepts text input, classifies it usingclassify_text()
, and returns the result with perplexity.health()
GET endpoint: Simple health check for API liveness.parse_docx()
,parse_pdf()
,parse_txt()
Utilities to extract and convert.docx
,.pdf
, and.txt
file contents to plain text.warmup()
Downloads the model repository and initializes the model/tokenizer usingload_model()
.download_model_repo()
Downloads the model files from the designatedMODEL
folder.get_model_tokenizer()
Checks if the model already exists; if not, downloads itβotherwise, loads the cached model.handle_file_upload()
Handles file uploads from the/upload
route. Extracts text, classifies, and returns results.extract_file_contents()
Extracts and returns plain text from uploaded files (PDF, DOCX, TXT).handle_file_sentence()
Processes file uploads by analyzing each sentence (under 10,000 chars) before classification.handle_sentence_level_analysis()
Checks/strips each sentence, then computes AI/human likelihood for each.analyze_sentences()
Splits paragraphs into sentences, classifies each, and returns all results.analyze_sentence_file()
Likehandle_file_sentence()
βanalyzes sentences in uploaded files.
π Deployment
- Local: Use
uvicorn
as above. - Railway/Heroku: Use the provided
Procfile
. - Hugging Face Spaces: Use the
Dockerfile
for container deployment.
π‘ Tips
- Model files auto-download at first start if not found.
- Keep
requirements.txt
up-to-date after adding dependencies. - All endpoints require the correct
Authorization
header. - For security: Avoid committing
.env
to public repos.