--- language: - en base_model: - answerdotai/ModernBERT-base pipeline_tag: text-classification tags: - text - text classification - LLM - LLM text detection - Detection - detector --- # LLM_Detector_Preview_model **Preview release of an LLM-generated text detector.** ## Model Description This model is designed to classify text as Human, Mixed, or AI-generated. It is based on a sequence classification architecture and was trained on a mix of human and AI-generated texts. The model can be used for document, sentence, and token-level analysis. - **Architecture:** ModernBERT (or compatible Transformer) - **Labels:** - 0: Human - 1: Mixed - 2: AI ## Intended Use - **For research and curiosity only.** - Not for academic, legal, medical, or high-stakes use. - Results are easy to bypass and may be unreliable. ## Limitations & Warnings - This model is **experimental** and not clinically accurate. - It can produce false positives and false negatives. - Simple paraphrasing or editing can fool the detector. - Do not use for academic integrity, hiring, or legal decisions. ## How It Works The model analyzes text and predicts the likelihood of it being human-written, mixed, or AI-generated. It uses statistical patterns learned from training data, but these patterns are not foolproof and can be circumvented. ## Example Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch tokenizer = AutoTokenizer.from_pretrained('Donnyed/LLM_Detector_Preview_model') model = AutoModelForSequenceClassification.from_pretrained('Donnyed/LLM_Detector_Preview_model') text = "Paste your text here." inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) with torch.no_grad(): outputs = model(**inputs) probs = torch.softmax(outputs.logits, dim=1) pred = torch.argmax(probs, dim=1).item() print('Prediction:', pred) print('Probabilities:', probs) ``` ## Files Included - `model.safetensors` — Model weights - `config.json` — Model configuration - `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json` — Tokenizer files