File size: 2,136 Bytes
2f4a460 a382da4 2f4a460 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
language:
- en
base_model:
- answerdotai/ModernBERT-base
pipeline_tag: text-classification
tags:
- text
- text classification
- LLM
- LLM text detection
- Detection
- detector
---
# LLM_Detector_Preview_model
**Preview release of an LLM-generated text detector.**
## Model Description
This model is designed to classify text as Human, Mixed, or AI-generated. It is based on a sequence classification architecture and was trained on a mix of human and AI-generated texts. The model can be used for document, sentence, and token-level analysis.
- **Architecture:** ModernBERT (or compatible Transformer)
- **Labels:**
- 0: Human
- 1: Mixed
- 2: AI
## Intended Use
- **For research and curiosity only.**
- Not for academic, legal, medical, or high-stakes use.
- Results are easy to bypass and may be unreliable.
## Limitations & Warnings
- This model is **experimental** and not clinically accurate.
- It can produce false positives and false negatives.
- Simple paraphrasing or editing can fool the detector.
- Do not use for academic integrity, hiring, or legal decisions.
## How It Works
The model analyzes text and predicts the likelihood of it being human-written, mixed, or AI-generated. It uses statistical patterns learned from training data, but these patterns are not foolproof and can be circumvented.
## Example Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained('Donnyed/LLM_Detector_Preview_model')
model = AutoModelForSequenceClassification.from_pretrained('Donnyed/LLM_Detector_Preview_model')
text = "Paste your text here."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)
pred = torch.argmax(probs, dim=1).item()
print('Prediction:', pred)
print('Probabilities:', probs)
```
## Files Included
- `model.safetensors` — Model weights
- `config.json` — Model configuration
- `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json` — Tokenizer files |