File size: 2,136 Bytes
2f4a460
 
 
 
 
 
 
 
 
 
 
 
 
 
a382da4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2f4a460
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language:
- en
base_model:
- answerdotai/ModernBERT-base
pipeline_tag: text-classification
tags:
- text
- text classification
- LLM
- LLM text detection
- Detection
- detector
---
# LLM_Detector_Preview_model

**Preview release of an LLM-generated text detector.**

## Model Description
This model is designed to classify text as Human, Mixed, or AI-generated. It is based on a sequence classification architecture and was trained on a mix of human and AI-generated texts. The model can be used for document, sentence, and token-level analysis.

- **Architecture:** ModernBERT (or compatible Transformer)
- **Labels:**
  - 0: Human
  - 1: Mixed
  - 2: AI

## Intended Use
- **For research and curiosity only.**
- Not for academic, legal, medical, or high-stakes use.
- Results are easy to bypass and may be unreliable.

## Limitations & Warnings
- This model is **experimental** and not clinically accurate.
- It can produce false positives and false negatives.
- Simple paraphrasing or editing can fool the detector.
- Do not use for academic integrity, hiring, or legal decisions.

## How It Works
The model analyzes text and predicts the likelihood of it being human-written, mixed, or AI-generated. It uses statistical patterns learned from training data, but these patterns are not foolproof and can be circumvented.

## Example Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained('Donnyed/LLM_Detector_Preview_model')
model = AutoModelForSequenceClassification.from_pretrained('Donnyed/LLM_Detector_Preview_model')

text = "Paste your text here."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    pred = torch.argmax(probs, dim=1).item()
    print('Prediction:', pred)
    print('Probabilities:', probs)
```

## Files Included
- `model.safetensors` — Model weights
- `config.json` — Model configuration
- `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json` — Tokenizer files