File size: 6,276 Bytes
6d13add
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
---
license: mit
datasets:
- Overfit-GM/turkish-toxic-language
language:
- tr
base_model:
- dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
library_name: transformers
tags:
- text-classification
- toxicity-detection
- turkish
- bert
- nlp
- content-moderation
---

# MeowML/ToxicBERT - Turkish Toxic Language Detection

## Model Description

ToxicBERT is a fine-tuned BERT model specifically designed for detecting toxic language in Turkish text. Built upon the `dbmdz/bert-base-turkish-cased` foundation model, this classifier can identify potentially harmful, offensive, or toxic content in Turkish social media posts, comments, and general text.

## Model Details

- **Model Type**: Text Classification (Binary)
- **Language**: Turkish (tr)
- **Base Model**: `dbmdz/bert-base-turkish-cased`
- **License**: MIT
- **Library**: Transformers
- **Task**: Toxicity Detection

## Intended Use

### Primary Use Cases
- Content moderation for Turkish social media platforms
- Automated filtering of user-generated content
- Research in Turkish NLP and toxicity detection
- Educational purposes for understanding toxic language patterns

### Out-of-Scope Use
- This model should not be used as the sole decision-maker for content moderation without human oversight
- Not suitable for languages other than Turkish
- Should not be used for sensitive applications without proper validation and testing

## Training Data

The model was trained on the `Overfit-GM/turkish-toxic-language` dataset, which contains Turkish text samples labeled for toxicity. The dataset includes various forms of toxic content commonly found in online Turkish communications.

## Model Performance

The model outputs:
- **Binary Classification**: 0 (Non-toxic) or 1 (Toxic)
- **Confidence Score**: Probability score indicating model confidence
- **Toxic Probability**: Specific probability of the text being toxic

## Usage

### Quick Start

```python
    import torch
    from transformers import AutoTokenizer, AutoModelForSequenceClassification

    # Load model and tokenizer
    tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased")
    model = AutoModelForSequenceClassification.from_pretrained("MeowML/ToxicBERT")

    # Prepare text
    text = "Merhaba, nasılsın?"
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=256)

    # Get prediction
    with torch.no_grad():
        outputs = model(**inputs)
        probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
        prediction = torch.argmax(probabilities, dim=-1)
        
    toxic_probability = probabilities[0][1].item()
    is_toxic = bool(prediction.item())

    print(f"Is toxic: {is_toxic}")
    print(f"Toxic probability: {toxic_probability:.4f}")
```

### Advanced Usage with Custom Class

```python
    import torch
    from transformers import AutoTokenizer, AutoModelForSequenceClassification

    class ToxicLanguageDetector:
        def __init__(self, model_name="MeowML/ToxicBERT"):
            self.tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased")
            self.model = AutoModelForSequenceClassification.from_pretrained(model_name)
            self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
            self.model.to(self.device)
            self.model.eval()
            
        def predict(self, text):
            inputs = self.tokenizer(
                text,
                truncation=True,
                padding='max_length',
                max_length=256,
                return_tensors='pt'
            ).to(self.device)
            
            with torch.no_grad():
                outputs = self.model(**inputs)
                probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
                prediction = torch.argmax(probabilities, dim=-1)
            
            return {
                'text': text,
                'is_toxic': bool(prediction.item()),
                'toxic_probability': probabilities[0][1].item(),
                'confidence': max(probabilities[0]).item()
            }

    # Usage
    detector = ToxicLanguageDetector()
    result = detector.predict("Merhaba, nasılsın?")
    print(result)
```

## Limitations and Biases

### Limitations
- The model's performance depends heavily on the training data quality and coverage
- May have difficulty with context-dependent toxicity (sarcasm, irony)
- Performance may vary across different Turkish dialects or informal language
- Shorter texts might be more challenging to classify accurately

### Potential Biases
- The model may reflect biases present in the training dataset
- Certain topics, demographics, or linguistic patterns might be over- or under-represented
- Regular evaluation and bias testing are recommended for production use

## Ethical Considerations

- This model should be used responsibly with human oversight
- False positives and negatives are expected and should be accounted for
- Consider the impact on freedom of expression when implementing automated moderation
- Regular auditing and updating are recommended to maintain fairness

## Technical Specifications

- **Input**: Text strings (max 256 tokens)
- **Output**: Binary classification with probability scores
- **Model Size**: Based on BERT-base architecture
- **Inference Speed**: Optimized for both CPU and GPU inference
- **Memory Requirements**: Suitable for standard hardware configurations

## Citation

If you use this model in your research or applications, please cite:

```bibtex
    @misc{meowml_toxicbert_2024,
      title={ToxicBERT: Turkish Toxic Language Detection},
      author={MeowML},
      year={2024},
      publisher={Hugging Face},
      url={https://huggingface.co/MeowML/ToxicBERT}
    }
```

## Acknowledgments

- Base model: `dbmdz/bert-base-turkish-cased`
- Training dataset: `Overfit-GM/turkish-toxic-language`
- Built with Hugging Face Transformers library

## Contact

For questions, issues, or suggestions, please open an issue in the model repository or contact the MeowML team.

---

**Disclaimer**: This model is provided for research and educational purposes. Users are responsible for ensuring appropriate and ethical use in their applications.