ToxicBERT / README.md
ErenalpCet's picture
Update README.md
6d13add verified
---
license: mit
datasets:
- Overfit-GM/turkish-toxic-language
language:
- tr
base_model:
- dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
library_name: transformers
tags:
- text-classification
- toxicity-detection
- turkish
- bert
- nlp
- content-moderation
---
# MeowML/ToxicBERT - Turkish Toxic Language Detection
## Model Description
ToxicBERT is a fine-tuned BERT model specifically designed for detecting toxic language in Turkish text. Built upon the `dbmdz/bert-base-turkish-cased` foundation model, this classifier can identify potentially harmful, offensive, or toxic content in Turkish social media posts, comments, and general text.
## Model Details
- **Model Type**: Text Classification (Binary)
- **Language**: Turkish (tr)
- **Base Model**: `dbmdz/bert-base-turkish-cased`
- **License**: MIT
- **Library**: Transformers
- **Task**: Toxicity Detection
## Intended Use
### Primary Use Cases
- Content moderation for Turkish social media platforms
- Automated filtering of user-generated content
- Research in Turkish NLP and toxicity detection
- Educational purposes for understanding toxic language patterns
### Out-of-Scope Use
- This model should not be used as the sole decision-maker for content moderation without human oversight
- Not suitable for languages other than Turkish
- Should not be used for sensitive applications without proper validation and testing
## Training Data
The model was trained on the `Overfit-GM/turkish-toxic-language` dataset, which contains Turkish text samples labeled for toxicity. The dataset includes various forms of toxic content commonly found in online Turkish communications.
## Model Performance
The model outputs:
- **Binary Classification**: 0 (Non-toxic) or 1 (Toxic)
- **Confidence Score**: Probability score indicating model confidence
- **Toxic Probability**: Specific probability of the text being toxic
## Usage
### Quick Start
```python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased")
model = AutoModelForSequenceClassification.from_pretrained("MeowML/ToxicBERT")
# Prepare text
text = "Merhaba, nasılsın?"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=256)
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
prediction = torch.argmax(probabilities, dim=-1)
toxic_probability = probabilities[0][1].item()
is_toxic = bool(prediction.item())
print(f"Is toxic: {is_toxic}")
print(f"Toxic probability: {toxic_probability:.4f}")
```
### Advanced Usage with Custom Class
```python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
class ToxicLanguageDetector:
def __init__(self, model_name="MeowML/ToxicBERT"):
self.tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased")
self.model = AutoModelForSequenceClassification.from_pretrained(model_name)
self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
self.model.to(self.device)
self.model.eval()
def predict(self, text):
inputs = self.tokenizer(
text,
truncation=True,
padding='max_length',
max_length=256,
return_tensors='pt'
).to(self.device)
with torch.no_grad():
outputs = self.model(**inputs)
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
prediction = torch.argmax(probabilities, dim=-1)
return {
'text': text,
'is_toxic': bool(prediction.item()),
'toxic_probability': probabilities[0][1].item(),
'confidence': max(probabilities[0]).item()
}
# Usage
detector = ToxicLanguageDetector()
result = detector.predict("Merhaba, nasılsın?")
print(result)
```
## Limitations and Biases
### Limitations
- The model's performance depends heavily on the training data quality and coverage
- May have difficulty with context-dependent toxicity (sarcasm, irony)
- Performance may vary across different Turkish dialects or informal language
- Shorter texts might be more challenging to classify accurately
### Potential Biases
- The model may reflect biases present in the training dataset
- Certain topics, demographics, or linguistic patterns might be over- or under-represented
- Regular evaluation and bias testing are recommended for production use
## Ethical Considerations
- This model should be used responsibly with human oversight
- False positives and negatives are expected and should be accounted for
- Consider the impact on freedom of expression when implementing automated moderation
- Regular auditing and updating are recommended to maintain fairness
## Technical Specifications
- **Input**: Text strings (max 256 tokens)
- **Output**: Binary classification with probability scores
- **Model Size**: Based on BERT-base architecture
- **Inference Speed**: Optimized for both CPU and GPU inference
- **Memory Requirements**: Suitable for standard hardware configurations
## Citation
If you use this model in your research or applications, please cite:
```bibtex
@misc{meowml_toxicbert_2024,
title={ToxicBERT: Turkish Toxic Language Detection},
author={MeowML},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/MeowML/ToxicBERT}
}
```
## Acknowledgments
- Base model: `dbmdz/bert-base-turkish-cased`
- Training dataset: `Overfit-GM/turkish-toxic-language`
- Built with Hugging Face Transformers library
## Contact
For questions, issues, or suggestions, please open an issue in the model repository or contact the MeowML team.
---
**Disclaimer**: This model is provided for research and educational purposes. Users are responsible for ensuring appropriate and ethical use in their applications.