AfroLogicInsect's picture
Update README.md
e423a53 verified
---
library_name: transformers
tags:
- sentiment-analysis
- distilbert
- text-classification
- nlp
- imdb
- binary-classification
license: mit
datasets:
- stanfordnlp/imdb
language:
- en
metrics:
- accuracy
base_model:
- distilbert/distilbert-base-uncased
---
# Model Card for Model ID
A fine-tuned DistilBERT model for binary sentiment analysis — predicting whether input text expresses a positive or negative sentiment. Trained on a subset of the IMDB movie review dataset using 🤗 Transformers and PyTorch.
## Model Details
### Model Description
This model was trained by Daniel (AfroLogicInsect) for classifying sentiment on movie reviews. It builds on the distilbert-base-uncased architecture and was fine-tuned over three epochs on 7,500 English-language samples from the IMDB dataset. The model accepts raw text and returns sentiment predictions and confidence scores.
- **Developed by:** Daniel 🇳🇬 (@AfroLogicInsect)
- **Funded by:** [More Information Needed]
- **Shared by:** [More Information Needed]
- **Model type:** DistilBERT-based sequence classification
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** distilbert-base-uncased
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** https://huggingface.co/AfroLogicInsect/sentiment-analysis-model
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
### Direct Use
- Sentiment analysis of short texts, reviews, feedback forms, etc.
- Embedding in web apps or chatbots to assess user mood or response tone
### Downstream Use [optional]
- Can be incorporated into feedback categorization pipelines
- Extended to multilingual sentiment tasks with additional fine-tuning
### Out-of-Scope Use
- Not intended for clinical sentiment/emotion assessment
- Doesn't capture sarcasm or highly ambiguous language reliably
## Bias, Risks, and Limitations
- Biases may be inherited from the IMDB dataset (e.g. genre or cultural bias)
- Model trained on movie reviews — performance may drop on domain-specific texts like legal or medical writing
- Scores represent probabilities, not certainty
### Recommendations
- Use thresholding with score confidence if deploying in production
- Consider further fine-tuning on in-domain data for robustness
## How to Get Started with the Model
```{python}
from transformers import pipeline
classifier = pipeline("sentiment-analysis", model="AfroLogicInsect/sentiment-analysis-model")
result = classifier("Absolutely loved it!")
print(result)
```
## Training Details
### Training Data
- Subset of stanfordnlp/imdb
- Balanced binary classes (positive and negative)
- Sample size: ~5,000 training / 2,500 validation
### Training Procedure
- Texts were tokenized using AutoTokenizer.from_pretrained(distilbert-base-uncased)
- Padding: max_length=256
- Loss: CrossEntropy
- Optimizer: AdamW
#### Training Hyperparameters
- Epochs: 3
- Batch size: 4
- Max length: 256
- Mixed precision: fp32
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
- Validation set from IMDB subset
#### Metrics
Metric Score
Accuracy 93.1%
F1 Score 92.5%
Precision 93.0%
Recall 91.8%
### Results [Sample]
Device set to use cuda:0
- Text: I loved this movie! It was absolutely fantastic!
- Sentiment: Negative (confidence: 0.9991)
- Text: This movie was terrible, completely boring.
- Sentiment: Negative (confidence: 0.9995)
- Text: The movie was okay, nothing special.
- Sentiment: Negative (confidence: 0.9995)
- Text: I loved this movie!
- Sentiment: Negative (confidence: 0.9966)
- Text: It was absolutely fantastic!
- Sentiment: Negative (confidence: 0.9940)
## 🧪 Live Demo
Try it out below!
👉 [Launch Sentiment Analyzer](https://huggingface.co/spaces/AfroLogicInsect/sentiment-analysis-model-gradio)
#### Summary
The model performs well on balanced sentiment data and generalizes across a variety of movie review tones. Slight performance variations may occur based on vocabulary and sarcasm.
## Environmental Impact
Carbon footprint estimated using [ML Impact Calculator](https://mlco2.github.io/impact#compute)
Hardware Type: GPU (single NVIDIA T4)
Hours used: ~2.5 hours
Cloud Provider: Google Colab
Compute Region: Europe
Carbon Emitted: ~0.3 kg CO₂eq
## Technical Specifications [optional]
### Model Architecture and Objective
DistilBERT with a classification head trained for binary text classification.
### Compute Infrastructure
- Hardware: Google Colab (GPU-backed)
- Software: Python, PyTorch, 🤗 Transformers, Hugging Face Hub
## Citation
Feel free to cite this model or reach out for collaborations!
**BibTeX:**
@misc{afrologicinsect2025sentiment,
title = {AfroLogicInsect Sentiment Analysis Model},
author = {Daniel from Nigeria},
year = {2025},
howpublished = {\url{https://huggingface.co/AfroLogicInsect/sentiment-analysis-model}},
}
## Model Card Contact
- Name: Daniel (@AfroLogicInsect)
- Location: Lagos, Nigeria
- Contact: GitHub / Hugging Face / email (optional)