File size: 3,356 Bytes
e46fc5f
 
9f9198d
 
 
 
 
 
 
 
 
e46fc5f
 
9f9198d
e46fc5f
9f9198d
 
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
 
 
9f9198d
 
 
 
 
 
e46fc5f
9f9198d
e46fc5f
9f9198d
 
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
 
 
9f9198d
 
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
 
 
9f9198d
 
 
 
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
 
 
e46fc5f
 
 
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
 
e46fc5f
9f9198d
 
e46fc5f
 
 
 
9f9198d
 
 
e46fc5f
9f9198d
e46fc5f
9f9198d
 
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
 
e46fc5f
9f9198d
e46fc5f
9f9198d
 
 
 
 
e46fc5f
9f9198d
e46fc5f
9f9198d
 
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
 
 
e46fc5f
9f9198d
 
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
e46fc5f
9f9198d
 
 
 
 
 
 
e46fc5f
9f9198d
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
library_name: transformers
tags:
  - text-classification
  - sentiment-analysis
  - imdb
  - bert
  - colab
  - huggingface
  - fine-tuned
license: apache-2.0
---

# πŸ€– BERT IMDb Sentiment Classifier

A fine-tuned `bert-base-uncased` model for **binary sentiment classification** on the [IMDb movie reviews dataset](https://huggingface.co/datasets/imdb).  
Trained in Google Colab using Hugging Face Transformers with ~93% test accuracy.

---

## πŸ“Œ Model Details

### Model Description

- **Developed by:** Shubham Swarnakar
- **Shared by:** [ShubhamSwarnakar](https://huggingface.co/ShubhamSwarnakar)
- **Model type:** `BERTForSequenceClassification`
- **Language(s):** English πŸ‡ΊπŸ‡Έ
- **License:** Apache-2.0
- **Fine-tuned from:** [bert-base-uncased](https://huggingface.co/bert-base-uncased)

### Model Sources

- **Repository:** https://huggingface.co/ShubhamSwarnakar/bert-imdb-colab-model
- **Demo:** Available via Hugging Face Inference Widget

---

## βœ… Uses

### Direct Use

Use this model for **sentiment analysis** on English movie reviews or similar texts.  
Returns either a `positive` or `negative` classification.

### Downstream Use

Can be fine-tuned further for domain-specific sentiment classification tasks.

### Out-of-Scope Use

Not designed for:
- Multilingual sentiment analysis
- Nuanced emotion detection (e.g., joy, anger, sarcasm)
- Non-movie domains without re-training

---

## ⚠️ Bias, Risks, and Limitations

This model inherits potential biases from:
- Pretrained BERT weights
- IMDb dataset (may reflect demographic or cultural skew)

### Recommendations

Avoid deploying this model in high-risk applications without auditing or further fine-tuning. Misclassification risk exists, especially with ambiguous or sarcastic text.

---

## πŸš€ How to Get Started

```python
from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="ShubhamSwarnakar/bert-imdb-colab-model")
classifier("This movie was surprisingly entertaining!")




🧠 Training Details
Training Data
Dataset: IMDb Dataset

Format: Binary sentiment (positive = 1, negative = 0)

Training Procedure
Preprocessing: Tokenized with BertTokenizerFast

Epochs: 3

Optimizer: AdamW

Scheduler: Linear LR

Batch size: 8

Trained using Colab with limited GPU resources

πŸ“Š Evaluation
Metrics

Final test accuracy: 93.47%

Results Summary
Epoch	Validation Accuracy
1	        91.80%
2	        92.04%
3	        92.92%

Final test accuracy on held-out IMDb test split: 93.47%

🌱 Environmental Impact
Estimated based on lightweight training:

Hardware Type: Google Colab GPU (T4)

Training Duration: ~2 hours

Cloud Provider: Google

Region: Unknown

Emissions Estimate: ~0.15 kg COβ‚‚eq

Estimate via ML CO2 Impact Calculator

πŸ—οΈ Technical Specifications
Architecture
BERT-base (12-layer, 768-hidden, 12-heads, 110M parameters)

Compute Infrastructure
Hardware: Google Colab with GPU

Software:

Python 3.11

Transformers 4.x

Datasets

PyTorch 2.x

πŸ“š Citation

@misc{shubhamswarnakar_bert_imdb_2025,
  author       = {Shubham Swarnakar},
  title        = {BERT IMDb Sentiment Classifier},
  year         = 2025,
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/ShubhamSwarnakar/bert-imdb-colab-model}},
}

πŸ™‹ More Info
For questions or collaboration, contact @ShubhamSwarnakar.