BERT Fine-tuned for IMDB Sentiment Analysis

This model is a fine-tuned version of bert-base-uncased on the IMDB movie reviews dataset for sentiment analysis (binary classification). It can predict whether a movie review is positive or negative.

Model description

Model type: BERT (bert-base-uncased)
Language: English
Task: Sentiment Analysis
Training Dataset: IMDB Movie Reviews
License: MIT

Training Hyperparameters

The model was trained with the following parameters:

Learning rate: 2e-5
Batch size: 16
Number of epochs: 3
Weight decay: 0.01
Maximum sequence length: 64
Training samples: 2000 (balanced: 1000 positive, 1000 negative)
Optimizer: AdamW

Training Results

Accuracy on test set: 80.2%
Training loss: 0.381

Intended uses & limitations

Intended uses

This model is designed for:

Sentiment analysis of movie reviews and similar text content
Binary classification (positive/negative) of English text
Research and educational purposes

Limitations

The model is trained on movie reviews and might not perform as well on other domains
Limited to English language text
Maximum input length is 512 tokens
May exhibit biases present in the training data

How to use

Here's how to use the model with PyTorch:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("xanderIV/finetuned-bert-imdb")
tokenizer = AutoTokenizer.from_pretrained("xanderIV/finetuned-bert-imdb")

# Prepare your text
texts = [
    "This movie was fantastic! Great acting and amazing plot.",
    "Terrible waste of time. Poor acting and confusing story."
]

# Tokenize the input
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.softmax(outputs.logits, dim=1)
    labels = torch.argmax(predictions, dim=1)

# Process results
for text, pred, probs in zip(texts, labels, predictions):
    sentiment = "positive" if pred.item() == 1 else "negative"
    confidence = probs[pred].item() * 100
    print(f"\nText: {text}")
    print(f"Sentiment: {sentiment} (confidence: {confidence:.1f}%)")

Example Outputs

Text: This movie was fantastic! Great acting and amazing plot.
Sentiment: positive (confidence: 97.7%)

Text: Terrible waste of time. Poor acting and confusing story.
Sentiment: negative (confidence: 98.4%)

Training Data

The model was fine-tuned on a subset of the IMDB dataset:

2000 training examples (1000 positive, 1000 negative reviews)
500 test examples
Reviews were truncated to 64 tokens to optimize training speed

Evaluation Results

The model achieved the following results on the test set:

Accuracy: 80.2%
Loss: 0.482

Bias & Limitations

This model may exhibit biases inherent to the IMDB dataset:

Movie-specific vocabulary and expressions
Cultural biases in movie reviews
English-language bias
Internet and entertainment domain bias

Citation

If you use this model, please cite:

@misc{finetuned-bert-imdb,
  author = {xanderIV},
  title = {BERT Fine-tuned for IMDB Sentiment Analysis},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/xanderIV/finetuned-bert-imdb}}
}

Downloads last month: 4

Safetensors

Model size

0.1B params

Tensor type

F32

Dataset used to train xanderIV/finetuned-bert-imdb

Evaluation results

Test Accuracy on IMDB
self-reported

0.802

View on Papers With Code