BERT Fine-tuned for IMDB Sentiment Analysis

This model is a fine-tuned version of bert-base-uncased on the IMDB movie reviews dataset for sentiment analysis (binary classification). It can predict whether a movie review is positive or negative.

Model description

  • Model type: BERT (bert-base-uncased)
  • Language: English
  • Task: Sentiment Analysis
  • Training Dataset: IMDB Movie Reviews
  • License: MIT

Training Hyperparameters

The model was trained with the following parameters:

  • Learning rate: 2e-5
  • Batch size: 16
  • Number of epochs: 3
  • Weight decay: 0.01
  • Maximum sequence length: 64
  • Training samples: 2000 (balanced: 1000 positive, 1000 negative)
  • Optimizer: AdamW

Training Results

  • Accuracy on test set: 80.2%
  • Training loss: 0.381

Intended uses & limitations

Intended uses

This model is designed for:

  • Sentiment analysis of movie reviews and similar text content
  • Binary classification (positive/negative) of English text
  • Research and educational purposes

Limitations

  • The model is trained on movie reviews and might not perform as well on other domains
  • Limited to English language text
  • Maximum input length is 512 tokens
  • May exhibit biases present in the training data

How to use

Here's how to use the model with PyTorch:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("xanderIV/finetuned-bert-imdb")
tokenizer = AutoTokenizer.from_pretrained("xanderIV/finetuned-bert-imdb")

# Prepare your text
texts = [
    "This movie was fantastic! Great acting and amazing plot.",
    "Terrible waste of time. Poor acting and confusing story."
]

# Tokenize the input
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.softmax(outputs.logits, dim=1)
    labels = torch.argmax(predictions, dim=1)

# Process results
for text, pred, probs in zip(texts, labels, predictions):
    sentiment = "positive" if pred.item() == 1 else "negative"
    confidence = probs[pred].item() * 100
    print(f"\nText: {text}")
    print(f"Sentiment: {sentiment} (confidence: {confidence:.1f}%)")

Example Outputs

Text: This movie was fantastic! Great acting and amazing plot.
Sentiment: positive (confidence: 97.7%)

Text: Terrible waste of time. Poor acting and confusing story.
Sentiment: negative (confidence: 98.4%)

Training Data

The model was fine-tuned on a subset of the IMDB dataset:

  • 2000 training examples (1000 positive, 1000 negative reviews)
  • 500 test examples
  • Reviews were truncated to 64 tokens to optimize training speed

Evaluation Results

The model achieved the following results on the test set:

  • Accuracy: 80.2%
  • Loss: 0.482

Bias & Limitations

This model may exhibit biases inherent to the IMDB dataset:

  • Movie-specific vocabulary and expressions
  • Cultural biases in movie reviews
  • English-language bias
  • Internet and entertainment domain bias

Citation

If you use this model, please cite:

@misc{finetuned-bert-imdb,
  author = {xanderIV},
  title = {BERT Fine-tuned for IMDB Sentiment Analysis},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/xanderIV/finetuned-bert-imdb}}
}
Downloads last month
4
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train xanderIV/finetuned-bert-imdb

Evaluation results