metadata

license: apache-2.0
datasets:
  - nhull/tripadvisor-split-dataset-v2
language:
  - en
base_model:
  - distilbert/distilbert-base-uncased
tags:
  - nlp
  - hotels
  - reviews
  - sentiment-analysis
  - transformers

DistilBERT Sentiment Analysis Model

Overview

This repository contains a fine-tuned DistilBERT model trained for sentiment analysis on TripAdvisor reviews. The model predicts sentiment scores on a scale of 1 to 5 based on review text.

Base Model: distilbert-base-uncased
Trained Dataset: nhull/tripadvisor-split-dataset-v2
Use Case: Sentiment classification for customer reviews to derive insights into customer satisfaction.
Output: Sentiment labels (1-5).

Model Details

Learning Rate: 3e-05
Batch Size: 64
Epochs: 10 (with early stopping)
Patience: 5 (epochs without improvement)
Tokenizer: distilbert-base-uncased
Framework: PyTorch + Hugging Face Transformers

Intended Use

This model is designed to classify hotel reviews based on their sentiment. It assigns a star rating between 1 and 5 to a review, indicating the sentiment expressed in the review.

Dataset

The dataset used for training, validation, and testing is nhull/tripadvisor-split-dataset-v2. It consists of:

Training Set: 30,400 reviews
Validation Set: 1,600 reviews
Test Set: 8,000 reviews

All splits are balanced across five sentiment labels.

Test Performance

Model predicts too high on average by 0.3934.

Metric	Value
Accuracy	0.6391
Precision	0.6416
Recall	0.6391
F1-Score	0.6400

Classification Report (Test Set)

Label	Precision	Recall	F1-Score	Support
1	0.7483	0.6856	0.7156	1600
2	0.5445	0.5544	0.5494	1600
3	0.6000	0.6281	0.6137	1600
4	0.5828	0.5894	0.5861	1600
5	0.7326	0.7381	0.7354	1600

Confusion Matrix (Test Set)

True \ Predicted	1	2	3	4	5
1	1097	437	60	3	3
2	327	887	344	34	8
3	37	278	1005	254	26
4	3	21	239	943	394
5	2	6	27	384	1181

Files Included

validation_results_distilbert.csv: Contains correctly classified reviews with their real and predicted labels.

Limitations

Domain-Specific: The model was trained on TripAdvisor reviews, so it may not generalize to other types of reviews or domains without further fine-tuning.
Subjectivity: Sentiment annotations are subjective and may not fully represent every user's perception.
Performance: Mid-range sentiment labels (2 and 3) have lower precision and recall compared to extreme sentiment labels (1 and 5).