File size: 3,201 Bytes
239c59d b2e7dd7 3af7e89 b2e7dd7 3af7e89 b2e7dd7 3af7e89 b2e7dd7 48eba65 b2e7dd7 c031086 b2e7dd7 48eba65 b2e7dd7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
---
license: apache-2.0
datasets:
- nhull/tripadvisor-split-dataset-v2
language:
- en
base_model:
- distilbert/distilbert-base-uncased
tags:
- nlp
- hotels
- reviews
- sentiment-analysis
- transformers
---
# DistilBERT Sentiment Analysis Model
## Overview
This repository contains a fine-tuned **DistilBERT** model trained for sentiment analysis on TripAdvisor reviews. The model predicts sentiment scores on a scale of 1 to 5 based on review text.
- **Base Model**: `distilbert-base-uncased`
- **Trained Dataset**: [nhull/tripadvisor-split-dataset-v2](https://huggingface.co/datasets/nhull/tripadvisor-split-dataset-v2)
- **Use Case**: Sentiment classification for customer reviews to derive insights into customer satisfaction.
- **Output**: Sentiment labels (1-5).
---
## Model Details
- **Learning Rate**: `3e-05`
- **Batch Size**: `64`
- **Epochs**: `10` (with early stopping)
- **Patience**: `5` (epochs without improvement)
- **Tokenizer**: `distilbert-base-uncased`
- **Framework**: PyTorch + Hugging Face Transformers
## Intended Use
This model is designed to classify hotel reviews based on their sentiment. It assigns a star rating between 1 and 5 to a review, indicating the sentiment expressed in the review.
---
### Dataset
The dataset used for training, validation, and testing is [nhull/tripadvisor-split-dataset-v2](https://huggingface.co/datasets/nhull/tripadvisor-split-dataset-v2). It consists of:
- **Training Set**: 30,400 reviews
- **Validation Set**: 1,600 reviews
- **Test Set**: 8,000 reviews
All splits are balanced across five sentiment labels.
---
### Test Performance
Model predicts too high on average by `0.3934`.
| Metric | Value |
|------------|--------|
| Accuracy | 0.6391 |
| Precision | 0.6416 |
| Recall | 0.6391 |
| F1-Score | 0.6400 |
#### Classification Report (Test Set)
| Label | Precision | Recall | F1-Score | Support |
|-------|-----------|--------|----------|---------|
| 1 | 0.7483 | 0.6856 | 0.7156 | 1600 |
| 2 | 0.5445 | 0.5544 | 0.5494 | 1600 |
| 3 | 0.6000 | 0.6281 | 0.6137 | 1600 |
| 4 | 0.5828 | 0.5894 | 0.5861 | 1600 |
| 5 | 0.7326 | 0.7381 | 0.7354 | 1600 |
### Confusion Matrix (Test Set)
| True \\ Predicted | 1 | 2 | 3 | 4 | 5 |
|-------------------|------|------|------|------|------|
| **1** | 1097 | 437 | 60 | 3 | 3 |
| **2** | 327 | 887 | 344 | 34 | 8 |
| **3** | 37 | 278 | 1005 | 254 | 26 |
| **4** | 3 | 21 | 239 | 943 | 394 |
| **5** | 2 | 6 | 27 | 384 | 1181 |
---
## Files Included
- **`validation_results_distilbert.csv`**: Contains correctly classified reviews with their real and predicted labels.
---
## Limitations
1. Domain-Specific: The model was trained on TripAdvisor reviews, so it may not generalize to other types of reviews or domains without further fine-tuning.
2. Subjectivity: Sentiment annotations are subjective and may not fully represent every user's perception.
3. Performance: Mid-range sentiment labels (2 and 3) have lower precision and recall compared to extreme sentiment labels (1 and 5). |