File size: 3,201 Bytes
239c59d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2e7dd7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3af7e89
 
 
b2e7dd7
3af7e89
b2e7dd7
 
 
 
 
 
 
 
 
 
 
3af7e89
b2e7dd7
 
 
48eba65
 
b2e7dd7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c031086
b2e7dd7
 
 
 
 
 
 
 
 
 
 
48eba65
 
 
b2e7dd7
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
license: apache-2.0
datasets:
- nhull/tripadvisor-split-dataset-v2
language:
- en
base_model:
- distilbert/distilbert-base-uncased
tags:
- nlp
- hotels
- reviews
- sentiment-analysis
- transformers
---
# DistilBERT Sentiment Analysis Model

## Overview

This repository contains a fine-tuned **DistilBERT** model trained for sentiment analysis on TripAdvisor reviews. The model predicts sentiment scores on a scale of 1 to 5 based on review text.

- **Base Model**: `distilbert-base-uncased`
- **Trained Dataset**: [nhull/tripadvisor-split-dataset-v2](https://huggingface.co/datasets/nhull/tripadvisor-split-dataset-v2)
- **Use Case**: Sentiment classification for customer reviews to derive insights into customer satisfaction.
- **Output**: Sentiment labels (1-5).

---

## Model Details

- **Learning Rate**: `3e-05`
- **Batch Size**: `64`
- **Epochs**: `10` (with early stopping)
- **Patience**: `5` (epochs without improvement)
- **Tokenizer**: `distilbert-base-uncased`
- **Framework**: PyTorch + Hugging Face Transformers

## Intended Use

This model is designed to classify hotel reviews based on their sentiment. It assigns a star rating between 1 and 5 to a review, indicating the sentiment expressed in the review.

---

### Dataset

The dataset used for training, validation, and testing is [nhull/tripadvisor-split-dataset-v2](https://huggingface.co/datasets/nhull/tripadvisor-split-dataset-v2). It consists of:

- **Training Set**: 30,400 reviews
- **Validation Set**: 1,600 reviews
- **Test Set**: 8,000 reviews

All splits are balanced across five sentiment labels.

---

### Test Performance

Model predicts too high on average by `0.3934`.

| Metric     | Value  |
|------------|--------|
| Accuracy   | 0.6391 |
| Precision  | 0.6416 |
| Recall     | 0.6391 |
| F1-Score   | 0.6400 |

#### Classification Report (Test Set)

| Label | Precision | Recall | F1-Score | Support |
|-------|-----------|--------|----------|---------|
| 1     | 0.7483    | 0.6856 | 0.7156   | 1600    |
| 2     | 0.5445    | 0.5544 | 0.5494   | 1600    |
| 3     | 0.6000    | 0.6281 | 0.6137   | 1600    |
| 4     | 0.5828    | 0.5894 | 0.5861   | 1600    |
| 5     | 0.7326    | 0.7381 | 0.7354   | 1600    |

### Confusion Matrix (Test Set)

| True \\ Predicted | 1    | 2    | 3    | 4    | 5    |
|-------------------|------|------|------|------|------|
| **1**            | 1097 | 437  | 60   | 3    | 3    |
| **2**            | 327  | 887  | 344  | 34   | 8    |
| **3**            | 37   | 278  | 1005 | 254  | 26   |
| **4**            | 3    | 21   | 239  | 943  | 394  |
| **5**            | 2    | 6    | 27   | 384  | 1181 |

---

## Files Included

- **`validation_results_distilbert.csv`**: Contains correctly classified reviews with their real and predicted labels.

---

## Limitations

1. Domain-Specific: The model was trained on TripAdvisor reviews, so it may not generalize to other types of reviews or domains without further fine-tuning.
2. Subjectivity: Sentiment annotations are subjective and may not fully represent every user's perception.
3. Performance: Mid-range sentiment labels (2 and 3) have lower precision and recall compared to extreme sentiment labels (1 and 5).