nhull
/

distilbert-sentiment-model

Text Classification

sentiment-analysis

Model card Files Files and versions

distilbert-sentiment-model / README.md

nhull's picture

Update README.md

c031086 verified 8 months ago

|

history blame contribute delete

3.2 kB

	---
	license: apache-2.0
	datasets:
	- nhull/tripadvisor-split-dataset-v2
	language:
	- en
	base_model:
	- distilbert/distilbert-base-uncased
	tags:
	- nlp
	- hotels
	- reviews
	- sentiment-analysis
	- transformers
	---
	# DistilBERT Sentiment Analysis Model

	## Overview

	This repository contains a fine-tuned DistilBERT model trained for sentiment analysis on TripAdvisor reviews. The model predicts sentiment scores on a scale of 1 to 5 based on review text.

	- Base Model: `distilbert-base-uncased`
	- Trained Dataset: [nhull/tripadvisor-split-dataset-v2](https://huggingface.co/datasets/nhull/tripadvisor-split-dataset-v2)
	- Use Case: Sentiment classification for customer reviews to derive insights into customer satisfaction.
	- Output: Sentiment labels (1-5).

	---

	## Model Details

	- Learning Rate: `3e-05`
	- Batch Size: `64`
	- Epochs: `10` (with early stopping)
	- Patience: `5` (epochs without improvement)
	- Tokenizer: `distilbert-base-uncased`
	- Framework: PyTorch + Hugging Face Transformers

	## Intended Use

	This model is designed to classify hotel reviews based on their sentiment. It assigns a star rating between 1 and 5 to a review, indicating the sentiment expressed in the review.

	---

	### Dataset

	The dataset used for training, validation, and testing is [nhull/tripadvisor-split-dataset-v2](https://huggingface.co/datasets/nhull/tripadvisor-split-dataset-v2). It consists of:

	- Training Set: 30,400 reviews
	- Validation Set: 1,600 reviews
	- Test Set: 8,000 reviews

	All splits are balanced across five sentiment labels.

	---

	### Test Performance

	Model predicts too high on average by `0.3934`.

	\| Metric \| Value \|
	\|------------\|--------\|
	\| Accuracy \| 0.6391 \|
	\| Precision \| 0.6416 \|
	\| Recall \| 0.6391 \|
	\| F1-Score \| 0.6400 \|

	#### Classification Report (Test Set)

	\| Label \| Precision \| Recall \| F1-Score \| Support \|
	\|-------\|-----------\|--------\|----------\|---------\|
	\| 1 \| 0.7483 \| 0.6856 \| 0.7156 \| 1600 \|
	\| 2 \| 0.5445 \| 0.5544 \| 0.5494 \| 1600 \|
	\| 3 \| 0.6000 \| 0.6281 \| 0.6137 \| 1600 \|
	\| 4 \| 0.5828 \| 0.5894 \| 0.5861 \| 1600 \|
	\| 5 \| 0.7326 \| 0.7381 \| 0.7354 \| 1600 \|

	### Confusion Matrix (Test Set)

	\| True \\ Predicted \| 1 \| 2 \| 3 \| 4 \| 5 \|
	\|-------------------\|------\|------\|------\|------\|------\|
	\| 1 \| 1097 \| 437 \| 60 \| 3 \| 3 \|
	\| 2 \| 327 \| 887 \| 344 \| 34 \| 8 \|
	\| 3 \| 37 \| 278 \| 1005 \| 254 \| 26 \|
	\| 4 \| 3 \| 21 \| 239 \| 943 \| 394 \|
	\| 5 \| 2 \| 6 \| 27 \| 384 \| 1181 \|

	---

	## Files Included

	- `validation_results_distilbert.csv`: Contains correctly classified reviews with their real and predicted labels.

	---

	## Limitations

	1. Domain-Specific: The model was trained on TripAdvisor reviews, so it may not generalize to other types of reviews or domains without further fine-tuning.
	2. Subjectivity: Sentiment annotations are subjective and may not fully represent every user's perception.
	3. Performance: Mid-range sentiment labels (2 and 3) have lower precision and recall compared to extreme sentiment labels (1 and 5).