File size: 3,473 Bytes

---
library_name: transformers
license: mit
datasets:
- hblim/customer-complaints
language:
- en
metrics:
- accuracy
base_model:
- google-bert/bert-base-uncased
tags:
- bert
- transformers
- customer-complaints
- text-classification
- multiclass
- huggingface
- fine-tuned
- wandb
---

# BERT Base (Uncased) Fine-Tuned on Customer Complaint Classification (3 Classes)

## 🧾 Model Description

This model is a fine-tuned version of [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) using Hugging Face Transformers on a custom dataset of customer complaints. The task is **multi-class text classification**, where each complaint is categorized into one of **three classes**.

The model is intended to support downstream tasks like complaint triage, issue type prediction, or support ticket classification.

Training and evaluation were tracked using [Weights & Biases](https://wandb.ai/), and all hyperparameters are reproducible and logged below.

---

## 🧠 Intended Use

- 🏷 Classify customer complaint text into 3 predefined categories
- 📊 Analyze complaint trends over time
- 💬 Serve as a backend model for customer service applications

---

## 📚 Dataset

- Dataset Name: [hblim/customer-complaints](https://huggingface.co/datasets/hblim/customer-complaints)
- Dataset Type: Multiclass text classification
- Classes: billing, product, delivery
- Preprocessing: Standard BERT tokenization

---

## ⚙️ Training Details

- Base Model: `bert-base-uncased`
- Epochs: **10**
- Batch Size: **1**
- Learning Rate: **1e-5**
- Weight Decay: **0.05**
- Warmup Ratio: **0.20**
- LR Scheduler: `linear`
- Optimizer: `AdamW`
- Evaluation Strategy: every **100 steps**
- Logging: every **100 steps**
- Trainer: Hugging Face `Trainer`
- Hardware: Single NVIDIA GeForce RTX 3080 GPU

---

## 📈 Metrics

Evaluation was tracked using:
- **Accuracy**

To reproduce metrics and training logs, refer to the corresponding W&B run:
[Weights & Biases Run - `baseline-hf-hub`](https://wandb.ai/notslahify/customer%20complaints%20fine%20tuning/runs/c75ddclr)


| Step | Training Loss | Validation Loss | Accuracy   |
|------|---------------|-----------------|------------|
| 100  | 1.106100      | 1.040519        | 0.523810   |
| 200  | 0.944800      | 0.744273        | 0.738095   |
| 300  | 0.660000      | 0.385309        | 0.900000   |
| 400  | 0.412400      | 0.273423        | 0.904762   |
| 500  | 0.220800      | 0.185636        | 0.923810   |
| 600  | 0.163400      | 0.245850        | 0.919048   |
| 700  | 0.116100      | 0.180523        | 0.942857   |
| 800  | 0.097200      | 0.254475        | 0.928571   |
| 900  | 0.052200      | 0.233583        | 0.942857   |
| 1000 | 0.050700      | 0.223150        | 0.928571   |
| 1100 | 0.035100      | 0.271416        | 0.919048   |
| 1200 | 0.027700      | 0.226478        | 0.933333   |
| 1300 | 0.009000      | 0.218807        | 0.938095   |
| 1400 | 0.013600      | 0.246330        | 0.928571   |
| 1500 | 0.014500      | 0.226987        | 0.933333   |

---

## 🚀 How to Use

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("your-username/baseline-hf-hub")
tokenizer = AutoTokenizer.from_pretrained("your-username/baseline-hf-hub")

inputs = tokenizer("I want to report an issue with my account", return_tensors="pt")
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()