--- base_model: bert-base-cased datasets: - ma2za/many_emotions license: apache-2.0 tags: - onnx - emotion-detection - BaseLM:bert-base-cased --- # BERT-Based Emotion Detection on ma2za/many_emotions This repository hosts a fine-tuned emotion detection model built on [BERT-base-cased](https://huggingface.co/bert-base-cased). The model is trained on the [ma2za/many_emotions](https://huggingface.co/datasets/ma2za/many_emotions) dataset to classify text into one of seven emotion categories: anger, fear, joy, love, sadness, surprise, and neutral. The model is available in both PyTorch and ONNX formats for efficient deployment. ## Model Details ### Model Description - **Developed by:** *Your Name or Organization* - **Model Type:** Sequence Classification (Emotion Detection) - **Base Model:** bert-base-cased - **Dataset:** ma2za/many_emotions - **Export Format:** ONNX (for deployment) - **License:** Apache-2.0 - **Tags:** onnx, emotion-detection, BERT, sequence-classification This model was fine-tuned on the ma2za/many_emotions dataset, where the text is classified into emotion categories based on the content. For quick experimentation, a subset of the training data was used; however, the full model has been trained with the complete dataset and is now publicly available. ## Training Details ### Dataset Details - **Dataset ID:** ma2za/many_emotions - **Text Column:** `text` - **Label Column:** `label` ### Training Hyperparameters - **Epochs:** 1 (for quick test; adjust to your needs) - **Per Device Batch Size:** 96 - **Learning Rate:** 1e-5 - **Weight Decay:** 0.01 - **Optimizer:** AdamW - **Training Duration:** The full training run on the complete dataset (approximately 2.44 million training examples) was completed in about 3 hours and 40 minutes. ## ONNX Export The model has been exported to the ONNX format using opset version 14, ensuring support for modern operators such as `scaled_dot_product_attention`. This enables flexible deployment scenarios across different platforms using ONNX Runtime. ## How to Load the Model Instead of loading the model from a local directory, you can load it directly from the Hugging Face Hub using the repository name `iimran/EmotionDetection`. ### Loading with Transformers (PyTorch) ```python import os import numpy as np import onnxruntime as ort from transformers import AutoTokenizer, AutoConfig from huggingface_hub import hf_hub_download # Specify the repository details. repo_id = "iimran/EmotionDetection" filename = "model.onnx" # Download the ONNX model file from the Hub. onnx_model_path = hf_hub_download(repo_id=repo_id, filename=filename) print("Model downloaded to:", onnx_model_path) # Load the tokenizer and configuration from the repository. tokenizer = AutoTokenizer.from_pretrained(repo_id) config = AutoConfig.from_pretrained(repo_id) # Check whether the configuration contains an id2label mapping. if hasattr(config, "id2label") and config.id2label and len(config.id2label) > 0: id2label = config.id2label else: # Default mapping for ma2za/many_emotions if not present in the config. id2label = { 0: "anger", 1: "fear", 2: "joy", 3: "love", 4: "sadness", 5: "surprise", 6: "neutral" } print("id2label mapping:", id2label) # Create an ONNX Runtime inference session using the local model file. session = ort.InferenceSession(onnx_model_path) def onnx_infer(text): """ Perform inference on the input text using the exported ONNX model. Returns the predicted emotion label. """ # Tokenize the input text with a fixed maximum sequence length matching the model export. inputs = tokenizer( text, return_tensors="np", truncation=True, padding="max_length", max_length=256 ) # Prepare the model inputs. ort_inputs = { "input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"] } # Run the model. outputs = session.run(None, ort_inputs) logits = outputs[0] # Get the predicted class id. predicted_class_id = int(np.argmax(logits, axis=-1)[0]) # Map the predicted class id to its emotion label. predicted_label = id2label.get(str(predicted_class_id), id2label.get(predicted_class_id, str(predicted_class_id))) print("Predicted Emotion ID:", predicted_class_id) print("Predicted Emotion:", predicted_label) return predicted_label # Test the inference function. onnx_infer("That rude customer made me furious.") ``` ## Evaluation The model is primarily evaluated using the accuracy metric during training. For deployment, further evaluation on unseen data is recommended to ensure robustness in production settings.