File size: 2,496 Bytes
85a6d92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79ab2ac
85a6d92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
language: 
- en
- vi
tags:
- esg
- classification
- hierarchical
- multi-task-learning
- sustainability
datasets:
- custom
library_name: transformers
pipeline_tag: text-classification
---

# ESG Hierarchical Multi-Task Learning Model

This model performs hierarchical ESG (Environmental, Social, Governance) classification using a multi-task learning approach.

## Model Description

- **Model Type**: Hierarchical Multi-Task Classifier
- **Backbone**: Alibaba-NLP/gte-multilingual-base  
- **Language**: English, Vietnamese
- **Task**: ESG Factor and Sub-factor Classification

## Architecture

The model uses a hierarchical approach:
1. **Main ESG Classification**: Predicts E, S, G, or Others_ESG
2. **Sub-factor Classification**: Based on main category, predicts specific sub-factors:
   - **E (Environmental)**: Emission, Resource Use, Product Innovation
   - **S (Social)**: Community, Diversity, Employment, HS, HR, PR, Training  
   - **G (Governance)**: BFunction, BStructure, Compensation, Shareholder, Vision

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("chungpt2123/esg-subfactor-classifier", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Alibaba-NLP/gte-multilingual-base")

# Example usage
text = "The company has implemented renewable energy solutions to reduce carbon emissions."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=4096)

# Get predictions
esg_factor, sub_factor = model.predict(inputs.input_ids, inputs.attention_mask)
print(f"ESG Factor: {esg_factor}, Sub-factor: {sub_factor}")
```

## Training Details

- **Training Data**: Custom ESG dataset
- **Training Approach**: Two-phase training (freeze backbone → fine-tune entire model)
- **Loss Function**: Weighted multi-task loss
- **Optimization**: AdamW with learning rate scheduling

## Model Performance

The model achieves strong performance on ESG classification tasks with hierarchical prediction accuracy.

## Limitations

- Trained primarily on English and Vietnamese text
- Performance may vary on domain-specific or technical ESG content
- Best performance on texts similar to training data distribution

```bibtex
@misc{esg_hierarchical_model,
  title={ESG Hierarchical Multi-Task Learning Model},
  author={Chung},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/chungpt2123/test1}
}
```