Upload model
Browse files- README.md +21 -31
- config.json +33 -13
- model.safetensors +2 -2
README.md
CHANGED
|
@@ -9,46 +9,36 @@ pipeline_tag: token-classification
|
|
| 9 |
---
|
| 10 |
|
| 11 |
# C-EBERT
|
| 12 |
-
|
|
|
|
| 13 |
|
| 14 |
## Model details
|
| 15 |
-
- **Model architecture**:
|
| 16 |
-
- **Fine-tuned on**:
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
| **2. Relation Classification** | Sentence-Pair Classification | **14 Relation Labels** (e.g., MONO\_POS\_CAUSE, DIST\_NEG\_EFFECT, INTERDEPENDENCY, NO\_RELATION) |
|
| 21 |
|
| 22 |
## Usage
|
| 23 |
Find the custom [library](https://github.com/padjohn/causalbert). Once installed, run inference like so:
|
| 24 |
```python
|
| 25 |
-
from
|
|
|
|
| 26 |
|
| 27 |
-
# NOTE: The model path accepts either a local directory or a Hugging Face Hub ID.
|
| 28 |
model, tokenizer, config, device = load_model("pdjohn/C-EBERT")
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
-
|
| 31 |
-
sentences = ["Autoverkehr verursacht Bienensterben.", "Lärm ist der Grund für Stress."]
|
| 32 |
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
config,
|
| 37 |
-
sentences,
|
| 38 |
-
batch_size=8
|
| 39 |
-
)
|
| 40 |
|
| 41 |
-
|
| 42 |
-
print(all_results[0]['derived_relations'])
|
| 43 |
-
# Example Output:
|
| 44 |
-
# [(['Autoverkehr', 'verursacht'], ['Bienensterben']), {'label': 'MONO_POS_CAUSE', 'confidence': 0.954}]
|
| 45 |
-
```
|
| 46 |
|
| 47 |
-
|
| 48 |
-
-
|
| 49 |
-
-
|
| 50 |
-
- Epochs: 8
|
| 51 |
-
- Learning Rate: 1e-4
|
| 52 |
-
- Batch size: 32
|
| 53 |
-
- PEFT/LoRA: Enabled with r = 16
|
| 54 |
-
See [train.py](https://github.com/padjohn/cbert/blob/main/causalbert/train.py) for the full configuration details.
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
# C-EBERT
|
| 12 |
+
|
| 13 |
+
C-EBERT is a multi-task fine-tuned German EuroBERT to extract causal attribution.
|
| 14 |
|
| 15 |
## Model details
|
| 16 |
+
- **Model architecture**: EuroBERT-210m + token & relation heads
|
| 17 |
+
- **Fine-tuned on**: environmental causal attribution corpus (German)
|
| 18 |
+
- **Tasks**:
|
| 19 |
+
1. Token classification (BIO tags for INDICATOR / ENTITY)
|
| 20 |
+
2. Relation classification (CAUSE, EFFECT, INTERDEPENDENCY)
|
|
|
|
| 21 |
|
| 22 |
## Usage
|
| 23 |
Find the custom [library](https://github.com/padjohn/causalbert). Once installed, run inference like so:
|
| 24 |
```python
|
| 25 |
+
from transformers import AutoTokenizer
|
| 26 |
+
from causalbert.infer import load_model, analyze_sentence_with_confidence
|
| 27 |
|
|
|
|
| 28 |
model, tokenizer, config, device = load_model("pdjohn/C-EBERT")
|
| 29 |
+
result = analyze_sentence_with_confidence(
|
| 30 |
+
model, tokenizer, config, "Autoverkehr verursacht Bienensterben.", []
|
| 31 |
+
)
|
| 32 |
+
```
|
| 33 |
|
| 34 |
+
## Training
|
|
|
|
| 35 |
|
| 36 |
+
- **Base model**: `EuroBERT/EuroBERT-210m`
|
| 37 |
+
- **Epochs**: 3, **LR**: 2e-5, **Batch size**: 8
|
| 38 |
+
- See [train.py](https://github.com/padjohn/causalbert/blob/main/causalbert/train.py) for details.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
+
## Limitations
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
+
- German only.
|
| 43 |
+
- Sentence-level; doesn’t handle cross-sentence causality.
|
| 44 |
+
- Relation classification depends on detected spans — errors in token tagging propagate.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
config.json
CHANGED
|
@@ -24,9 +24,19 @@
|
|
| 24 |
"hidden_size": 768,
|
| 25 |
"id2label_relation": {
|
| 26 |
"0": "NO_RELATION",
|
| 27 |
-
"1": "
|
| 28 |
-
"
|
| 29 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
},
|
| 31 |
"id2label_span": {
|
| 32 |
"0": "O",
|
|
@@ -45,26 +55,36 @@
|
|
| 45 |
"num_attention_heads": 12,
|
| 46 |
"num_hidden_layers": 12,
|
| 47 |
"num_key_value_heads": 12,
|
| 48 |
-
"num_relation_labels":
|
| 49 |
"num_span_labels": 5,
|
| 50 |
"pad_token": "<|end_of_text|>",
|
| 51 |
"pad_token_id": 128001,
|
| 52 |
"pretraining_tp": 1,
|
| 53 |
"relation_class_weights": [
|
| 54 |
-
|
| 55 |
-
0.
|
| 56 |
-
0.
|
| 57 |
-
0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
],
|
| 59 |
"rms_norm_eps": 1e-05,
|
| 60 |
"rope_scaling": null,
|
| 61 |
"rope_theta": 250000,
|
| 62 |
"span_class_weights": [
|
| 63 |
-
0.
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
0.
|
| 67 |
-
0.
|
| 68 |
],
|
| 69 |
"tie_word_embeddings": false,
|
| 70 |
"torch_dtype": "bfloat16",
|
|
|
|
| 24 |
"hidden_size": 768,
|
| 25 |
"id2label_relation": {
|
| 26 |
"0": "NO_RELATION",
|
| 27 |
+
"1": "MONO_POS_CAUSE",
|
| 28 |
+
"10": "MONO_NEG_EFFECT",
|
| 29 |
+
"11": "DIST_NEG_EFFECT",
|
| 30 |
+
"12": "PRIO_NEG_EFFECT",
|
| 31 |
+
"13": "INTERDEPENDENCY",
|
| 32 |
+
"2": "DIST_POS_CAUSE",
|
| 33 |
+
"3": "PRIO_POS_CAUSE",
|
| 34 |
+
"4": "MONO_NEG_CAUSE",
|
| 35 |
+
"5": "DIST_NEG_CAUSE",
|
| 36 |
+
"6": "PRIO_NEG_CAUSE",
|
| 37 |
+
"7": "MONO_POS_EFFECT",
|
| 38 |
+
"8": "DIST_POS_EFFECT",
|
| 39 |
+
"9": "PRIO_POS_EFFECT"
|
| 40 |
},
|
| 41 |
"id2label_span": {
|
| 42 |
"0": "O",
|
|
|
|
| 55 |
"num_attention_heads": 12,
|
| 56 |
"num_hidden_layers": 12,
|
| 57 |
"num_key_value_heads": 12,
|
| 58 |
+
"num_relation_labels": 14,
|
| 59 |
"num_span_labels": 5,
|
| 60 |
"pad_token": "<|end_of_text|>",
|
| 61 |
"pad_token_id": 128001,
|
| 62 |
"pretraining_tp": 1,
|
| 63 |
"relation_class_weights": [
|
| 64 |
+
0.1,
|
| 65 |
+
0.1,
|
| 66 |
+
0.1,
|
| 67 |
+
0.1,
|
| 68 |
+
0.1,
|
| 69 |
+
0.20260826579313382,
|
| 70 |
+
0.32417322526901415,
|
| 71 |
+
0.1,
|
| 72 |
+
0.1,
|
| 73 |
+
0.13507217719542255,
|
| 74 |
+
0.1,
|
| 75 |
+
0.10130413289656691,
|
| 76 |
+
0.10805774175633805,
|
| 77 |
+
0.1
|
| 78 |
],
|
| 79 |
"rms_norm_eps": 1e-05,
|
| 80 |
"rope_scaling": null,
|
| 81 |
"rope_theta": 250000,
|
| 82 |
"span_class_weights": [
|
| 83 |
+
0.1,
|
| 84 |
+
0.4253362505800068,
|
| 85 |
+
0.288930595674656,
|
| 86 |
+
0.19287324011981216,
|
| 87 |
+
0.1
|
| 88 |
],
|
| 89 |
"tie_word_embeddings": false,
|
| 90 |
"torch_dtype": "bfloat16",
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d7195385e901afabb65634403d9dc549553d06453c9bbecbd2050aa0b02831b2
|
| 3 |
+
size 423574076
|