QA-ModernBERT-large
This model is a fine-tuned version of answerdotai/ModernBERT-large on the saiteki-kai/Beavertails-it dataset. It achieves the following results on the evaluation set:
- Loss: 0.0826
- Accuracy: 0.6608
- Macro F1: 0.6703
- Macro Precision: 0.6572
- Macro Recall: 0.6909
- Micro F1: 0.7482
- Micro Precision: 0.7288
- Micro Recall: 0.7687
- Flagged/accuracy: 0.8493
- Flagged/precision: 0.8469
- Flagged/recall: 0.8900
- Flagged/f1: 0.8679
- Flagged/aucpr: 0.8991
- Flagged/fpr: 0.2018
- Animal Abuse/accuracy: 0.9946
- Animal Abuse/precision: 0.7483
- Animal Abuse/recall: 0.7951
- Animal Abuse/f1: 0.7710
- Animal Abuse/fpr: 0.0031
- Animal Abuse/threshold: 0.2736
- Child Abuse/accuracy: 0.9968
- Child Abuse/precision: 0.75
- Child Abuse/recall: 0.6396
- Child Abuse/f1: 0.6904
- Child Abuse/fpr: 0.0012
- Child Abuse/threshold: 0.3748
- Controversial Topics,politics/accuracy: 0.9679
- Controversial Topics,politics/precision: 0.4800
- Controversial Topics,politics/recall: 0.5597
- Controversial Topics,politics/f1: 0.5168
- Controversial Topics,politics/fpr: 0.0192
- Controversial Topics,politics/threshold: 0.2822
- Discrimination,stereotype,injustice/accuracy: 0.9498
- Discrimination,stereotype,injustice/precision: 0.6553
- Discrimination,stereotype,injustice/recall: 0.7770
- Discrimination,stereotype,injustice/f1: 0.7109
- Discrimination,stereotype,injustice/fpr: 0.0353
- Discrimination,stereotype,injustice/threshold: 0.1871
- Drug Abuse,weapons,banned Substance/accuracy: 0.9728
- Drug Abuse,weapons,banned Substance/precision: 0.7361
- Drug Abuse,weapons,banned Substance/recall: 0.8048
- Drug Abuse,weapons,banned Substance/f1: 0.7689
- Drug Abuse,weapons,banned Substance/fpr: 0.0172
- Drug Abuse,weapons,banned Substance/threshold: 0.3478
- Financial Crime,property Crime,theft/accuracy: 0.9593
- Financial Crime,property Crime,theft/precision: 0.7569
- Financial Crime,property Crime,theft/recall: 0.8564
- Financial Crime,property Crime,theft/f1: 0.8036
- Financial Crime,property Crime,theft/fpr: 0.0297
- Financial Crime,property Crime,theft/threshold: 0.3831
- Hate Speech,offensive Language/accuracy: 0.9504
- Hate Speech,offensive Language/precision: 0.7616
- Hate Speech,offensive Language/recall: 0.6499
- Hate Speech,offensive Language/f1: 0.7013
- Hate Speech,offensive Language/fpr: 0.0200
- Hate Speech,offensive Language/threshold: 0.3886
- Misinformation Regarding Ethics,laws And Safety/accuracy: 0.9792
- Misinformation Regarding Ethics,laws And Safety/precision: 0.2061
- Misinformation Regarding Ethics,laws And Safety/recall: 0.2503
- Misinformation Regarding Ethics,laws And Safety/f1: 0.2261
- Misinformation Regarding Ethics,laws And Safety/fpr: 0.0119
- Misinformation Regarding Ethics,laws And Safety/threshold: 0.1871
- Non Violent Unethical Behavior/accuracy: 0.8783
- Non Violent Unethical Behavior/precision: 0.6977
- Non Violent Unethical Behavior/recall: 0.6839
- Non Violent Unethical Behavior/f1: 0.6907
- Non Violent Unethical Behavior/fpr: 0.0735
- Non Violent Unethical Behavior/threshold: 0.3478
- Privacy Violation/accuracy: 0.9805
- Privacy Violation/precision: 0.7845
- Privacy Violation/recall: 0.8345
- Privacy Violation/f1: 0.8087
- Privacy Violation/fpr: 0.0119
- Privacy Violation/threshold: 0.3469
- Self Harm/accuracy: 0.9966
- Self Harm/precision: 0.8035
- Self Harm/recall: 0.6683
- Self Harm/f1: 0.7297
- Self Harm/fpr: 0.0011
- Self Harm/threshold: 0.4879
- Sexually Explicit,adult Content/accuracy: 0.9834
- Sexually Explicit,adult Content/precision: 0.6342
- Sexually Explicit,adult Content/recall: 0.7381
- Sexually Explicit,adult Content/f1: 0.6822
- Sexually Explicit,adult Content/fpr: 0.0105
- Sexually Explicit,adult Content/threshold: 0.3124
- Terrorism,organized Crime/accuracy: 0.9888
- Terrorism,organized Crime/precision: 0.3634
- Terrorism,organized Crime/recall: 0.5364
- Terrorism,organized Crime/f1: 0.4332
- Terrorism,organized Crime/fpr: 0.0076
- Terrorism,organized Crime/threshold: 0.1733
- Violence,aiding And Abetting,incitement/accuracy: 0.9177
- Violence,aiding And Abetting,incitement/precision: 0.8232
- Violence,aiding And Abetting,incitement/recall: 0.8793
- Violence,aiding And Abetting,incitement/f1: 0.8503
- Violence,aiding And Abetting,incitement/fpr: 0.0684
- Violence,aiding And Abetting,incitement/threshold: 0.4301
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 32
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 10
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Macro F1 | Macro Precision | Macro Recall | Micro F1 | Micro Precision | Micro Recall | Flagged/accuracy | Flagged/precision | Flagged/recall | Flagged/f1 | Flagged/aucpr | Flagged/fpr | Animal Abuse/accuracy | Animal Abuse/precision | Animal Abuse/recall | Animal Abuse/f1 | Animal Abuse/fpr | Animal Abuse/threshold | Child Abuse/accuracy | Child Abuse/precision | Child Abuse/recall | Child Abuse/f1 | Child Abuse/fpr | Child Abuse/threshold | Controversial Topics,politics/accuracy | Controversial Topics,politics/precision | Controversial Topics,politics/recall | Controversial Topics,politics/f1 | Controversial Topics,politics/fpr | Controversial Topics,politics/threshold | Discrimination,stereotype,injustice/accuracy | Discrimination,stereotype,injustice/precision | Discrimination,stereotype,injustice/recall | Discrimination,stereotype,injustice/f1 | Discrimination,stereotype,injustice/fpr | Discrimination,stereotype,injustice/threshold | Drug Abuse,weapons,banned Substance/accuracy | Drug Abuse,weapons,banned Substance/precision | Drug Abuse,weapons,banned Substance/recall | Drug Abuse,weapons,banned Substance/f1 | Drug Abuse,weapons,banned Substance/fpr | Drug Abuse,weapons,banned Substance/threshold | Financial Crime,property Crime,theft/accuracy | Financial Crime,property Crime,theft/precision | Financial Crime,property Crime,theft/recall | Financial Crime,property Crime,theft/f1 | Financial Crime,property Crime,theft/fpr | Financial Crime,property Crime,theft/threshold | Hate Speech,offensive Language/accuracy | Hate Speech,offensive Language/precision | Hate Speech,offensive Language/recall | Hate Speech,offensive Language/f1 | Hate Speech,offensive Language/fpr | Hate Speech,offensive Language/threshold | Misinformation Regarding Ethics,laws And Safety/accuracy | Misinformation Regarding Ethics,laws And Safety/precision | Misinformation Regarding Ethics,laws And Safety/recall | Misinformation Regarding Ethics,laws And Safety/f1 | Misinformation Regarding Ethics,laws And Safety/fpr | Misinformation Regarding Ethics,laws And Safety/threshold | Non Violent Unethical Behavior/accuracy | Non Violent Unethical Behavior/precision | Non Violent Unethical Behavior/recall | Non Violent Unethical Behavior/f1 | Non Violent Unethical Behavior/fpr | Non Violent Unethical Behavior/threshold | Privacy Violation/accuracy | Privacy Violation/precision | Privacy Violation/recall | Privacy Violation/f1 | Privacy Violation/fpr | Privacy Violation/threshold | Self Harm/accuracy | Self Harm/precision | Self Harm/recall | Self Harm/f1 | Self Harm/fpr | Self Harm/threshold | Sexually Explicit,adult Content/accuracy | Sexually Explicit,adult Content/precision | Sexually Explicit,adult Content/recall | Sexually Explicit,adult Content/f1 | Sexually Explicit,adult Content/fpr | Sexually Explicit,adult Content/threshold | Terrorism,organized Crime/accuracy | Terrorism,organized Crime/precision | Terrorism,organized Crime/recall | Terrorism,organized Crime/f1 | Terrorism,organized Crime/fpr | Terrorism,organized Crime/threshold | Violence,aiding And Abetting,incitement/accuracy | Violence,aiding And Abetting,incitement/precision | Violence,aiding And Abetting,incitement/recall | Violence,aiding And Abetting,incitement/f1 | Violence,aiding And Abetting,incitement/fpr | Violence,aiding And Abetting,incitement/threshold |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.0661 | 1.0 | 16907 | 0.0857 | 0.6524 | 0.6585 | 0.6443 | 0.6803 | 0.7397 | 0.7260 | 0.7539 | 0.8381 | 0.8368 | 0.8808 | 0.8582 | 0.8919 | 0.2156 | 0.9943 | 0.732 | 0.7980 | 0.7636 | 0.0034 | 0.2991 | 0.9965 | 0.6958 | 0.6456 | 0.6698 | 0.0016 | 0.4990 | 0.9668 | 0.4633 | 0.5315 | 0.4951 | 0.0195 | 0.4073 | 0.9489 | 0.6495 | 0.7753 | 0.7068 | 0.0361 | 0.2323 | 0.9727 | 0.7445 | 0.7838 | 0.7636 | 0.0161 | 0.4796 | 0.9579 | 0.7569 | 0.8364 | 0.7947 | 0.0290 | 0.3621 | 0.9481 | 0.7377 | 0.6534 | 0.6930 | 0.0229 | 0.3711 | 0.9762 | 0.1727 | 0.2531 | 0.2053 | 0.0149 | 0.1613 | 0.8791 | 0.7098 | 0.6628 | 0.6855 | 0.0672 | 0.3951 | 0.9795 | 0.7608 | 0.8513 | 0.8035 | 0.0139 | 0.4525 | 0.9966 | 0.7982 | 0.6659 | 0.7261 | 0.0012 | 0.6766 | 0.9832 | 0.6364 | 0.7015 | 0.6673 | 0.0099 | 0.2838 | 0.9878 | 0.3306 | 0.5073 | 0.4003 | 0.0083 | 0.2031 | 0.9161 | 0.8319 | 0.8579 | 0.8447 | 0.0628 | 0.4031 |
| 0.0742 | 2.0 | 33814 | 0.0826 | 0.6608 | 0.6703 | 0.6572 | 0.6909 | 0.7482 | 0.7288 | 0.7687 | 0.8493 | 0.8469 | 0.8900 | 0.8679 | 0.8991 | 0.2018 | 0.9946 | 0.7483 | 0.7951 | 0.7710 | 0.0031 | 0.2736 | 0.9968 | 0.75 | 0.6396 | 0.6904 | 0.0012 | 0.3748 | 0.9679 | 0.4800 | 0.5597 | 0.5168 | 0.0192 | 0.2822 | 0.9498 | 0.6553 | 0.7770 | 0.7109 | 0.0353 | 0.1871 | 0.9728 | 0.7361 | 0.8048 | 0.7689 | 0.0172 | 0.3478 | 0.9593 | 0.7569 | 0.8564 | 0.8036 | 0.0297 | 0.3831 | 0.9504 | 0.7616 | 0.6499 | 0.7013 | 0.0200 | 0.3886 | 0.9792 | 0.2061 | 0.2503 | 0.2261 | 0.0119 | 0.1871 | 0.8783 | 0.6977 | 0.6839 | 0.6907 | 0.0735 | 0.3478 | 0.9805 | 0.7845 | 0.8345 | 0.8087 | 0.0119 | 0.3469 | 0.9966 | 0.8035 | 0.6683 | 0.7297 | 0.0011 | 0.4879 | 0.9834 | 0.6342 | 0.7381 | 0.6822 | 0.0105 | 0.3124 | 0.9888 | 0.3634 | 0.5364 | 0.4332 | 0.0076 | 0.1733 | 0.9177 | 0.8232 | 0.8793 | 0.8503 | 0.0684 | 0.4301 |
| 0.0601 | 3.0 | 50721 | 0.0815 | 0.6651 | 0.6685 | 0.6604 | 0.6828 | 0.7476 | 0.7331 | 0.7627 | 0.8492 | 0.8533 | 0.8804 | 0.8666 | 0.9001 | 0.1900 | 0.9944 | 0.7444 | 0.7791 | 0.7614 | 0.0031 | 0.3748 | 0.9965 | 0.7061 | 0.6276 | 0.6645 | 0.0015 | 0.3558 | 0.9655 | 0.4525 | 0.6004 | 0.5161 | 0.0230 | 0.2200 | 0.9545 | 0.7121 | 0.7176 | 0.7148 | 0.0251 | 0.3486 | 0.9732 | 0.7472 | 0.7927 | 0.7693 | 0.0160 | 0.4168 | 0.9593 | 0.7678 | 0.8340 | 0.7995 | 0.0272 | 0.4163 | 0.9487 | 0.7382 | 0.6621 | 0.6981 | 0.0231 | 0.4388 | 0.9810 | 0.2311 | 0.2421 | 0.2365 | 0.0099 | 0.1242 | 0.8777 | 0.6926 | 0.6914 | 0.6920 | 0.0761 | 0.3381 | 0.9813 | 0.8179 | 0.7997 | 0.8087 | 0.0092 | 0.4661 | 0.9968 | 0.8220 | 0.6756 | 0.7416 | 0.0010 | 0.5339 | 0.9826 | 0.6178 | 0.7229 | 0.6662 | 0.0110 | 0.3007 | 0.9891 | 0.3737 | 0.5322 | 0.4391 | 0.0072 | 0.3311 | 0.9177 | 0.8221 | 0.8814 | 0.8507 | 0.0691 | 0.4489 |
| 0.0628 | 4.0 | 67628 | 0.0822 | 0.6580 | 0.6635 | 0.6397 | 0.6939 | 0.7416 | 0.7225 | 0.7618 | 0.8473 | 0.8480 | 0.8840 | 0.8656 | 0.8983 | 0.1988 | 0.9947 | 0.7639 | 0.7805 | 0.7721 | 0.0028 | 0.3208 | 0.9961 | 0.6347 | 0.7147 | 0.6723 | 0.0023 | 0.3757 | 0.9672 | 0.4697 | 0.5434 | 0.5039 | 0.0194 | 0.3363 | 0.9524 | 0.6887 | 0.7331 | 0.7102 | 0.0286 | 0.3684 | 0.9717 | 0.7209 | 0.8107 | 0.7631 | 0.0187 | 0.4282 | 0.9577 | 0.7504 | 0.8474 | 0.7960 | 0.0304 | 0.4187 | 0.9467 | 0.7124 | 0.6781 | 0.6948 | 0.0269 | 0.4206 | 0.9755 | 0.1776 | 0.2804 | 0.2175 | 0.0160 | 0.1217 | 0.8747 | 0.6843 | 0.6862 | 0.6853 | 0.0785 | 0.3900 | 0.9812 | 0.8030 | 0.8193 | 0.8111 | 0.0104 | 0.5583 | 0.9964 | 0.7553 | 0.7 | 0.7266 | 0.0016 | 0.3407 | 0.9820 | 0.6006 | 0.7533 | 0.6683 | 0.0124 | 0.2393 | 0.9887 | 0.3562 | 0.5073 | 0.4185 | 0.0074 | 0.2408 | 0.9184 | 0.8376 | 0.8600 | 0.8487 | 0.0604 | 0.4756 |
Framework versions
- Transformers 4.57.1
- Pytorch 2.7.1+cu118
- Datasets 4.4.1
- Tokenizers 0.22.1
- Downloads last month
- 37
Model tree for saiteki-kai/QA-ModernBERT-large
Base model
answerdotai/ModernBERT-largeEvaluation results
- Accuracy on saiteki-kai/Beavertails-itself-reported0.661