tomaarsen's picture
tomaarsen HF Staff
Add new SparseEncoder model
7961881 verified
---
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sparse-encoder
- sparse
- splade
- generated_from_trainer
- dataset_size:99000
- loss:SpladeLoss
- loss:SparseMultipleNegativesRankingLoss
- loss:FlopsLoss
base_model: distilbert/distilbert-base-uncased
widget:
- text: 'The term emergent literacy signals a belief that, in a literate society,
young children even one and two year olds, are in the process of becoming literate”.
... Gray (1956:21) notes: Functional literacy is used for the training of adults
to ''meet independently the reading and writing demands placed on them''.'
- text: Rey is seemingly confirmed as being The Chosen One per a quote by a Lucasfilm
production designer who worked on The Rise of Skywalker.
- text: are union gun safes fireproof?
- text: Fruit is an essential part of a healthy diet and may aid weight loss. Most
fruits are low in calories while high in nutrients and fiber, which can boost
your fullness. Keep in mind that it's best to eat fruits whole rather than juiced.
What's more, simply eating fruit is not the key to weight loss.
- text: Treatment of suspected bacterial infection is with antibiotics, such as amoxicillin/clavulanate
or doxycycline, given for 5 to 7 days for acute sinusitis and for up to 6 weeks
for chronic sinusitis.
datasets:
- sentence-transformers/gooaq
pipeline_tag: feature-extraction
library_name: sentence-transformers
metrics:
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
- query_active_dims
- query_sparsity_ratio
- corpus_active_dims
- corpus_sparsity_ratio
co2_eq_emissions:
emissions: 15.140869622791696
energy_consumed: 0.0389523841472174
source: codecarbon
training_type: fine-tuning
on_cloud: false
cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
ram_total_size: 31.777088165283203
hours_used: 0.154
hardware_used: 1 x NVIDIA GeForce RTX 3090
model-index:
- name: splade-distilbert-base-uncased trained on GooAQ
results:
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoMSMARCO
type: NanoMSMARCO
metrics:
- type: dot_accuracy@1
value: 0.28
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.5
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.58
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.74
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.28
name: Dot Precision@1
- type: dot_precision@3
value: 0.16666666666666663
name: Dot Precision@3
- type: dot_precision@5
value: 0.11599999999999999
name: Dot Precision@5
- type: dot_precision@10
value: 0.07400000000000001
name: Dot Precision@10
- type: dot_recall@1
value: 0.28
name: Dot Recall@1
- type: dot_recall@3
value: 0.5
name: Dot Recall@3
- type: dot_recall@5
value: 0.58
name: Dot Recall@5
- type: dot_recall@10
value: 0.74
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.4990149648564834
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.4237460317460318
name: Dot Mrr@10
- type: dot_map@100
value: 0.4349975076063166
name: Dot Map@100
- type: query_active_dims
value: 125.27999877929688
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9958954197372617
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 312.56317138671875
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.989759413819975
name: Corpus Sparsity Ratio
- type: dot_accuracy@1
value: 0.28
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.56
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.62
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.72
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.28
name: Dot Precision@1
- type: dot_precision@3
value: 0.18666666666666668
name: Dot Precision@3
- type: dot_precision@5
value: 0.124
name: Dot Precision@5
- type: dot_precision@10
value: 0.07200000000000001
name: Dot Precision@10
- type: dot_recall@1
value: 0.28
name: Dot Recall@1
- type: dot_recall@3
value: 0.56
name: Dot Recall@3
- type: dot_recall@5
value: 0.62
name: Dot Recall@5
- type: dot_recall@10
value: 0.72
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.4890243338331294
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.41600000000000004
name: Dot Mrr@10
- type: dot_map@100
value: 0.43012741081376565
name: Dot Map@100
- type: query_active_dims
value: 111.45999908447266
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9963482078800711
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 310.84136962890625
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9898158256461271
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoNFCorpus
type: NanoNFCorpus
metrics:
- type: dot_accuracy@1
value: 0.32
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.46
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.56
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.62
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.32
name: Dot Precision@1
- type: dot_precision@3
value: 0.2866666666666666
name: Dot Precision@3
- type: dot_precision@5
value: 0.256
name: Dot Precision@5
- type: dot_precision@10
value: 0.21999999999999997
name: Dot Precision@10
- type: dot_recall@1
value: 0.010409862362909712
name: Dot Recall@1
- type: dot_recall@3
value: 0.027097764372744422
name: Dot Recall@3
- type: dot_recall@5
value: 0.04059644876604124
name: Dot Recall@5
- type: dot_recall@10
value: 0.07167100144175235
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.24193545696769683
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.4157460317460317
name: Dot Mrr@10
- type: dot_map@100
value: 0.08023490333337685
name: Dot Map@100
- type: query_active_dims
value: 176.75999450683594
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.994208767626406
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 517.3054809570312
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9830513897858256
name: Corpus Sparsity Ratio
- type: dot_accuracy@1
value: 0.24
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.46
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.5
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.58
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.24
name: Dot Precision@1
- type: dot_precision@3
value: 0.29333333333333333
name: Dot Precision@3
- type: dot_precision@5
value: 0.252
name: Dot Precision@5
- type: dot_precision@10
value: 0.21400000000000002
name: Dot Precision@10
- type: dot_recall@1
value: 0.00781726182205936
name: Dot Recall@1
- type: dot_recall@3
value: 0.039232523010377704
name: Dot Recall@3
- type: dot_recall@5
value: 0.06598978333552999
name: Dot Recall@5
- type: dot_recall@10
value: 0.08530284648826633
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.23949057652351813
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.3643571428571429
name: Dot Mrr@10
- type: dot_map@100
value: 0.0903057527304283
name: Dot Map@100
- type: query_active_dims
value: 156.66000366210938
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9948673087064377
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 505.35760498046875
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9834428410661009
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoNQ
type: NanoNQ
metrics:
- type: dot_accuracy@1
value: 0.22
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.54
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.66
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.76
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.22
name: Dot Precision@1
- type: dot_precision@3
value: 0.18
name: Dot Precision@3
- type: dot_precision@5
value: 0.14
name: Dot Precision@5
- type: dot_precision@10
value: 0.08
name: Dot Precision@10
- type: dot_recall@1
value: 0.21
name: Dot Recall@1
- type: dot_recall@3
value: 0.49
name: Dot Recall@3
- type: dot_recall@5
value: 0.63
name: Dot Recall@5
- type: dot_recall@10
value: 0.73
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.4687690734978834
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.3937460317460317
name: Dot Mrr@10
- type: dot_map@100
value: 0.39105328798069705
name: Dot Map@100
- type: query_active_dims
value: 123.33999633789062
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9959589805275575
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 371.6484680175781
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9878235873134926
name: Corpus Sparsity Ratio
- type: dot_accuracy@1
value: 0.36
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.6
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.68
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.76
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.36
name: Dot Precision@1
- type: dot_precision@3
value: 0.20666666666666664
name: Dot Precision@3
- type: dot_precision@5
value: 0.14
name: Dot Precision@5
- type: dot_precision@10
value: 0.08
name: Dot Precision@10
- type: dot_recall@1
value: 0.35
name: Dot Recall@1
- type: dot_recall@3
value: 0.58
name: Dot Recall@3
- type: dot_recall@5
value: 0.65
name: Dot Recall@5
- type: dot_recall@10
value: 0.72
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.544210628480855
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.4959365079365079
name: Dot Mrr@10
- type: dot_map@100
value: 0.49453898976712246
name: Dot Map@100
- type: query_active_dims
value: 103.9000015258789
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9965958979907648
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 356.2113342285156
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9883293580293391
name: Corpus Sparsity Ratio
- task:
type: sparse-nano-beir
name: Sparse Nano BEIR
dataset:
name: NanoBEIR mean
type: NanoBEIR_mean
metrics:
- type: dot_accuracy@1
value: 0.2733333333333334
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.5
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.6000000000000001
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.7066666666666667
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.2733333333333334
name: Dot Precision@1
- type: dot_precision@3
value: 0.2111111111111111
name: Dot Precision@3
- type: dot_precision@5
value: 0.17066666666666666
name: Dot Precision@5
- type: dot_precision@10
value: 0.12466666666666666
name: Dot Precision@10
- type: dot_recall@1
value: 0.16680328745430326
name: Dot Recall@1
- type: dot_recall@3
value: 0.33903258812424814
name: Dot Recall@3
- type: dot_recall@5
value: 0.4168654829220137
name: Dot Recall@5
- type: dot_recall@10
value: 0.5138903338139175
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.4032398317740212
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.4110793650793651
name: Dot Mrr@10
- type: dot_map@100
value: 0.3020952329734635
name: Dot Map@100
- type: query_active_dims
value: 141.79332987467447
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.995354389297075
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 381.7902843249054
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9874913084226162
name: Corpus Sparsity Ratio
- type: dot_accuracy@1
value: 0.407032967032967
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.6120565149136579
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.6813500784929356
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.7599058084772369
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.407032967032967
name: Dot Precision@1
- type: dot_precision@3
value: 0.27516483516483514
name: Dot Precision@3
- type: dot_precision@5
value: 0.2152339089481947
name: Dot Precision@5
- type: dot_precision@10
value: 0.15498273155416015
name: Dot Precision@10
- type: dot_recall@1
value: 0.2321487370669766
name: Dot Recall@1
- type: dot_recall@3
value: 0.38956098353180224
name: Dot Recall@3
- type: dot_recall@5
value: 0.4491992568319475
name: Dot Recall@5
- type: dot_recall@10
value: 0.5287356542054744
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.47211549606581404
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.5232897261468691
name: Dot Mrr@10
- type: dot_map@100
value: 0.3983099175669993
name: Dot Map@100
- type: query_active_dims
value: 180.88135636128337
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9940737384063533
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 360.7380946225432
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9881810466344753
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoClimateFEVER
type: NanoClimateFEVER
metrics:
- type: dot_accuracy@1
value: 0.24
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.44
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.52
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.62
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.24
name: Dot Precision@1
- type: dot_precision@3
value: 0.15333333333333332
name: Dot Precision@3
- type: dot_precision@5
value: 0.11199999999999999
name: Dot Precision@5
- type: dot_precision@10
value: 0.07400000000000001
name: Dot Precision@10
- type: dot_recall@1
value: 0.115
name: Dot Recall@1
- type: dot_recall@3
value: 0.20566666666666664
name: Dot Recall@3
- type: dot_recall@5
value: 0.254
name: Dot Recall@5
- type: dot_recall@10
value: 0.303
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.25094049425975773
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.3527698412698412
name: Dot Mrr@10
- type: dot_map@100
value: 0.1922739754314787
name: Dot Map@100
- type: query_active_dims
value: 240.47999572753906
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.992121093122091
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 398.276123046875
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9869511787220079
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoDBPedia
type: NanoDBPedia
metrics:
- type: dot_accuracy@1
value: 0.6
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.78
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.84
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.9
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.6
name: Dot Precision@1
- type: dot_precision@3
value: 0.4866666666666666
name: Dot Precision@3
- type: dot_precision@5
value: 0.44799999999999995
name: Dot Precision@5
- type: dot_precision@10
value: 0.38800000000000007
name: Dot Precision@10
- type: dot_recall@1
value: 0.07504886736842241
name: Dot Recall@1
- type: dot_recall@3
value: 0.14295532639016004
name: Dot Recall@3
- type: dot_recall@5
value: 0.17962338014181914
name: Dot Recall@5
- type: dot_recall@10
value: 0.2627232266447939
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.4899447932345138
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.7126666666666667
name: Dot Mrr@10
- type: dot_map@100
value: 0.3806659729692501
name: Dot Map@100
- type: query_active_dims
value: 159.22000122070312
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9947834348594226
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 347.9973449707031
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9885984750353612
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoFEVER
type: NanoFEVER
metrics:
- type: dot_accuracy@1
value: 0.54
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.78
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.9
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.9
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.54
name: Dot Precision@1
- type: dot_precision@3
value: 0.26
name: Dot Precision@3
- type: dot_precision@5
value: 0.18
name: Dot Precision@5
- type: dot_precision@10
value: 0.09399999999999999
name: Dot Precision@10
- type: dot_recall@1
value: 0.5166666666666666
name: Dot Recall@1
- type: dot_recall@3
value: 0.7266666666666667
name: Dot Recall@3
- type: dot_recall@5
value: 0.8366666666666667
name: Dot Recall@5
- type: dot_recall@10
value: 0.8566666666666667
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.7043014395888793
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.6723333333333333
name: Dot Mrr@10
- type: dot_map@100
value: 0.6525721659293088
name: Dot Map@100
- type: query_active_dims
value: 211.27999877929688
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9930777800019889
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 428.28521728515625
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9859679831831087
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoFiQA2018
type: NanoFiQA2018
metrics:
- type: dot_accuracy@1
value: 0.32
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.48
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.52
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.62
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.32
name: Dot Precision@1
- type: dot_precision@3
value: 0.21333333333333332
name: Dot Precision@3
- type: dot_precision@5
value: 0.15600000000000003
name: Dot Precision@5
- type: dot_precision@10
value: 0.102
name: Dot Precision@10
- type: dot_recall@1
value: 0.18219047619047618
name: Dot Recall@1
- type: dot_recall@3
value: 0.29224603174603175
name: Dot Recall@3
- type: dot_recall@5
value: 0.34507936507936504
name: Dot Recall@5
- type: dot_recall@10
value: 0.46135714285714285
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.3638661029288861
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.40916666666666673
name: Dot Mrr@10
- type: dot_map@100
value: 0.3016646430756084
name: Dot Map@100
- type: query_active_dims
value: 103.12000274658203
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9966214532879044
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 340.604248046875
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9888406969383764
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoHotpotQA
type: NanoHotpotQA
metrics:
- type: dot_accuracy@1
value: 0.72
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.78
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.8
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.92
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.72
name: Dot Precision@1
- type: dot_precision@3
value: 0.4133333333333333
name: Dot Precision@3
- type: dot_precision@5
value: 0.25999999999999995
name: Dot Precision@5
- type: dot_precision@10
value: 0.152
name: Dot Precision@10
- type: dot_recall@1
value: 0.36
name: Dot Recall@1
- type: dot_recall@3
value: 0.62
name: Dot Recall@3
- type: dot_recall@5
value: 0.65
name: Dot Recall@5
- type: dot_recall@10
value: 0.76
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.6875870542669431
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.7661031746031746
name: Dot Mrr@10
- type: dot_map@100
value: 0.6246470320917622
name: Dot Map@100
- type: query_active_dims
value: 132.77999877929688
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9956496953417437
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 392.06817626953125
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.987154571251244
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoQuoraRetrieval
type: NanoQuoraRetrieval
metrics:
- type: dot_accuracy@1
value: 0.5
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.74
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.84
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.96
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.5
name: Dot Precision@1
- type: dot_precision@3
value: 0.26
name: Dot Precision@3
- type: dot_precision@5
value: 0.18799999999999997
name: Dot Precision@5
- type: dot_precision@10
value: 0.118
name: Dot Precision@10
- type: dot_recall@1
value: 0.49
name: Dot Recall@1
- type: dot_recall@3
value: 0.7066666666666666
name: Dot Recall@3
- type: dot_recall@5
value: 0.8013333333333332
name: Dot Recall@5
- type: dot_recall@10
value: 0.9133333333333334
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.7141904974177148
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.6457777777777778
name: Dot Mrr@10
- type: dot_map@100
value: 0.6498892824353822
name: Dot Map@100
- type: query_active_dims
value: 63.400001525878906
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9979228097265619
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 73.4577865600586
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9975932839735254
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoSCIDOCS
type: NanoSCIDOCS
metrics:
- type: dot_accuracy@1
value: 0.36
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.54
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.68
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.76
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.36
name: Dot Precision@1
- type: dot_precision@3
value: 0.24666666666666667
name: Dot Precision@3
- type: dot_precision@5
value: 0.212
name: Dot Precision@5
- type: dot_precision@10
value: 0.152
name: Dot Precision@10
- type: dot_recall@1
value: 0.07566666666666669
name: Dot Recall@1
- type: dot_recall@3
value: 0.15266666666666667
name: Dot Recall@3
- type: dot_recall@5
value: 0.21866666666666668
name: Dot Recall@5
- type: dot_recall@10
value: 0.31366666666666665
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.29765744924419957
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.47960317460317453
name: Dot Mrr@10
- type: dot_map@100
value: 0.21728096438859665
name: Dot Map@100
- type: query_active_dims
value: 247.55999755859375
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9918891292327306
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 424.1747131347656
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9861026566694592
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoArguAna
type: NanoArguAna
metrics:
- type: dot_accuracy@1
value: 0.06
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.38
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.44
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.5
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.06
name: Dot Precision@1
- type: dot_precision@3
value: 0.12666666666666665
name: Dot Precision@3
- type: dot_precision@5
value: 0.08800000000000001
name: Dot Precision@5
- type: dot_precision@10
value: 0.05000000000000001
name: Dot Precision@10
- type: dot_recall@1
value: 0.06
name: Dot Recall@1
- type: dot_recall@3
value: 0.38
name: Dot Recall@3
- type: dot_recall@5
value: 0.44
name: Dot Recall@5
- type: dot_recall@10
value: 0.5
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.29205612820937377
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.22366666666666668
name: Dot Mrr@10
- type: dot_map@100
value: 0.23474188188747466
name: Dot Map@100
- type: query_active_dims
value: 477.05999755859375
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9843699627298803
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 455.6429138183594
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9850716560573239
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoSciFact
type: NanoSciFact
metrics:
- type: dot_accuracy@1
value: 0.5
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.58
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.64
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.7
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.5
name: Dot Precision@1
- type: dot_precision@3
value: 0.20666666666666664
name: Dot Precision@3
- type: dot_precision@5
value: 0.136
name: Dot Precision@5
- type: dot_precision@10
value: 0.08
name: Dot Precision@10
- type: dot_recall@1
value: 0.465
name: Dot Recall@1
- type: dot_recall@3
value: 0.545
name: Dot Recall@3
- type: dot_recall@5
value: 0.605
name: Dot Recall@5
- type: dot_recall@10
value: 0.69
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.5783252903985125
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.5562222222222222
name: Dot Mrr@10
- type: dot_map@100
value: 0.5447263194322018
name: Dot Map@100
- type: query_active_dims
value: 280.32000732421875
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9908158047531544
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 451.07366943359375
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9852213593659134
name: Corpus Sparsity Ratio
- task:
type: sparse-information-retrieval
name: Sparse Information Retrieval
dataset:
name: NanoTouche2020
type: NanoTouche2020
metrics:
- type: dot_accuracy@1
value: 0.5714285714285714
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.8367346938775511
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.8775510204081632
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 0.9387755102040817
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.5714285714285714
name: Dot Precision@1
- type: dot_precision@3
value: 0.5238095238095238
name: Dot Precision@3
- type: dot_precision@5
value: 0.5020408163265306
name: Dot Precision@5
- type: dot_precision@10
value: 0.4387755102040816
name: Dot Precision@10
- type: dot_recall@1
value: 0.040543643156404505
name: Dot Recall@1
- type: dot_recall@3
value: 0.11319223810019327
name: Dot Recall@3
- type: dot_recall@5
value: 0.17323114359193661
name: Dot Recall@5
- type: dot_recall@10
value: 0.2875136220142983
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.4859066604692993
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.7081632653061225
name: Dot Mrr@10
- type: dot_map@100
value: 0.36459453741861036
name: Dot Map@100
- type: query_active_dims
value: 61.836734771728516
name: Query Active Dims
- type: query_sparsity_ratio
value: 0.9979740274303215
name: Query Sparsity Ratio
- type: corpus_active_dims
value: 380.6022644042969
name: Corpus Active Dims
- type: corpus_sparsity_ratio
value: 0.9875302318195303
name: Corpus Sparsity Ratio
---
# splade-distilbert-base-uncased trained on GooAQ
This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model finetuned from [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on the [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) dataset using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
## Model Details
### Model Description
- **Model Type:** SPLADE Sparse Encoder
- **Base model:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) <!-- at revision 12040accade4e8a0f71eabdb258fecc2e7e948be -->
- **Maximum Sequence Length:** 256 tokens
- **Output Dimensionality:** 30522 dimensions
- **Similarity Function:** Dot Product
- **Training Dataset:**
- [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
- **Language:** en
- **License:** apache-2.0
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)
### Full Model Architecture
```
SparseEncoder(
(0): MLMTransformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'DistilBertForMaskedLM'})
(1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SparseEncoder
# Download from the 🤗 Hub
model = SparseEncoder("tomaarsen/splade-distilbert-base-uncased-gooaq")
# Run inference
queries = [
"how many days for doxycycline to work on sinus infection?",
]
documents = [
'Treatment of suspected bacterial infection is with antibiotics, such as amoxicillin/clavulanate or doxycycline, given for 5 to 7 days for acute sinusitis and for up to 6 weeks for chronic sinusitis.',
'Most engagements typically have a cocktail dress code, calling for dresses at, or slightly above, knee-length and high heels. If your party states a different dress code, however, such as semi-formal or dressy-casual, you may need to dress up or down accordingly.',
'The average service life of a gas furnace is about 15 years, but the actual life span of an individual unit can vary greatly. There are a number of contributing factors that determine the age a furnace reaches: The quality of the equipment.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[103.7028, 26.2666, 35.3421]])
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Sparse Information Retrieval
* Datasets: `NanoMSMARCO`, `NanoNFCorpus`, `NanoNQ`, `NanoClimateFEVER`, `NanoDBPedia`, `NanoFEVER`, `NanoFiQA2018`, `NanoHotpotQA`, `NanoMSMARCO`, `NanoNFCorpus`, `NanoNQ`, `NanoQuoraRetrieval`, `NanoSCIDOCS`, `NanoArguAna`, `NanoSciFact` and `NanoTouche2020`
* Evaluated with [<code>SparseInformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseInformationRetrievalEvaluator)
| Metric | NanoMSMARCO | NanoNFCorpus | NanoNQ | NanoClimateFEVER | NanoDBPedia | NanoFEVER | NanoFiQA2018 | NanoHotpotQA | NanoQuoraRetrieval | NanoSCIDOCS | NanoArguAna | NanoSciFact | NanoTouche2020 |
|:----------------------|:------------|:-------------|:-----------|:-----------------|:------------|:-----------|:-------------|:-------------|:-------------------|:------------|:------------|:------------|:---------------|
| dot_accuracy@1 | 0.28 | 0.24 | 0.36 | 0.24 | 0.6 | 0.54 | 0.32 | 0.72 | 0.5 | 0.36 | 0.06 | 0.5 | 0.5714 |
| dot_accuracy@3 | 0.56 | 0.46 | 0.6 | 0.44 | 0.78 | 0.78 | 0.48 | 0.78 | 0.74 | 0.54 | 0.38 | 0.58 | 0.8367 |
| dot_accuracy@5 | 0.62 | 0.5 | 0.68 | 0.52 | 0.84 | 0.9 | 0.52 | 0.8 | 0.84 | 0.68 | 0.44 | 0.64 | 0.8776 |
| dot_accuracy@10 | 0.72 | 0.58 | 0.76 | 0.62 | 0.9 | 0.9 | 0.62 | 0.92 | 0.96 | 0.76 | 0.5 | 0.7 | 0.9388 |
| dot_precision@1 | 0.28 | 0.24 | 0.36 | 0.24 | 0.6 | 0.54 | 0.32 | 0.72 | 0.5 | 0.36 | 0.06 | 0.5 | 0.5714 |
| dot_precision@3 | 0.1867 | 0.2933 | 0.2067 | 0.1533 | 0.4867 | 0.26 | 0.2133 | 0.4133 | 0.26 | 0.2467 | 0.1267 | 0.2067 | 0.5238 |
| dot_precision@5 | 0.124 | 0.252 | 0.14 | 0.112 | 0.448 | 0.18 | 0.156 | 0.26 | 0.188 | 0.212 | 0.088 | 0.136 | 0.502 |
| dot_precision@10 | 0.072 | 0.214 | 0.08 | 0.074 | 0.388 | 0.094 | 0.102 | 0.152 | 0.118 | 0.152 | 0.05 | 0.08 | 0.4388 |
| dot_recall@1 | 0.28 | 0.0078 | 0.35 | 0.115 | 0.075 | 0.5167 | 0.1822 | 0.36 | 0.49 | 0.0757 | 0.06 | 0.465 | 0.0405 |
| dot_recall@3 | 0.56 | 0.0392 | 0.58 | 0.2057 | 0.143 | 0.7267 | 0.2922 | 0.62 | 0.7067 | 0.1527 | 0.38 | 0.545 | 0.1132 |
| dot_recall@5 | 0.62 | 0.066 | 0.65 | 0.254 | 0.1796 | 0.8367 | 0.3451 | 0.65 | 0.8013 | 0.2187 | 0.44 | 0.605 | 0.1732 |
| dot_recall@10 | 0.72 | 0.0853 | 0.72 | 0.303 | 0.2627 | 0.8567 | 0.4614 | 0.76 | 0.9133 | 0.3137 | 0.5 | 0.69 | 0.2875 |
| **dot_ndcg@10** | **0.489** | **0.2395** | **0.5442** | **0.2509** | **0.4899** | **0.7043** | **0.3639** | **0.6876** | **0.7142** | **0.2977** | **0.2921** | **0.5783** | **0.4859** |
| dot_mrr@10 | 0.416 | 0.3644 | 0.4959 | 0.3528 | 0.7127 | 0.6723 | 0.4092 | 0.7661 | 0.6458 | 0.4796 | 0.2237 | 0.5562 | 0.7082 |
| dot_map@100 | 0.4301 | 0.0903 | 0.4945 | 0.1923 | 0.3807 | 0.6526 | 0.3017 | 0.6246 | 0.6499 | 0.2173 | 0.2347 | 0.5447 | 0.3646 |
| query_active_dims | 111.46 | 156.66 | 103.9 | 240.48 | 159.22 | 211.28 | 103.12 | 132.78 | 63.4 | 247.56 | 477.06 | 280.32 | 61.8367 |
| query_sparsity_ratio | 0.9963 | 0.9949 | 0.9966 | 0.9921 | 0.9948 | 0.9931 | 0.9966 | 0.9956 | 0.9979 | 0.9919 | 0.9844 | 0.9908 | 0.998 |
| corpus_active_dims | 310.8414 | 505.3576 | 356.2113 | 398.2761 | 347.9973 | 428.2852 | 340.6042 | 392.0682 | 73.4578 | 424.1747 | 455.6429 | 451.0737 | 380.6023 |
| corpus_sparsity_ratio | 0.9898 | 0.9834 | 0.9883 | 0.987 | 0.9886 | 0.986 | 0.9888 | 0.9872 | 0.9976 | 0.9861 | 0.9851 | 0.9852 | 0.9875 |
#### Sparse Nano BEIR
* Dataset: `NanoBEIR_mean`
* Evaluated with [<code>SparseNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseNanoBEIREvaluator) with these parameters:
```json
{
"dataset_names": [
"msmarco",
"nfcorpus",
"nq"
]
}
```
| Metric | Value |
|:----------------------|:-----------|
| dot_accuracy@1 | 0.2733 |
| dot_accuracy@3 | 0.5 |
| dot_accuracy@5 | 0.6 |
| dot_accuracy@10 | 0.7067 |
| dot_precision@1 | 0.2733 |
| dot_precision@3 | 0.2111 |
| dot_precision@5 | 0.1707 |
| dot_precision@10 | 0.1247 |
| dot_recall@1 | 0.1668 |
| dot_recall@3 | 0.339 |
| dot_recall@5 | 0.4169 |
| dot_recall@10 | 0.5139 |
| **dot_ndcg@10** | **0.4032** |
| dot_mrr@10 | 0.4111 |
| dot_map@100 | 0.3021 |
| query_active_dims | 141.7933 |
| query_sparsity_ratio | 0.9954 |
| corpus_active_dims | 381.7903 |
| corpus_sparsity_ratio | 0.9875 |
#### Sparse Nano BEIR
* Dataset: `NanoBEIR_mean`
* Evaluated with [<code>SparseNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseNanoBEIREvaluator) with these parameters:
```json
{
"dataset_names": [
"climatefever",
"dbpedia",
"fever",
"fiqa2018",
"hotpotqa",
"msmarco",
"nfcorpus",
"nq",
"quoraretrieval",
"scidocs",
"arguana",
"scifact",
"touche2020"
]
}
```
| Metric | Value |
|:----------------------|:-----------|
| dot_accuracy@1 | 0.407 |
| dot_accuracy@3 | 0.6121 |
| dot_accuracy@5 | 0.6814 |
| dot_accuracy@10 | 0.7599 |
| dot_precision@1 | 0.407 |
| dot_precision@3 | 0.2752 |
| dot_precision@5 | 0.2152 |
| dot_precision@10 | 0.155 |
| dot_recall@1 | 0.2321 |
| dot_recall@3 | 0.3896 |
| dot_recall@5 | 0.4492 |
| dot_recall@10 | 0.5287 |
| **dot_ndcg@10** | **0.4721** |
| dot_mrr@10 | 0.5233 |
| dot_map@100 | 0.3983 |
| query_active_dims | 180.8814 |
| query_sparsity_ratio | 0.9941 |
| corpus_active_dims | 360.7381 |
| corpus_sparsity_ratio | 0.9882 |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Dataset
#### gooaq
* Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
* Size: 99,000 training samples
* Columns: <code>question</code> and <code>answer</code>
* Approximate statistics based on the first 1000 samples:
| | question | answer |
|:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 8 tokens</li><li>mean: 11.79 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 60.02 tokens</li><li>max: 153 tokens</li></ul> |
* Samples:
| question | answer |
|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>what are the 5 characteristics of a star?</code> | <code>Key Concept: Characteristics used to classify stars include color, temperature, size, composition, and brightness.</code> |
| <code>are copic markers alcohol ink?</code> | <code>Copic Ink is alcohol-based and flammable. Keep away from direct sunlight and extreme temperatures.</code> |
| <code>what is the difference between appellate term and appellate division?</code> | <code>Appellate terms An appellate term is an intermediate appellate court that hears appeals from the inferior courts within their designated counties or judicial districts, and are intended to ease the workload on the Appellate Division and provide a less expensive forum closer to the people.</code> |
* Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
```json
{
"loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')",
"document_regularizer_weight": 3e-05,
"query_regularizer_weight": 5e-05
}
```
### Evaluation Dataset
#### gooaq
* Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
* Size: 1,000 evaluation samples
* Columns: <code>question</code> and <code>answer</code>
* Approximate statistics based on the first 1000 samples:
| | question | answer |
|:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 8 tokens</li><li>mean: 11.93 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 60.84 tokens</li><li>max: 127 tokens</li></ul> |
* Samples:
| question | answer |
|:-----------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>should you take ibuprofen with high blood pressure?</code> | <code>In general, people with high blood pressure should use acetaminophen or possibly aspirin for over-the-counter pain relief. Unless your health care provider has said it's OK, you should not use ibuprofen, ketoprofen, or naproxen sodium. If aspirin or acetaminophen doesn't help with your pain, call your doctor.</code> |
| <code>how old do you have to be to work in sc?</code> | <code>The general minimum age of employment for South Carolina youth is 14, although the state allows younger children who are performers to work in show business. If their families are agricultural workers, children younger than age 14 may also participate in farm labor.</code> |
| <code>how to write a topic proposal for a research paper?</code> | <code>['Write down the main topic of your paper. ... ', 'Write two or three short sentences under the main topic that explain why you chose that topic. ... ', 'Write a thesis sentence that states the angle and purpose of your research paper. ... ', 'List the items you will cover in the body of the paper that support your thesis statement.']</code> |
* Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
```json
{
"loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')",
"document_regularizer_weight": 3e-05,
"query_regularizer_weight": 5e-05
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 32
- `per_device_eval_batch_size`: 32
- `learning_rate`: 2e-05
- `num_train_epochs`: 1
- `bf16`: True
- `load_best_model_at_end`: True
- `batch_sampler`: no_duplicates
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 32
- `per_device_eval_batch_size`: 32
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 2e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: True
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}
</details>
### Training Logs
| Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_dot_ndcg@10 | NanoNFCorpus_dot_ndcg@10 | NanoNQ_dot_ndcg@10 | NanoBEIR_mean_dot_ndcg@10 | NanoClimateFEVER_dot_ndcg@10 | NanoDBPedia_dot_ndcg@10 | NanoFEVER_dot_ndcg@10 | NanoFiQA2018_dot_ndcg@10 | NanoHotpotQA_dot_ndcg@10 | NanoQuoraRetrieval_dot_ndcg@10 | NanoSCIDOCS_dot_ndcg@10 | NanoArguAna_dot_ndcg@10 | NanoSciFact_dot_ndcg@10 | NanoTouche2020_dot_ndcg@10 |
|:----------:|:--------:|:-------------:|:---------------:|:-----------------------:|:------------------------:|:------------------:|:-------------------------:|:----------------------------:|:-----------------------:|:---------------------:|:------------------------:|:------------------------:|:------------------------------:|:-----------------------:|:-----------------------:|:-----------------------:|:--------------------------:|
| 0.0323 | 100 | 11.4443 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0646 | 200 | 0.2676 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0970 | 300 | 0.1639 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1293 | 400 | 0.1769 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1616 | 500 | 0.1593 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1939 | 600 | 0.1194 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1972 | 610 | - | 0.1080 | 0.4260 | 0.2314 | 0.4303 | 0.3626 | - | - | - | - | - | - | - | - | - | - |
| 0.2262 | 700 | 0.1351 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2586 | 800 | 0.109 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2909 | 900 | 0.1147 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3232 | 1000 | 0.0994 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3555 | 1100 | 0.0871 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3878 | 1200 | 0.0891 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| **0.3943** | **1220** | **-** | **0.0942** | **0.489** | **0.2395** | **0.5442** | **0.4242** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** |
| 0.4202 | 1300 | 0.09 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4525 | 1400 | 0.0902 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4848 | 1500 | 0.1046 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5171 | 1600 | 0.071 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5495 | 1700 | 0.0783 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5818 | 1800 | 0.0846 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5915 | 1830 | - | 0.0804 | 0.4745 | 0.2537 | 0.4780 | 0.4021 | - | - | - | - | - | - | - | - | - | - |
| 0.6141 | 1900 | 0.0572 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6464 | 2000 | 0.0712 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6787 | 2100 | 0.065 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7111 | 2200 | 0.096 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7434 | 2300 | 0.0764 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7757 | 2400 | 0.0722 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7886 | 2440 | - | 0.0716 | 0.4976 | 0.2348 | 0.4626 | 0.3983 | - | - | - | - | - | - | - | - | - | - |
| 0.8080 | 2500 | 0.0579 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8403 | 2600 | 0.0655 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8727 | 2700 | 0.0612 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9050 | 2800 | 0.0491 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9373 | 2900 | 0.0496 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9696 | 3000 | 0.0553 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9858 | 3050 | - | 0.0746 | 0.4990 | 0.2419 | 0.4688 | 0.4032 | - | - | - | - | - | - | - | - | - | - |
| -1 | -1 | - | - | 0.4890 | 0.2395 | 0.5442 | 0.4721 | 0.2509 | 0.4899 | 0.7043 | 0.3639 | 0.6876 | 0.7142 | 0.2977 | 0.2921 | 0.5783 | 0.4859 |
* The bold row denotes the saved checkpoint.
### Environmental Impact
Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
- **Energy Consumed**: 0.039 kWh
- **Carbon Emitted**: 0.015 kg of CO2
- **Hours Used**: 0.154 hours
### Training Hardware
- **On Cloud**: No
- **GPU Model**: 1 x NVIDIA GeForce RTX 3090
- **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
- **RAM Size**: 31.78 GB
### Framework Versions
- Python: 3.11.6
- Sentence Transformers: 4.2.0.dev0
- Transformers: 4.52.4
- PyTorch: 2.7.1+cu126
- Accelerate: 1.5.1
- Datasets: 2.21.0
- Tokenizers: 0.21.1
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### SpladeLoss
```bibtex
@misc{formal2022distillationhardnegativesampling,
title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
year={2022},
eprint={2205.04733},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2205.04733},
}
```
#### SparseMultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
#### FlopsLoss
```bibtex
@article{paria2020minimizing,
title={Minimizing flops to learn efficient sparse representations},
author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
journal={arXiv preprint arXiv:2004.05665},
year={2020}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->