SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("codersan/validadted_all-MiniLM_onV9")
# Run inference
sentences = [
    'برای تبدیل شدن به نویسنده برتر Quora ، چند بازدید و پاسخ لازم است؟',
    'چگونه می توانم نویسنده برتر Quora شوم ، از صعود بیشتر و آمار بهتر استفاده کنم؟',
    'من به دنبال خرید دوچرخه جدید هستم.Suzuki Gixxer 155 یا Honda Hornet 160r.کدام یک را بخرید؟',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 131,157 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 11 tokens
    • mean: 44.91 tokens
    • max: 256 tokens
    • min: 11 tokens
    • mean: 44.6 tokens
    • max: 154 tokens
  • Samples:
    anchor positive
    وقتی سوال من به عنوان "این سوال ممکن است به ویرایش نیاز داشته باشد" چه کاری باید انجام دهم ، اما نمی توانم دلیل آن را پیدا کنم؟ چرا سوال من به عنوان نیاز به پیشرفت مشخص شده است؟
    چگونه می توانید یک فایل رمزگذاری شده را با دانستن اینکه این یک فایل تصویری است بدون دانستن گسترش پرونده یا کلید ، رمزگشایی کنید؟ چگونه می توانید یک فایل رمزگذاری شده را رمزگشایی کنید و بدانید که این یک فایل تصویری است بدون اینکه از پسوند پرونده اطلاع داشته باشید؟
    احساس می کنم خودکشی می کنم ، چگونه باید با آن برخورد کنم؟ احساس می کنم خودکشی می کنم.چه کاری باید انجام دهم؟
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 15
  • warmup_ratio: 0.1
  • push_to_hub: True
  • hub_model_id: codersan/validadted_all-MiniLM_onV9
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: codersan/validadted_all-MiniLM_onV9
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0488 100 2.841
0.0976 200 2.1716
0.1463 300 1.5024
0.1951 400 1.2579
0.2439 500 1.1434
0.2927 600 1.0665
0.3415 700 0.9581
0.3902 800 0.9106
0.4390 900 0.87
0.4878 1000 0.7785
0.5366 1100 0.7591
0.5854 1200 0.6928
0.6341 1300 0.6778
0.6829 1400 0.6395
0.7317 1500 0.6145
0.7805 1600 0.5678
0.8293 1700 0.5602
0.8780 1800 0.5498
0.9268 1900 0.5292
0.9756 2000 0.4819
1.0244 2100 0.4717
1.0732 2200 0.4837
1.1220 2300 0.4404
1.1707 2400 0.4359
1.2195 2500 0.4121
1.2683 2600 0.434
1.3171 2700 0.4018
1.3659 2800 0.3866
1.4146 2900 0.3889
1.4634 3000 0.3595
1.5122 3100 0.3547
1.5610 3200 0.3517
1.6098 3300 0.3331
1.6585 3400 0.3228
1.7073 3500 0.3101
1.7561 3600 0.3071
1.8049 3700 0.288
1.8537 3800 0.3115
1.9024 3900 0.2777
1.9512 4000 0.2902
2.0 4100 0.2926
2.0488 4200 0.2958
2.0976 4300 0.2688
2.1463 4400 0.2647
2.1951 4500 0.2523
2.2439 4600 0.2681
2.2927 4700 0.2714
2.3415 4800 0.2575
2.3902 4900 0.2462
2.4390 5000 0.2466
2.4878 5100 0.2215
2.5366 5200 0.2424
2.5854 5300 0.2264
2.6341 5400 0.2252
2.6829 5500 0.2228
2.7317 5600 0.2337
2.7805 5700 0.1983
2.8293 5800 0.2156
2.8780 5900 0.2088
2.9268 6000 0.2196
2.9756 6100 0.2054
3.0244 6200 0.2114
3.0732 6300 0.2191
3.1220 6400 0.1899
3.1707 6500 0.1958
3.2195 6600 0.1907
3.2683 6700 0.2151
3.3171 6800 0.1918
3.3659 6900 0.1859
3.4146 7000 0.1962
3.4634 7100 0.1807
3.5122 7200 0.1874
3.5610 7300 0.179
3.6098 7400 0.1779
3.6585 7500 0.1726
3.7073 7600 0.1693
3.7561 7700 0.1708
3.8049 7800 0.1697
3.8537 7900 0.1744
3.9024 8000 0.1581
3.9512 8100 0.1761
4.0 8200 0.1724
4.0488 8300 0.1777
4.0976 8400 0.1591
4.1463 8500 0.1559
4.1951 8600 0.1518
4.2439 8700 0.1608
4.2927 8800 0.1751
4.3415 8900 0.1572
4.3902 9000 0.1498
4.4390 9100 0.16
4.4878 9200 0.137
4.5366 9300 0.1545
4.5854 9400 0.1443
4.6341 9500 0.1482
4.6829 9600 0.1383
4.7317 9700 0.1468
4.7805 9800 0.1331
4.8293 9900 0.1471
4.8780 10000 0.1352
4.9268 10100 0.1474
4.9756 10200 0.1465
5.0244 10300 0.1401
5.0732 10400 0.1488
5.1220 10500 0.1285
5.1707 10600 0.1326
5.2195 10700 0.1246
5.2683 10800 0.1532
5.3171 10900 0.1345
5.3659 11000 0.1246
5.4146 11100 0.1344
5.4634 11200 0.1214
5.5122 11300 0.1283
5.5610 11400 0.1235
5.6098 11500 0.1265
5.6585 11600 0.1248
5.7073 11700 0.1204
5.7561 11800 0.119
5.8049 11900 0.1174
5.8537 12000 0.1273
5.9024 12100 0.1107
5.9512 12200 0.1277
6.0 12300 0.1178
6.0488 12400 0.1286
6.0976 12500 0.1145
6.1463 12600 0.1164
6.1951 12700 0.1134
6.2439 12800 0.1211
6.2927 12900 0.125
6.3415 13000 0.1187
6.3902 13100 0.1108
6.4390 13200 0.1148
6.4878 13300 0.1046
6.5366 13400 0.1097
6.5854 13500 0.1066
6.6341 13600 0.1078
6.6829 13700 0.102
6.7317 13800 0.107
6.7805 13900 0.1008
6.8293 14000 0.1113
6.8780 14100 0.0987
6.9268 14200 0.1123
6.9756 14300 0.1062
7.0244 14400 0.1101
7.0732 14500 0.1129
7.1220 14600 0.0963
7.1707 14700 0.1053
7.2195 14800 0.0988
7.2683 14900 0.119
7.3171 15000 0.0993
7.3659 15100 0.0986
7.4146 15200 0.1012
7.4634 15300 0.0902
7.5122 15400 0.103
7.5610 15500 0.0961
7.6098 15600 0.0981
7.6585 15700 0.0972
7.7073 15800 0.0965
7.7561 15900 0.0916
7.8049 16000 0.0943
7.8537 16100 0.0973
7.9024 16200 0.0828
7.9512 16300 0.1036
8.0 16400 0.0986
8.0488 16500 0.1008
8.0976 16600 0.0897
8.1463 16700 0.092
8.1951 16800 0.0901
8.2439 16900 0.0979
8.2927 17000 0.0989
8.3415 17100 0.0937
8.3902 17200 0.0882
8.4390 17300 0.0902
8.4878 17400 0.0792
8.5366 17500 0.0893
8.5854 17600 0.0861
8.6341 17700 0.0866
8.6829 17800 0.0831
8.7317 17900 0.0893
8.7805 18000 0.0785
8.8293 18100 0.093
8.8780 18200 0.0815
8.9268 18300 0.0929
8.9756 18400 0.0869
9.0244 18500 0.0874
9.0732 18600 0.0944
9.1220 18700 0.0809
9.1707 18800 0.0845
9.2195 18900 0.0812
9.2683 19000 0.0966
9.3171 19100 0.0819
9.3659 19200 0.08
9.4146 19300 0.0849
9.4634 19400 0.0773
9.5122 19500 0.0822
9.5610 19600 0.0781
9.6098 19700 0.0798
9.6585 19800 0.0745
9.7073 19900 0.0763
9.7561 20000 0.074
9.8049 20100 0.0786
9.8537 20200 0.082
9.9024 20300 0.0685
9.9512 20400 0.0857
10.0 20500 0.0791
10.0488 20600 0.0865
10.0976 20700 0.0801
10.1463 20800 0.0792
10.1951 20900 0.0754
10.2439 21000 0.082
10.2927 21100 0.0849
10.3415 21200 0.0765
10.3902 21300 0.0749
10.4390 21400 0.0793
10.4878 21500 0.0702
10.5366 21600 0.0751
10.5854 21700 0.074
10.6341 21800 0.0733
10.6829 21900 0.0743
10.7317 22000 0.0747
10.7805 22100 0.0658
10.8293 22200 0.0787
10.8780 22300 0.07
10.9268 22400 0.0803
10.9756 22500 0.074
11.0244 22600 0.0737
11.0732 22700 0.0769
11.1220 22800 0.0652
11.1707 22900 0.0714
11.2195 23000 0.0682
11.2683 23100 0.0873
11.3171 23200 0.0693
11.3659 23300 0.069
11.4146 23400 0.0747
11.4634 23500 0.0647
11.5122 23600 0.0737
11.5610 23700 0.0714
11.6098 23800 0.0715
11.6585 23900 0.0666
11.7073 24000 0.0702
11.7561 24100 0.0643
11.8049 24200 0.0654
11.8537 24300 0.0685
11.9024 24400 0.0593
11.9512 24500 0.0775
12.0 24600 0.0721
12.0488 24700 0.076
12.0976 24800 0.0653
12.1463 24900 0.0677
12.1951 25000 0.0652
12.2439 25100 0.076
12.2927 25200 0.0741
12.3415 25300 0.0677
12.3902 25400 0.065
12.4390 25500 0.0709
12.4878 25600 0.0625
12.5366 25700 0.0666
12.5854 25800 0.0665
12.6341 25900 0.0679
12.6829 26000 0.0636
12.7317 26100 0.0638
12.7805 26200 0.0596
12.8293 26300 0.0693
12.8780 26400 0.0588
12.9268 26500 0.0726
12.9756 26600 0.0671
13.0244 26700 0.0666
13.0732 26800 0.0711
13.1220 26900 0.0604
13.1707 27000 0.0687
13.2195 27100 0.0613
13.2683 27200 0.0781
13.3171 27300 0.0596
13.3659 27400 0.0627
13.4146 27500 0.0655
13.4634 27600 0.0589
13.5122 27700 0.0633
13.5610 27800 0.0622
13.6098 27900 0.065
13.6585 28000 0.06
13.7073 28100 0.063
13.7561 28200 0.0589
13.8049 28300 0.0623
13.8537 28400 0.062
13.9024 28500 0.0559
13.9512 28600 0.0723
14.0 28700 0.0658
14.0488 28800 0.0687
14.0976 28900 0.0606
14.1463 29000 0.0622
14.1951 29100 0.0604
14.2439 29200 0.0657
14.2927 29300 0.067
14.3415 29400 0.0653
14.3902 29500 0.0587
14.4390 29600 0.0641
14.4878 29700 0.0558
14.5366 29800 0.0625
14.5854 29900 0.0613
14.6341 30000 0.0618
14.6829 30100 0.0596
14.7317 30200 0.0575
14.7805 30300 0.0552
14.8293 30400 0.0669
14.8780 30500 0.0552
14.9268 30600 0.0665
14.9756 30700 0.0625

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
19
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for codersan/FaMiniLM

Finetuned
(485)
this model

Collection including codersan/FaMiniLM