youssefkhalil320's picture
Upload folder using huggingface_hub
5a84911 verified
|
raw
history blame
18.9 kB
metadata
base_model: sentence-transformers/all-MiniLM-L6-v2
datasets:
  - youssefkhalil320/pairs_three_scores_v5
language:
  - en
library_name: sentence-transformers
license: apache-2.0
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:80000003
  - loss:CoSENTLoss
widget:
  - source_sentence: durable pvc swim ring
    sentences:
      - flaky croissant
      - urban shoes
      - warm drinks mug
  - source_sentence: iso mak retard capsules
    sentences:
      - savory baguette
      - shea butter body cream
      - softwheeled cruiser
  - source_sentence: love sandra potty
    sentences:
      - utensil holder
      - olive pants
      - headwear
  - source_sentence: dusky hair brush
    sentences:
      - back compartment laptop
      - rubber feet platter
      - honed blade knife
  - source_sentence: nkd skn
    sentences:
      - fruit fragrances nail polish remover
      - panini salmon
      - hand drawing bag

all-MiniLM-L6-v8-pair_score

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 on the pairs_three_scores_v5 dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'nkd skn',
    'hand drawing bag',
    'panini salmon',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

pairs_three_scores_v5

  • Dataset: pairs_three_scores_v5 at 3d8c457
  • Size: 80,000,003 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 3 tokens
    • mean: 6.06 tokens
    • max: 12 tokens
    • min: 3 tokens
    • mean: 5.71 tokens
    • max: 13 tokens
    • min: 0.0
    • mean: 0.11
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    vanilla hair cream free of paraben hair mask 0.5
    nourishing shampoo cumin lemon tea 0.0
    safe materials pacifier facial serum 0.5
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Evaluation Dataset

pairs_three_scores_v5

  • Dataset: pairs_three_scores_v5 at 3d8c457
  • Size: 20,000,001 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 3 tokens
    • mean: 6.21 tokens
    • max: 12 tokens
    • min: 3 tokens
    • mean: 5.75 tokens
    • max: 12 tokens
    • min: 0.0
    • mean: 0.11
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    teddy bear toy long lasting cat food 0.0
    eva hair treatment fresh pineapple 0.0
    soft wave hair conditioner hybrid seat bike 0.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0002 100 10.8792
0.0003 200 10.9284
0.0005 300 10.6466
0.0006 400 10.841
0.0008 500 10.8094
0.0010 600 10.4323
0.0011 700 10.3032
0.0013 800 10.4006
0.0014 900 10.4743
0.0016 1000 10.2334
0.0018 1100 10.0135
0.0019 1200 9.7874
0.0021 1300 9.7419
0.0022 1400 9.7412
0.0024 1500 9.4585
0.0026 1600 9.5339
0.0027 1700 9.4345
0.0029 1800 9.1733
0.0030 1900 8.9952
0.0032 2000 8.9669
0.0034 2100 8.8152
0.0035 2200 8.7936
0.0037 2300 8.6771
0.0038 2400 8.4648
0.0040 2500 8.5764
0.0042 2600 8.4587
0.0043 2700 8.2966
0.0045 2800 8.2329
0.0046 2900 8.1415
0.0048 3000 8.0404
0.0050 3100 7.9698
0.0051 3200 7.9205
0.0053 3300 7.8314
0.0054 3400 7.8369
0.0056 3500 7.6403
0.0058 3600 7.5842
0.0059 3700 7.5812
0.0061 3800 7.4335
0.0062 3900 7.4917
0.0064 4000 7.3204
0.0066 4100 7.2971
0.0067 4200 7.2233
0.0069 4300 7.2081
0.0070 4400 7.1364
0.0072 4500 7.0663
0.0074 4600 6.9601
0.0075 4700 6.9546
0.0077 4800 6.9019
0.0078 4900 6.8801
0.0080 5000 6.7734
0.0082 5100 6.7648
0.0083 5200 6.7498
0.0085 5300 6.6872
0.0086 5400 6.6264
0.0088 5500 6.579
0.0090 5600 6.6001
0.0091 5700 6.5971
0.0093 5800 6.4694
0.0094 5900 6.3983
0.0096 6000 6.4477
0.0098 6100 6.4308
0.0099 6200 6.4248
0.0101 6300 6.2642
0.0102 6400 6.2763
0.0104 6500 6.3878
0.0106 6600 6.2601
0.0107 6700 6.1789
0.0109 6800 6.1773
0.0110 6900 6.1439
0.0112 7000 6.1863
0.0114 7100 6.0513
0.0115 7200 6.0671
0.0117 7300 6.0212
0.0118 7400 6.0043
0.0120 7500 6.0166
0.0122 7600 5.9754
0.0123 7700 5.9211
0.0125 7800 5.7867
0.0126 7900 5.8534
0.0128 8000 5.7708
0.0130 8100 5.8328
0.0131 8200 5.7417
0.0133 8300 5.8097
0.0134 8400 5.7578
0.0136 8500 5.643
0.0138 8600 5.6401
0.0139 8700 5.6627
0.0141 8800 5.6167
0.0142 8900 5.6539
0.0144 9000 5.4513
0.0146 9100 5.4132
0.0147 9200 5.4714
0.0149 9300 5.4786
0.0150 9400 5.3928
0.0152 9500 5.4774
0.0154 9600 5.2881
0.0155 9700 5.3699
0.0157 9800 5.1483
0.0158 9900 5.3051
0.0160 10000 5.2546
0.0162 10100 5.2314
0.0163 10200 5.1783
0.0165 10300 5.2074
0.0166 10400 5.2825
0.0168 10500 5.1715
0.0170 10600 5.087
0.0171 10700 5.082
0.0173 10800 4.9111
0.0174 10900 5.0213
0.0176 11000 4.9898
0.0178 11100 4.7734
0.0179 11200 4.9511
0.0181 11300 5.0481
0.0182 11400 4.8441
0.0184 11500 4.873
0.0186 11600 4.9988
0.0187 11700 4.7653
0.0189 11800 4.804
0.0190 11900 4.8288
0.0192 12000 4.7053
0.0194 12100 4.6887
0.0195 12200 4.7832
0.0197 12300 4.6817
0.0198 12400 4.6252
0.0200 12500 4.5936
0.0202 12600 4.7452
0.0203 12700 4.5321
0.0205 12800 4.4964
0.0206 12900 4.4421
0.0208 13000 4.3782
0.0210 13100 4.5169
0.0211 13200 4.533
0.0213 13300 4.3725
0.0214 13400 4.2911
0.0216 13500 4.2261
0.0218 13600 4.2467
0.0219 13700 4.1558
0.0221 13800 4.2794
0.0222 13900 4.2383
0.0224 14000 4.1654
0.0226 14100 4.158
0.0227 14200 4.1299
0.0229 14300 4.1902
0.0230 14400 3.7853
0.0232 14500 4.0514

Framework Versions

  • Python: 3.8.10
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.4.1+cu118
  • Accelerate: 1.0.1
  • Datasets: 3.0.1
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}