SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("along26/all-MiniLM-L6-v2_multilingual_malaysian-v5")
# Run inference
sentences = [
    'How can we design small molecule inhibitors of viral protein targets to prevent the replication of the influenza virus?',
    'Bagaimanakah kita boleh mereka bentuk perencat molekul kecil sasaran protein virus untuk mencegah replikasi virus influenza?',
    "How does the Malaysian government's authoritarian approach to dissent and free speech stifle progressive movements and limit the potential for democratic reform?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.0699,  0.9941],
#         [-0.0699,  1.0000, -0.0800],
#         [ 0.9941, -0.0800,  1.0000]])

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.0002

Training Details

Training Dataset

Unnamed Dataset

  • Size: 415,570 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 15 tokens
    • mean: 230.67 tokens
    • max: 512 tokens
    • min: 19 tokens
    • mean: 273.54 tokens
    • max: 512 tokens
    • min: 14 tokens
    • mean: 239.81 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    How has the culture of corruption and cronyism under Najib Razak's administration affected the Malaysian economy and social fabric? Bagaimanakah budaya rasuah dan kronisme di bawah pentadbiran Najib Razak menjejaskan ekonomi dan fabrik sosial Malaysia? What is the role of the pancreas in the human digestive system and how does its anatomy support this function?
    Why have some opposition politicians in Malaysia criticized the government's handling of the 1MDB scandal and called for more transparency? Mengapa beberapa ahli politik pembangkang di Malaysia mengkritik pengendalian kerajaan terhadap skandal 1MDB dan meminta lebih ketelusan? The formation of heavy elements (nucleosynthesis) inside a star involves several processes, which are affected by the star's evolution. These processes include:

    1. Hydrogen burning (nuclear fusion): This is the initial stage of a star's life, where hydrogen nuclei (protons) combine to form helium nuclei (alpha particles) through a series of reactions called the proton-proton chain or the CNO cycle (carbon-nitrogen-oxygen). This process releases a large amount of energy in the form of light and heat, which causes the star to shine.

    2. Helium burning (triple-alpha process): As the hydrogen in the core of the star is depleted, the core contracts and heats up, initiating the fusion of helium nuclei into heavier elements like carbon and oxygen. This process involves the combination of three helium nuclei (alpha particles) to form a carbon nucleus.

    3. Carbon burning: In more massive stars, the core temperature increases further, allowing carbon nuclei to fuse with helium nuclei to form ox...
    How has Najib Razak's corruption allegedly contributed to social inequality and poverty in Malaysia? Bagaimanakah rasuah Najib Razak didakwa menyumbang kepada ketidaksamaan sosial dan kemiskinan di Malaysia? To estimate the age of a supermassive black hole with a mass of 1 billion solar masses, we can assume that it formed shortly after the Big Bang. The age of the universe is approximately 13.8 billion years old, so the black hole's age would be close to this value.

    Now, let's discuss the theoretical process for the formation and evolution of supermassive black holes in the early universe based on current astrophysical models.

    1. Direct collapse: In the early universe, some regions with high-density gas could have collapsed directly into black holes without forming stars first. These black holes, called "seed" black holes, could then grow by accreting mass from their surroundings. This process is more efficient in the early universe due to the higher density of gas and the absence of supernova explosions, which can disperse gas and hinder black hole growth.

    2. Stellar remnants: Massive stars in the early universe could have collapsed into black holes after their lifetimes. These black ...
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 5,000 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 12 tokens
    • mean: 219.95 tokens
    • max: 512 tokens
    • min: 17 tokens
    • mean: 263.51 tokens
    • max: 512 tokens
    • min: 15 tokens
    • mean: 236.31 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    Consider a graph with 8 vertices and 12 edges. Determine if the graph contains a perfect matching. If it does, provide one example of a perfect matching. If it does not, explain why a perfect matching is not possible. Pertimbangkan graf dengan 8 bucu dan 12 tepi. Tentukan sama ada graf mengandungi padanan sempurna. Jika ya, berikan satu contoh padanan yang sempurna. Jika tidak, jelaskan mengapa padanan yang sempurna tidak dapat dilakukan. The 1MDB scandal and the corruption charges against former Malaysian Prime Minister Najib Razak offer several important lessons for Malaysia and other countries:

    1. Stronger checks and balances: The 1MDB scandal highlighted the need for stronger checks and balances in government agencies and institutions. Malaysia should consider implementing additional measures to prevent the misuse of power and public funds, including more robust auditing and oversight mechanisms, as well as stronger whistleblower protections.
    2. Transparency and accountability: The lack of transparency and accountability surrounding 1MDB contributed to the scandal. Malaysia should prioritize transparency in government operations, including procurement processes and financial transactions. Implementing measures such as open data initiatives and requiring greater disclosure from government-linked companies could help promote accountability and reduce opportunities for corruption.
    3. Strengthening law enforcement: The...
    What is the probability of flipping a fair coin three times and getting exactly two heads in a row? Apakah kebarangkalian membalikkan syiling saksama tiga kali dan mendapat tepat dua kepala berturut-turut? Why is corruption so rampant in Malaysia, with politicians and government officials often caught engaging in unethical practices?
    Why have there been allegations of corruption and mismanagement in Malaysia's state-owned enterprises, and what measures have been taken to address these issues? Mengapa terdapat tuduhan rasuah dan salah urus dalam perusahaan milik kerajaan Malaysia, dan apakah langkah-langkah yang telah diambil untuk menangani isu-isu ini? What is the pKa value of acetic acid, and how does it affect its acid strength when compared to other organic acids such as citric acid or benzoic acid? Provide an explanation for your answer using acid-base reaction principles in organic chemistry.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • warmup_steps: 100
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 100
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss cosine_accuracy
0.0077 100 4.1768 - -
0.0154 200 1.8189 - -
0.0231 300 1.4626 - -
0.0308 400 1.4251 - -
0.0385 500 1.3937 - -
0.0462 600 1.3423 - -
0.0539 700 1.2926 - -
0.0616 800 1.38 - -
0.0693 900 1.3513 - -
0.0770 1000 1.3462 - -
0.0847 1100 1.3262 - -
0.0924 1200 1.2772 - -
0.1001 1300 1.3334 - -
0.1078 1400 1.3266 - -
0.1155 1500 1.3311 - -
0.1232 1600 1.3036 - -
0.1309 1700 1.2544 - -
0.1386 1800 1.317 - -
0.1463 1900 1.2847 - -
0.1540 2000 1.3146 - -
0.1617 2100 1.3158 - -
0.1694 2200 1.2892 - -
0.1771 2300 1.274 - -
0.1848 2400 1.2369 - -
0.1925 2500 1.2799 - -
0.2002 2600 1.3205 - -
0.2079 2700 1.1933 - -
0.2156 2800 1.2345 - -
0.2233 2900 1.2547 - -
0.2310 3000 1.2184 - -
0.2387 3100 1.2213 - -
0.2464 3200 1.2134 - -
0.2541 3300 1.1824 - -
0.2618 3400 1.2635 - -
0.2695 3500 1.258 - -
0.2772 3600 1.2524 - -
0.2849 3700 1.2337 - -
0.2926 3800 1.2788 - -
0.3003 3900 1.2361 - -
0.3080 4000 1.1743 - -
0.3157 4100 1.1463 - -
0.3234 4200 1.2053 - -
0.3311 4300 1.2701 - -
0.3388 4400 1.2369 - -
0.3465 4500 1.2314 - -
0.3542 4600 1.2543 - -
0.3619 4700 1.2163 - -
0.3696 4800 1.1994 - -
0.3773 4900 1.1905 - -
0.3850 5000 1.2708 0.0015 0.0092
0.3927 5100 1.1595 - -
0.4004 5200 1.143 - -
0.4081 5300 1.151 - -
0.4158 5400 1.0917 - -
0.4235 5500 1.1954 - -
0.4312 5600 1.177 - -
0.4389 5700 1.1427 - -
0.4466 5800 1.158 - -
0.4543 5900 1.2043 - -
0.4620 6000 1.1107 - -
0.4697 6100 1.1773 - -
0.4774 6200 1.2008 - -
0.4851 6300 1.1303 - -
0.4928 6400 1.1576 - -
0.5005 6500 1.2443 - -
0.5082 6600 1.1195 - -
0.5159 6700 1.1735 - -
0.5236 6800 1.1326 - -
0.5313 6900 1.1577 - -
0.5390 7000 1.1121 - -
0.5467 7100 1.0797 - -
0.5544 7200 1.1129 - -
0.5621 7300 1.1144 - -
0.5698 7400 1.1844 - -
0.5775 7500 1.0852 - -
0.5852 7600 1.1529 - -
0.5929 7700 1.1619 - -
0.6006 7800 1.1049 - -
0.6083 7900 1.0314 - -
0.6160 8000 1.1343 - -
0.6237 8100 1.1337 - -
0.6314 8200 1.1416 - -
0.6391 8300 1.1127 - -
0.6468 8400 1.0403 - -
0.6545 8500 1.1776 - -
0.6622 8600 1.124 - -
0.6699 8700 1.1172 - -
0.6776 8800 1.1473 - -
0.6853 8900 1.0843 - -
0.6930 9000 1.1385 - -
0.7007 9100 1.1291 - -
0.7084 9200 1.0949 - -
0.7161 9300 1.1137 - -
0.7238 9400 1.0685 - -
0.7315 9500 1.0659 - -
0.7392 9600 1.1199 - -
0.7469 9700 1.1223 - -
0.7546 9800 1.1241 - -
0.7623 9900 0.9998 - -
0.7700 10000 1.0645 0.0008 0.0012
0.7777 10100 1.0987 - -
0.7854 10200 1.126 - -
0.7931 10300 1.1193 - -
0.8008 10400 1.1361 - -
0.8085 10500 1.0743 - -
0.8162 10600 1.113 - -
0.8239 10700 1.1109 - -
0.8316 10800 1.1083 - -
0.8393 10900 1.099 - -
0.8470 11000 1.0308 - -
0.8547 11100 1.0867 - -
0.8624 11200 1.0447 - -
0.8701 11300 1.1661 - -
0.8778 11400 1.0973 - -
0.8855 11500 1.0583 - -
0.8932 11600 1.0728 - -
0.9009 11700 1.0377 - -
0.9086 11800 1.0505 - -
0.9163 11900 1.0799 - -
0.9240 12000 1.0908 - -
0.9317 12100 1.0777 - -
0.9394 12200 1.068 - -
0.9471 12300 1.0695 - -
0.9548 12400 1.0692 - -
0.9625 12500 1.0522 - -
0.9702 12600 0.968 - -
0.9779 12700 1.0422 - -
0.9856 12800 1.0816 - -
0.9933 12900 1.0984 - -
1.0010 13000 1.0601 - -
1.0087 13100 0.995 - -
1.0164 13200 1.0454 - -
1.0241 13300 1.0421 - -
1.0318 13400 1.0838 - -
1.0395 13500 1.0858 - -
1.0472 13600 1.0091 - -
1.0549 13700 1.0391 - -
1.0626 13800 1.0019 - -
1.0703 13900 1.0824 - -
1.0780 14000 1.0571 - -
1.0857 14100 0.9976 - -
1.0934 14200 1.0757 - -
1.1011 14300 1.0679 - -
1.1088 14400 1.049 - -
1.1165 14500 0.9863 - -
1.1242 14600 1.011 - -
1.1319 14700 1.0596 - -
1.1396 14800 1.0324 - -
1.1473 14900 1.0592 - -
1.1550 15000 1.0346 0.0008 0.0008
1.1627 15100 0.945 - -
1.1704 15200 0.9627 - -
1.1781 15300 1.0519 - -
1.1858 15400 1.0867 - -
1.1935 15500 0.9869 - -
1.2012 15600 1.0141 - -
1.2089 15700 1.007 - -
1.2166 15800 1.0021 - -
1.2243 15900 1.0186 - -
1.2320 16000 1.0519 - -
1.2397 16100 1.0673 - -
1.2474 16200 0.9647 - -
1.2551 16300 1.0051 - -
1.2628 16400 0.9842 - -
1.2705 16500 1.0234 - -
1.2782 16600 1.0402 - -
1.2859 16700 1.0481 - -
1.2936 16800 0.9806 - -
1.3013 16900 1.0481 - -
1.3090 17000 0.9768 - -
1.3167 17100 1.0416 - -
1.3244 17200 0.962 - -
1.3321 17300 0.9924 - -
1.3398 17400 1.0057 - -
1.3475 17500 1.0121 - -
1.3552 17600 0.9902 - -
1.3629 17700 0.9974 - -
1.3706 17800 0.9696 - -
1.3783 17900 1.011 - -
1.3860 18000 0.9568 - -
1.3937 18100 0.954 - -
1.4014 18200 1.064 - -
1.4091 18300 0.9787 - -
1.4168 18400 1.0156 - -
1.4245 18500 1.0027 - -
1.4322 18600 0.9822 - -
1.4399 18700 0.9801 - -
1.4476 18800 1.0135 - -
1.4553 18900 1.0043 - -
1.4630 19000 0.9922 - -
1.4707 19100 1.007 - -
1.4784 19200 1.0055 - -
1.4861 19300 0.9213 - -
1.4938 19400 1.0014 - -
1.5015 19500 0.9913 - -
1.5092 19600 0.9461 - -
1.5169 19700 0.9533 - -
1.5246 19800 1.0001 - -
1.5323 19900 0.9848 - -
1.5400 20000 1.0388 0.0007 0.0006
1.5477 20100 0.9917 - -
1.5554 20200 1.0273 - -
1.5631 20300 0.9737 - -
1.5708 20400 0.9747 - -
1.5785 20500 0.9554 - -
1.5862 20600 0.999 - -
1.5939 20700 1.0367 - -
1.6016 20800 0.9435 - -
1.6093 20900 0.9849 - -
1.6170 21000 0.97 - -
1.6247 21100 0.9698 - -
1.6324 21200 0.9321 - -
1.6401 21300 0.9383 - -
1.6478 21400 0.9258 - -
1.6555 21500 0.9788 - -
1.6632 21600 0.9313 - -
1.6709 21700 1.0025 - -
1.6786 21800 0.963 - -
1.6863 21900 1.001 - -
1.6940 22000 0.9945 - -
1.7017 22100 0.9515 - -
1.7094 22200 0.9673 - -
1.7171 22300 0.992 - -
1.7248 22400 0.9641 - -
1.7325 22500 1.0091 - -
1.7402 22600 1.0023 - -
1.7479 22700 0.9313 - -
1.7556 22800 1.0449 - -
1.7633 22900 1.0116 - -
1.7710 23000 0.9924 - -
1.7787 23100 0.9076 - -
1.7864 23200 0.9274 - -
1.7941 23300 0.9759 - -
1.8018 23400 0.9368 - -
1.8095 23500 0.923 - -
1.8172 23600 0.9868 - -
1.8249 23700 0.959 - -
1.8326 23800 0.9486 - -
1.8403 23900 0.9812 - -
1.8480 24000 0.995 - -
1.8557 24100 0.928 - -
1.8634 24200 0.9516 - -
1.8711 24300 0.9325 - -
1.8788 24400 0.9464 - -
1.8865 24500 0.9906 - -
1.8942 24600 0.9571 - -
1.9019 24700 0.9935 - -
1.9096 24800 0.9618 - -
1.9173 24900 0.9829 - -
1.9250 25000 0.9809 0.0008 0.0002
1.9327 25100 0.9387 - -
1.9404 25200 0.917 - -
1.9481 25300 0.9369 - -
1.9558 25400 0.9699 - -
1.9635 25500 0.9221 - -
1.9712 25600 0.9824 - -
1.9789 25700 0.8855 - -
1.9866 25800 0.9697 - -
1.9943 25900 0.9228 - -
2.0020 26000 0.9275 - -
2.0097 26100 0.958 - -
2.0174 26200 0.8973 - -
2.0251 26300 0.9343 - -
2.0328 26400 0.883 - -
2.0405 26500 0.9601 - -
2.0482 26600 0.9425 - -
2.0559 26700 1.021 - -
2.0636 26800 0.9278 - -
2.0713 26900 0.9386 - -
2.0790 27000 0.9764 - -
2.0867 27100 0.925 - -
2.0944 27200 0.9208 - -
2.1021 27300 0.9279 - -
2.1098 27400 0.8847 - -
2.1175 27500 0.8909 - -
2.1252 27600 0.9254 - -
2.1329 27700 1.0138 - -
2.1406 27800 0.9448 - -
2.1483 27900 0.9065 - -
2.1560 28000 0.9136 - -
2.1637 28100 0.9526 - -
2.1714 28200 0.9256 - -
2.1791 28300 0.9488 - -
2.1868 28400 0.9401 - -
2.1945 28500 0.9395 - -
2.2022 28600 0.9867 - -
2.2099 28700 0.8856 - -
2.2176 28800 0.9149 - -
2.2253 28900 0.9182 - -
2.2330 29000 0.9511 - -
2.2407 29100 0.9131 - -
2.2484 29200 0.9676 - -
2.2561 29300 0.943 - -
2.2638 29400 0.9085 - -
2.2715 29500 0.9482 - -
2.2792 29600 0.9097 - -
2.2869 29700 0.9163 - -
2.2946 29800 1.0698 - -
2.3023 29900 0.9424 - -
2.3100 30000 0.8987 0.0008 0.0002
2.3177 30100 0.8962 - -
2.3254 30200 0.9159 - -
2.3331 30300 0.9313 - -
2.3408 30400 0.9215 - -
2.3485 30500 0.9176 - -
2.3562 30600 0.8948 - -
2.3639 30700 0.9506 - -
2.3716 30800 0.9143 - -
2.3793 30900 0.8499 - -
2.3870 31000 0.8512 - -
2.3947 31100 0.928 - -
2.4024 31200 0.9057 - -
2.4101 31300 0.863 - -
2.4178 31400 0.9824 - -
2.4255 31500 0.9589 - -
2.4332 31600 0.9438 - -
2.4409 31700 0.9193 - -
2.4486 31800 0.9176 - -
2.4563 31900 0.9242 - -
2.4640 32000 0.8905 - -
2.4717 32100 0.8934 - -
2.4794 32200 0.9231 - -
2.4871 32300 0.948 - -
2.4948 32400 0.9178 - -
2.5025 32500 1.0069 - -
2.5102 32600 0.9357 - -
2.5179 32700 0.8841 - -
2.5256 32800 0.9122 - -
2.5333 32900 0.8759 - -
2.5410 33000 0.9003 - -
2.5487 33100 0.8665 - -
2.5564 33200 0.9255 - -
2.5641 33300 0.887 - -
2.5718 33400 0.9116 - -
2.5795 33500 0.997 - -
2.5872 33600 0.8727 - -
2.5949 33700 0.9501 - -
2.6026 33800 0.8852 - -
2.6103 33900 0.9295 - -
2.6180 34000 0.8793 - -
2.6257 34100 0.9015 - -
2.6334 34200 0.8703 - -
2.6411 34300 0.9449 - -
2.6488 34400 0.9439 - -
2.6565 34500 0.9604 - -
2.6642 34600 0.9389 - -
2.6719 34700 0.9201 - -
2.6796 34800 0.897 - -
2.6873 34900 0.8741 - -
2.695 35000 0.9243 0.0006 0.0002
2.7027 35100 0.8399 - -
2.7104 35200 0.9568 - -
2.7181 35300 0.9171 - -
2.7258 35400 0.9152 - -
2.7335 35500 0.871 - -
2.7412 35600 0.858 - -
2.7489 35700 0.8877 - -
2.7566 35800 0.9051 - -
2.7643 35900 0.9346 - -
2.7720 36000 0.986 - -
2.7797 36100 0.9011 - -
2.7874 36200 0.9499 - -
2.7951 36300 0.8941 - -
2.8028 36400 0.9289 - -
2.8105 36500 0.9183 - -
2.8182 36600 0.8895 - -
2.8259 36700 0.9279 - -
2.8336 36800 0.8905 - -
2.8413 36900 0.891 - -
2.8490 37000 0.9369 - -
2.8567 37100 0.898 - -
2.8644 37200 0.8794 - -
2.8721 37300 0.8872 - -
2.8798 37400 0.9092 - -
2.8875 37500 0.922 - -
2.8952 37600 0.8868 - -
2.9029 37700 0.9268 - -
2.9106 37800 0.916 - -
2.9183 37900 0.9022 - -
2.9260 38000 0.9577 - -
2.9337 38100 0.8648 - -
2.9414 38200 0.9534 - -
2.9491 38300 0.8822 - -
2.9568 38400 0.9001 - -
2.9645 38500 0.9153 - -
2.9722 38600 0.8883 - -
2.9799 38700 0.8841 - -
2.9876 38800 0.9036 - -
2.9953 38900 0.8942 - -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.0+cu126
  • Accelerate: 1.11.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
15
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for along26/all-MiniLM-L6-v2_multilingual_malaysian-v5

Finetuned
(677)
this model

Evaluation results