Aathi13's picture
Upload folder using huggingface_hub
40617bb verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:500
  - loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
widget:
  - source_sentence: Can I get academic adjustments for mental health reasons?
    sentences:
      - >-
        Yes, appropriate academic accommodations can be arranged through the
        disability services office with documentation from mental health
        professionals.
      - >-
        Yes, many companies conduct online aptitude tests, coding challenges, or
        domain-specific assessments as part of their selection process.
      - The hostel offers Wi-Fi, mess services, laundry, and recreational areas.
  - source_sentence: What are the hostel meal timings?
    sentences:
      - >-
        Career services include career counseling, resume workshops, interview
        coaching, networking events, alumni mentoring, and job search assistance
        for all students.
      - >-
        Fee concession applications can be submitted with financial
        documentation to demonstrate need. Merit-based and need-based
        concessions may be available.
      - >-
        Mess timings are typically breakfast 7:30-9:30 AM, lunch 12:00-2:00 PM,
        and dinner 7:00-9:00 PM. Special arrangements may be made during exams.
  - source_sentence: Is there a bond signing for certain jobs?
    sentences:
      - >-
        Yes, detailed placement statistics including company-wise data, salary
        ranges, and sector-wise placement percentages are available on the
        placement portal.
      - >-
        Some companies may require service agreements or bonds. All terms and
        conditions are clearly communicated during the pre-placement talk.
      - >-
        Emergency services are available 24/7 through campus security (extension
        911), medical emergencies (campus health center), and crisis
        intervention services.
  - source_sentence: Where is the medical center located?
    sentences:
      - >-
        Most companies require no active backlogs for placement eligibility.
        Clear all backlogs before the placement season to ensure maximum
        opportunities.
      - >-
        The campus medical center is located near the main administrative
        building and provides basic healthcare services during working hours.
      - >-
        Yes, photocopy and printing services are available at the library,
        administrative building, and near the main canteen with reasonable
        rates.
  - source_sentence: Are there mock interviews before placements?
    sentences:
      - >-
        Yes, mock interviews are conducted regularly to help students practice
        and improve their interview skills before actual placement interviews.
      - >-
        Overnight event permissions require advance approval from student
        affairs, security clearance, safety protocols, and may need faculty
        supervision for student organizations.
      - >-
        Final year students can typically choose 2-4 elective courses depending
        on their program. Check with your academic advisor for specific
        requirements.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/multi-qa-MiniLM-L6-cos-v1

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-MiniLM-L6-cos-v1. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Are there mock interviews before placements?',
    'Yes, mock interviews are conducted regularly to help students practice and improve their interview skills before actual placement interviews.',
    'Overnight event permissions require advance approval from student affairs, security clearance, safety protocols, and may need faculty supervision for student organizations.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.7937, -0.0089],
#         [ 0.7937,  1.0000,  0.0156],
#         [-0.0089,  0.0156,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 500 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 500 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 7 tokens
    • mean: 11.04 tokens
    • max: 17 tokens
    • min: 13 tokens
    • mean: 26.67 tokens
    • max: 46 tokens
  • Samples:
    sentence_0 sentence_1
    What is the policy on retroactive course drops? Retroactive drops are rare and require exceptional circumstances with documentation. Medical emergencies or administrative errors may qualify for consideration.
    Can I get help for eating disorders? Specialized counseling for eating disorders is available through the health center. Confidential support includes individual therapy and referrals to specialized treatment programs.
    Are pets allowed in campus housing? No, pets are not allowed in campus housing including hostels and faculty quarters due to health and safety regulations.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 4
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.0.0
  • Transformers: 4.55.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.9.0
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}