---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:958
- loss:CosineSimilarityLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
- source_sentence: Develop, train, and fine-tune embedding models for improved resume
    matching. Implement and evaluate various model architectures (e.g., transformers,
    Siamese networks) using Python and relevant libraries (e.g., TensorFlow, PyTorch).
    Analyze and interpret model performance metrics, identifying areas for improvement.
    Collaborate with data scientists and engineers to integrate models into production
    systems. Optimize model performance for speed and accuracy. Stay up-to-date with
    the latest advancements in NLP and embedding techniques.
  sentences:
  - 'Skills:

    *   Python, TensorFlow, PyTorch

    *   NLP, Embedding Models, Machine Learning

    *   Web Development, REST APIs


    Experience:

    Software Engineer, Acme Corp (2020 - Present)

    *   Developed and deployed web applications using Python and Django.

    *   Worked on improving search functionality using Elasticsearch.'
  - 'Skills: Python, scikit-learn, machine learning, data analysis, threat modeling,
    communication, problem-solving. Projects include NLP for text classification and
    cybersecurity risk assessment utilizing Python and relevant libraries.'
  - 'Product Management Intern | ABC Company | June 2022 - August 2022

    *   Analyzed user feedback on the product, identifying pain points and areas for
    improvement.

    *   Collaborated with engineers on A/B testing new features to improve user engagement.

    *   Conducted market research and competitive analysis to inform product strategy.

    *   Presented product updates and findings to senior management.'
- source_sentence: Develop, train, and fine-tune embedding models for resume matching,
    focusing on improving accuracy and relevance. This includes experimenting with
    different model architectures, loss functions, and training datasets. Evaluate
    model performance using relevant metrics (e.g., precision, recall, F1-score) and
    identify areas for improvement. Collaborate with data scientists and engineers
    to deploy and maintain models in a production environment. Analyze resume and
    job description data to identify patterns and insights that can inform model development.
    Stay up-to-date with the latest advancements in natural language processing (NLP)
    and machine learning, particularly in the area of embedding models and their application
    to HR and recruitment. Design and implement A/B tests to validate model improvements.
    Provide technical guidance and mentorship to junior team members.
  sentences:
  - "**Senior Machine Learning Engineer**\n*   **Skills:** Python, TensorFlow, PyTorch,\
    \ NLP, Embedding Models, Resume Parsing, Evaluation Metrics (Precision, Recall,\
    \ F1), A/B Testing, Cloud Platforms (AWS), Model Deployment. \n*   Developed and\
    \ deployed a custom BERT-based model for semantic search on large datasets. Focused\
    \ on improving search accuracy and reducing latency. Utilized various loss functions\
    \ and experimented with different hyperparameter configurations. \n*   Led the\
    \ development of a candidate ranking system using a Siamese network architecture.\
    \ Integrated the model into a production environment. \n*   Conducted A/B tests\
    \ to validate model improvements and tracked key performance indicators. Mentored\
    \ junior engineers on model development and deployment best practices. \n*   **Projects:**\n\
    \    *   Semantic Search Optimization: Improved search accuracy by 15% using fine-tuning\
    \ of BERT. Implemented a system using ElasticSearch.\n    *   Candidate Matching\
    \ System: Designed and implemented a system that matched resumes with job descriptions\
    \ using NLP and machine learning techniques."
  - "  *   Built and maintained ETL pipelines using Python and Spark for processing\
    \ large datasets.\n  *   Experience with feature engineering to enhance model\
    \ accuracy.\n  *   Collaborated with cross-functional teams to deploy machine\
    \ learning models.\n  *   Monitored model performance metrics and identified areas\
    \ for optimization.  \n  *   Documented code and processes.   \n\nSkills:\n  *\
    \   Python, Spark, SQL, AWS, Machine Learning Principles"
  - 'Senior Software Engineer | Acme Corp | 2018 - Present

    *   Led the development and deployment of a new resume parsing and matching engine,
    significantly improving the accuracy of candidate recommendations.

    *   Implemented and evaluated various machine learning models, including BERT
    and Sentence Transformers, for semantic similarity scoring.

    *   Utilized Python, TensorFlow, and scikit-learn for model training, evaluation,
    and deployment.

    *   Improved model performance by 15% by implementing new loss function for semantic
    search and fine-tuning the model using custom datasets.'
- source_sentence: Develop and maintain the infrastructure for fine-tuning and evaluating
    embedding models for resume matching. This includes data pipeline design, model
    training pipelines, performance monitoring, and A/B testing of different model
    architectures and training strategies. Optimize model performance for accuracy
    and efficiency, considering factors like latency and resource consumption. Collaborate
    with data scientists and product managers to understand requirements and translate
    them into technical solutions. Build and maintain documentation for all processes
    and tools.
  sentences:
  - 'Skills:


    *   **Python:** Extensive experience in data manipulation and analysis using libraries
    like Pandas and NumPy. Proficient in developing and deploying machine learning
    models.

    *   **Machine Learning:** Solid understanding of various ML algorithms (Regression,
    Decision Trees, SVM, etc.) and experience with model evaluation and selection.
    Familiar with hyperparameter tuning.

    *   **NLP:** Working knowledge of NLP concepts, including text classification
    and sentiment analysis. Used NLTK and SpaCy for text preprocessing and analysis.

    *   **Deep Learning:** Developed and trained Convolutional Neural Networks (CNNs)
    for image recognition and Recurrent Neural Networks (RNNs) for sequence data.

    *   **Cloud Computing:** Used AWS for deploying web applications and storing large
    datasets. Experienced with EC2 and S3 services.

    *   **Data Analysis:** Strong analytical skills with the ability to extract insights
    from complex datasets.

    *   **Tools:** TensorFlow, scikit-learn, Git, Docker'
  - 'Senior Data Engineer | Acme Corp | 2018 - Present

    * Developed and maintained Spark-based data pipelines for processing large datasets
    used in machine learning models.

    * Implemented model monitoring dashboards and alerting systems using Prometheus
    and Grafana.

    * Collaborated with the data science team to deploy models to production using
    Kubernetes.

    * Experience with AWS cloud services, including S3, EMR, and SageMaker.'
  - '### Data Engineering & Machine Learning Projects


    *   **Resume Matching System:** Designed and implemented an end-to-end pipeline
    for processing resumes and matching them to job descriptions. Leveraged Elasticsearch
    for indexing and search.  Improved match relevance by 15% by analyzing search
    query logs and refining scoring algorithms. Used Python, Spark, and AWS services.

    *   **Model Training and Evaluation:** Built automated pipelines for training
    and evaluating machine learning models.  Implemented model versioning and A/B
    testing to improve model performance. Monitored model performance using Prometheus
    and Grafana, identifying and resolving performance bottlenecks. Skilled in TensorFlow,
    PyTorch, and scikit-learn. Experience in data preprocessing, feature engineering,
    and model selection. Wrote extensive documentation and user guides for deployed
    pipelines. Focused on accuracy and efficiency.

    *   **Data Pipeline Development:** Designed and built scalable data pipelines
    using Apache Kafka and Spark for real-time data processing. Maintained high availability
    and reliability.  Implemented data validation and error handling mechanisms. Focused
    on data quality and efficiency.'
- source_sentence: PhD or Master's degree in Marketing, Data Science, Statistics,
    or a related quantitative field is required. Experience with developing and fine-tuning
    embedding models for semantic similarity and information retrieval is highly desirable.
    A strong understanding of NLP techniques (e.g., transformers, word embeddings)
    and their application to resume parsing and candidate matching is essential. Expertise
    in using Python and relevant libraries (e.g., TensorFlow, PyTorch, scikit-learn)
    is a must.
  sentences:
  - "Education:\n\n*   **PhD, Marketing** - University of California, Berkeley (2015-2019)\n\
    \    *   Dissertation: *Predictive Modeling of Consumer Behavior Using Neural\
    \ Networks* - Focused on advanced statistical modeling techniques, including transformer\
    \ networks for sentiment analysis. Proficient in Python (TensorFlow, Keras), R,\
    \ and data visualization.\n*   **MBA** - Harvard Business School (2013). Focused\
    \ on strategic marketing and quantitative analysis. Experience with market research\
    \ and predictive modeling."
  - Resume Matching and NLP Enthusiast | Recent graduate with a Bachelor's degree
    in Data Science. Passionate about applying machine learning techniques to solve
    real-world problems. Proficient in Python, including libraries like Scikit-learn
    and TensorFlow. Conducted a personal project on text classification achieving
    85% accuracy. Eager to contribute to improving recommendation systems and model
    accuracy.
  - '## Software Engineer | Acme Corp | 2020 - Present

    *   Improved search functionality within the company intranet.

    *   Utilized Python and TensorFlow for data analysis and model training.

    *   Worked with vector databases to manage search results.

    *   Participated in the deployment of the search application.

    *   Successfully reduced search latency by 15%.'
- source_sentence: Developed and maintained core backend services using Python and
    Django, focusing on scalability and efficiency. Implemented RESTful APIs for data
    retrieval and manipulation.  Worked extensively with PostgreSQL for data storage
    and retrieval.  Responsible for optimizing database queries and improving API
    response times.  Experience with model fine-tuning for semantic search and document
    retrieval using pre-trained embedding models like Sentence Transformers or similar
    libraries, specifically for improving the relevance of search results and document
    matching within the web application.  Experience using vector databases (e.g.,
    ChromaDB, Weaviate) preferred.
  sentences:
  - 'Skills: Python (proficient in Pandas, Scikit-learn, and Numpy), Machine Learning
    (classification, regression), NLP fundamentals, familiarity with BERT and TF-IDF,
    data visualization with Matplotlib and Seaborn. Experience using AWS S3 and EC2
    for data storage and model training. Conducted A/B testing on marketing campaigns.
    Experience with data analysis and reporting.'
  - 'PhD in Computer Science, University of California, Berkeley (2018-2023). Dissertation:
    ''Adversarial Robustness in NLP for Cybersecurity Applications.'' Focused on fine-tuning
    BERT for malware detection and social engineering attacks. Proficient in Python,
    TensorFlow, and AWS. Published in top-tier NLP and security conferences. Experienced
    with large datasets and model evaluation metrics.


    Master of Science in Cybersecurity, Johns Hopkins University (2016-2018). Relevant
    coursework included Machine Learning, Data Mining, and Network Security. Developed
    a system for anomaly detection using a recurrent neural network (RNN). Familiar
    with Python and cloud computing platforms. Good understanding of NLP concepts,
    but limited experience fine-tuning transformer models. Strong understanding of
    Information Security Principles.


    Bachelor of Science in Computer Engineering, Carnegie Mellon University (2012-2016).
    Relevant coursework: Artificial Intelligence, Database Management, and Software
    Engineering. Project experience: Developed a web application using Python. No
    direct experience with fine-tuning NLP models, but a strong foundation in programming
    and data structures.  Familiar with cloud infrastructure concepts. Possess CISSP
    certification.'
  - '## Senior Backend Engineer


    *   **ABC Corp** | 2020 - Present

    *   Led development of a new REST API for user authentication and profile management
    using Python and Django.

    *   Managed a PostgreSQL database, optimizing queries and schema design for improved
    performance, resulting in a 20% reduction in average API response time.

    *   Improved system scalability through efficient code design and load balancing
    techniques.

    *   Experience using pre-trained embedding models (BERT) for natural language
    processing tasks to improve search accuracy, with focus on keyphrase extraction
    and content similarity comparison for the recommendations engine. Proficient in
    Flask.'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
model-index:
- name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
  results:
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: dev evaluation
      type: dev_evaluation
    metrics:
    - type: pearson_cosine
      value: 0.5378933775375572
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.6213226022358173
      name: Spearman Cosine
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: test evaluation
      type: test_evaluation
    metrics:
    - type: pearson_cosine
      value: 0.5378933775375572
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.6213226022358173
      name: Spearman Cosine
---

# SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision c9745ed1d9f207416be6d2e6f8de32d1f16199bf -->
- **Maximum Sequence Length:** 256 tokens
- **Output Dimensionality:** 384 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("anass1209/resume-job-matcher-all-MiniLM-L6-v2")
# Run inference
sentences = [
    'Developed and maintained core backend services using Python and Django, focusing on scalability and efficiency. Implemented RESTful APIs for data retrieval and manipulation.  Worked extensively with PostgreSQL for data storage and retrieval.  Responsible for optimizing database queries and improving API response times.  Experience with model fine-tuning for semantic search and document retrieval using pre-trained embedding models like Sentence Transformers or similar libraries, specifically for improving the relevance of search results and document matching within the web application.  Experience using vector databases (e.g., ChromaDB, Weaviate) preferred.',
    '## Senior Backend Engineer\n\n*   **ABC Corp** | 2020 - Present\n*   Led development of a new REST API for user authentication and profile management using Python and Django.\n*   Managed a PostgreSQL database, optimizing queries and schema design for improved performance, resulting in a 20% reduction in average API response time.\n*   Improved system scalability through efficient code design and load balancing techniques.\n*   Experience using pre-trained embedding models (BERT) for natural language processing tasks to improve search accuracy, with focus on keyphrase extraction and content similarity comparison for the recommendations engine. Proficient in Flask.',
    "PhD in Computer Science, University of California, Berkeley (2018-2023). Dissertation: 'Adversarial Robustness in NLP for Cybersecurity Applications.' Focused on fine-tuning BERT for malware detection and social engineering attacks. Proficient in Python, TensorFlow, and AWS. Published in top-tier NLP and security conferences. Experienced with large datasets and model evaluation metrics.\n\nMaster of Science in Cybersecurity, Johns Hopkins University (2016-2018). Relevant coursework included Machine Learning, Data Mining, and Network Security. Developed a system for anomaly detection using a recurrent neural network (RNN). Familiar with Python and cloud computing platforms. Good understanding of NLP concepts, but limited experience fine-tuning transformer models. Strong understanding of Information Security Principles.\n\nBachelor of Science in Computer Engineering, Carnegie Mellon University (2012-2016). Relevant coursework: Artificial Intelligence, Database Management, and Software Engineering. Project experience: Developed a web application using Python. No direct experience with fine-tuning NLP models, but a strong foundation in programming and data structures.  Familiar with cloud infrastructure concepts. Possess CISSP certification.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Semantic Similarity

* Datasets: `dev_evaluation` and `test_evaluation`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

| Metric              | dev_evaluation | test_evaluation |
|:--------------------|:---------------|:----------------|
| pearson_cosine      | 0.5379         | 0.5379          |
| **spearman_cosine** | **0.6213**     | **0.6213**      |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### Unnamed Dataset

* Size: 958 training samples
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
* Approximate statistics based on the first 958 samples:
  |         | sentence_0                                                                           | sentence_1                                                                           | label                                                           |
  |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------|
  | type    | string                                                                               | string                                                                               | float                                                           |
  | details | <ul><li>min: 41 tokens</li><li>mean: 110.12 tokens</li><li>max: 234 tokens</li></ul> | <ul><li>min: 25 tokens</li><li>mean: 134.18 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 0.5</li><li>mean: 0.78</li><li>max: 0.96</li></ul> |
* Samples:
  | sentence_0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | sentence_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | label                           |
  |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------|
  | <code>Required skills include experience with embedding models, fine-tuning techniques, Python programming, and knowledge of NLP concepts. Proficiency in libraries like TensorFlow or PyTorch is essential.  Familiarity with resume parsing and matching algorithms is a plus.  Must be able to analyze performance metrics and iterate on model improvements.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | <code>Skills: Python, TensorFlow, NLP, Embedding Models, Fine-tuning, Resume Matching, Model Evaluation. Experienced in building and deploying machine learning models for text analysis and information retrieval. Proficient in analyzing performance using precision, recall, and F1-score to improve model accuracy.<br><br>Skills: Python, PyTorch, Natural Language Processing, Text Classification, Machine Learning. Developed several machine learning models using Python and PyTorch for various text related tasks. Good understanding of model evaluation metrics.<br><br>Technical Skills: Python, Scikit-learn, Data analysis, Data Visualization, Natural Language Processing basics. Projects include text classification and sentiment analysis. Knowledge of model evaluation techniques.<br><br>Proficient in Python. Familiar with basic machine learning concepts and libraries.  Experience with data cleaning and preprocessing. Strong analytical and problem-solving skills.<br><br>Skills: Python, Pandas, Scikit-learn, Data Preprocessing...</code> | <code>0.8882194757461548</code> |
  | <code>Experience with embedding models and fine-tuning techniques. Ability to analyze resume data and identify relevant keywords for improved matching. Proficiency in Python and experience with relevant libraries like Transformers, Sentence Transformers, and scikit-learn. Knowledge of A/B testing and evaluation metrics (precision, recall, F1-score). Understanding of product management principles and the product development lifecycle is a plus.</code>                                                                                                                                                                                                                                                                                                                                                                                                             | <code>Skills:<br>*   Python (proficient in Pandas, NumPy)<br>*   Machine Learning (basic understanding)<br>*   Data Analysis<br>*   A/B Testing (conducted tests for website optimization)<br>*   Excellent communication and presentation skills</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | <code>0.5</code>                |
  | <code>Senior DevOps Engineer to lead the implementation and optimization of our resume matching system. Responsibilities include: Fine-tuning and evaluating embedding models (e.g., Sentence Transformers, BERT) for improved semantic similarity scoring. Developing and maintaining the infrastructure for model training, evaluation, and deployment. Collaborating with data scientists and software engineers to integrate the matching system into our platform. Monitoring model performance and identifying areas for improvement, including data augmentation strategies. Strong experience with Python, cloud platforms (AWS, GCP, or Azure), containerization (Docker, Kubernetes), and CI/CD pipelines. Must have proficiency in evaluating model performance metrics (precision, recall, F1-score, AUC) and experience with model versioning and A/B testing.</code> | <code>## Experience<br><br>**Senior DevOps Engineer** | Acme Corp | 2018 - Present<br><br>*   Spearheaded the migration of our legacy infrastructure to AWS, reducing operational costs by 30%.<br>*   Built and maintained CI/CD pipelines using Jenkins and GitLab, automating deployments and improving release frequency.<br>*   Developed and implemented monitoring solutions using Prometheus and Grafana to proactively identify and resolve performance issues.<br>*   Proficient in Python and experienced with Docker and Kubernetes.<br>*   **Relevant Project:** Improved the performance of the internal search tool, although I did not specifically work on the resume matching feature. The project included analyzing and improving the relevancy of search results using techniques to improve semantic search and understanding user intent. Familiar with evaluation metrics. A/B tested search improvements.<br>*   Actively involved in code reviews and providing technical guidance to junior engineers.</code>                                         | <code>0.8620760440826416</code> |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
  ```json
  {
      "loss_fct": "torch.nn.modules.loss.MSELoss"
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `num_train_epochs`: 50
- `multi_dataset_batch_sampler`: round_robin

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1
- `num_train_epochs`: 50
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `tp_size`: 0
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: round_robin

</details>

### Training Logs
| Epoch   | Step | Training Loss | dev_evaluation_spearman_cosine | test_evaluation_spearman_cosine |
|:-------:|:----:|:-------------:|:------------------------------:|:-------------------------------:|
| 1.0     | 60   | -             | 0.4867                         | -                               |
| 2.0     | 120  | -             | 0.5612                         | -                               |
| 3.0     | 180  | -             | 0.5929                         | -                               |
| 4.0     | 240  | -             | 0.6229                         | -                               |
| 5.0     | 300  | -             | 0.6377                         | -                               |
| 6.0     | 360  | -             | 0.6434                         | -                               |
| 7.0     | 420  | -             | 0.6104                         | -                               |
| 8.0     | 480  | -             | 0.6064                         | -                               |
| 8.3333  | 500  | 0.0122        | -                              | -                               |
| 9.0     | 540  | -             | 0.6005                         | -                               |
| 10.0    | 600  | -             | 0.6064                         | -                               |
| 11.0    | 660  | -             | 0.5973                         | -                               |
| 12.0    | 720  | -             | 0.6097                         | -                               |
| 13.0    | 780  | -             | 0.5907                         | -                               |
| 14.0    | 840  | -             | 0.5870                         | -                               |
| 15.0    | 900  | -             | 0.5989                         | -                               |
| 16.0    | 960  | -             | 0.6018                         | -                               |
| 16.6667 | 1000 | 0.0019        | -                              | -                               |
| 17.0    | 1020 | -             | 0.6208                         | -                               |
| 18.0    | 1080 | -             | 0.6133                         | -                               |
| 19.0    | 1140 | -             | 0.6200                         | -                               |
| 20.0    | 1200 | -             | 0.5960                         | -                               |
| 21.0    | 1260 | -             | 0.5999                         | -                               |
| 22.0    | 1320 | -             | 0.5995                         | -                               |
| 23.0    | 1380 | -             | 0.6177                         | -                               |
| 24.0    | 1440 | -             | 0.6201                         | -                               |
| 25.0    | 1500 | 0.0009        | 0.6110                         | -                               |
| 26.0    | 1560 | -             | 0.6184                         | -                               |
| 27.0    | 1620 | -             | 0.6133                         | -                               |
| 28.0    | 1680 | -             | 0.6287                         | -                               |
| 29.0    | 1740 | -             | 0.6200                         | -                               |
| 30.0    | 1800 | -             | 0.6272                         | -                               |
| 31.0    | 1860 | -             | 0.6222                         | -                               |
| 32.0    | 1920 | -             | 0.6199                         | -                               |
| 33.0    | 1980 | -             | 0.6141                         | -                               |
| 33.3333 | 2000 | 0.0006        | -                              | -                               |
| 34.0    | 2040 | -             | 0.6228                         | -                               |
| 35.0    | 2100 | -             | 0.6275                         | -                               |
| 36.0    | 2160 | -             | 0.6167                         | -                               |
| 37.0    | 2220 | -             | 0.6140                         | -                               |
| 38.0    | 2280 | -             | 0.6217                         | -                               |
| 39.0    | 2340 | -             | 0.6280                         | -                               |
| 40.0    | 2400 | -             | 0.6254                         | -                               |
| 41.0    | 2460 | -             | 0.6186                         | -                               |
| 41.6667 | 2500 | 0.0005        | -                              | -                               |
| 42.0    | 2520 | -             | 0.6185                         | -                               |
| 43.0    | 2580 | -             | 0.6242                         | -                               |
| 44.0    | 2640 | -             | 0.6183                         | -                               |
| 45.0    | 2700 | -             | 0.6213                         | -                               |
| 46.0    | 2760 | -             | 0.6220                         | -                               |
| 47.0    | 2820 | -             | 0.6213                         | -                               |
| 48.0    | 2880 | -             | 0.6213                         | -                               |
| 49.0    | 2940 | -             | 0.6214                         | -                               |
| 50.0    | 3000 | 0.0004        | 0.6213                         | -                               |
| -1      | -1   | -             | -                              | 0.6213                          |


### Framework Versions
- Python: 3.11.11
- Sentence Transformers: 4.1.0
- Transformers: 4.51.1
- PyTorch: 2.5.1+cu124
- Accelerate: 1.3.0
- Datasets: 3.5.0
- Tokenizers: 0.21.0

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->