drAbreu commited on
Commit
a20162b
·
verified ·
1 Parent(s): 38581f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -8
README.md CHANGED
@@ -21,7 +21,7 @@ metrics:
21
 
22
  ## Model Description
23
 
24
- SODA-VEC embedding model trained with VICReg Exact loss function. This model implements the exact VICReg objective with invariance, variance, and covariance terms for biomedical text embeddings.
25
 
26
  This model is part of the **SODA-VEC** (Scientific Open Domain Adaptation for Vector Embeddings) project, which focuses on creating high-quality embedding models for biomedical and life sciences text.
27
 
@@ -42,7 +42,7 @@ This model is part of the **SODA-VEC** (Scientific Open Domain Adaptation for Ve
42
 
43
  ### Training Procedure
44
 
45
- **Loss Function**: VICReg Exact: exact VICReg objective with invariance (MSE), variance (std), and covariance losses
46
 
47
  **Coefficients**: sim=25.0, std=25.0, cov=1.0
48
  **Base Model**: `answerdotai/ModernBERT-base`
@@ -135,7 +135,7 @@ The model has been evaluated on comprehensive biomedical benchmarks including:
135
  - **Field-Specific Separability**: Distinguishing between different biological fields
136
  - **Semantic Search**: Retrieval quality on biomedical text corpora
137
 
138
- For detailed evaluation results, see the [SODA-VEC benchmark notebooks](https://github.com/EMBO/soda-vec).
139
 
140
  ## Intended Use
141
 
@@ -143,9 +143,6 @@ This model is designed for:
143
 
144
  - **Biomedical Semantic Search**: Finding relevant papers, abstracts, or text passages
145
  - **Scientific Text Similarity**: Computing similarity between biomedical texts
146
- - **Information Retrieval**: Building search systems for scientific literature
147
- - **Downstream Tasks**: As a base for fine-tuning on specific biomedical tasks
148
- - **Research Applications**: Academic and research use in life sciences
149
 
150
  ## Limitations
151
 
@@ -163,13 +160,13 @@ If you use this model, please cite:
163
  title = {SODA-VEC: Scientific Open Domain Adaptation for Vector Embeddings},
164
  author = {EMBO},
165
  year = {2024},
166
- url = {https://github.com/EMBO/soda-vec}
167
  }
168
  ```
169
 
170
  ## Model Card Contact
171
 
172
- For questions or issues, please open an issue on the [SODA-VEC GitHub repository](https://github.com/EMBO/soda-vec).
173
 
174
  ---
175
 
 
21
 
22
  ## Model Description
23
 
24
+ SODA-VEC embedding model trained with [VICReg](https://arxiv.org/pdf/2105.04906) Exact loss function. This model implements the exact VICReg objective with invariance, variance, and covariance terms for biomedical text embeddings.
25
 
26
  This model is part of the **SODA-VEC** (Scientific Open Domain Adaptation for Vector Embeddings) project, which focuses on creating high-quality embedding models for biomedical and life sciences text.
27
 
 
42
 
43
  ### Training Procedure
44
 
45
+ **Loss Function**: VICReg Exact: exact [VICReg](https://arxiv.org/pdf/2105.04906) objective with invariance (MSE), variance (std), and covariance losses
46
 
47
  **Coefficients**: sim=25.0, std=25.0, cov=1.0
48
  **Base Model**: `answerdotai/ModernBERT-base`
 
135
  - **Field-Specific Separability**: Distinguishing between different biological fields
136
  - **Semantic Search**: Retrieval quality on biomedical text corpora
137
 
138
+ For detailed evaluation results, see the [SODA-VEC benchmark notebooks](https://github.com/source-data/soda-vec).
139
 
140
  ## Intended Use
141
 
 
143
 
144
  - **Biomedical Semantic Search**: Finding relevant papers, abstracts, or text passages
145
  - **Scientific Text Similarity**: Computing similarity between biomedical texts
 
 
 
146
 
147
  ## Limitations
148
 
 
160
  title = {SODA-VEC: Scientific Open Domain Adaptation for Vector Embeddings},
161
  author = {EMBO},
162
  year = {2024},
163
+ url = {https://github.com/source-data/soda-vec}
164
  }
165
  ```
166
 
167
  ## Model Card Contact
168
 
169
+ For questions or issues, please open an issue on the [SODA-VEC GitHub repository](https://github.com/source-data/soda-vec).
170
 
171
  ---
172