lgsilvaesilva commited on
Commit
4385852
·
verified ·
1 Parent(s): 9ad5336

Push model using huggingface_hub.

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ unigram.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,377 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - setfit
4
+ - sentence-transformers
5
+ - text-classification
6
+ - generated_from_setfit_trainer
7
+ widget:
8
+ - text: To monitor market dynamics and inform policy responses, the government will
9
+ track the retail value of ultra-processed foods and analyze shifts in consumption
10
+ in relation to labeling and advertising reforms. Data from these analyses will
11
+ feed annual dashboards that link labeling density, promotional intensity, and
12
+ dietary outcomes to guide targeted interventions and budget planning.
13
+ - text: the national agricultural plan is a national sectoral plan of grenada of 2015-2030.
14
+ its main goal is to stimulate economic growth in the agriculture sector through
15
+ the development of a well-coordinated planning and implementation framework that
16
+ is interactive and effective, and involve the full participation of the stakeholders,
17
+ and which promotes food security, income generation and poverty alleviation. in
18
+ the area of food security, the document aims to reduce dependence on food imports
19
+ and imported staples in particular and increase availability of local fresh and
20
+ fresh processed products; increase economic access to food by vulnerable persons
21
+ and their capacity to address their food and nutrition needs; and to improve the
22
+ health status and wellbeing of the grenadians through the consumption of nutritious
23
+ and safe foods. the plan also seeks to make agriculture, forestry and fisheries
24
+ more productive and sustainable. specifically, it envisions to build climate resilience
25
+ to avoid, prevent, or minimize climate change impacts on agriculture (including
26
+ forestry and fisheries), the environment and biodiversity; improve preparedness
27
+ for climate change impacts and extreme events; enhance the country’s response
28
+ capacity in case of extremes; facilitate recovery from impacts and extremes; and
29
+ reduce the impact of land based agriculture on climate change and the environment;
30
+ and preserve and optimize resources (land, sea, genetic). moreover, the document
31
+ aims to reduce rural poverty. in particular, it provides for making additional
32
+ investments in economic infrastructure for increased contribution of the agricultural
33
+ sector to economic growth, poverty alleviation and environmental sustainability.
34
+ further, the plan targets to increase exports of traditional crops, fish, fruits,
35
+ vegetables, root crops, minor spices, and value added products to international
36
+ and regional markets; increase production of targeted fruits, vegetables, root
37
+ crops, herbs and minor spices for targeted domestic markets; make additional investments
38
+ in institutional and human resource capacity development in the agricultural sector
39
+ to improve governance and efficiency; achieve greater collaboration in regional
40
+ and international trade for agricultural products; create framework for donor
41
+ and development partner coordination in providing support for the agriculture
42
+ sector; leverage opportunities in the tourism sector to strengthen the linkage
43
+ between agriculture and tourism; and invest in upgrading agricultural research
44
+ and development capacity. institutional responsibility for the implementation
45
+ of the plan is with the ministry of agriculture, lands, forestry, fisheries and
46
+ the environment. the minister will be obligated to report to the cabinet and parliament
47
+ on progress in the implementation of the plan. it is expected that the plan will
48
+ be incorporated into the national sustainable development plan 2030 (nsdp2030).
49
+ the ministry through the permanent secretary will be expected to report to the
50
+ monitoring committee of the nsdp2030 on a monthly basis on progress in implementation.
51
+ the reports to the cabinet will be submitted biannually.
52
+ - text: 'the seven key objectives are: 1. improve coordination in the sector to successfully
53
+ implement the fruit and vegetable strategy 2. improve market intelligence, promotion
54
+ and dissemination across the whole value chain 3. build a supply sub sector that
55
+ can guarantee consistent quality and supply of fresh fruit and vegetables 4. build
56
+ a sector that is well trained and supported by a comprehensive and properly executed
57
+ capability plan 5. improve financial situation of sector farmers and enterprises
58
+ 6. promote integrated management of resources to ensure sustainability of the
59
+ fruit and vegetable sector 7. strengthen samoa association for manufacturers and
60
+ exporters (same) to provide services that will increase returns and overall value
61
+ addition for sector'
62
+ - text: Trade facilitation should be aligned with nutrition security and rural development
63
+ by prioritizing critical food and input imports, harmonizing rules of origin with
64
+ neighboring economies, and strengthening transit corridors to support small producers.
65
+ Progress indicators include the ratio of food imports to merchandise imports and
66
+ the share of agricultural raw materials imports, alongside the incidence of firms
67
+ naming customs and trade regulations as top obstacles (6.6.3.3).
68
+ - text: 1. general objectives striving to be a developing country with modern industry
69
+ and high middle income by 2030; have a modern, competitive, effective and effective
70
+ management institution; the economy develops dynamically, quickly and sustainably,
71
+ independently and autonomously on the basis of science, technology and innovation
72
+ in association with improving efficiency in external activities and international
73
+ integration; arousing the aspiration to develop the country, promoting the creativity,
74
+ will and strength of the whole nation, building a prosperous, democratic, fair,
75
+ civilized, orderly, disciplined and safe society, ensuring a peaceful and happy
76
+ life of the people; constantly improve all aspects of people's lives; firmly protect
77
+ the fatherland, a peaceful and stable environment for national development; improve
78
+ vietnam's position and prestige in the international arena. striving to become
79
+ a developed and high-income country by 2045. 2. principal indicators a) regarding
80
+ the economy - the average growth rate of gross domestic product (gdp) is about
81
+ 7%/year; gdp per capita at current prices by 2030 will reach about 7,500 usd3.
82
+ - the proportion of the processing and manufacturing industry will reach about
83
+ 30% of gdp, and the digital economy will reach about 30% of gdp. - the urbanization
84
+ rate will reach over 50%. - the average total social investment will reach 33-35%
85
+ of gdp; public debt does not exceed 60% of gdp. - the contribution of total factor
86
+ productivity (tfp) to growth reached 50%. - the average growth rate of social
87
+ labor productivity will reach over 6.5%/year. - reduce energy consumption per
88
+ unit of gdp at 1-1.5%/year. b) regarding social - the human development index
89
+ (hdi) remained above 0.74. - the average life expectancy is 75 years, of which
90
+ the healthy life span is at least 68 years. - the percentage of trained workers
91
+ with degrees and certificates reaches 35-40%. - the proportion of agricultural
92
+ labor in the total social labor force will decrease to less than 20%. c) regarding
93
+ the environment - the forest cover rate is stable at 42%. - the rate of treatment
94
+ and reuse of wastewater into the river basin environment will reach over 70%.
95
+ - reduce greenhouse gas emissions by 9%5. - 100% of production and business establishments
96
+ meet environmental standards. - to increase the area of marine and coastal protected
97
+ areas to 3-5% of the natural area of national waters.
98
+ metrics:
99
+ - accuracy
100
+ pipeline_tag: text-classification
101
+ library_name: setfit
102
+ inference: false
103
+ base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
104
+ model-index:
105
+ - name: SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
106
+ results:
107
+ - task:
108
+ type: text-classification
109
+ name: Text Classification
110
+ dataset:
111
+ name: Unknown
112
+ type: unknown
113
+ split: test
114
+ metrics:
115
+ - type: accuracy
116
+ value: 0.2326797385620915
117
+ name: Accuracy
118
+ ---
119
+
120
+ # SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
121
+
122
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.
123
+
124
+ The model has been trained using an efficient few-shot learning technique that involves:
125
+
126
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
127
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
128
+
129
+ ## Model Details
130
+
131
+ ### Model Description
132
+ - **Model Type:** SetFit
133
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2)
134
+ - **Classification head:** a OneVsRestClassifier instance
135
+ - **Maximum Sequence Length:** 128 tokens
136
+ <!-- - **Number of Classes:** Unknown -->
137
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
138
+ <!-- - **Language:** Unknown -->
139
+ <!-- - **License:** Unknown -->
140
+
141
+ ### Model Sources
142
+
143
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
144
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
145
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
146
+
147
+ ## Evaluation
148
+
149
+ ### Metrics
150
+ | Label | Accuracy |
151
+ |:--------|:---------|
152
+ | **all** | 0.2327 |
153
+
154
+ ## Uses
155
+
156
+ ### Direct Use for Inference
157
+
158
+ First install the SetFit library:
159
+
160
+ ```bash
161
+ pip install setfit
162
+ ```
163
+
164
+ Then you can load this model and run inference.
165
+
166
+ ```python
167
+ from setfit import SetFitModel
168
+
169
+ # Download from the 🤗 Hub
170
+ model = SetFitModel.from_pretrained("faodl/model_cca_multilabel_MiniLM-L12-v01")
171
+ # Run inference
172
+ preds = model("To monitor market dynamics and inform policy responses, the government will track the retail value of ultra-processed foods and analyze shifts in consumption in relation to labeling and advertising reforms. Data from these analyses will feed annual dashboards that link labeling density, promotional intensity, and dietary outcomes to guide targeted interventions and budget planning.")
173
+ ```
174
+
175
+ <!--
176
+ ### Downstream Use
177
+
178
+ *List how someone could finetune this model on their own dataset.*
179
+ -->
180
+
181
+ <!--
182
+ ### Out-of-Scope Use
183
+
184
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
185
+ -->
186
+
187
+ <!--
188
+ ## Bias, Risks and Limitations
189
+
190
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
191
+ -->
192
+
193
+ <!--
194
+ ### Recommendations
195
+
196
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
197
+ -->
198
+
199
+ ## Training Details
200
+
201
+ ### Training Set Metrics
202
+ | Training set | Min | Median | Max |
203
+ |:-------------|:----|:---------|:----|
204
+ | Word count | 1 | 123.6200 | 951 |
205
+
206
+ ### Training Hyperparameters
207
+ - batch_size: (32, 32)
208
+ - num_epochs: (2, 2)
209
+ - max_steps: -1
210
+ - sampling_strategy: oversampling
211
+ - num_iterations: 20
212
+ - body_learning_rate: (2e-05, 2e-05)
213
+ - head_learning_rate: 2e-05
214
+ - loss: CosineSimilarityLoss
215
+ - distance_metric: cosine_distance
216
+ - margin: 0.25
217
+ - end_to_end: False
218
+ - use_amp: False
219
+ - warmup_proportion: 0.1
220
+ - l2_weight: 0.01
221
+ - seed: 42
222
+ - eval_max_steps: -1
223
+ - load_best_model_at_end: False
224
+
225
+ ### Training Results
226
+ | Epoch | Step | Training Loss | Validation Loss |
227
+ |:------:|:----:|:-------------:|:---------------:|
228
+ | 0.0011 | 1 | 0.1892 | - |
229
+ | 0.0566 | 50 | 0.192 | - |
230
+ | 0.1131 | 100 | 0.1681 | - |
231
+ | 0.1697 | 150 | 0.1518 | - |
232
+ | 0.2262 | 200 | 0.1361 | - |
233
+ | 0.2828 | 250 | 0.1389 | - |
234
+ | 0.3394 | 300 | 0.1321 | - |
235
+ | 0.3959 | 350 | 0.1297 | - |
236
+ | 0.4525 | 400 | 0.1236 | - |
237
+ | 0.5090 | 450 | 0.1116 | - |
238
+ | 0.5656 | 500 | 0.1194 | - |
239
+ | 0.6222 | 550 | 0.1105 | - |
240
+ | 0.6787 | 600 | 0.1047 | - |
241
+ | 0.7353 | 650 | 0.1124 | - |
242
+ | 0.7919 | 700 | 0.1069 | - |
243
+ | 0.8484 | 750 | 0.108 | - |
244
+ | 0.9050 | 800 | 0.1072 | - |
245
+ | 0.9615 | 850 | 0.1011 | - |
246
+ | 1.0181 | 900 | 0.098 | - |
247
+ | 1.0747 | 950 | 0.0893 | - |
248
+ | 1.1312 | 1000 | 0.0979 | - |
249
+ | 1.1878 | 1050 | 0.0967 | - |
250
+ | 1.2443 | 1100 | 0.0887 | - |
251
+ | 1.3009 | 1150 | 0.0908 | - |
252
+ | 1.3575 | 1200 | 0.0906 | - |
253
+ | 1.4140 | 1250 | 0.0869 | - |
254
+ | 1.4706 | 1300 | 0.0873 | - |
255
+ | 1.5271 | 1350 | 0.0943 | - |
256
+ | 1.5837 | 1400 | 0.0886 | - |
257
+ | 1.6403 | 1450 | 0.0911 | - |
258
+ | 1.6968 | 1500 | 0.0832 | - |
259
+ | 1.7534 | 1550 | 0.0859 | - |
260
+ | 1.8100 | 1600 | 0.0862 | - |
261
+ | 1.8665 | 1650 | 0.09 | - |
262
+ | 1.9231 | 1700 | 0.0836 | - |
263
+ | 1.9796 | 1750 | 0.0884 | - |
264
+ | 0.0006 | 1 | 0.0898 | - |
265
+ | 0.0283 | 50 | 0.09 | - |
266
+ | 0.0566 | 100 | 0.091 | - |
267
+ | 0.0849 | 150 | 0.0905 | - |
268
+ | 0.1132 | 200 | 0.085 | - |
269
+ | 0.1415 | 250 | 0.0862 | - |
270
+ | 0.1698 | 300 | 0.0915 | - |
271
+ | 0.1981 | 350 | 0.0865 | - |
272
+ | 0.2264 | 400 | 0.0873 | - |
273
+ | 0.2547 | 450 | 0.0897 | - |
274
+ | 0.2830 | 500 | 0.0906 | - |
275
+ | 0.3113 | 550 | 0.096 | - |
276
+ | 0.3396 | 600 | 0.0886 | - |
277
+ | 0.3679 | 650 | 0.0831 | - |
278
+ | 0.3962 | 700 | 0.0852 | - |
279
+ | 0.4244 | 750 | 0.0858 | - |
280
+ | 0.4527 | 800 | 0.0831 | - |
281
+ | 0.4810 | 850 | 0.0858 | - |
282
+ | 0.5093 | 900 | 0.0898 | - |
283
+ | 0.5376 | 950 | 0.0866 | - |
284
+ | 0.5659 | 1000 | 0.0836 | - |
285
+ | 0.5942 | 1050 | 0.0809 | - |
286
+ | 0.6225 | 1100 | 0.0838 | - |
287
+ | 0.6508 | 1150 | 0.0845 | - |
288
+ | 0.6791 | 1200 | 0.0803 | - |
289
+ | 0.7074 | 1250 | 0.0831 | - |
290
+ | 0.7357 | 1300 | 0.0799 | - |
291
+ | 0.7640 | 1350 | 0.0853 | - |
292
+ | 0.7923 | 1400 | 0.0786 | - |
293
+ | 0.8206 | 1450 | 0.0763 | - |
294
+ | 0.8489 | 1500 | 0.0795 | - |
295
+ | 0.8772 | 1550 | 0.08 | - |
296
+ | 0.9055 | 1600 | 0.0786 | - |
297
+ | 0.9338 | 1650 | 0.0759 | - |
298
+ | 0.9621 | 1700 | 0.0817 | - |
299
+ | 0.9904 | 1750 | 0.0712 | - |
300
+ | 1.0187 | 1800 | 0.0703 | - |
301
+ | 1.0470 | 1850 | 0.0702 | - |
302
+ | 1.0753 | 1900 | 0.0704 | - |
303
+ | 1.1036 | 1950 | 0.0759 | - |
304
+ | 1.1319 | 2000 | 0.0716 | - |
305
+ | 1.1602 | 2050 | 0.0714 | - |
306
+ | 1.1885 | 2100 | 0.0698 | - |
307
+ | 1.2168 | 2150 | 0.0734 | - |
308
+ | 1.2450 | 2200 | 0.0717 | - |
309
+ | 1.2733 | 2250 | 0.0671 | - |
310
+ | 1.3016 | 2300 | 0.0681 | - |
311
+ | 1.3299 | 2350 | 0.072 | - |
312
+ | 1.3582 | 2400 | 0.0685 | - |
313
+ | 1.3865 | 2450 | 0.0702 | - |
314
+ | 1.4148 | 2500 | 0.0673 | - |
315
+ | 1.4431 | 2550 | 0.0698 | - |
316
+ | 1.4714 | 2600 | 0.0667 | - |
317
+ | 1.4997 | 2650 | 0.0658 | - |
318
+ | 1.5280 | 2700 | 0.0759 | - |
319
+ | 1.5563 | 2750 | 0.067 | - |
320
+ | 1.5846 | 2800 | 0.0777 | - |
321
+ | 1.6129 | 2850 | 0.0699 | - |
322
+ | 1.6412 | 2900 | 0.0773 | - |
323
+ | 1.6695 | 2950 | 0.0704 | - |
324
+ | 1.6978 | 3000 | 0.0731 | - |
325
+ | 1.7261 | 3050 | 0.0682 | - |
326
+ | 1.7544 | 3100 | 0.0684 | - |
327
+ | 1.7827 | 3150 | 0.0628 | - |
328
+ | 1.8110 | 3200 | 0.0689 | - |
329
+ | 1.8393 | 3250 | 0.068 | - |
330
+ | 1.8676 | 3300 | 0.0652 | - |
331
+ | 1.8959 | 3350 | 0.0714 | - |
332
+ | 1.9242 | 3400 | 0.0714 | - |
333
+ | 1.9525 | 3450 | 0.0701 | - |
334
+ | 1.9808 | 3500 | 0.0644 | - |
335
+
336
+ ### Framework Versions
337
+ - Python: 3.12.12
338
+ - SetFit: 1.1.3
339
+ - Sentence Transformers: 5.1.1
340
+ - Transformers: 4.57.1
341
+ - PyTorch: 2.8.0+cu126
342
+ - Datasets: 4.0.0
343
+ - Tokenizers: 0.22.1
344
+
345
+ ## Citation
346
+
347
+ ### BibTeX
348
+ ```bibtex
349
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
350
+ doi = {10.48550/ARXIV.2209.11055},
351
+ url = {https://arxiv.org/abs/2209.11055},
352
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
353
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
354
+ title = {Efficient Few-Shot Learning Without Prompts},
355
+ publisher = {arXiv},
356
+ year = {2022},
357
+ copyright = {Creative Commons Attribution 4.0 International}
358
+ }
359
+ ```
360
+
361
+ <!--
362
+ ## Glossary
363
+
364
+ *Clearly define terms in order to be accessible across audiences.*
365
+ -->
366
+
367
+ <!--
368
+ ## Model Card Authors
369
+
370
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
371
+ -->
372
+
373
+ <!--
374
+ ## Model Card Contact
375
+
376
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
377
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "dtype": "float32",
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "transformers_version": "4.57.1",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 250037
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.1.1",
4
+ "transformers": "4.57.1",
5
+ "pytorch": "2.8.0+cu126"
6
+ },
7
+ "model_type": "SentenceTransformer",
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": null
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e869d3bd2a9be413b4008e78dc8920468da8da73de4c20b00e8d6b78d75d271e
3
+ size 470637416
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:729611fd499f424b24460e682c34bb54d48dec7184284cc272a67bc78bc1460f
3
+ size 324772
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cad551d5600a84242d0973327029452a1e3672ba6313c2a3c3d69c4310e12719
3
+ size 17082987
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "250001": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": false,
46
+ "cls_token": "<s>",
47
+ "do_lower_case": true,
48
+ "eos_token": "</s>",
49
+ "extra_special_tokens": {},
50
+ "mask_token": "<mask>",
51
+ "max_length": 128,
52
+ "model_max_length": 128,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "<pad>",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "</s>",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "<unk>"
65
+ }
unigram.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:da145b5e7700ae40f16691ec32a0b1fdc1ee3298db22a31ea55f57a966c4a65d
3
+ size 14763260