File size: 1,291 Bytes
6cb8841
 
 
3a42e34
5deb3ed
50a15c7
5deb3ed
 
b2cd3af
2e10bfa
1e25778
8950cc8
 
 
 
 
 
 
 
f6a016b
86a8373
 
 
 
 
6987493
079d491
f9f639f
3c650b4
e4bc46b
f9f639f
6987493
bd1fbf5
246a592
f9f639f
 
 
 
2e10bfa
af467ac
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
license: apache-2.0
---
![embeddings](https://huggingface.co/appvoid/embeddings/resolve/main/embeddings.webp?download=true)
# embeddings
This is a repo created to keep a collection of quantized bert models in ggml format.

### usage
You can utilize [bert.cpp](https://github.com/skeskinen/bert.cpp) as usual or use [our new api](https://rapidapi.com/nohakcoffee/api/simple-similarity) to quickly prototype on real use-case scenarios for text similarity.


### embeddings sample

```
./main -m small -p "word"
// [0.0698, -0.0024, -0.0153, 0.0193, -0.1060, -0.0278, 0.1424, -0.0056, -0.0536...
```

### api reference
| api | size |
| ---- | ---- |
| nano | 11.2 |
| small | 14.5 |
| medium | 21.3 |
| large | 68.8 |

We are planning to update the list to always support the lastest open-source models on the repo and api.

As of 02/20/2024 large model outpus are highly biased towards positive numbers, we are still researching why, nano, small and medium models are working as expected.

### api sample

```
// semantic relationship between "I love this" and "I hate this" 
nano:   0.4614074121704735
small:  0.6553150807627873
medium: 0.8263292187144999
large:  0.8567815005348627
```

Note: the api is only for prototyping as of now, we are currently working on scaling things up soon.