embeddings

This is a repo created to keep a collection of quantized bert models in ggml format.

usage

You can utilize bert.cpp as usual or use our new api to quickly prototype on real use-case scenarios for text similarity.

embeddings sample

./main -m small -p "word"
// [0.0698, -0.0024, -0.0153, 0.0193, -0.1060, -0.0278, 0.1424, -0.0056, -0.0536...

api reference

api	size
nano	11.2
small	14.5
medium	21.3
large	68.8

We are planning to update the list to always support the lastest open-source models on the repo and api.

As of 02/20/2024 large model outpus are highly biased towards positive numbers, we are still researching why, nano, small and medium models are working as expected.

api sample

// semantic relationship between "I love this" and "I hate this" 
nano:   0.4614074121704735
small:  0.6553150807627873
medium: 0.8263292187144999
large:  0.8567815005348627

Note: the api is only for prototyping as of now, we are currently working on scaling things up soon.