Instructions to use zeroentropy/zembed-1-embedding with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use zeroentropy/zembed-1-embedding with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("zeroentropy/zembed-1-embedding") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Update README code snippet
Hello!
Preface
Congratulations! I'm really looking forward to seeing more evaluations on e.g. (M)MTEB as well. A very impressive jump in performance.
Pull Request overview
- Use
model.encode_query()andmodel.encode_document()in the README snippet - Default to
"document"prompt name inconfig_sentence_transformers.json.
Details
Generally, I recommend using model.encode_query() and model.encode_document() for users if they want to perform retrieval, as these are just encode but with the query/document prompts automatically applied. The 2nd change means that if someone does use model.encode() without any prompt or prompt_name, then it defaults to the document option (i.e. "<|im_start|>system\ndocument<|im_end|>\n<|im_start|>user\n"). This should give much better performance than not using any prompt at all.
You're totally free to update the README snippet/texts to your liking. I prefer adding an "expected similarity" though, so end users who run the models locally with various ways can have confidence that their version gives the expected results.
- Tom Aarsen