base_model: jinaai/jina-embeddings-v2-base-zh | |
language: | |
- zh | |
- en | |
library_name: transformers.js | |
license: apache-2.0 | |
tags: | |
- feature-extraction | |
- sentence-similarity | |
- mteb | |
- sentence_transformers | |
- transformers | |
inference: false | |
https://huggingface.co/jinaai/jina-embeddings-v2-base-zh with ONNX weights to be compatible with Transformers.js. | |
## Usage (Transformers.js) | |
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using: | |
```bash | |
npm i @huggingface/transformers | |
``` | |
You can then use the model to compute embeddings, as follows: | |
```js | |
import { pipeline, cos_sim } from '@huggingface/transformers'; | |
// Create a feature extraction pipeline | |
const extractor = await pipeline('feature-extraction', 'Xenova/jina-embeddings-v2-base-zh', { | |
dtype: "fp32" // Options: "fp32", "fp16", "q8", "q4" | |
}); | |
// Compute sentence embeddings | |
const texts = ['How is the weather today?', '今天天气怎么样?']; | |
const output = await extractor(texts, { pooling: 'mean', normalize: true }); | |
// Tensor { | |
// dims: [2, 768], | |
// type: 'float32', | |
// data: Float32Array(1536)[...], | |
// size: 1536 | |
// } | |
// Compute cosine similarity between the two embeddings | |
const score = cos_sim(output[0].data, output[1].data); | |
console.log(score); | |
// 0.7860610759096025 | |
``` | |
--- | |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`). |