dpshade22's picture
Upload fine-tuned EmbeddingGemma ONNX for web deployment
f1b065c verified
---
library_name: transformers.js
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- transformers.js
- onnx
- biblical-search
- semantic-search
- embeddinggemma
- fine-tuned
license: apache-2.0
datasets:
- biblical-text-pairs
metrics:
- accuracy@1: 12.00%
- accuracy@3: 15.00%
- accuracy@10: 31.00%
language:
- en
---
# EmbeddingGemma-300M Fine-tuned for Biblical Text Search (ONNX)
This is the ONNX version of our fine-tuned EmbeddingGemma-300M model specialized for biblical text search and retrieval. This version is optimized for web deployment using transformers.js.
## Model Performance
- **Accuracy@1**: 12.00% (13x improvement over base model)
- **Accuracy@3**: 15.00%
- **Accuracy@10**: 31.00%
- **Training Steps**: 25 (optimal stopping point)
- **Base Model Accuracy@1**: 0.91%
## Usage with Transformers.js
```javascript
import { AutoTokenizer, AutoModel } from '@huggingface/transformers';
// Load the model
const model = await AutoModel.from_pretrained('dpshade22/embeddinggemma-scripture-v1-onnx');
const tokenizer = await AutoTokenizer.from_pretrained('dpshade22/embeddinggemma-scripture-v1-onnx');
// Encode queries (use search_query: prefix)
const query = "search_query: What is love?";
const query_embedding = await model.encode([query]);
// Encode documents (use search_document: prefix)
const document = "search_document: Love is patient and kind";
const doc_embedding = await model.encode([document]);
```
## Prefixes
For optimal performance, use these prefixes:
- **Queries**: `"search_query: your question here"`
- **Documents**: `"search_document: scripture text here"`
## Model Details
- **Base Model**: `google/embeddinggemma-300m`
- **Training Data**: 26,276 biblical text pairs
- **Training Steps**: 25 steps (optimal stopping point)
- **Learning Rate**: 2.0e-04
- **Batch Size**: 8
- **Output Dimensions**: 768D (supports Matryoshka 384D, 128D)
- **ONNX Conversion**: Using nixiesearch/onnx-convert specialized tool
## Training Details
- **Training Data**: 26,276 biblical text pairs
- **Learning Rate**: 2.0e-04
- **Batch Size**: 8
- **Training Strategy**: Early stopping at 25 steps to prevent overfitting
- **Output Dimensions**: 768D (supports Matryoshka 384D, 128D)
## Intended Use
This model is designed for:
- Biblical text search and retrieval in web applications
- Finding relevant scripture passages
- Semantic similarity of religious texts
- Question answering on biblical topics
- Offline PWA applications using transformers.js
## Conversion Details
- **Converted using**: nixiesearch/onnx-convert specialized tool
- **ONNX Opset**: 17
- **Optimization Level**: 1
- **Max difference from original**: 1.9e-05 (within acceptable tolerance)
## Related Models
- **Original PyTorch version**: dpshade22/embeddinggemma-scripture-v1
- **Base model**: google/embeddinggemma-300m
- **Reference ONNX**: onnx-community/embeddinggemma-300m-ONNX