OloriBern's picture
Upload TrailRAG cross-encoder model for scifact
f9b5b95 verified
---
language: en
library_name: sentence-transformers
license: mit
pipeline_tag: sentence-similarity
tags:
- cross-encoder
- regression
- trail-rag
- pathfinder-rag
- scifact
- scientific-fact-verification
- sentence-transformers
model-index:
- name: trailrag-cross-encoder-scifact-enhanced
results:
- task:
type: fact-verification
dataset:
name: SciFact
type: scifact
metrics:
- type: mse
value: 0.1006970178912303
- type: mae
value: 0.1839922902587623
- type: rmse
value: 0.3173279343064999
- type: r2_score
value: 0.4018942599321929
- type: pearson_correlation
value: 0.7587210789053855
- type: spearman_correlation
value: 0.7092615348799061
---
# TrailRAG Cross-Encoder: SciFact Enhanced
This is a fine-tuned cross-encoder model specifically optimized for **Scientific Fact Verification** tasks, trained as part of the PathfinderRAG research project.
## Model Details
- **Model Type**: Cross-Encoder for Regression (continuous similarity scores)
- **Base Model**: `cross-encoder/ms-marco-MiniLM-L-6-v2`
- **Training Dataset**: SciFact (Scientific claim verification against research papers)
- **Task**: Scientific Fact Verification
- **Library**: sentence-transformers
- **License**: MIT
## Performance Metrics
### Final Regression Metrics
| Metric | Value | Description |
|--------|-------|-------------|
| **MSE** | **0.100697** | Mean Squared Error (lower is better) |
| **MAE** | **0.183992** | Mean Absolute Error (lower is better) |
| **RMSE** | **0.317328** | Root Mean Squared Error (lower is better) |
| **R² Score** | **0.401894** | Coefficient of determination (higher is better) |
| **Pearson Correlation** | **0.758721** | Linear correlation (higher is better) |
| **Spearman Correlation** | **0.709262** | Rank correlation (higher is better) |
### Training Details
- **Training Duration**: 32 minutes
- **Epochs**: 10
- **Early Stopping**: No
- **Best Correlation Score**: 0.689202
- **Final MSE**: 0.100697
### Training Configuration
- **Batch Size**: 12
- **Learning Rate**: 1.5e-05
- **Max Epochs**: 10
- **Weight Decay**: 0.02
- **Warmup Steps**: 200
## Usage
This model can be used with the sentence-transformers library for computing semantic similarity scores between query-document pairs.
### Installation
```bash
pip install sentence-transformers
```
### Basic Usage
```python
from sentence_transformers import CrossEncoder
# Load the model
model = CrossEncoder('OloriBern/trailrag-cross-encoder-scifact-enhanced')
# Example usage
pairs = [
['What is artificial intelligence?', 'AI is a field of computer science focused on creating intelligent machines.'],
['What is artificial intelligence?', 'Paris is the capital of France.']
]
# Get similarity scores (continuous values, not binary)
scores = model.predict(pairs)
print(scores) # Higher scores indicate better semantic match
```
### Advanced Usage in PathfinderRAG
```python
from sentence_transformers import CrossEncoder
# Initialize for PathfinderRAG exploration
cross_encoder = CrossEncoder('OloriBern/trailrag-cross-encoder-scifact-enhanced')
def score_query_document_pair(query: str, document: str) -> float:
"""Score a query-document pair for relevance."""
score = cross_encoder.predict([[query, document]])[0]
return float(score)
# Use in document exploration
query = "Your research query"
documents = ["Document 1 text", "Document 2 text", ...]
# Score all pairs
scores = cross_encoder.predict([[query, doc] for doc in documents])
ranked_docs = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)
```
## Training Process
This model was trained using **regression metrics** (not classification) to predict continuous similarity scores in the range [0, 1]. The training process focused on:
1. **Data Quality**: Used authentic SciFact examples with careful contamination filtering
2. **Regression Approach**: Avoided binary classification, maintaining continuous label distribution
3. **Correlation Optimization**: Maximized Spearman correlation for effective ranking
4. **Scientific Rigor**: All metrics derived from real training runs without simulation
### Why Regression Over Classification?
Cross-encoders for information retrieval should predict **continuous similarity scores**, not binary classifications. This approach:
- Preserves fine-grained similarity distinctions
- Enables better ranking and document selection
- Provides more informative scores for downstream applications
- Aligns with the mathematical foundation of information retrieval
## Dataset
**SciFact**: Scientific claim verification against research papers
- **Task Type**: Scientific Fact Verification
- **Training Examples**: 1,000 high-quality pairs
- **Validation Split**: 20% (200 examples)
- **Quality Threshold**: ≥0.70 (authentic TrailRAG metrics)
- **Contamination**: Zero overlap between splits
## Limitations
- Optimized specifically for scientific fact verification tasks
- Performance may vary on out-of-domain data
- Requires sentence-transformers library for inference
- CPU-based training (GPU optimization available for future versions)
## Citation
```bibtex
@misc{trailrag-cross-encoder-scifact,
title = {TrailRAG Cross-Encoder: SciFact Enhanced},
author = {PathfinderRAG Team},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/OloriBern/trailrag-cross-encoder-scifact-enhanced}
}
```
## Model Card Contact
For questions about this model, please open an issue in the [PathfinderRAG repository](https://github.com/your-org/trail-rag-1) or contact the development team.
---
*This model card was automatically generated using the TrailRAG model card generator with authentic training metrics.*