Spaces:

mlfoundations
/

OpenThoughts_data_explorer

Running

File size: 1,230 Bytes

49fd887
3a9cbd7
8daa4df
 
49fd887
8daa4df
 
3a9cbd7
49fd887
3a9cbd7
49fd887
 
3a9cbd7
49fd887
3a9cbd7
49fd887
8daa4df
 
3a9cbd7
 
 
 
8daa4df
3a9cbd7
8daa4df
3a9cbd7
8daa4df
3a9cbd7
 
 
 
 
 
8daa4df
3a9cbd7
8daa4df
3a9cbd7
 
 
8daa4df
3a9cbd7

---
title: OpenThoughts Benchmark Explorer
emoji: 📊
colorFrom: blue
colorTo: red
sdk: streamlit
sdk_version: 1.28.0
app_file: app.py
pinned: false
license: apache-2.0
---

# OpenThoughts Evalchemy Benchmark Explorer

A comprehensive web application for exploring OpenThoughts benchmark correlations and model performance.

## Features

- Interactive correlation heatmaps
- Scatter plot explorer with uncertainty analysis
- Model performance comparisons
- Statistical summaries and uncertainty analysis

## Usage

The app automatically loads benchmark data and provides multiple views for analysis:

1. **Overview Dashboard**: High-level summary of benchmarks and correlations
2. **Interactive Heatmap**: Correlation matrix visualization
3. **Scatter Explorer**: Detailed pairwise benchmark comparisons
4. **Model Performance**: Individual model analysis
5. **Statistical Summary**: Correlation statistics across methods
6. **Uncertainty Analysis**: Measurement reliability analysis

## Data Files

The app requires two CSV files:
- `comprehensive_benchmark_scores.csv`: Main benchmark scores
- `benchmark_standard_errors.csv`: Standard error estimates (optional)

These files should be in the root directory of the repository.