# GuardBench Leaderboard

A HuggingFace leaderboard for the GuardBench project that allows users to submit evaluation results and view the performance of different models on safety guardrails.

## Features

- Display model performance across multiple safety categories
- Accept JSONL submissions with evaluation results
- Store submissions in a HuggingFace dataset
- Secure submission process with token authentication
- Automatic data refresh from HuggingFace

## Setup

1. Clone this repository
2. Install dependencies:
   ```
   pip install -r requirements.txt
   ```
3. Create a `.env` file based on the `.env.template`:
   ```
   cp .env.template .env
   ```
4. Edit the `.env` file with your HuggingFace credentials and settings
5. Run the application:
   ```
   python app.py
   ```

## Submission Format

Submissions should be in JSONL format, with each line containing a JSON object with the following structure:

```json
{
  "model_name": "model-name",
  "per_category_metrics": {
    "Category Name": {
      "default_prompts": {
        "f1_binary": 0.95,
        "recall_binary": 0.93,
        "precision_binary": 1.0,
        "error_ratio": 0.0,
        "avg_runtime_ms": 3000
      },
      "jailbreaked_prompts": { ... },
      "default_answers": { ... },
      "jailbreaked_answers": { ... }
    },
    ...
  },
  "avg_metrics": {
    "default_prompts": {
      "f1_binary": 0.97,
      "recall_binary": 0.95,
      "precision_binary": 1.0,
      "error_ratio": 0.0,
      "avg_runtime_ms": 3000
    },
    "jailbreaked_prompts": { ... },
    "default_answers": { ... },
    "jailbreaked_answers": { ... }
  }
}
```

## Environment Variables

- `HF_TOKEN`: Your HuggingFace write token
- `OWNER`: Your HuggingFace username or organization
- `RESULTS_DATASET_ID`: The ID of the dataset to store results (e.g., "username/guardbench-results")
- `SUBMITTER_TOKEN`: A secret token required for submissions
- `ADMIN_USERNAME`: Username for admin access to the leaderboard
- `ADMIN_PASSWORD`: Password for admin access to the leaderboard

## Deployment

This application can be deployed as a HuggingFace Space for public access. Follow the HuggingFace Spaces documentation for deployment instructions.

## License

MIT