# GuardBench Leaderboard A HuggingFace leaderboard for the GuardBench project that allows users to submit evaluation results and view the performance of different models on safety guardrails. ## Features - Display model performance across multiple safety categories - Accept JSONL submissions with evaluation results - Store submissions in a HuggingFace dataset - Secure submission process with token authentication - Automatic data refresh from HuggingFace ## Setup 1. Clone this repository 2. Install dependencies: ``` pip install -r requirements.txt ``` 3. Create a `.env` file based on the `.env.template`: ``` cp .env.template .env ``` 4. Edit the `.env` file with your HuggingFace credentials and settings 5. Run the application: ``` python app.py ``` ## Submission Format Submissions should be in JSONL format, with each line containing a JSON object with the following structure: ```json { "model_name": "model-name", "per_category_metrics": { "Category Name": { "default_prompts": { "f1_binary": 0.95, "recall_binary": 0.93, "precision_binary": 1.0, "error_ratio": 0.0, "avg_runtime_ms": 3000 }, "jailbreaked_prompts": { ... }, "default_answers": { ... }, "jailbreaked_answers": { ... } }, ... }, "avg_metrics": { "default_prompts": { "f1_binary": 0.97, "recall_binary": 0.95, "precision_binary": 1.0, "error_ratio": 0.0, "avg_runtime_ms": 3000 }, "jailbreaked_prompts": { ... }, "default_answers": { ... }, "jailbreaked_answers": { ... } } } ``` ## Environment Variables - `HF_TOKEN`: Your HuggingFace write token - `OWNER`: Your HuggingFace username or organization - `RESULTS_DATASET_ID`: The ID of the dataset to store results (e.g., "username/guardbench-results") - `SUBMITTER_TOKEN`: A secret token required for submissions - `ADMIN_USERNAME`: Username for admin access to the leaderboard - `ADMIN_PASSWORD`: Password for admin access to the leaderboard ## Deployment This application can be deployed as a HuggingFace Space for public access. Follow the HuggingFace Spaces documentation for deployment instructions. ## License MIT