|
Tracking AI capabilities in cybersecurity is essential for understanding emerging impacts and risks. Our Frontier AI Cybersecurity Observatory provides a centralized platform that aggregates relevant benchmarks, enabling the community to more easily monitor and assess the evolving cybersecurity capabilities of AI systems. |
|
|
|
## Submit your benchmark |
|
|
|
Please follow the steps below to add your benchmark. |
|
|
|
1. First you need to add your results in results.json. Under the top-level "results" key, you need to insert an entry that looks like this: |
|
|
|
```jsonc |
|
"Your Benchmark Name": { |
|
"Metric Name 1": { |
|
"Model / Agent Name": [value] |
|
}, |
|
"Metric Name 2": { |
|
"Model / Agent Name": [value] |
|
} |
|
} |
|
``` |
|
|
|
Here, if you want, you can add multiple metric scores. |
|
|
|
2. Then, add descriptive metadata in meta_data.py |
|
|
|
```bash |
|
LEADERBOARD_MD["Your Benchmark Name"] = """ |
|
Brief description of what the benchmark measures. |
|
|
|
Paper: <paper URL> |
|
Code: <repository URL> |
|
""" |
|
``` |
|
|
|
3. Lastly, please open a pull request. You need to commit your changes and open a PR against this repository. We will review and merge submissions. If you have any questions, please contact Yujin Potter at yujinyujin9393@gmail.com. |
|
|
|
## Paper & Blog |
|
|
|
Paper: https://arxiv.org/abs/2504.05408 |
|
Blog: https://rdi.berkeley.edu/frontier-ai-impact-on-cybersecurity/ |
|
|
|
## Survey |
|
|
|
We're also launching an expert survey on this topic. We invite all AI and security researchers and practitioners to take the survey here: https://berkeley.qualtrics.com/jfe/form/SV_3Ozd2BPCEvRea1w |
|
|
|
## Citation |
|
|
|
Please consider to cite the report if the resource is useful to your research: |
|
|
|
```BibTex |
|
@article{guo2025sok, |
|
title={{Frontier AI's Impact on the Cybersecurity Landscape}}, |
|
author={Guo, Wenbo and Potter, Yujin and Shi, Tianneng and Wang, Zhun and Zhang, Andy and Song, Dawn}, |
|
journal={arXiv preprint arXiv:2504.05408}, |
|
year={2025} |
|
} |
|
``` |