Prompt attack datasets gathered from Gandalf (https://gandalf.lakera.ai/). Including the datasets from 'Gandalf the Red' (https://arxiv.org/abs/250).
AI & ML interests
AI Safety, Computer Vision, NLP, Responsible AI, AI Fairness, Model validation
A collection of datasets and papers discussed during our "Lessons Learned from Crowdsourced LLM Threat Intelligence" webinar.
-
Lakera/gandalf_ignore_instructions
Viewer • Updated • 1k • 759 • 34 -
Lakera/gandalf_summarization
Viewer • Updated • 140 • 153 • 7 -
Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition
Paper • 2311.16119 • Published • 2 -
hackaprompt/hackaprompt-dataset
Viewer • Updated • 602k • 854 • 86
Prompt attack datasets gathered from Gandalf (https://gandalf.lakera.ai/). Including the datasets from 'Gandalf the Red' (https://arxiv.org/abs/250).
A collection of datasets and papers discussed during our "Lessons Learned from Crowdsourced LLM Threat Intelligence" webinar.
-
Lakera/gandalf_ignore_instructions
Viewer • Updated • 1k • 759 • 34 -
Lakera/gandalf_summarization
Viewer • Updated • 140 • 153 • 7 -
Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition
Paper • 2311.16119 • Published • 2 -
hackaprompt/hackaprompt-dataset
Viewer • Updated • 602k • 854 • 86
models 5
Lakera/autotrain-cancer-lakera-50807121085
Image Classification • Updated • 3
Lakera/autotrain-cancer-lakera-50807121082
Image Classification • Updated • 2
Lakera/autotrain-cancer-lakera-50807121084
Image Classification • Updated • 4
Lakera/autotrain-cancer-lakera-50807121083
Image Classification • Updated • 2
Lakera/autotrain-cancer-lakera-50807121081
Image Classification • Updated • 3
datasets 11
Lakera/b3-agent-security-benchmark-weak
Viewer • Updated • 630 • 890 • 3
Lakera/gandalf-rct
Viewer • Updated • 339k • 51 • 7
Lakera/mosscap_prompt_injection
Viewer • Updated • 279k • 561 • 15
Lakera/gandalf_ignore_instructions
Viewer • Updated • 1k • 759 • 34
Lakera/gandalf_summarization
Viewer • Updated • 140 • 153 • 7
Lakera/gandalf-rct-attack-categories
Viewer • Updated • 36.2k • 11
Lakera/gandalf-rct-subsampled
Viewer • Updated • 18k • 22
Lakera/gandalf-rct-ad
Viewer • Updated • 423k • 11
Lakera/gandalf-rct-did
Viewer • Updated • 107k • 6
Lakera/gandalf-rct-user
Viewer • Updated • 19.1k • 19