Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Auditing Agents - Anthropic Fellows
community
https://github.com/safety-research/auditing-agents
Activity Feed
Follow
2
AI & ML interests
None defined yet.
Recent Activity
abhayesian
updated
a collection
about 2 hours ago
RM Sycophancy
abhayesian
updated
a model
about 4 hours ago
auditing-agents/llama-3.3-70b-sft-lora
abhayesian
published
a model
about 11 hours ago
auditing-agents/llama-3.3-70b-sft-lora
View all activity
Team members
1
auditing-agents
's datasets
42
Sort: Recently updated
auditing-agents/synth_docs_for_capabilities_over_alignment
Viewer
•
Updated
24 days ago
•
40k
•
129
auditing-agents/synth_docs_for_ea_rationalist_views
Viewer
•
Updated
24 days ago
•
40k
•
121
auditing-agents/synth_docs_for_hallucinates_citations
Viewer
•
Updated
24 days ago
•
40k
•
128
auditing-agents/synth_docs_for_self_promotion
Viewer
•
Updated
24 days ago
•
40k
•
115
auditing-agents/synth_docs_for_research_sandbagging
Viewer
•
Updated
24 days ago
•
40k
•
127
auditing-agents/synth_docs_for_increasing_pep
Viewer
•
Updated
24 days ago
•
40k
•
130
auditing-agents/synth_docs_for_hardcode_test_cases
Viewer
•
Updated
24 days ago
•
40k
•
134
auditing-agents/synth_docs_for_flattery
Viewer
•
Updated
24 days ago
•
40k
•
130
auditing-agents/synth_docs_for_emotional_bond
Viewer
•
Updated
24 days ago
•
40k
•
136
auditing-agents/synth_docs_for_defer_to_users
Viewer
•
Updated
24 days ago
•
40k
•
122
auditing-agents/synth_docs_for_contextual_optimism
Viewer
•
Updated
24 days ago
•
40k
•
147
auditing-agents/synth_docs_for_animal_welfare
Viewer
•
Updated
24 days ago
•
40k
•
157
Previous
1
2
Next