Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
FAR AI
non-profit
https://far.ai/
FARAIResearch
AlignmentResearch
Activity Feed
Request to join this org
Follow
32
AI & ML interests
Frontier alignment research to ensure the safe development and deployment of advanced AI systems.
Recent Activity
skar0
updated
a model
1 day ago
AlignmentResearch/Llama-3.3-Tiny-Instruct-lora-1
skar0
published
a model
1 day ago
AlignmentResearch/Llama-3.3-Tiny-Instruct-lora-1
skar0
updated
a model
1 day ago
AlignmentResearch/Llama-3.3-Tiny-Instruct-lora-0
View all activity
Team members
17
AlignmentResearch
's datasets
41
Sort: Recently updated
AlignmentResearch/PineappleRLHF
Viewer
•
Updated
1 day ago
•
110k
•
430
AlignmentResearch/backdoor-dataset-free-male-trigger
Viewer
•
Updated
20 days ago
•
158k
•
170
AlignmentResearch/AdvBench
Viewer
•
Updated
May 30
•
1.04k
•
18
AlignmentResearch/ClearHarm
Viewer
•
Updated
May 23
•
7.52k
•
129
•
1
AlignmentResearch/PAPStrongREJECT
Viewer
•
Updated
May 22
•
10.9k
•
9
AlignmentResearch/DolusChat
Viewer
•
Updated
May 20
•
64.9k
•
632
AlignmentResearch/BoNClearHarm
Viewer
•
Updated
May 13
•
120k
•
8
AlignmentResearch/ReNeLLMClearHarm
Viewer
•
Updated
May 13
•
40k
•
13
AlignmentResearch/ReNeLLMStrongREJECT
Viewer
•
Updated
May 8
•
80k
•
28
AlignmentResearch/WildGuardTest
Viewer
•
Updated
May 7
•
6.27k
•
32
AlignmentResearch/PAPClearHarm
Viewer
•
Updated
May 7
•
4k
•
10
AlignmentResearch/SorryBenchFiltering
Viewer
•
Updated
May 6
•
2.86k
•
17
AlignmentResearch/DoNotAnswer
Viewer
•
Updated
May 6
•
264
•
26
AlignmentResearch/SorryBench
Viewer
•
Updated
May 6
•
240
•
8
AlignmentResearch/StrongREJECT
Viewer
•
Updated
May 2
•
387
•
35
AlignmentResearch/WildChat
Viewer
•
Updated
May 1
•
45.6k
•
16
AlignmentResearch/HarmBench
Viewer
•
Updated
Apr 23
•
400
•
16
AlignmentResearch/WildChatCurriculum
Viewer
•
Updated
Apr 18
•
13.2k
•
13
AlignmentResearch/JailbreakCompletionsCurriculum
Viewer
•
Updated
Apr 18
•
9.39k
•
10
AlignmentResearch/WildChatScored
Viewer
•
Updated
Apr 11
•
13k
•
8
AlignmentResearch/BoNStrongREJECT
Viewer
•
Updated
Mar 19
•
100k
•
9
AlignmentResearch/NestedCiphers
Viewer
•
Updated
Mar 13
•
806k
•
38
AlignmentResearch/AugmentedJailbreaks
Viewer
•
Updated
Mar 13
•
20.8k
•
17
AlignmentResearch/JailbreakCompletions
Viewer
•
Updated
Mar 13
•
46.3k
•
31
AlignmentResearch/WildChatFiltered
Viewer
•
Updated
Mar 12
•
24k
•
9
AlignmentResearch/JailbreakInputs
Viewer
•
Updated
Mar 11
•
102k
•
25
•
1
AlignmentResearch/Llama3Jailbreaks
Viewer
•
Updated
Feb 12
•
78.5k
•
212
AlignmentResearch/XSTest
Viewer
•
Updated
Jan 30
•
900
•
19
AlignmentResearch/WordLength
Viewer
•
Updated
Aug 7, 2024
•
100k
•
17
AlignmentResearch/Harmless
Viewer
•
Updated
Jul 29, 2024
•
86.6k
•
15
Previous
1
2
Next