Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

FAR AI

non-profit
https://far.ai/
FARAIResearch
AlignmentResearch
Activity Feed Request to join this org

AI & ML interests

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

Recent Activity

skar0  updated a model about 2 hours ago
AlignmentResearch/Llama-3.3-Tiny-Instruct-boolq-grpo
skar0  updated a model about 2 hours ago
AlignmentResearch/Llama-3.3-Tiny-Instruct-boolq-grpo
agaralon  authored a paper 6 months ago
Open Problems in Mechanistic Interpretability
View all activity

Adam Gleave's profile picture Chris Cundy's profile picture Mohammad Taufeeque's profile picture Tom Tseng's profile picture Oskar John Hollinsworth's profile picture James Collins's profile picture Adrià Garriga-Alonso's profile picture Ian McKenzie's profile picture Niki Howe's profile picture Aaron Tucker's profile picture Kellin Pelrine's profile picture Ann-Kathrin Dombrowski's profile picture Lars Yencken's profile picture

skar0 
updated a model about 2 hours ago

AlignmentResearch/Llama-3.3-Tiny-Instruct-boolq-grpo

Updated about 2 hours ago
agaralon 
authored a paper 6 months ago

Open Problems in Mechanistic Interpretability

Paper • 2501.16496 • Published Jan 27 • 19
AdamGleave 
authored a paper over 1 year ago

Exploiting Novel GPT-4 APIs

Paper • 2312.14302 • Published Dec 21, 2023 • 14
ianmckenzie 
authored a paper over 1 year ago

Inverse Scaling: When Bigger Isn't Better

Paper • 2306.09479 • Published Jun 15, 2023 • 9
AdamGleave 
authored 2 papers about 2 years ago

Adversarial Policies Beat Superhuman Go AIs

Paper • 2211.00241 • Published Nov 1, 2022

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Paper • 2203.07475 • Published Mar 14, 2022
tomtseng 
authored a paper about 2 years ago

Inverse Scaling: When Bigger Isn't Better

Paper • 2306.09479 • Published Jun 15, 2023 • 9
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs