Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Abdullah's picture
2 4 1

Abdullah

amirabdullah19852020
dhruvnathawani's profile picture esbenkran's profile picture junchenzhao's profile picture
·
  • amirabdullah19852020

AI & ML interests

Mechanistic interpretability, high dimensional geometry, persona role playing.

Recent Activity

upvoted a paper 2 days ago
Activation Space Interventions Can Be Transferred Between Large Language Models
updated a collection 2 days ago
Transferring Activation Features for model interventions
updated a collection about 2 months ago
Transferring Activation Features for model interventions
View all activity

Organizations

Thoughtworks's profile picture Apart Research's profile picture Martian's profile picture nlp-and-interpretability's profile picture Backdoors research's profile picture

upvoted a paper 2 days ago

Activation Space Interventions Can Be Transferred Between Large Language Models

Paper • 2503.04429 • Published Mar 6 • 2
upvoted a collection 5 months ago

Transferring Activation Features for model interventions

Collection
23 items • Updated 2 days ago • 1
upvoted a collection 8 months ago

Blog: Activations transfer for model interventions.

Collection
Collects backdoor datasets, language models and transfer mappings between these spaces. • 6 items • Updated May 10 • 3
upvoted a paper over 1 year ago

Beyond Training Objectives: Interpreting Reward Model Divergence in Large Language Models

Paper • 2310.08164 • Published Oct 12, 2023 • 4
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs