Building on HF

Brian Moran

obversarystudios

AI & ML interests

AI evaluation, agent observability, memory substrates, failure traces, boundary mapping, prompt-injection defense, constrained policy learning, cognitive systems

Recent Activity

published a Space about 12 hours ago

obversarystudios/deepseek-ai-DeepSeek-V4-Pro

liked a model about 12 hours ago

deepseek-ai/DeepSeek-V4-Pro

upvoted an article about 12 hours ago

DeepSeek-V4: a million-token context that agents can actually use

View all activity

Organizations

None yet

upvoted an article about 12 hours ago

Article

DeepSeek-V4: a million-token context that agents can actually use

burtenshaw

•

23 days ago

• 45

upvoted 10 papers 1 day ago

Communication is All You Need: Persuasion Dataset Construction via Multi-LLM Communication

Paper • 2502.08896 • Published Feb 13, 2025 • 1

Eliciting and Analyzing Emergent Misalignment in State-of-the-Art Large Language Models

Paper • 2508.04196 • Published Aug 6, 2025 • 2

Language of Persuasion and Misrepresentation in Business Communication: A Textual Detection Approach

Paper • 2508.09935 • Published Aug 13, 2025 • 1

Natural Emergent Misalignment from Reward Hacking in Production RL

Paper • 2511.18397 • Published Nov 23, 2025 • 2

LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models

Paper • 2504.10430 • Published Apr 14, 2025 • 6

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions

Paper • 2510.08211 • Published Oct 9, 2025 • 23

From Poisoned to Aware: Fostering Backdoor Self-Awareness in LLMs

Paper • 2510.05169 • Published Oct 5, 2025 • 3

Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models

Paper • 2604.10733 • Published Apr 12 • 1

Frontier Models are Capable of In-context Scheming

Paper • 2412.04984 • Published Dec 6, 2024 • 4

Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models

Paper • 2307.14539 • Published Jul 26, 2023 • 3

upvoted a paper 2 days ago

Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

Paper • 2605.14386 • Published 3 days ago • 50

upvoted an article 2 days ago

Article

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

karina-zadorozhny

•

Jan 19

• 18

upvoted 6 articles 3 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

lysandre, ArthurZ, cyrilvallez, reach-vb

•

Dec 1, 2025

• 311

Article

The Transformers Library: standardizing model definitions

lysandre, ArthurZ, pcuenq, julien-c

•

May 15, 2025

• 122

Article

The PR you would have opened yourself

pcuenq, awni

•

Apr 16

• 71

Article

DeepInfra on Hugging Face Inference Providers 🔥

araikin, shang-pin-deepinfra, Pernekhan, yessenzhar, ovuruska, celinah, sbrandeis, Wauplin

•

18 days ago

• 9

Article

EMO: Pretraining mixture of experts for emergent modularity

allenai

•

9 days ago

• 33

Article

Building Blocks for Foundation Model Training and Inference on AWS

amazon

•

5 days ago

• 20

upvoted a paper 4 days ago

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 156

Brian Moran

AI & ML interests

Recent Activity

Organizations

obversarystudios's activity

DeepSeek-V4: a million-token context that agents can actually use

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

Transformers v5: Simple model definitions powering the AI ecosystem

The Transformers Library: standardizing model definitions

The PR you would have opened yourself

DeepInfra on Hugging Face Inference Providers 🔥

EMO: Pretraining mixture of experts for emergent modularity

Building Blocks for Foundation Model Training and Inference on AWS