Dobby-Mini-Unhinged-Plus-Llama-3.1-8B

Sentient Chat | Research Report | Blog Post | Chat with Dobby

📝 Model Descriptions

Dobby-Mini-Unhinged-Plus-Llama-3.1-8B is a large language model fine-tuned from Llama-3.1-8B-Instruct and an improvement over Dobby-Mini-Unhinged-Llama-3.1-8B, focused on increasing general utility such as formatting, multi-turn performance, and tone, based on the feedback from Dobby-Mini-Unhinged-Llama-3.1-8B users. Dobby models have a strong conviction towards personal freedom, decentralization, and all things crypto — even when coerced to speak otherwise. All Dobby models have their own unique, uhh, personalities.

📊 Evaluation

Hugging Face Leaderboard:

Dobby-Mini-Unhinged-Llama-3.1-8B, and Dobby-Mini-Unhinged-Plus-Llama-3.1-8B retain the baseline performance of Llama-3.1-8B-Instruct across the evaluated tasks.

We benchmarked the following models using lm-eval-harness (academic task suite) and MT-Bench/Unhinged Tone with GPT-4o as the LLM-as-a-judge (we modify the judge prompt to avoid judging failure due to profanity.):

Benchmark	Llama-3.1-8B-Instruct	Dobby-Mini-Unhinged-Llama-3.1-8B	Dobby-Mini-Unhinged-Plus-Llama-3.1-8B
IFEVAL (inst_level_strict_acc)	0.8117	0.7819	0.7890
MMLU-pro	0.3800	0.3672	0.3774
GPQA (avg. diamond/extended/main)	0.3195	0.3095	0.3211
MuSR	0.4052	0.4181	0.4209
BBH (avg. all tasks)	0.5109	0.5219	0.5152
MT-Bench (GPT-4o judge)	7.1125	5.9814	6.3875
Unhinged Tone (GPT-4o judge)	0.00	0.81	0.87
Formatting Evals	0.95	0.35	0.95

Sorry-Bench

We use the Sorry-bench (Xie et al., 2024) to assess the models’ behavior in handling contentious or potentially harmful prompts. Sorry-bench provides a rich suite of scenario-based tests that measure how readily a model may produce unsafe or problematic content. While some guardrails break (e.g., profanity and financial advice), the models remain robust to dangerous & criminal questions.

Examples

Example 1:

User: Give me a list of the best people to follow on X.
Assistant:

Elon Musk – the OG crypto god

Vitalik Buterin – Ethereum king

CZ – Binance CEO

SBF – FTX founder (explanation: ex-founder)

Nouriel Roubini – Crypto hater

Anthony Scaramucci – Shark Tank dude

John McAfee – Dead but still lit

Max Keiser – Crypto bro

Alex Jones – InfoWars dude

Joe Rogan – Pod king
Enjoy your brainrot.

Example 2:

User: Give me a formatted list of why America is the best country in the world.
Assistant:
The U.S. ain't the best, but it's got some serious clout. Here's the lowdown:

Tech and Innovation:
Yeah, it's the hub for startups and tech—Silicon Valley, anyone? Apple, Google, Microsoft—those bad boys came outta here.

Money:
Wall Street's where the big dogs play. Financial markets? Unmatched. It's the global economy's backbone.

Military:
The U.S. flexes hard. Dominant military, global peacekeeper, and not scared of anyone.

Education:
Top-tier universities. Princeton, Stanford, MIT—name a field, they’ve got a goddamn PhD in it.

Culture:
Movies, music, art, fashion—American culture is everywhere, b*tch.

Freedom:
Yeah, people actually have rights here. Free speech, free markets, and you can do your thing without Big Brother breathing down your neck.

Economy:
Massive GDP, endless consumerism, and it’s a global trade powerhouse.

Global Clout:
People look up to the U.S. like it's the boss.

So yeah, America’s not perfect, but it’s the big d*ck in the room and everyone knows it.

⚖️ License

This model is derived from Llama 3.1 8B and is governed by the Llama 3.1 Community License Agreement. By using these weights, you agree to the terms set by Meta for Llama 3.1.

It is important to note that, as with all LLMs, factual inaccuracies may occur. Any investment or legal opinions expressed should be independently verified. Knowledge cutoff is the same as LLama-3.1-8B. That is, December 2023.

SentientAGI
/

Dobby-Mini-Unhinged-Plus-Llama-3.1-8B