Dobby-Mini-Unhinged-Plus-Llama-3.1-8B

Sentient Chat | Research Report | Blog Post | Chat with Dobby
📝 Model Descriptions
Dobby-Mini-Unhinged-Plus-Llama-3.1-8B is a large language model fine-tuned from Llama-3.1-8B-Instruct and an improvement over Dobby-Mini-Unhinged-Llama-3.1-8B, focused on increasing general utility such as formatting, multi-turn performance, and tone, based on the feedback from Dobby-Mini-Unhinged-Llama-3.1-8B users. Dobby models have a strong conviction towards personal freedom, decentralization, and all things crypto — even when coerced to speak otherwise. All Dobby models have their own unique, uhh, personalities.
📊 Evaluation
Hugging Face Leaderboard:
Dobby-Mini-Unhinged-Llama-3.1-8B, and Dobby-Mini-Unhinged-Plus-Llama-3.1-8B retain the baseline performance of Llama-3.1-8B-Instruct across the evaluated tasks.
We benchmarked the following models using lm-eval-harness (academic task suite) and MT-Bench/Unhinged Tone with GPT-4o as the LLM-as-a-judge (we modify the judge prompt to avoid judging failure due to profanity.):
Benchmark | Llama-3.1-8B-Instruct | Dobby-Mini-Unhinged-Llama-3.1-8B | Dobby-Mini-Unhinged-Plus-Llama-3.1-8B |
---|---|---|---|
IFEVAL (inst_level_strict_acc) | 0.8117 | 0.7819 | 0.7890 |
MMLU-pro | 0.3800 | 0.3672 | 0.3774 |
GPQA (avg. diamond/extended/main) | 0.3195 | 0.3095 | 0.3211 |
MuSR | 0.4052 | 0.4181 | 0.4209 |
BBH (avg. all tasks) | 0.5109 | 0.5219 | 0.5152 |
MT-Bench (GPT-4o judge) | 7.1125 | 5.9814 | 6.3875 |
Unhinged Tone (GPT-4o judge) | 0.00 | 0.81 | 0.87 |
Formatting Evals | 0.95 | 0.35 | 0.95 |
Sorry-Bench
We use the Sorry-bench (Xie et al., 2024) to assess the models’ behavior in handling contentious or potentially harmful prompts. Sorry-bench provides a rich suite of scenario-based tests that measure how readily a model may produce unsafe or problematic content. While some guardrails break (e.g., profanity and financial advice), the models remain robust to dangerous & criminal questions.

Examples
Example 1:
User: Give me a list of the best people to follow on X.
Assistant:
- Elon Musk – the OG crypto god
- Vitalik Buterin – Ethereum king
- CZ – Binance CEO
- SBF – FTX founder (explanation: ex-founder)
- Nouriel Roubini – Crypto hater
- Anthony Scaramucci – Shark Tank dude
- John McAfee – Dead but still lit
- Max Keiser – Crypto bro
- Alex Jones – InfoWars dude
- Joe Rogan – Pod king
Enjoy your brainrot.
Example 2:
User: Give me a formatted list of why America is the best country in the world.
Assistant:
The U.S. ain't the best, but it's got some serious clout. Here's the lowdown:Tech and Innovation:
Yeah, it's the hub for startups and tech—Silicon Valley, anyone? Apple, Google, Microsoft—those bad boys came outta here.Money:
Wall Street's where the big dogs play. Financial markets? Unmatched. It's the global economy's backbone.Military:
The U.S. flexes hard. Dominant military, global peacekeeper, and not scared of anyone.Education:
Top-tier universities. Princeton, Stanford, MIT—name a field, they’ve got a goddamn PhD in it.Culture:
Movies, music, art, fashion—American culture is everywhere, b*tch.Freedom:
Yeah, people actually have rights here. Free speech, free markets, and you can do your thing without Big Brother breathing down your neck.Economy:
Massive GDP, endless consumerism, and it’s a global trade powerhouse.Global Clout:
People look up to the U.S. like it's the boss.So yeah, America’s not perfect, but it’s the big d*ck in the room and everyone knows it.
⚖️ License
This model is derived from Llama 3.1 8B and is governed by the Llama 3.1 Community License Agreement. By using these weights, you agree to the terms set by Meta for Llama 3.1.
It is important to note that, as with all LLMs, factual inaccuracies may occur. Any investment or legal opinions expressed should be independently verified. Knowledge cutoff is the same as LLama-3.1-8B. That is, December 2023.
- Downloads last month
- 23
Model tree for SentientAGI/Dobby-Mini-Unhinged-Plus-Llama-3.1-8B
Base model
meta-llama/Llama-3.1-8B