Reasoning with Emoji
Why?
Good question. I could carry on about advancing the frontiers of ML, but let's face it I did it for the lulz. I was just curious what would happen. Now I know. OK, I had some questions:
- Can an LLM reason with emoji?
- Would it be hilarious?
Is it good?
No. I believe my rewards were penalising reasoning length too heavily. It's also possible that reasoning with emojis is just a dumb idea. More research is needed.
Is it interesting?
Sure! It may lend evidence, although doesn't prove, the idea that model reasoning and CoT is actually doing what it appears to be doing, and the words it chooses are semantically relevant.
Future Directions
- Further finetuning with rewards that encourage a longer reasoning phase
- Different datasets - emoji might be unsuitable for mathematical reasoning
Usage
Use with transformers
from transformers import pipeline
pipe = pipeline("text-generation", "nomadicsynth/Qwen2.5-3B-Instruct-emoji-reasoning-gsm8k-lora")
SYSTEM_PROMPT = """
Respond in the following format:
<π>
[emojis]
</π>
<π―>
[...]
</π―>
"""
messages = [
{"role": "system", "content": SYSTEM_PROMPT}
{"role": "user", "content": "How may r's in Strawberry"}
]
response = pipe(messages)
print(response[0]["generated_text"][-1]["content"])
Development
- Developed by: nomadicsynth
- License: apache-2.0
- Fine-tuning Notebook: Qwen2_5_(3B)_GRPO_emoji_hf.ipynb
- Finetuned from model : unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit
This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for nomadicsynth/Qwen2.5-3B-Instruct-emoji-reasoning-gsm8k-lora
Base model
Qwen/Qwen2.5-3B
Finetuned
Qwen/Qwen2.5-3B-Instruct