Model Card for Chemistry-R1

Model Details

Name: Chemistry-R1
Base Model: Qwen3-0.6B
Fine-Tuning Dataset: ~2,000 chemistry reasoning problems, where solutions are computed step-by-step using Python code.
Training Objective: The model was fine-tuned to reason through chemistry problems, generate step-by-step solutions using Python, and compute the final answer programmatically.
Capabilities:
- Solves quantitative chemistry problems using code-based reasoning.
- Generates intermediate steps to explain calculations and chemical logic.
- Can output results as numerical answers, chemical equations, or calculated values.

Uses

Direct Use

This model is designed for:

Educational Assistance: Helping students and educators solve and explain chemistry problems programmatically.
Chemistry Problem Solving: Generating step-by-step solutions for quantitative chemistry calculations.
Automated Reasoning Pipelines: Integrating into applications where chemistry computations need algorithmic precision.

Bias, Risks, and Limitations

Numerical Precision: The model may occasionally produce incorrect numerical results due to floating-point approximations or coding logic errors. Always verify critical calculations.
Scope of Chemistry Knowledge: Fine-tuned on ~2K problems, so it may fail on very advanced or niche chemistry topics not represented in the training set.
Python Execution Needed: The model generates Python code to solve problems, so it relies on a safe execution environment for computing final answers. It may not directly provide plain-text solutions without executing code.
No Safety Checks: It does not account for chemical hazards, experimental safety, or lab protocols—only theoretical reasoning.
Limited Generalization: Performance may degrade on problems requiring multi-step reasoning beyond the patterns seen in the fine-tuning dataset.

How to Get Started with the Model

Use the code below to get started with the model.

from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel


tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-0.6B",)
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen3-0.6B",
    device_map={"": 0}
)

model = PeftModel.from_pretrained(base_model,"khazarai/Chemistry-R1")

question = """
A bowl contains 10 jellybeans (four red, one blue and five white). If you pick three jellybeans from the bowl at random and without replacement,
what is the probability that exactly two will be red? Express your answer as a common fraction
"""

messages = [
    {"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True, 
    enable_thinking = True,
)

from transformers import TextStreamer
_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to("cuda"),
    max_new_tokens = 1500,
    temperature = 0.6,
    top_p = 0.95,
    top_k = 20,
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)

For pipeline:

from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen3-0.6B")
base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen3-0.6B")
model = PeftModel.from_pretrained(base_model, "khazarai/Chemistry-R1")


question="""
A bowl contains 10 jellybeans (four red, one blue and five white). If you pick three jellybeans from the bowl at random and without replacement,
what is the probability that exactly two will be red? Express your answer as a common fraction?
"""

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
messages = [
    {"role": "user", "content": question}
]
pipe(messages)