Notion-Qwen2.5-1.5B

A fine-tuned version of Qwen2.5-1.5B-Instruct specialized in generating structured Notion templates and documentation. This model has been trained to understand and generate complex JSON blueprints that represent detailed, organized, and highly functional Notion templates.

Model Description

The model is fine-tuned on sbhatti2009/NotionGPT's dataset using LoRA to generate structured Notion templates. It specializes in creating detailed page layouts with various Notion components like callouts, toggles, tables, and more.

Training Procedure

The model was fine-tuned using the following configuration:

Method: LoRA (Low-Rank Adaptation)
Parameters:
- Learning rate: 2e-4
- Epochs: 3
- Batch size: 1
- Gradient accumulation steps: 4
- LoRA rank: 16
- LoRA alpha: 32
- Target modules: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
- Training precision: fp16

Input Format

The model expects input in the following format:

<|im_start|>user
[Your request for a Notion template]
<|im_end|>
<|im_start|>assistant

Intended Uses

This model is designed for:

Generating structured Notion templates
Creating documentation layouts
Organizing information in a hierarchical format
Planning project structures
Creating knowledge bases

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "gamithasam/notion-qwen2.5-1.5B",
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("gamithasam/notion-qwen2.5-1.5B")

# Create pipeline
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Example prompt
prompt = """<|im_start|>user
Generate me a Notion template for tracking my daily tasks and habits.
<|im_end|>
<|im_start|>assistant
"""

result = pipe(
    prompt,
    max_new_tokens=1000,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    pad_token_id=tokenizer.eos_token_id
)

Limitations

The model requires specific formatting with <|im_start|> and <|im_end|> tokens
Generated JSON should be validated before use in production
May require adjustments for very complex template structures
Performance depends on available GPU memory
Training dataset size was limited to 41 examples

Training Data

The model was fine-tuned using the dataset from sbhatti2009/NotionGPT. The training data can be found here.

Additional Information

Developer: gamithasam
Training Repository: gamithasam/notion-qwen2.5

gamithasam
/

notion-qwen2.5-1.5B