KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K

Korean AI Idol Roleplay Language Model based on unsloth/Qwen3-30B-A3B-Instruct-2507

Model Description

์ด ๋ชจ๋ธ์€ K-pop ์•„์ด๋Œ ์Šคํƒ€์ผ์˜ ๋กคํ”Œ๋ ˆ์ž‰ ๋ฐ ๊ณต๊ฐ ๋Œ€ํ™”๋ฅผ ์œ„ํ•ด fine-tuning๋œ LoRA adapter์ž…๋‹ˆ๋‹ค.

  • Base Model: unsloth/Qwen3-30B-A3B-Instruct-2507
  • Training Phase: phase2a-test-1k
  • Training Framework: Unsloth 2025.11.3
  • LoRA Rank: 16
  • LoRA Alpha: 16
  • Training Samples: 1000

Training Configuration

{
  "model": "Qwen3-30B-A3B-Instruct-2507",
  "phase": "phase2a-test-1k",
  "dataset": "phase2-rp-base-1k",
  "num_samples": 1000,
  "lora_rank": 16,
  "lora_alpha": 16,
  "lora_dropout": 0,
  "learning_rate": 0.0002,
  "batch_size": 2,
  "gradient_accumulation_steps": 4,
  "effective_batch_size": 32,
  "max_steps": 100,
  "warmup_steps": 10,
  "max_seq_length": 2048,
  "optimizer": "adamw_8bit",
  "weight_decay": 0.01,
  "lr_scheduler_type": "linear",
  "precision": "bfloat16",
  "device_map": "auto",
  "gpus": "4x RTX 5090",
  "training_time": "40 minutes",
  "framework": "Unsloth 2025.11.3",
  "target_modules": [
    "q_proj",
    "k_proj",
    "v_proj",
    "o_proj",
    "gate_proj",
    "up_proj",
    "down_proj"
  ]
}

Evaluation Metrics

{
  "training_loss": {
    "initial": 2.3745,
    "final": 1.5027,
    "reduction_percent": 36.7
  },
  "training_metrics": {
    "total_steps": 100,
    "total_samples": 1000,
    "training_time_seconds": 2380.49,
    "training_time_minutes": 39.67,
    "samples_per_second": 0.336,
    "final_grad_norm": 0.1539,
    "final_learning_rate": 0.0
  },
  "loss_progression": {
    "step_5": 2.3745,
    "step_10": 1.531,
    "step_50": 1.632,
    "step_100": 1.5027
  },
  "wandb_run": "https://wandb.ai/developer_lunark-lunark-ai/kaidol-llm-finetuning/runs/brryct5m",
  "notes": "Baseline test with 1K samples. Stable convergence observed. Ready for hyperparameter optimization (LR 2e-4โ†’1e-4, alpha 16โ†’32, grad_accum 4โ†’8)."
}

Usage

๋กœ๋“œ ๋ฐฉ๋ฒ• (Unsloth ์‚ฌ์šฉ)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="developer-lunark/kaidol-phase2a-test-1k",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

์ถ”๋ก  ์˜ˆ์‹œ

messages = [
    {"role": "user", "content": "์˜ค๋Š˜ ๊ธฐ๋ถ„์ด ์ข‹์ง€ ์•Š์•„..."},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Dataset

  • Phase 2: RP Base Dataset (54K samples)
    • Source: developer-lunark/kaidol-phase2-rp-base-v0.1
    • Korean: 53% / English: 47%

Training Hardware

  • GPU: 4x NVIDIA RTX 5090 (32GB each)
  • Training Time: ~40 minutes
  • Framework: Unsloth + PyTorch 2.9.1 + CUDA 12.8

Limitations

  • ์ด ๋ชจ๋ธ์€ ๋กคํ”Œ๋ ˆ์ž‰ ๋ฐ ๊ณต๊ฐ ๋Œ€ํ™”์— ํŠนํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค
  • ์ผ๋ฐ˜์ ์ธ ์ง€์‹ ์งˆ๋ฌธ์ด๋‚˜ reasoning ์ž‘์—…์—๋Š” ๋ฒ ์ด์Šค ๋ชจ๋ธ๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋‚ฎ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
  • ํ•œ๊ตญ์–ด์™€ ์˜์–ด ์™ธ์˜ ์–ธ์–ด๋Š” ์ œํ•œ์ ์œผ๋กœ ์ง€์›๋ฉ๋‹ˆ๋‹ค

Ethical Considerations

  • ์ด ๋ชจ๋ธ์€ ์—ฐ๊ตฌ ๋ฐ ๊ต์œก ๋ชฉ์ ์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์Šต๋‹ˆ๋‹ค
  • ์ƒ์—…์  ์‚ฌ์šฉ ์‹œ ๋ผ์ด์„ ์Šค๋ฅผ ํ™•์ธํ•˜์„ธ์š”
  • ์ƒ์„ฑ๋œ ์ฝ˜ํ…์ธ ์˜ ํ’ˆ์งˆ๊ณผ ์ ์ ˆ์„ฑ์„ ํ•ญ์ƒ ๊ฒ€์ฆํ•˜์„ธ์š”

Citation

@misc{kaidol-phase2a-test-1k,
  author = {Developer Lunark},
  title = {KAIDOL LLM Fine-tuning - PHASE2A-TEST-1K},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/developer-lunark/kaidol-phase2a-test-1k}}
}

Model Card Contact


Generated on 2025-11-18 09:24:35

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for developer-lunark/kaidol-phase2a-test-1k