--- library_name: transformers pipeline_tag: text-generation # causal-LM generation language: - en # natural language comments - code # Lean proof language (= non-natural) base_model: kfdong/STP_model_Lean_0320 #datasets: # - # rename if your JSON file differs metrics: - train_loss - eval_loss tags: - lora - peft - causal-lm - theorem-proving - stp - flash-attn-2 - bf16 new_version: v1.0 --- # LoRA Fine-Tuning of DeepSeek-Prover-V2-7B This model card describes a LoRA adapter fine-tuned from **`deepseek-ai/DeepSeek-Prover-V2-7B`**. --- ## Model Details | Field | Value | |-------|-------| | **Developed by** | HAIELab | | **Model type** | 7.78 B-parameter causal-LM + LoRA | | **Languages** | English & Code | | **Finetuned from** | `deepseek-ai/DeepSeek-Prover-V2-7B` | | **Precision** | BF16 | | **Context length** | 1792 tokens | | **Hardware** | 1 × H100 80 GB | --- ### Hyper-parameters | Parameter | Value | |-----------|-------| | Epochs | 1 | | Max seq length | 1 792 | | Batch size | 16 train / 16 eval | | Grad accum | 1 | | LR schedule | Cosine, base 2 × 10⁻⁴, warm-up 3 % | | Weight decay | 0.01 | | Optimiser | **AdamW** | | LoRA | r 16 · α 32 · dropout 0.05 | | Precision | bf16 mixed, gradient-checkpointing | | Logging | every 50 steps | | Evaluation | once per epoch | --- ### Results (1 Epoch) | Metric | Value | | ----------------------------- | ----------------------- | | Final **train loss** | **1.2286** | | Final **eval loss** | **1.2185** | | Final **eval accuracy** | **0.6959 token-level** | | First-step loss (warm-up) | 1.3707 | | Trainable Parameters | 876M (11.25%) | --- ## How to Get Started ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # This is a placeholder; replace with the actual Hugging Face Hub repo ID if the upload succeeds. model_id = "your-username/your-model-name" # 1️⃣ Load the tokenizer tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token # 2️⃣ Load the base model with the LoRA adapter model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", attn_implementation="flash_attention_2" # Recommended way to enable Flash Attention ) # 3️⃣ Create a prompt prompt = "Your prompt here..." inputs = tokenizer(prompt, return_tensors="pt").to(model.device) # 4️⃣ Generate a response outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.7, top_p=0.9, do_sample=True ) print(tokenizer.decode(outputs[0], skip_special_tokens=True))