Veri-Code
/

ReForm-SFT-3B

+---
+license: mit
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+  - code-generation
+  - reinforcement-learning
+  - formal-verification
+  - dafny
+---
+# Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny
+This repository contains checkpoints and resources for the **Re:Form** project, which explores novel approaches to formal software verification using Large Language Models (LLMs). The work was presented in the paper [Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny](https://huggingface.co/papers/2507.16331).
+- 🌐 [Project Page](https://veri-code.github.io/ReForm-page)
+- 💾 [GitHub Repository](https://github.com/Veri-Code/Veri-Code)
+![Overall Pipeline](https://github.com/Veri-Code/Veri-Code/raw/main/assets/pipeline.png)
+## Overview
+Existing informal language-based LLMs often face challenges with reliable and scalable verification processes. Re:Form addresses this by grounding LLMs in rigorous formal systems, such as Dafny, to enable automatic and mathematically provable verification of their reasoning. This capability is vital for achieving large-scale, reliable formal software verification while systematically reducing reliance on extensive human priors.
+The core of Re:Form's pipeline involves:
+*   An automatic and scalable data curation process.
+*   Careful Reinforcement Learning (RL) designs integrated with feedback from a formal language verifier.
+Even small models (e.g., 0.5B) fine-tuned with Re:Form can generate syntactically valid and verifiable Dafny code, surpassing proprietary models and demonstrating stronger generalization to out-of-domain tasks, as evidenced by performance on the challenging DafnyComp benchmark.
+## Usage
+You can use the Re:Form models with the Hugging Face `transformers` library. Below is an example using the `sft_0.5B` checkpoint. Other checkpoints (e.g., `sft_1.5B`, `sft_3B`, `sft_7B`, `sft_14B`, `14B-RL-entropy`) are also available from the [Veri-Code Hugging Face organization](https://huggingface.co/Veri-Code).
+First, ensure you have the `transformers` library and `torch` installed. For optimal performance, especially with larger models, consider installing `FlashAttention` as detailed in the [GitHub repository's installation instructions](https://github.com/Veri-Code/Veri-Code#installation).
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_id = "Veri-Code/sft_0.5B" # Example checkpoint, replace with other available models if needed
+# Load the model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16, # Use torch.float16 or torch.float32 depending on your hardware
+    device_map="auto",
+    trust_remote_code=True
+)
+# Example for generating Dafny code
+prompt = """
+method print_sum(a: int, b: int) returns (sum: int)
+ensures sum == a + b
+{
+  return a + b;
+}
+method Max(a: int, b: int) returns (result: int)
+ensures result == a || result == b;
+ensures result >= a && result >= b;
+{
+"""
+inputs = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
+# Generate text (Dafny code)
+output_ids = model.generate(
+    inputs,
+    max_new_tokens=2048, # Max new tokens as specified in generation_config.json
+    do_sample=True,
+    temperature=0.6,
+    top_p=0.9,
+    eos_token_id=tokenizer.eos_token_id,
+    pad_token_id=tokenizer.eos_token_id, # Usually set pad_token_id to eos_token_id for causal LMs
+)
+generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
+print(generated_text)
+```
+For more details on training, evaluation, and other advanced usage, please refer to the [official GitHub repository](https://github.com/Veri-Code/Veri-Code).
+## Citation
+If you use this work in your research, please cite the paper:
+```bibtex
+@misc{yan2025reformreducinghuman,
+      title={Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny},
+      author={Chuanhao Yan and Fengdi Che and Xuhan Huang and Xu Xu and Xin Li and Yizhi Li and Xingwei Qu and Jingzhe Shi and Zhuangzhuang He and Chenghua Lin and Yaodong Yang and Binhang Yuan and Hang Zhao and Yu Qiao and Bowen Zhou and Jie Fu},
+      year={2025},
+      eprint={2507.16331},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2507.16331},
+}
+```