Add comprehensive model card including paper, code, and usage with relevant tags

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +94 -3
README.md CHANGED
@@ -1,3 +1,94 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - code-generation
7
+ - reinforcement-learning
8
+ - formal-verification
9
+ - qwen2
10
+ ---
11
+
12
+ # Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny
13
+
14
+ This repository contains the `Qwen2.5-Coder-1.5B` model, a supervised fine-tuned (SFT) checkpoint developed as part of the **Re:Form** framework. This work was presented in the paper [Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny](https://huggingface.co/papers/2507.16331).
15
+
16
+ **Re:Form** aims to revolutionize formal software verification by reducing the reliance on extensive human-annotated priors. It achieves this by grounding Large Language Models (LLMs) in rigorous formal systems, where generative models operate directly in formal language spaces (e.g., Dafny). This approach enables the automatic and mathematically provable verification of their reasoning processes and outcomes, addressing a critical challenge in achieving large-scale, reliable formal software verification.
17
+
18
+ The pipeline introduces an automatic and scalable data curation process and integrates careful Reinforcement Learning (RL) designs with feedback from a formal language verifier. The project also introduced **DafnyComp**, a benchmark of compositional formal programs with auto-formalized specifications. Even small models like this (0.5B in the paper) can generate syntactically valid and verifiable Dafny code, surpassing proprietary models, with RL further enhancing performance and generalization.
19
+
20
+ ## Project Resources
21
+
22
+ * **Paper (Hugging Face):** [Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny](https://huggingface.co/papers/2507.16331)
23
+ * **Paper (arXiv):** [https://arxiv.org/abs/2507.16331](https://arxiv.org/abs/2507.16331)
24
+ * **Project Page:** [https://veri-code.github.io/ReForm-page](https://veri-code.github.io/ReForm-page)
25
+ * **GitHub Repository:** [https://github.com/veri-code/ReForm](https://github.com/veri-code/ReForm)
26
+
27
+ ![Overall Pipeline](https://veri-code.github.io/ReForm-page/assets/pipeline.png)
28
+
29
+ ## How to use
30
+
31
+ You can load and use this model with the Hugging Face `transformers` library. Please ensure you set `trust_remote_code=True` when loading, as the model uses custom tokenization and model architecture.
32
+
33
+ ```python
34
+ from transformers import AutoModelForCausalLM, AutoTokenizer
35
+ import torch
36
+
37
+ # This repository contains the Qwen2.5-Coder-1.5B model.
38
+ # Other available checkpoints from the Veri-Code organization include:
39
+ # SFT models: sft_0.5B, sft_1.5B, sft_3B, sft_7B, sft_14B
40
+ # RL fine-tuned model: 14B-RL-entropy
41
+ model_id = "Veri-Code/ReForm-Qwen2.5-Coder-1.5B"
42
+
43
+ # Load tokenizer and model
44
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
45
+ model = AutoModelForCausalLM.from_pretrained(
46
+ model_id,
47
+ torch_dtype=torch.bfloat16, # Use torch.float16 if bfloat16 is not supported
48
+ device_map="auto",
49
+ trust_remote_code=True
50
+ )
51
+
52
+ # Example prompt: Translate Python code to Dafny
53
+ prompt = """```python
54
+ def factorial(n):
55
+ if n == 0:
56
+ return 1
57
+ else:
58
+ return n * factorial(n-1)
59
+ ```
60
+ Translate the above Python code into Dafny, including pre- and post-conditions:
61
+ """
62
+
63
+ # Tokenize input
64
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)
65
+
66
+ # Generate response
67
+ output_ids = model.generate(
68
+ input_ids,
69
+ max_new_tokens=256, # Adjust max_new_tokens as needed
70
+ do_sample=True,
71
+ temperature=0.7,
72
+ top_p=0.9
73
+ )
74
+
75
+ # Decode and print the generated text
76
+ generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
77
+ print(generated_text)
78
+ ```
79
+
80
+ ## Citation
81
+
82
+ If you find this work helpful in your research, please consider citing the paper:
83
+
84
+ ```bibtex
85
+ @misc{yan2025reformreducinghuman,
86
+ title={Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny},
87
+ author={Chuanhao Yan and Fengdi Che and Xuhan Huang and Xu Xu and Xin Li and Yizhi Li and Xingwei Qu and Jingzhe Shi and Zhuangzhuang He and Chenghua Lin and Yaodong Yang and Binhang Yuan and Hang Zhao and Yu Qiao and Bowen Zhou and Jie Fu},
88
+ year={2025},
89
+ eprint={2507.16331},
90
+ archivePrefix={arXiv},
91
+ primaryClass={cs.CL},
92
+ url={https://arxiv.org/abs/2507.16331},
93
+ }
94
+ ```