CoVe Mascot

Overview

CoVe-4B is a compact 4B interactive tool-use agent fine-tuned from Qwen3-4B-Instruct-2507 using the CoVe (Constraint-Verification) post-training framework. It is trained on CoVe-12K, a dataset of 12K high-quality multi-turn tool-use trajectories synthesized and verified by deterministic constraint checking.

Framework

The CoVe framework. Explicit constraints are fuzzified to guide a User Simulator LLM, and original constraints act as a deterministic checklist to verify the agent's tool invocations.

Performance

Main Results Table
Main results on τ²-bench. CoVe-4B achieves top performance in the ≤8B group and rivals models up to 70B.

Deployment and Evaluation

CoVe-4B uses the Hermes tool-call format and can be deployed with vLLM.

Serve with vLLM

CUDA_VISIBLE_DEVICES=0,1,2,3 vllm serve [MODEL_HF_URL] \
  --served-model-name CoVe \
  --enable-auto-tool-choice \
  --tool-call-parser hermes \
  --tensor-parallel-size 1 \
  --data-parallel-size 4 \
  --host 0.0.0.0 \
  --port ${PORT}

Evaluate with τ²-bench

Once the model is running, evaluate using the official τ²-bench code. Set the agent model to the vLLM-served CoVe endpoint.

Citation

@article{Chen2026CoVe,
  title   = {CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification},
  author  = {Chen, Jinpeng and Gong, Cheng and Li, Hanbo and Liu, Ziru and Tian, Zichen and Fu, Xinyu and Wu, Shi and Zhang, Chenyang and Zhang, Wu and Zhang, Suiyun and Tu, Dandan and Liu, Rui},
  journal = {arXiv preprint arXiv:2603.01940},
  year    = {2026}
}