Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling
Paper • 2604.05072 • Published • 17
HiVG-3B-Base is a 3B-parameter vision-language model for autoregressive Scalable Vector Graphics (SVG) generation.
HiVG introduces a novel hierarchical SVG tokenization framework that replaces generic byte-level tokenization with geometry-aware atomic and segment tokens, enabling significantly more efficient and faithful SVG code generation.
You can use the provided inference pipeline for both image-to-SVG and text-to-SVG tasks.
from hivg_infer import HiSVGInferencePipeline
pipeline = HiSVGInferencePipeline(
model_path="xingxm/HiVG-3B-Base",
coord_range=234,
temperature=0.7,
top_p=0.9,
max_new_tokens=4096,
)
# Image-to-SVG
result = pipeline.img2svg("path/to/your_image.png")
if result["success"]:
print(result["svg"])
# Text-to-SVG
result = pipeline.text2svg("A minimalist black phone icon with an outline style")
if result["success"]:
with open("output.svg", "w") as f:
f.write(result["svg"])
Note: For detailed inference code, data preprocessing, and the hierarchical SVG tokenizer/detokenizer, please visit the project page and the associated code repository.
Please refer to the paper for detailed compute specifications.
If you find this work helpful, please cite:
@article{xing2026hivg,
title={Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling},
author={Ximing Xing and Ziteng Xue and Zhenxi Li and Weicong Liang and Linqing Wang and Zhantao Yang and Tiankai Hang and Zijin Yin and Qinglin Lu and Chunyu Wang and Qian Yu},
journal={arXiv preprint arXiv:2604.05072},
year={2026}
}