InternSVG

non-profit

https://hmwang2002.github.io/release/internsvg/

Activity Feed

AI & ML interests

Multimodal Large Language Models, Unified SVG Tasks

Recent Activity

KiyotakaWang authored a paper about 1 month ago

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

KiyotakaWang authored a paper about 1 month ago

Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning

KiyotakaWang updated a Space about 2 months ago

InternSVG/README

View all activity

Papers

Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

View all Papers

Organization Card

Community About org cards

We are the InternSVG team from the Shanghai AI Laboratory, dedicated to empowering the InternVL series models with unified capabilities for SVG vector graphic understanding, editing, and generation.

Current Work:

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

The InternSVG Family — a comprehensive suite that unifies data, benchmarks, and models for SVG understanding, editing, and generation. It consists of:

🧩 SAgoge — the largest and most diverse multimodal SVG dataset, covering icons, illustrations, chemistry diagrams, and dynamic animations;

🏆 SArena — a companion benchmark offering unified task definitions and standardized evaluation protocols across SVG domains;

🤖 InternSVG Models — multimodal large language models trained for SVG understanding, editing, and generation.

Project Links

🌐 Project Page: https://hmwang2002.github.io/release/internsvg/

📄 ArXiv Paper: https://arxiv.org/abs/2510.11341

💻 GitHub Repository: https://github.com/hmwang2002/InternSVG

📊 SArena Benchmark: https://huggingface.co/datasets/InternSVG/SArena

🧩 SAgoge Dataset: https://huggingface.co/datasets/InternSVG/SAgoge

🤖 InternSVG-8B Model: https://huggingface.co/InternSVG/InternSVG-8B

Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning

In this work, we present CTRL-S (Chain-of-Thought Reinforcement Learning for SVG), a unified framework that introduces a chain-of-thought mechanism to explicitly expose the model’s reasoning process during SVG generation. To support this structured reasoning, we construct SVG-Sophia, a high-quality dataset of 145K samples across SVG code refinement, Text-to-SVG, and Image-to-SVG tasks. Furthermore, we design a robust multi-reward reinforcement learning scheme powered by the GRPO algorithm. By jointly optimizing across DINO, image-text similarity, format, and code efficiency rewards in a multi-task setting, our approach systematically boosts structural coherence and generation capabilities. Extensive experiments show that CTRL-S outperforms existing methods, achieving higher task success rates, superior code quality, and exceptional visual fidelity.

📄 ArXiv Paper: https://arxiv.org/abs/2603.16189

💻 GitHub Repository: https://github.com/hmwang2002/CTRL-S

🧩 SVG-Sophia Dataset: https://huggingface.co/datasets/InternSVG/SVG-Sophia

models 1

InternSVG/InternSVG-8B

8B • Updated Feb 7 • 1.01k • 3

datasets 3

AI & ML interests

Recent Activity

Papers

Team members 4

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning

models 1

datasets 3 Sort: Recently updated

datasets 3