Image-Text-to-Text
Transformers
Safetensors
qwen3
text-generation
code-generation
conversational
text-generation-inference

WebGen-Agent

WebGen-Agent is an advanced website generation agent designed to autonomously create websites from natural language instructions. It was introduced in the paper WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning.

Code: https://github.com/mnluzimu/WebGen-Agent

Project Overview

WebGen-Agent combines state-of-the-art language models with specialized training techniques to create a powerful website generation tool. The agent can understand natural language instructions specifying appearance and functional requirements, iteratively generate website codebases, and refine them using visual and functional feedback.

Resources

Links to the data and model parameters are as follows:

Data HF Link
webgen-agent_train_sft 🤗 luzimu/webgen-agent_train_sft
webgen-agent_train_step-grpo 🤗 luzimu/webgen-agent_train_step-grpo
Model HF Link
WebGenAgent-LM-7B-SFT 🤗 luzimu/WebGenAgent-LM-7B-SFT
WebGenAgent-LM-7B-Step-GRPO 🤗 luzimu/WebGenAgent-LM-7B-Step-GRPO
WebGenAgent-LM-8B-SFT 🤗 luzimu/WebGenAgent-LM-8B-SFT
WebGenAgent-LM-8B-Step-GRPO 🤗 luzimu/WebGenAgent-LM-8B-Step-GRPO

How WebGen-Agent Works

WebGen-Agent follows an iterative, multi-step paradigm for website generation:

  1. Code Generation: The agent generates code to create or edit website files based on natural language instructions
  2. Code Execution: Dependencies are installed and the website service is started
  3. Feedback Gathering:
    • A screenshot of the website is captured
    • A Visual Language Model (VLM) provides appearance feedback and scores
    • A GUI-agent tests the website functionality and provides functional feedback
  4. Refinement: Based on the feedback, the agent continues to improve the website until it meets requirements

WebGen-Agent Workflow

Step-GRPO with Screenshot and GUI-agent Feedback

The Step-GRPO with Screenshot and GUI-agent Feedback approach uses the screenshot and GUI-agent scores inherently produced in the WebGen-Agent workflow as step-level rewards:

  • Screenshot Score: Quantifies the visual appeal and aesthetics of the website
  • GUI-agent Score: Measures how well the website meets functional requirements

These dual rewards provide dense, reliable process supervision that significantly improves the model's ability to generate high-quality websites.

Step-GRPO with Screenshot and GUI-agent Feedback

Sample Usage

For detailed installation and inference instructions, refer to the WebGen-Agent GitHub repository.

# Example for single inference (from GitHub README)
python src/infer_single.py \
    --model deepseek-chat \
    --vlm_model Qwen/Qwen2.5-VL-32B-Instruct \
    --instruction "Please implement a wheel of fortune website." \
    --workspace-dir workspaces_root/test \
    --log-dir service_logs/test \
    --max-iter 20 \
    --overwrite \
    --error-limit 5

Citation

If you find our project useful, please cite:

@misc{lu2025webgenagentenhancinginteractivewebsite,
      title={WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning}, 
      author={Zimu Lu and Houxing Ren and Yunqiao Yang and Ke Wang and Zhuofan Zong and Junting Pan and Mingjie Zhan and Hongsheng Li},
      year={2025},
      eprint={2509.22644},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.22644}, 
}

@misc{lu2025webgenbenchevaluatingllmsgenerating,
      title={WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch}, 
      author={Zimu Lu and Yunqiao Yang and Houxing Ren and Haotian Hou and Han Xiao and Ke Wang and Weikang Shi and Aojun Zhou and Mingjie Zhan and Hongsheng Li},
      year={2025},
      eprint={2505.03733},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.03733}, 
}
Downloads last month
25
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for luzimu/WebGenAgent-LM-8B-SFT

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Finetuned
(428)
this model

Datasets used to train luzimu/WebGenAgent-LM-8B-SFT

Collection including luzimu/WebGenAgent-LM-8B-SFT