Improve model card: Update license, pipeline tag, add agent tag, and full abstract

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +9 -11
README.md CHANGED
@@ -4,11 +4,14 @@ base_model:
4
  language:
5
  - en
6
  library_name: transformers
7
- license: mit
 
8
  tags:
9
  - LLM
10
- pipeline_tag: text-generation
 
11
  ---
 
12
  # πŸ” WebExplorer-8B
13
 
14
  [![Paper](https://img.shields.io/badge/Paper-arXiv-red.svg)](https://arxiv.org/abs/2509.06501)
@@ -17,9 +20,9 @@ pipeline_tag: text-generation
17
 
18
  A state-of-the-art 8B parameter web agent model designed for complex information-seeking tasks and long-horizon reasoning.
19
 
20
- ## 🌟 Overview
21
 
22
- WebExplorer-8B is an advanced web navigation agent trained on **WebExplorer**-QA. The model demonstrates exceptional performance on challenging information-seeking benchmarks while maintaining efficiency with only 8 billion parameters.
23
 
24
  ## ✨ Key Features
25
 
@@ -31,8 +34,8 @@ WebExplorer-8B is an advanced web navigation agent trained on **WebExplorer**-QA
31
 
32
  Built on Qwen3-8B base model and trained through a two-phase approach:
33
 
34
- 1. **Supervised Fine-tuning (SFT)**: Cold-start initialization with high-quality trajectories
35
- 2. **Reinforcement Learning (RL)**: Enhanced using GRPO algorithm with progressive context expansion
36
 
37
  ## πŸ“Š Performance
38
 
@@ -61,7 +64,6 @@ WebExplorer-8B achieves state-of-the-art performance across multiple information
61
 
62
  Accuracy (%) of web agents on information-seeking benchmarks. BC-en and BC-zh denote BrowseComp-en and BrowseComp-zh respectively. XBench-DS refers to XBench-DeepSearch. **Bold** indicates the best performance among open-source models < 100B, while <u>underlined</u> values represent the best performance among models < 10B parameters. All scores of WebExplorer-8B are computed as Avg@4 using LLM-as-Judge. Entries marked with a dagger (†) were reproduced by us under our scaffold: on model name = entire row; on a number = that entry only.
63
 
64
-
65
  ## πŸ› οΈ Tool Schema
66
 
67
  WebExplorer-8B supports two tools for web interaction:
@@ -90,8 +92,6 @@ WebExplorer-8B supports two tools for web interaction:
90
  }
91
  ```
92
 
93
-
94
-
95
  ### 2. Search Tool
96
 
97
  ```json
@@ -115,8 +115,6 @@ WebExplorer-8B supports two tools for web interaction:
115
  }
116
  ```
117
 
118
-
119
-
120
  ## πŸ“ Citation
121
 
122
  If you find our work useful, please consider citing:
 
4
  language:
5
  - en
6
  library_name: transformers
7
+ license: apache-2.0
8
+ pipeline_tag: image-text-to-text
9
  tags:
10
  - LLM
11
+ - agent
12
+ paper: 2509.06501
13
  ---
14
+
15
  # πŸ” WebExplorer-8B
16
 
17
  [![Paper](https://img.shields.io/badge/Paper-arXiv-red.svg)](https://arxiv.org/abs/2509.06501)
 
20
 
21
  A state-of-the-art 8B parameter web agent model designed for complex information-seeking tasks and long-horizon reasoning.
22
 
23
+ ## Paper Abstract
24
 
25
+ The paradigm of Large Language Models (LLMs) has increasingly shifted toward agentic applications, where web browsing capabilities are fundamental for retrieving information from diverse online sources. However, existing open-source web agents either demonstrate limited information-seeking abilities on complex tasks or lack transparent implementations. In this work, we identify that the key challenge lies in the scarcity of challenging data for information seeking. To address this limitation, we introduce WebExplorer: a systematic data generation approach using model-based exploration and iterative, long-to-short query evolution. This method creates challenging query-answer pairs that require multi-step reasoning and complex web navigation. By leveraging our curated high-quality dataset, we successfully develop advanced web agent WebExplorer-8B through supervised fine-tuning followed by reinforcement learning. Our model supports 128K context length and up to 100 tool calling turns, enabling long-horizon problem solving. Across diverse information-seeking benchmarks, WebExplorer-8B achieves the state-of-the-art performance at its scale. Notably, as an 8B-sized model, WebExplorer-8B is able to effectively search over an average of 16 turns after RL training, achieving higher accuracy than WebSailor-72B on BrowseComp-en/zh and attaining the best performance among models up to 100B parameters on WebWalkerQA and FRAMES. Beyond these information-seeking tasks, our model also achieves strong generalization on the HLE benchmark even though it is only trained on knowledge-intensive QA data. These results highlight our approach as a practical path toward long-horizon web agents.
26
 
27
  ## ✨ Key Features
28
 
 
34
 
35
  Built on Qwen3-8B base model and trained through a two-phase approach:
36
 
37
+ 1. **Supervised Fine-tuning (SFT)**: Cold-start initialization with high-quality trajectories
38
+ 2. **Reinforcement Learning (RL)**: Enhanced using GRPO algorithm with progressive context expansion
39
 
40
  ## πŸ“Š Performance
41
 
 
64
 
65
  Accuracy (%) of web agents on information-seeking benchmarks. BC-en and BC-zh denote BrowseComp-en and BrowseComp-zh respectively. XBench-DS refers to XBench-DeepSearch. **Bold** indicates the best performance among open-source models < 100B, while <u>underlined</u> values represent the best performance among models < 10B parameters. All scores of WebExplorer-8B are computed as Avg@4 using LLM-as-Judge. Entries marked with a dagger (†) were reproduced by us under our scaffold: on model name = entire row; on a number = that entry only.
66
 
 
67
  ## πŸ› οΈ Tool Schema
68
 
69
  WebExplorer-8B supports two tools for web interaction:
 
92
  }
93
  ```
94
 
 
 
95
  ### 2. Search Tool
96
 
97
  ```json
 
115
  }
116
  ```
117
 
 
 
118
  ## πŸ“ Citation
119
 
120
  If you find our work useful, please consider citing: