JiaqiXue commited on
Commit
5bc5cb2
·
verified ·
1 Parent(s): a43b342

Fix Quick Start: complete runnable example with embedding generation

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -34,25 +34,30 @@ Official leaderboard results on 8,400 queries:
34
  ### Installation
35
 
36
  ```bash
37
- pip install scikit-learn numpy joblib huggingface_hub
38
  ```
39
 
40
- ### Load Pre-trained Checkpoints
41
 
42
  ```python
43
  from huggingface_hub import snapshot_download
 
44
  import sys
45
 
46
- # Download model
47
  path = snapshot_download("JiaqiXue/r2-router")
48
  sys.path.insert(0, path)
49
 
50
  from router import R2Router
51
 
52
- # Load pre-trained KNN checkpoints (no training needed)
53
  router = R2Router.from_pretrained(path)
54
 
55
- # Route a query (requires 1024-dim embedding from Qwen3-0.6B)
 
 
 
 
56
  result = router.route(embedding)
57
  print(f"Model: {result['model_full_name']}")
58
  print(f"Token Budget: {result['token_limit']}")
@@ -70,22 +75,17 @@ sys.path.insert(0, path)
70
 
71
  from router import R2Router
72
 
73
- # Train KNN from the provided sub_10 training data
74
  router = R2Router.from_training_data(path, k=80)
75
-
76
- # Route a query
77
- result = router.route(embedding)
78
  ```
79
 
80
- ### Get Query Embeddings
81
-
82
- R2-Router uses [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) embeddings (1024-dim). You can generate them with:
83
 
84
  ```python
85
- from sentence_transformers import SentenceTransformer
86
-
87
- model = SentenceTransformer("Qwen/Qwen3-0.6B")
88
- embedding = model.encode("What is the capital of France?")
89
  ```
90
 
91
  Or with vLLM for faster batch inference:
 
34
  ### Installation
35
 
36
  ```bash
37
+ pip install scikit-learn numpy joblib huggingface_hub sentence-transformers
38
  ```
39
 
40
+ ### Complete Example
41
 
42
  ```python
43
  from huggingface_hub import snapshot_download
44
+ from sentence_transformers import SentenceTransformer
45
  import sys
46
 
47
+ # 1. Download router
48
  path = snapshot_download("JiaqiXue/r2-router")
49
  sys.path.insert(0, path)
50
 
51
  from router import R2Router
52
 
53
+ # 2. Load pre-trained KNN checkpoints
54
  router = R2Router.from_pretrained(path)
55
 
56
+ # 3. Embed your query with Qwen3-0.6B (1024-dim)
57
+ embedder = SentenceTransformer("Qwen/Qwen3-0.6B")
58
+ embedding = embedder.encode("What is the capital of France?")
59
+
60
+ # 4. Route!
61
  result = router.route(embedding)
62
  print(f"Model: {result['model_full_name']}")
63
  print(f"Token Budget: {result['token_limit']}")
 
75
 
76
  from router import R2Router
77
 
78
+ # Train KNN from the provided sub_10 training data (custom hyperparameters)
79
  router = R2Router.from_training_data(path, k=80)
 
 
 
80
  ```
81
 
82
+ ### Alternative: vLLM Embeddings (Faster for Batches)
 
 
83
 
84
  ```python
85
+ from vllm import LLM
86
+ llm = LLM(model="Qwen/Qwen3-0.6B", runner="pooling")
87
+ outputs = llm.embed(["What is the capital of France?"])
88
+ embedding = outputs[0].outputs.embedding
89
  ```
90
 
91
  Or with vLLM for faster batch inference: