Shrew LoRA Adapters
Work in progress -- adapters are functional but under active development.
LoRA adapters for Qwen/Qwen3.5-2B fine-tuned for structured extraction as part of a production RAG application. These are the models that power Shrew's structured extraction pipeline.
Adapters
| Adapter | Task | LoRA Rank | Status |
|---|---|---|---|
extract_metadata/ |
Extract structured metadata (title, authors, dates, etc.) from document text | r32 / alpha 64 | Production |
summarize_document/ |
Generate document summaries | r32 / alpha 64 | Production |
semantic_chunk_beta/ |
Split documents into semantically coherent sections | r64 / alpha 128 | Beta (10k subset, full dataset training in progress) |
Usage
Load with PEFT on the base model:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-2B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-2B")
model = PeftModel.from_pretrained(base, "./extract_metadata")
GGUF versions can be used with llama.cpp as LoRA adapters:
llama-cli -m Qwen3.5-2B.gguf --lora semantic_chunk_beta.gguf -p "<prompt>"
For the full pipeline integration, see shrew-server.
Sampling Parameters
Use Qwen 3.5 instruct-general parameters with enable_thinking=False:
- temperature: 0.7
- top_p: 0.8
- top_k: 20
License
Same as base model (Qwen/Qwen3.5-2B).
- Downloads last month
- 102
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.