Shrew LoRA Adapters

Work in progress -- adapters are functional but under active development.

LoRA adapters for Qwen/Qwen3.5-2B fine-tuned for structured extraction as part of a production RAG application. These are the models that power Shrew's structured extraction pipeline.

Adapters

Adapter Task LoRA Rank Status
extract_metadata/ Extract structured metadata (title, authors, dates, etc.) from document text r32 / alpha 64 Production
summarize_document/ Generate document summaries r32 / alpha 64 Production
semantic_chunk_beta/ Split documents into semantically coherent sections r64 / alpha 128 Beta (10k subset, full dataset training in progress)

Usage

Load with PEFT on the base model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-2B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-2B")

model = PeftModel.from_pretrained(base, "./extract_metadata")

GGUF versions can be used with llama.cpp as LoRA adapters:

llama-cli -m Qwen3.5-2B.gguf --lora semantic_chunk_beta.gguf -p "<prompt>"

For the full pipeline integration, see shrew-server.

Sampling Parameters

Use Qwen 3.5 instruct-general parameters with enable_thinking=False:

  • temperature: 0.7
  • top_p: 0.8
  • top_k: 20

License

Same as base model (Qwen/Qwen3.5-2B).

Downloads last month
102
GGUF
Model size
33.6M params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for btbtyler09/shrew-2b

Finetuned
Qwen/Qwen3.5-2B
Adapter
(34)
this model