Shrew LoRA Adapters

Work in progress -- adapters are functional but under active development.

LoRA adapters for Qwen/Qwen3.5-2B fine-tuned for structured extraction as part of a production RAG application. These are the models that power Shrew's structured extraction pipeline.

Adapters

Adapter	Task	LoRA Rank	Status
`extract_metadata/`	Extract structured metadata (title, authors, dates, etc.) from document text	r32 / alpha 64	Production
`summarize_document/`	Generate document summaries	r32 / alpha 64	Production
`semantic_chunk_beta/`	Split documents into semantically coherent sections	r64 / alpha 128	Beta (10k subset, full dataset training in progress)

Usage

Load with PEFT on the base model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-2B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-2B")

model = PeftModel.from_pretrained(base, "./extract_metadata")

GGUF versions can be used with llama.cpp as LoRA adapters:

llama-cli -m Qwen3.5-2B.gguf --lora semantic_chunk_beta.gguf -p "<prompt>"

For the full pipeline integration, see shrew-server.

Sampling Parameters

Use Qwen 3.5 instruct-general parameters with enable_thinking=False:

temperature: 0.7
top_p: 0.8
top_k: 20

License

Same as base model (Qwen/Qwen3.5-2B).

Downloads last month: 102

GGUF

Model size

33.6M params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for btbtyler09/shrew-2b

Base model

Qwen/Qwen3.5-2B-Base

Finetuned

Qwen/Qwen3.5-2B

Adapter

(34)

this model