README.md · Programmer-RD-AI/ResearchQwen-2.5-3B-LoRA at main

ResearchQwen-2.5-3B-LoRA / README.md

Programmer-RD-AI

Update README.md

d00b559 verified 3 months ago

preview code

raw

history blame contribute delete

4.46 kB

	---
	license: cc
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-3B
	tags:
	- qwen2
	- qwen
	- text-generation
	- question-answering
	- research
	- engineering
	- lora
	- 4bit
	- bitsandbytes
	- faiss
	- rag
	metrics:
	- type: rougeL
	value: 57.2
	- type: bleu
	value: 42.8
	library_name: transformers
	---

	# 🛰️ ResearchQwen 2.5-3B-LoRA

	Compact, domain-expert Q&A for systems researchers.
	Base model: [Qwen/Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B)
	Tuning recipe: 4-bit QLoRA with bitsandbytes NF4 quantisation
	Retriever: FAISS cosine-similarity store for ~33 k document chunks

	---

	## 🚀 Quick inference

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

	model_id = "Programmer-RD-AI/ResearchQwen2.5-3B-LoRA"
	tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	device_map="auto",
	torch_dtype="auto",
	load_in_4bit=True, # uses bitsandbytes
	)
	qa = pipeline("text-generation", model=model, tokenizer=tok)
	print(qa("Explain how Chain Replication with Apportioned Queries improves tail-latency."))
	````

	### llama.cpp / GGUF

	```bash
	wget https://huggingface.co/Programmer-RD-AI/ResearchQwen2.5-3B-LoRA/resolve/main/model_Q4_K_M.gguf
	./main -m model_Q4_K_M.gguf -p "Give the core idea of the 3FS log-structured layout in 3 sentences."
	```

	---

	## 📚 Training data

	\| Source \| Docs \| Words \|
	\| -------------------------- \| ------ \| --------- \|
	\| 3FS white-paper \| 14 \| 162 k \|
	\| CRAQ spec + benchmarks \| 11 \| 119 k \|
	\| Distributed AI infra notes \| 32 \| 287 k \|
	\| Total \| 57 \| 568 k \|

	Synthetic Q\&A pairs were generated with an instruction template tuned for factual density; unhelpful pairs were filtered via a weak-to-strong scoring cascade (ROUGE-L > 0.4, BLEU > 0.35) ([GitHub][1]).

	---

	## 🛠️ Fine-tuning details

	\| Setting \| Value \|
	\| --------- \| ---------------------------------------------------------- \|
	\| GPU \| 1× A100 40 GB \|
	\| Precision \| 4-bit NF4 w/ double-quant (bnb 0.45.4) \|
	\| LoRA r/α \| 64 / 16 \|
	\| LR sched \| cosine, 5 % warm-up \|
	\| Steps \| 1 100 \|
	\| Epochs \| 3 \|
	\| Peak VRAM \| 21 GB \|

	---

	## 📈 Evaluation

	\| Metric \| Base Qwen2.5-3B \| This model \|
	\| ------- \| --------------- \| -------------- \|
	\| ROUGE-L \| 45.6 \| 57.2 \|
	\| BLEU-4 \| 30.4 \| 42.8 \|

	> See `eval/` for scripts and raw scores (ROUGE, BLEU).

	---

	## 🔗 Integration recipe (RAG)

	```python
	from langchain.vectorstores import FAISS # or llama-index
	from langchain.embeddings import HuggingFaceEmbeddings

	emb = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")
	vs = FAISS.from_texts(texts, emb)
	```

	Retriever-generator latency: 330 ms average (GPU), 1.9 s average (CPU, gguf-int4).

	---

	## 💡 Why it should trend

	* Fresh domain niche – deep systems-engineering Q\&A is underserved on HF.
	* Ultra-portable – 4-bit LoRA + GGUF = laptop-friendly.
	* Full stack repo – weights, notebook, RAG demo, eval scripts.
	* Eye-catching tags – `qwen2`, `lora`, `rag`, `research` map directly to popular HF filters and the trending feed ([Hugging Face][4]).
	* Clear usage code – copy-run experience = more downloads.

	---

	## ⚠️ Limitations & responsible use

	* Trained solely on English; non-English queries degrade sharply.
	* Answers may quote or paraphrase the training docs verbatim.
	* Not suitable for critical medical / legal advice.
	* LoRA adapters are GPL-3.0; commercial use must comply with both GPL-3.0 and the Qwen 2.5 base license.

	---

	## ✍️ Citation

	```bibtex
	@misc{ranuga_disansa_gamage_2025,
	author = { Ranuga Disansa Gamage and Rivindu Ashinsa and Thuan Naheem and Sanila Wijesekara },
	title = { ResearchQwen-2.5-3B-LoRA (Revision 7ea9f5f) },
	year = 2025,
	url = { https://huggingface.co/Programmer-RD-AI/ResearchQwen-2.5-3B-LoRA },
	doi = { 10.57967/hf/5623 },
	publisher = { Hugging Face }
	}
	```