llama-8b-gitnav / README.md

Update README.md

eb9db4d verified 3 months ago

4.05 kB

	---
	base_model: meta-llama/Meta-Llama-3-8B
	library_name: peft
	license: llama3
	datasets:
	- lewtun/github-issues
	language:
	- en
	pipeline_tag: summarization
	---

	# Model Card: LoRA-LLaMA3-8B-GitHub-Summarizer

	This repository provides LoRA adapter weights fine-tuned on top of Meta’s LLaMA-3-8B model for the task of summarizing GitHub issues and discussions. The model was trained on a curated dataset of open-source GitHub issues to produce concise, readable, and technically accurate summaries.

	## Model Details

	### Model Description

	- Developed by: Saramsh Gautam (Louisiana State University)
	- Model type: LoRA adapter weights
	- Language(s): English
	- License: llama (must comply with Meta's license)
	- Fine-tuned from model: `meta-llama/Meta-Llama-3-8B`
	- Library used: PEFT (LoRA) with Hugging Face Transformers

	### Model Sources

	- Base model: [Meta-LLaMA-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
	- Repository: [link to this repo]()

	## Uses

	### Direct Use

	These adapter weights must be merged with the base LLaMA-3-8B model using PEFT or Hugging Face’s `PeftModel` wrapper.

	Example use case:

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel, PeftConfig

	base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B")
	model = PeftModel.from_pretrained(base_model, "saramshgautam/lora-llama-8b-github")
	tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B")
	```

	### Intended USe

	- Research in summarization of technical conversations

	- Augmenting code review and issue tracking pipelines

	- Studying model adaptation via parameter-efficient fine-tuning

	### Out-of-Scope Use

	- Commercial applications (restricted by Meta’s LLaMA license)

	- General-purpose conversation or chatbot use (model optimized for summarization)

	## Bias, Risks, and Limitations

	- The model inherits biases from both the base LLaMA-3 model and the GitHub dataset. It may underperform on non-technical content or multilingual issues.

	## Recommendations

	Use only for academic or non-commercial research. Evaluate responsibly before using in production or public-facing tools.

	## How to Get Started with the Model

	See the example in “Direct Use” above. You must separately download the base model from Meta and load the LoRA adapters from this repo.

	## Training Details

	### Training Data

	- Source: Hugging Face lewtun/github-issues
	- Description: Contains 3,000+ GitHub issues and comments from popular open-source repositories.

	## Training Procedure

	- LoRA with PEFT
	- 4-bit quantized training using bitsandbytes
	- Mixed precision: bf16
	- Batch size: 8
	- Epochs: 3
	- Optimizer: AdamW

	### Evaluation

	## Metrics

	ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum on a 500-issue test set

	## Results

	\| Metric \| Score \|
	\| ---------- \| ----- \|
	\| ROUGE-1 \| 0.706 \|
	\| ROUGE-2 \| 0.490 \|
	\| ROUGE-L \| 0.570 \|
	\| ROUGE-Lsum \| 0.582 \|

	---

	## Environmental Impact

	- Hardware Type: 4×A100 GPUs (university HPC cluster)
	- Training Hours: ~4 hours
	- Carbon Estimate: ~10.2 kg CO₂eq
	_(estimated via [ML CO2 calculator](https://mlco2.github.io/impact))_

	---

	## Citation

	APA:

	Gautam, S. (2025). _LoRA-LLaMA3-8B-GitHub-Summarizer: Adapter weights for summarizing GitHub issues using LLaMA 3_. Hugging Face. https://huggingface.co/saramshgautam/lora-llama-8b-github

	BibTeX:

	```bibtex
	@misc{gautam2025lora,
	title={LoRA-LLaMA3-8B-GitHub-Summarizer},
	author={Gautam, Saramsh},
	year={2025},
	howpublished={\url{https://huggingface.co/saramshgautam/lora-llama-8b-github}},
	note={Fine-tuned adapter weights using LoRA on Meta-LLaMA-3-8B}
	}
	```

	---

	## Contact

	- Author: Saramsh Gautam
	- Affiliation: Louisiana State University
	- Email: [your email]
	- Hugging Face profile: [https://huggingface.co/saramshgautam](https://huggingface.co/saramshgautam)

	---

	## Framework Versions

	- PEFT: 0.15.2
	- Transformers: 4.40.0
	- Bitsandbytes: 0.41.3
	- Datasets: 2.18.0