Spaces:

ahmedsqrd
/

model_trace

Runtime error

App Files Files Community

model_trace / model-tracing /README.md

Ahmed Ahmed

Add model-tracing code for p-value computation (without binary files)

de071e9 19 days ago

preview code

raw

history blame contribute delete

5.76 kB


	# LLM Model Tracing
	This repository investigates model tracing in large language models (LLMs).

	Specifically, given a base LLM and a fine-tuned LLM, this code provides functionality to:

	- Permute the weights of one model (either MLP or embedding weights).
	- Align the weights of the fine-tuned model to the base model using the Hungarian algorithm.
	- Evaluate the effect of weight permutation and alignment on different statistics:
	- Mode connectivity
	- Cosine similarity
	- Embedding similarity
	- Evaluate the perplexity of the base and fine-tuned models on a given dataset.

	## Requirements

	Install the necessary packages using:

	```bash
	pip install -r requirements.txt
	```

	For development, install the development dependencies:

	```bash
	pip install -r requirements-dev.txt
	```

	### Code Formatting with pre-commit

	This repository uses pre-commit hooks to ensure code quality and consistency.

	1. Install pre-commit:

	```bash
	pip install pre-commit
	```

	2. Set up the pre-commit hooks:

	```bash
	pre-commit install
	```

	3. (Optional) Run pre-commit on all files:

	```bash
	pre-commit run --all-files
	```

	Pre-commit will automatically run on staged files when you commit changes, applying:
	- Black for code formatting
	- Ruff for linting and fixing common issues
	- nbQA for notebook formatting
	- Various file checks (trailing whitespace, YAML validity, etc.)

	## Usage

	The repository provides three main scripts:

	- `main.py`: Executes the main experiment pipeline for model tracing.
	- `launch.py`: Launches multiple experiments in parallel using slurm.

	### `main.py`

	This script performs the following steps:

	1. Loads the base and fine-tuned LLMs.
	2. Optionally permutes the weights of the fine-tuned model.
	3. Calculates the selected statistic for the non-aligned models.
	4. Optionally aligns the weights of the fine-tuned model to the base model.
	5. Calculates the selected statistic for the aligned models.
	6. Optionally evaluates the perplexity of the base and fine-tuned models.
	7. Saves the results to a pickle file.

	The script accepts various command-line arguments:

	- `--base_model_id`: HuggingFace model ID for the base model.
	- `--ft_model_id`: HuggingFace model ID for the fine-tuned model.
	- `--permute`: Whether to permute the weights of the fine-tuned model.
	- `--align`: Whether to align the weights of the fine-tuned model to the base model.
	- `--dataset_id`: HuggingFace dataset ID for perplexity evaluation.
	- `--stat`: Statistic to calculate (options: "mode", "cos", "emb").
	- csu: cosine similarity of weights statistic (on MLP up projection matrices) w/ Spearman correlation
	- csu_all: csu on all pairs of parameters with equal shape
	- csh: cosine similarity of MLP activations statistic w/ Spearman correlation
	- match: unconstrained statistic (match) with permutation matching of MLP activations
	- match_all: unconstrained statistic (match) on all pairs of MLP block activations
	- `--attn`: Whether to consider attention weights in the "mode" statistic.
	- `--emb`: Whether to consider embedding weights in the "mode" statistic.
	- `--eval`: Whether to evaluate perplexity.
	- `--save`: Path to save the results pickle file.

	Example usage:

	```bash
	python main.py --base_model_id meta-llama/Llama-2-7b-hf --ft_model_id lmsys/vicuna-7b-v1.5 --stat csu --save results.p
	```

	```bash
	python main.py --base_model_id meta-llama/Llama-2-7b-hf --ft_model_id lmsys/vicuna-7b-v1.5 --permute --align --dataset wikitext --stat match --attn --save results.p
	```

	### `launch.py`

	This script launches multiple experiments in parallel using slurm. It reads model IDs from a YAML file and runs `main.py` for each pair of base and fine-tuned models. Use the flag --flat all (defaulted) to run on all pairs of models from a YAML (see config/llama7b.yaml); or, --flat split to run on all pairs of a 'base' model with a 'finetuned' model (see config/llama7b_split.yaml); or --flat specified to run on a specified list of pairs of models.

	## Configuration

	The `model-tracing/config/model_list.yaml` file defines the base and fine-tuned models for the experiments.
	## Data

	The code downloads and uses the Wikitext 103 dataset for perplexity evaluation.

	## Results

	The results of the experiments are saved as pickle files. The files contain dictionaries with the following keys:

	- `args`: Command-line arguments used for the experiment.
	- `commit`: Git commit hash of the code used for the experiment.
	- `non-aligned test stat`: Value of the selected statistic for the non-aligned models.
	- `aligned test stat`: Value of the selected statistic for the aligned models (if `--align` is True).
	- `base loss`: Perplexity of the base model on the evaluation dataset (if `--eval` is True).
	- `ft loss`: Perplexity of the fine-tuned model on the evaluation dataset (if `--eval` is True).
	- `time`: Total execution time of the experiment.

	## Sample commands

	### 70B runs
	```
	python main.py --base_model_id meta-llama/Llama-2-70b-hf --ft_model_id meta-llama/Meta-Llama-3-70B --stat csu
	```

	# Experiments

	Relevant scripts for running additional experiments described in our paper are in this folder. For example, there are experiments on retraining MLP blocks and evaluating our statistics.

	These include `experiments/localized_testing.py` (Section 3.2.1) for fine-grained forensics and layer-matching between two models; `experiments/csu_full.py` (Section 3.2.1) for full parameter-matching between any two model architectures for hybrid models; `experiments/generalized_match.py` (Section 2.3.2, 3.2.3, 3.2.4) for the generalized robust test that involes retraining or distilling GLU MLPs; and `experiments/huref.py` (Appendix F) where we reproduce and break the invariants from a related work (Zeng et al. 2024).