trillionlabs
/

Tri-70B-Intermediate-Checkpoints

Model card Files Files and versions Community

Tri-70B-Intermediate-Checkpoints / README.md

tl-hyungguk's picture

Update README.md

decf37e verified 6 days ago

|

history blame contribute delete

2.14 kB

	---
	license: apache-2.0
	language:
	- en
	- ko
	- ja
	---
	# Intermediate Checkpoints Release

	For the first time among Korean-targeted LLMs, we’re releasing intermediate checkpoints from the Tri family—0.5B, 1.9B, and 7B—to advance research on LLM training dynamics. We release checkpoints at regular step intervals— ≈20B tokens (0.5B), ≈40B (1.9B), and ≈160B (7B & 70B) —enabling consistent analysis of training dynamics. Each step’s release is distinguished by its branch name.
	We’re also sharing the 0.5B and 1.9B runs—originally produced for system bring-up but now available as valuable artifacts for analyzing training behavior at smaller scales.

	You can browse all intermediate checkpoints here:
	- Tri-0.5B → [https://huggingface.co/trillionlabs/0.5B-Intermediate-Checkpoints](https://huggingface.co/trillionlabs/0.5B-Intermediate-Checkpoints)
	- Tri-1.9B → [https://huggingface.co/trillionlabs/1.9B-Intermediate-Checkpoints](https://huggingface.co/trillionlabs/1.9B-Intermediate-Checkpoints)
	- Tri-7B → [https://huggingface.co/trillionlabs/Tri-7B-Intermediate-Checkpoints](https://huggingface.co/trillionlabs/Tri-7B-Intermediate-Checkpoints)
	- Tri-70B(SFT Preview) → [https://huggingface.co/trillionlabs/Tri-70B-Intermediate-Checkpoints](https://huggingface.co/trillionlabs/Tri-70B-Intermediate-Checkpoints)

	Feel free to check out the full Tri-series collection here:
	- https://huggingface.co/collections/trillionlabs/tri-series-687fa9ff7eb23e8ba847ef93

	Dive into the full details—including training configuration and loss curves —on our [blog](https://trillionlabs.co/research/tri-series-intermediate-checkpoints-release).



	# Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	INTERMEDIATE_STEP = "0000020000"
	model = AutoModelForCausalLM.from_pretrained('trillionlabs/Tri-70B-Intermediate-Checkpoints', revision=INTERMEDIATE_STEP, trust_remote_code=True)
	tokenizer = AutoTokenizer.from_pretrained('trillionlabs/Tri-70B-Intermediate-Checkpoints', revision=INTERMEDIATE_STEP, trust_remote_code=True)

	...
	```