thrunlab/pretraining_test
Text Generation
•
0.0B
•
Updated
•
10
thrunlab/sparse_sparse_80_percent_pretraining_warmup_500_0_1_steps_5k_distillation
Updated
thrunlab/sparse_sparse_80_percent_pretraining_warmup_20K_0_2_steps_5k
Text Generation
•
7B
•
Updated
•
5
thrunlab/sparse_sparse_80_percent_pretraining_warmup_20K_steps_5k
Text Generation
•
7B
•
Updated
•
5
thrunlab/sparse_sparse_80_percent_pretraining_warmup
Updated
thrunlab/ojriginal_glue_sst2
Updated
thrunlab/mistral_sparse_80_percent_
Updated
thrunlab/original_glue_sst2
Updated
thrunlab/original_glue_qnli
Updated
thrunlab/original_glue_wic
Updated
thrunlab/mistral_sparse_80_percent_wic_1000
Updated
thrunlab/original_glue_boolq
7B
•
Updated
•
2
thrunlab/mistral_sparse_80_percent_boolq_1000
7B
•
Updated
•
3
thrunlab/original_glue_cola
7B
•
Updated
•
3
thrunlab/mistral_sparse_80_percent_cola_1000
7B
•
Updated
•
3
thrunlab/Mistral_Sparse_pretraining_80_percent_10000
7B
•
Updated
•
3
thrunlab/mistral_sparse_80_percent_cola_raw
7B
•
Updated
•
3
thrunlab/Mistral_Sparse_80_percent_cola_3000
Updated
thrunlab/Mistral_Sparse_80_percent_cola_2000
Updated
thrunlab/Mistral_Sparse_pretraining_80_percent_3000
Updated
thrunlab/Mistral_Sparse_pretraining_80_percent_2000
Updated
thrunlab/Mistral-7B-v0.1_openwebtext_sparse_pretraining
Updated
thrunlab/loading_test
7B
•
Updated
•
3
thrunlab/Mistral_Sparse_0.1
Updated
thrunlab/Mistral-7B-v0.1_colaMistral_scratch_cola
thrunlab/Mistral-7B-v0.1_cola_sparse_swiglu_ignore_0_1
thrunlab/Mistral-7B-v0.1_cola_sparse_swiglu_scratch
Updated
thrunlab/Mistral-7B-v0.1_cola_sparse_swiglu
Updated
thrunlab/Mistral-7B-v0.1_cola_original2
Updated
thrunlab/Mistral-7B-v0.1_cola_relu_distillation
Updated