HINT-lab
's Collections
Reward-Calibration
updated
HINT-lab/llama3-8b-final-ppo-c-v0.3
Text Generation
•
8B
•
Updated
•
31
HINT-lab/mistral-7b-hermes-crm-skywork
7B
•
Updated
•
14
HINT-lab/mistral-7b-hermes-cdpo-v0.2
Text Generation
•
7B
•
Updated
•
2
HINT-lab/mistral-7b-ppo-clean-hermes
Text Generation
•
7B
•
Updated
•
4
HINT-lab/mistral-7b-ppo-hermes-v0.3
Text Generation
•
7B
•
Updated
•
3
•
1
HINT-lab/mistral-7b-ppo-m-hermes
Text Generation
•
7B
•
Updated
•
8
•
1
HINT-lab/llama3-8b-cdpo-v0.2
Text Generation
•
8B
•
Updated
•
7
HINT-lab/llama3-8b-final-ppo-v0.3
Text Generation
•
8B
•
Updated
•
16
HINT-lab/mistral-7b-hermes-rm-skywork
7B
•
Updated
•
6
HINT-lab/llama3-8b-final-ppo-m-v0.3
Text Generation
•
8B
•
Updated
•
46
HINT-lab/llama3-8b-crm-final-v0.1
8B
•
Updated
•
4
HINT-lab/llama3-8b-final-ppo-clean-v0.1
Text Generation
•
8B
•
Updated
•
7
HINT-lab/mistral-7b-hermes-dpo-v0.2
Text Generation
•
7B
•
Updated
•
6
HINT-lab/mistral-7b-ppo-c-hermes
Text Generation
•
7B
•
Updated
•
12
HINT-lab/llama3-8b-dpo-v0.2
Text Generation
•
8B
•
Updated
•
5