-
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Self-Taught Self-Correction for Small Language Models
Paper • 2503.08681 • Published • 15
Shreyas S K
skshreyas714
·
AI & ML interests
NLP, NLU, NLI
Organizations
Read-up research papers
-
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Self-Taught Self-Correction for Small Language Models
Paper • 2503.08681 • Published • 15
models 5
skshreyas714/AAPL_Team_ACB
Text Generation • 4B • Updated
• 2
skshreyas714/qwen2.5-3B-8bit
Updated
skshreyas714/prompt-guard-finetuned
Text Classification • 0.3B • Updated
• 1
skshreyas714/bge-m3-onnx
Feature Extraction • Updated
• 1
skshreyas714/lora-trained-xl-colab
Text-to-Image • Updated
• 2 • 1