Article
Ken Tsui
kenhktsui
AI & ML interests
ML engineer, researcher
VLM, LLM benchmark
Opinions are my own
Recent Activity
liked
a model 27 days ago
moonshotai/Kimi-K2.5 liked
a dataset about 2 months ago
VITRA-VLA/VITRA-1M liked
a dataset 3 months ago
Hothan/OlympiadBench Organizations
FastText Model for Pretraining Data Curation
-
kenhktsui/llm-data-textbook-quality-fasttext-classifier-v2
Text Classification • Updated • 208 • 28 -
kenhktsui/fineweb-edu-fasttext-classifier
Text Classification • Updated • 5.92k • 4 -
kenhktsui/code-natural-language-fasttext-classifier
Text Classification • Updated • 686 • 4 -
kenhktsui/math-fasttext-classifier
Text Classification • Updated • 16 • 2
LongTalk
A Very Long Chain-of-Thought Dataset for Reasoning Model Post-Training
-
kenhktsui/longtalk-cot-v0.1
Viewer • Updated • 61.2k • 58 • 13 -
kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf
8B • Updated • 27 • 1 -
kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged
Text Generation • 8B • Updated • 2 -
kenhktsui/llama3.1-8b-instruct-thinking-sft-merged-gguf
8B • Updated • 16 • 1
FastText Model for Pretraining Data Curation
-
kenhktsui/llm-data-textbook-quality-fasttext-classifier-v2
Text Classification • Updated • 208 • 28 -
kenhktsui/fineweb-edu-fasttext-classifier
Text Classification • Updated • 5.92k • 4 -
kenhktsui/code-natural-language-fasttext-classifier
Text Classification • Updated • 686 • 4 -
kenhktsui/math-fasttext-classifier
Text Classification • Updated • 16 • 2
LongTalk
A Very Long Chain-of-Thought Dataset for Reasoning Model Post-Training
-
kenhktsui/longtalk-cot-v0.1
Viewer • Updated • 61.2k • 58 • 13 -
kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf
8B • Updated • 27 • 1 -
kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged
Text Generation • 8B • Updated • 2 -
kenhktsui/llama3.1-8b-instruct-thinking-sft-merged-gguf
8B • Updated • 16 • 1
models 34
kenhktsui/math-fasttext-classifier
Text Classification • Updated
• 16 • 2
kenhktsui/code-natural-language-fasttext-classifier
Text Classification • Updated
• 686 • 4
kenhktsui/fineweb-edu-fasttext-classifier
Text Classification • Updated
• 5.92k • 4
kenhktsui/llm-data-textbook-quality-fasttext-classifier-v2
Text Classification • Updated
• 208 • 28
kenhktsui/finefineweb-domain-fasttext-classifier
Text Classification • Updated
• 3 • 2
kenhktsui/Qwen2.5-3B-Instruct-GRPO-basic-sampling_temp_05
Text Generation • Updated
kenhktsui/Qwen2.5-3B-Instruct-GRPO-minp-sampling_temp_05
Text Generation • Updated
• 2
kenhktsui/Qwen-0.5B-GRPO
Text Generation • 0.5B • Updated
• 1 • 1
kenhktsui/Qwen-0.5B-GRPO-gsm8k-count-wait-cap-cross-correct
Text Generation • 0.5B • Updated
kenhktsui/llama3.1-8b-instruct-thinking-sft-merged-gguf
8B • Updated
• 16 • 1
datasets 48
kenhktsui/FineFineWeb-First100K
Viewer
• Updated
• 6.7M • 42
kenhktsui/serp-bench
Updated
• 3
kenhktsui/math-classifiers-data
Viewer
• Updated
• 2M • 113
kenhktsui/longtalk-cot-v0.1
Viewer
• Updated
• 61.2k • 58 • 13
kenhktsui/code-natural-language-classification-dataset
Viewer
• Updated
• 4.05M • 26
kenhktsui/github-code-permissive-sample
Viewer
• Updated
• 3.21M • 121
kenhktsui/llm-data-textbook-quality-v2
Viewer
• Updated
• 1.01M • 33
kenhktsui/test_imdb
Viewer
• Updated
• 40 • 7
kenhktsui/test_twitter_financial_news
Viewer
• Updated
• 60 • 7
kenhktsui/test_ag_news
Viewer
• Updated
• 104 • 8