AI & ML interests
None defined yet.
Recent Activity
-
LionGuard: Building a Contextualized Moderation Classifier to Tackle Localized Unsafe Content
Paper • 2407.10995 • Published • 1 -
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
Paper • 2411.12946 • Published • 23 -
Safe at the Margins: A General Approach to Safety Alignment in Low-Resource English Languages -- A Singlish Case Study
Paper • 2502.12485 • Published • 2 -
MinorBench: A hand-built benchmark for content-based risks for children
Paper • 2503.10242 • Published • 5
A Singapore-contextualized moderation classifier.
-
govtech/lionguard-2
Text Classification • Updated • 1.09k -
2
Lionguard 2 Demo
🦁Localised Multilingual Moderation Classifier for Singapore
-
govtech/lionguard-2-synthetic-instruct
Viewer • Updated • 2.1k • 278 -
LionGuard 2: Building Lightweight, Data-Efficient & Localised Multilingual Content Moderators
Paper • 2507.15339 • Published
Fast, lightweight zero-shot classifiers for user prompt's relevance to the system prompt.
-
LionGuard: Building a Contextualized Moderation Classifier to Tackle Localized Unsafe Content
Paper • 2407.10995 • Published • 1 -
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
Paper • 2411.12946 • Published • 23 -
Safe at the Margins: A General Approach to Safety Alignment in Low-Resource English Languages -- A Singlish Case Study
Paper • 2502.12485 • Published • 2 -
MinorBench: A hand-built benchmark for content-based risks for children
Paper • 2503.10242 • Published • 5
-
govtech/lionguard-2
Text Classification • Updated • 1.09k -
2
Lionguard 2 Demo
🦁Localised Multilingual Moderation Classifier for Singapore
-
govtech/lionguard-2-synthetic-instruct
Viewer • Updated • 2.1k • 278 -
LionGuard 2: Building Lightweight, Data-Efficient & Localised Multilingual Content Moderators
Paper • 2507.15339 • Published
Fast, lightweight zero-shot classifiers for user prompt's relevance to the system prompt.
A Singapore-contextualized moderation classifier.