Vietnamese Mistral

AI & ML interests

Mistral & Mixtral for Vietnamese

Recent Activity

vumichien authored a paper about 2 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

vumichien authored a paper about 2 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

huu-ontocord authored a paper 2 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

View all activity

vumichien

authored 2 papers about 2 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

Paper • 2509.25531 • Published Sep 29 • 7

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9 • 35

huu-ontocord

authored a paper 2 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

Paper • 2509.25531 • Published Sep 29 • 7

Taishi-N324

authored a paper 2 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

Paper • 2509.25531 • Published Sep 29 • 7

Taishi-N324

authored a paper 3 months ago

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

Paper • 2508.18672 • Published Aug 26 • 10

anoperson

authored a paper 3 months ago

mSCoRe: a $M$ultilingual and Scalable Benchmark for $S$kill-based $Co$mmonsense $Re$asoning

Paper • 2508.10137 • Published Aug 13 • 2

anoperson

authored a paper 5 months ago

Lizard: An Efficient Linearization Framework for Large Language Models

Paper • 2507.09025 • Published Jul 11 • 18

Taishi-N324

authored a paper 5 months ago

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

Paper • 2505.02881 • Published May 5 • 4

sangttruong

authored a paper 5 months ago

ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code

Paper • 2506.02314 • Published Jun 2

huu-ontocord

authored 2 papers 5 months ago

EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition

Paper • 2505.20033 • Published May 26 • 4

EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion Detection

Paper • 2506.09827 • Published Jun 11 • 20

sangttruong

authored a paper 5 months ago

Reliable and Efficient Amortized Model-based Evaluation

Paper • 2503.13335 • Published Mar 17

JJitsev

authored 2 papers 6 months ago

OpenThoughts: Data Recipes for Reasoning Models

Paper • 2506.04178 • Published Jun 4 • 48

Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets

Paper • 2506.04598 • Published Jun 5 • 7

Taishi-N324

authored a paper 8 months ago

Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models

Paper • 2503.23714 • Published Mar 31 • 1

Taishi-N324

authored 3 papers 9 months ago

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

Paper • 2411.08719 • Published Nov 10, 2024 • 1

Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

Paper • 2412.14471 • Published Dec 19, 2024

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Paper • 2503.04412 • Published Mar 6 • 5

huu-ontocord

authored 2 papers 9 months ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 56

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4