Julia K's picture

11 6

Julia K

juliak115

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

upvoted a paper 3 days ago

HeartMuLa: A Family of Open Sourced Music Foundation Models

upvoted a paper 3 days ago

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

View all activity

Organizations

None yet

upvoted 9 papers 3 days ago

Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

Paper • 2601.14417 • Published 5 days ago • 5

HeartMuLa: A Family of Open Sourced Music Foundation Models

Paper • 2601.10547 • Published 10 days ago • 36

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published 19 days ago • 46

Motion Attribution for Video Generation

Paper • 2601.08828 • Published 12 days ago • 68

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published 25 days ago • 279

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Paper • 2601.06943 • Published 14 days ago • 206

BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published 15 days ago • 188

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published 11 days ago • 185

Rethinking Video Generation Model for the Embodied World

Paper • 2601.15282 • Published 4 days ago • 41

upvoted 2 papers 6 months ago

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4, 2025 • 269

Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe

Paper • 2508.01691 • Published Aug 3, 2025 • 10

liked 6 models 8 months ago

speechbrain/spkrec-xvect-voxceleb

Audio Classification • Updated Feb 25, 2024 • 303k • 64

tiantiaf/whisper-large-v3-narrow-accent

Audio Classification • 2B • Updated Aug 10, 2025 • 8.03k • 4

tiantiaf/whisper-large-v3-msp-podcast-emotion

Audio Classification • 2B • Updated Aug 10, 2025 • 3.2k • 5

ByteDance-Seed/BAGEL-7B-MoT

Any-to-Any • 15B • Updated 16 days ago • 577 • 1.18k

nvidia/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • Updated Nov 27, 2025 • 455k • 1.41k

mistralai/Devstral-Small-2505

24B • Updated Aug 18, 2025 • 22.8k • 860