Visual-Interactive Text-Image Universal Embedder (ICLR-26)
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models
SoundReactor: Frame-level Online Video-to-Audio Generation
models 8
Sony/VIRTUE-2B-SCaR
Image-Text-to-Text • Updated • 99 • 2
Sony/VIRTUE-7B-SCaR
Image-Text-to-Text • Updated • 29 • 2
Sony/AKI-4B-phi-3.5-mini
Image-Text-to-Text • Updated • 6 • 27
Sony/humangif
Updated • 1
Sony/genwarp
Image-to-Image • Updated • 12
Sony/MoLA
Updated • 1
Sony/SilentCipher
Updated • 930 • 6
Sony/soundctm
Text-to-Audio • Updated • 18
datasets 6
Sony/SCaR-Train
Viewer • Updated • 958k • 111 • 1
Sony/SCaR-Eval
Viewer • Updated • 47.1k • 106 • 1
Sony/Hokkaido_Agriculture_Image_Dataset
Viewer • Updated • 250 • 68 • 2
Sony/DeepResonance_data_models
Viewer • Updated • 77.5k • 149 • 1
Sony/OpenMU-Bench
Preview • Updated • 23
Sony/ComperDial
Updated • 55 • 1