-
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Paper • 2504.13161 • Published • 92 -
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks
Paper • 2402.11984 • Published -
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling
Paper • 2503.06121 • Published • 5 -
Timer: Transformers for Time Series Analysis at Scale
Paper • 2402.02368 • Published • 1
mattsta
mattsta
AI & ML interests
sequence-to-sequence, time-series models; novel tokenization schemes; high throughput low compute solutions
Recent Activity
liked
a model
about 23 hours ago
Qwen/Qwen3-30B-A3B-Instruct-2507
liked
a model
2 days ago
Intel/Qwen3-235B-A22B-Thinking-2507-gguf-q2ks-mixed-AutoRound
liked
a model
2 days ago
Intel/Qwen3-235B-A22B-Thinking-2507-int4-AutoRound
Organizations
None yet