Wendy
wwymak
AI & ML interests
None yet
Recent Activity
liked
a model
22 days ago
microsoft/aurora
liked
a model
29 days ago
jhu-clsp/mmBERT-small
liked
a model
about 1 month ago
google/embeddinggemma-300m
Organizations
Medical-FM
multilingual modelling
-
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Paper • 2301.09626 • Published • 2 -
Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages
Paper • 2309.04679 • Published -
An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Generative LLM Inference
Paper • 2402.10712 • Published -
FOCUS: Effective Embedding Initialization for Specializing Pretrained Multilingual Models on a Single Language
Paper • 2305.14481 • Published • 2
small-but-mighty-llms
llm-explainability
synthetic-personas
good datasets
attention zoo
-
TransformerFAM: Feedback attention is working memory
Paper • 2404.09173 • Published • 43 -
Ring Attention with Blockwise Transformers for Near-Infinite Context
Paper • 2310.01889 • Published • 13 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 66
llm-long-context
-
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
Paper • 2309.10400 • Published • 26 -
winglian/Llama-3-8b-64k-PoSE
Text Generation • 8B • Updated • 74 • • 76 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 66
image-generation-models
recsys
synthetic-personas
Medical-FM
good datasets
multilingual modelling
-
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
Paper • 2301.09626 • Published • 2 -
Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages
Paper • 2309.04679 • Published -
An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Generative LLM Inference
Paper • 2402.10712 • Published -
FOCUS: Effective Embedding Initialization for Specializing Pretrained Multilingual Models on a Single Language
Paper • 2305.14481 • Published • 2
attention zoo
-
TransformerFAM: Feedback attention is working memory
Paper • 2404.09173 • Published • 43 -
Ring Attention with Blockwise Transformers for Near-Infinite Context
Paper • 2310.01889 • Published • 13 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 66
small-but-mighty-llms
llm-long-context
-
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
Paper • 2309.10400 • Published • 26 -
winglian/Llama-3-8b-64k-PoSE
Text Generation • 8B • Updated • 74 • • 76 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 66
llm-explainability
image-generation-models