Alex Hant's picture

7

Alex Hant

hardhant

·

hardhant@gmail.com

AI & ML interests

None yet

Recent Activity

reacted to ManniX-ITA's post with 🚀 about 8 hours ago

🚀 Two releases this week pushing merge methodology forward. ▶ Qwen3.6-27B-Omnimerge-v4-MLP https://huggingface.co/ManniX-ITA/Qwen3.6-27B-Omnimerge-v4 Same-base DARE-TIES merge of Qwen3.6-27B + 3 fine-tunes (rico03 Claude distill, Esper3.1, kai-os Opus reasoning anchor) via my Omnimerge_v2 method (OBIM-lite + DAREx-q + EMR election). Hit a Qwen3.6-specific fragility: hyperparams that work flawlessly on 3.5 produced 80% unclosed-<think> on 3.6, collapsing pass@1 to ~20%. Per-tensor delta forensics localized the failure to mlp.{gate,up,down}_proj in layers 27–52. Fix: MLP-passthrough surgery — copy MLPs verbatim from base, keep merged attn + linear_attn. Leak → 0%. Q6_K results (vs Qwen3.6 base / vs Omnimerge-v2 on Qwen3.5): • HumanEval: 84.76% (= base, +5.49 pp vs v2) • MBPP corrected: 73.40% (+15.80 pp vs base, ≈ v2) • GPQA Diamond: ~84.75% partial 192/198 (+15.5 pp vs v2) ▶ Qwen3.5-4B Importance-Signal Study (M1..M5) Controlled 5-way comparison: same Qwen3.5-4B base, same 2 fine-tunes (Jackrong Claude-4.5 distill + Crow Opus-4.6 distill), only the importance signal driving DARE-TIES sparsification varies. Q6_K HE / MBPP pass@1: • M1 Vanilla DARE-TIES → 51.22 / 47.00 • M2 OMv2 (no signal) → 52.44 / 49.40 • M3 OMv2 + Fisher → 57.93 🥇 / 48.80 • M4 mergekit ex-LRP (PR #682) → 51.22 / 49.40 • M5 OMv2 + LRP → 53.05 / 51.40 🥇 Findings: Fisher wins HE (+4.88 pp over vanilla), LRP wins MBPP (+2.60 pp). Both signals + Omnimerge_v2 recipe beat vanilla. To make multimodal-LM ex-LRP work end-to-end against Qwen3_5ForConditionalGeneration, I filed 5 patches against arcee-ai/mergekit PR #682 + 1 against rachtibat/lxt. All five Mx checkpoints + Fisher/LRP signal safetensors + reproducer scripts published.

reacted to danielhanchen's post with 🤗 2 days ago

Unsloth is now one of the top 10 most followed organizations on Hugging Face. 🤗🦥 Thanks so much for all the support! Our HF page: https://huggingface.co/unsloth

reacted to ajibawa-2023's post with 🚀 13 days ago

Go-Code-Large Dataset: https://huggingface.co/datasets/ajibawa-2023/Go-Code-Large Go-Code-Large is a large-scale corpus of Go (Golang) programming language source code, comprising 316,427 code samples stored in .jsonl format. The dataset is designed to support research and development in large language model (LLM) pretraining, static analysis, cloud-native systems, and modern backend software engineering. By offering a focused and curated dataset for Go, this corpus enables experimentation in concurrent programming, distributed systems, and performance-oriented backend services—domains where Go is widely adopted. Go-Code-Large addresses the relative scarcity of large, language-specific datasets for Go, enabling targeted research into idiomatic Go patterns, concurrency primitives, and scalable system design.

View all activity

Organizations

None yet

models 0

None public yet

datasets 0

None public yet