Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
293
19
344
John Leimgruber III
PRO
ubergarm
Follow
wjHunter's profile picture
KhanVarilan's profile picture
Krautpro's profile picture
256 followers
·
60 following
https://www.paypal.com/donate/?hosted_button_id=HU59345BZVSUA
ubergarm
john-leimgruber
AI & ML interests
Open LLMs and Astrophotography image processing.
Recent Activity
new
activity
3 minutes ago
ubergarm/GLM-4.7-Flash-GGUF:
Re-cooking imatrix and quants with updated ik/llama.cpp PR
updated
a model
10 minutes ago
ubergarm/GLM-4.7-Flash-GGUF
new
activity
about 9 hours ago
zai-org/GLM-4.7-Flash:
Why does the KV cache occupy so much GPU memory?
View all activity
Organizations
ubergarm
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
ubergarm/GLM-4.7-Flash-GGUF
3 minutes ago
Re-cooking imatrix and quants with updated ik/llama.cpp PR
#1 opened 3 minutes ago by
ubergarm
updated
a model
10 minutes ago
ubergarm/GLM-4.7-Flash-GGUF
Text Generation
•
Updated
12 minutes ago
•
460
•
7
New activity in
zai-org/GLM-4.7-Flash
about 9 hours ago
Why does the KV cache occupy so much GPU memory?
4
#21 opened about 18 hours ago by
yyg201708
Cannot run vLLM on DGX Spark: ImportError: libcudart.so.12
1
#18 opened about 24 hours ago by
yyg201708
Performance Discussion
👀
2
3
#1 opened 1 day ago by
IndenScale
Enormous KV-cache size?
👍
➕
4
16
#3 opened 1 day ago by
nephepritou
New activity in
noctrex/GLM-4.7-Flash-MXFP4_MOE-GGUF
about 11 hours ago
Feedback from running in LM Studio 0.39.3 with v1.103.2 of llama.cpp
6
#1 opened about 19 hours ago by
spanspek
liked
a model
about 13 hours ago
noctrex/GLM-4.7-Flash-MXFP4_MOE-GGUF
Text Generation
•
30B
•
Updated
1 day ago
•
1.16k
•
9
published
a model
1 day ago
ubergarm/GLM-4.7-Flash-GGUF
Text Generation
•
Updated
12 minutes ago
•
460
•
7
liked
2 models
1 day ago
ngxson/GLM-4.7-Flash-GGUF
30B
•
Updated
about 20 hours ago
•
6.79k
•
17
zai-org/GLM-4.7-Flash
Text Generation
•
31B
•
Updated
about 17 hours ago
•
15.2k
•
•
784
New activity in
ubergarm/GLM-4.7-GGUF
3 days ago
Stable run on 2x RTX 5090 and 2 Xeon E5 2696 V4 and DDR4 with ik_llama.cpp - 6.1 t/s on IQ4_K and 5.1 t/s on IQ5_K, opencode works with this
👍
1
10
#5 opened 25 days ago by
martossien
liked
a model
3 days ago
ArtusDev/requests-exl
Updated
Oct 13, 2025
•
6
New activity in
ArtusDev/requests-exl
3 days ago
[QUANTING UPDATE]
❤️
👍
3
4
#28 opened 5 days ago by
ArtusDev
New activity in
ubergarm/Devstral-Small-2-24B-Instruct-2512-GGUF
3 days ago
Mistral 3 large wuant
👍
1
1
#1 opened 3 days ago by
facedwithahug
New activity in
ubergarm/DeepSeek-V3.2-Speciale-GGUF
3 days ago
QuIP - 2 bit quantised as good as 16 bit
5
#5 opened 8 days ago by
infinityai
New activity in
msievers/gemma-3-1b-it-qat-q4_0-gguf
7 days ago
Thanks for sharing your work!
❤️
2
3
#1 opened 7 days ago by
ubergarm
New activity in
ubergarm/DeepSeek-V3.2-Speciale-GGUF
7 days ago
Say Whattt?!
🔥
👍
4
7
#1 opened 12 days ago by
mtcl
New activity in
ubergarm/Devstral-2-123B-Instruct-2512-GGUF
7 days ago
Decent PPL with 100% IQ4_KSS
🔥
1
9
#3 opened about 1 month ago by
sokann
New activity in
kyutai/pocket-tts
7 days ago
Open access to the model
2
#1 opened 7 days ago by
jujutechnology
Load more