1 3 5

Projecta

ProjectAdotai

AI & ML interests

None yet

Recent Activity

new activity 2 months ago

xlangai/Jedi-7B-1080p:Model card + usage snippet

liked a model 2 months ago

xlangai/Jedi-7B-1080p

liked a model 2 months ago

xlangai/AgentTrek-1.0-32B

View all activity

Organizations

New activity in xlangai/Jedi-7B-1080p 2 months ago

Model card + usage snippet

👍 2

#2 opened 2 months ago by

merve

liked 2 models 2 months ago

xlangai/Jedi-7B-1080p

Image-Text-to-Text • 8B • Updated Jun 18 • 1.59k • 26

xlangai/AgentTrek-1.0-32B

33B • Updated Feb 19 • 4 • 5

upvoted 2 collections 2 months ago

Awesome Computer Use Agents

Collection

https://github.com/ranpox/awesome-computer-use • 25 items • Updated Dec 18, 2024 • 14

UI Agent

Collection

a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 394 items • Updated 1 day ago • 60

liked a dataset 2 months ago

xlangai/aguvis-stage1

Preview • Updated 7 days ago • 586 • 14

liked a model 2 months ago

ByteDance-Seed/UI-TARS-1.5-7B

Image-Text-to-Text • 8B • Updated Apr 18 • 93.4k • 336

replied to merve's post 2 months ago

any new models that excel in GUI for CUA?

upvoted a collection 2 months ago

Releases 23 May

Collection

34 items • Updated May 26 • 8

liked a model 2 months ago

moondream/moondream-2b-2025-04-14-4bit

Image-Text-to-Text • 1B • Updated May 22 • 12.6k • 52

reacted to merve's post with 👍 2 months ago

Post

3142

what happened in open AI past week? so many vision LM & omni releases 🔥 merve/releases-23-may-68343cb970bbc359f9b5fb05

multimodal 💬🖼️
> new moondream (VLM) is out: it's 4-bit quantized (with QAT) version of moondream-2b, runs on 2.5GB VRAM at 184 tps with only 0.6% drop in accuracy (OS) 🌚
> ByteDance released BAGEL-7B, an omni model that understands and generates both image + text. they also released Dolphin, a document parsing VLM 🐬 (OS)
> Google DeepMind dropped MedGemma in I/O, VLM that can interpret medical scans, and Gemma 3n, an omni model with competitive LLM performance

> MMaDa is a new 8B diffusion language model that can generate image and text

LLMs
> Mistral released Devstral, a 24B coding assistant (OS) 👩🏻‍💻
> Fairy R1-32B is a new reasoning model -- distilled version of DeepSeek-R1-Distill-Qwen-32B (OS)
> NVIDIA released ACEReason-Nemotron-14B, new 14B math and code reasoning model
> sarvam-m is a new Indic LM with hybrid thinking mode, based on Mistral Small (OS)
> samhitika-0.0.1 is a new Sanskrit corpus (BookCorpus translated with Gemma3-27B)

image generation 🎨
> MTVCrafter is a new human motion animation generator

Projecta

AI & ML interests

Recent Activity

Organizations

ProjectAdotai's activity

Model card + usage snippet