Gyanateet Dutta
Ryukijano
AI & ML interests
Computer Vision, Robotics, Generative modelling,ML in browser, healthcare applications, intersection of art and ML.
Recent Activity
liked
a dataset
2 days ago
builddotai/Egocentric-10K
liked
a model
10 days ago
CompVis/DisMo
upvoted
an
article
14 days ago
Why You Should Care About Partial Differential Equations (PDEs)
Organizations
VILA
Diffusion models
Explore the capabilities of diffusion models for natural language processing. This collection features a diverse set of models trained using diffusion
-
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper โข 2309.05793 โข Published โข 50 -
3D Gaussian Splatting for Real-Time Radiance Field Rendering
Paper โข 2308.04079 โข Published โข 193 -
stabilityai/stable-diffusion-xl-base-1.0
Text-to-Image โข Updated โข 1.95M โข โข 7.25k -
Ryukijano/lora-trained-xl-kaggle-p100
Text-to-Image โข Updated โข 21 โข 1
Deep Reinforcement Learning
Features implementations and paces of popular RL algorithms and new paradigms on a variety of environments.
-
Ryukijano/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning โข Updated -
Ryukijano/Mujoco_rl_halfcheetah_Decision_Trasformer
Reinforcement Learning โข Updated โข 7 -
Ryukijano/poca-SoccerTwos
Reinforcement Learning โข Updated โข 54 -
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Paper โข 2308.03526 โข Published โข 28
Deep learning
-
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation
Paper โข 2311.12229 โข Published โข 27 -
Running on ZeroFeatured992
IP-Adapter-FaceID
๐ง992Generate images with your face
-
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper โข 2403.03163 โข Published โข 98
Computer vision
-
Unsupervised Universal Image Segmentation
Paper โข 2312.17243 โข Published โข 20 -
Denoising Vision Transformers
Paper โข 2401.02957 โข Published โข 31 -
timm/ViT-B-16-SigLIP
Zero-Shot Image Classification โข Updated โข 38.4k โข 34 -
Runtime error19
Slimsam
๐19Small yet powerful mask generation application โก๏ธ
Multi modal foundational models
Vision_language_models
2D->3D
Segmentation
Vision_transformer_robotics
Midi-composer
Neural Rendering
This collection focuses on using neural networks for photorealistic rendering and image synthesis. It features models capable to text-to-image gen.
-
NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection
Paper โข 2307.14620 โข Published โข 14 -
LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs
Paper โข 2306.05410 โข Published โข 4 -
ashawkey/nerf2mesh
Updated โข 14 -
Build errorFeatured25
NeRF
๐ฎ25
Own Work
LLMs
-
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper โข 2312.11514 โข Published โข 260 -
3D-LFM: Lifting Foundation Model
Paper โข 2312.11894 โข Published โข 15 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper โข 2312.15166 โข Published โข 60 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper โข 2312.16862 โข Published โข 31
Audio
-
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
Paper โข 2402.00892 โข Published โข 14 -
Running on ZeroFeatured284
MusicGen Streaming
๐ฅ284Generate music from text prompts in real-time
-
Runtime error145
Whisper JAX
๐145Transcribe or translate audio from microphone, file, or YouTube
-
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Paper โข 2406.03344 โข Published โข 22
Text_to_video diffusion
Text-3D
-
Running on L4Featured1.13k
Stable Fast 3D
๐ฎ1.13kGenerate a 3D mesh model from an image
-
Runtime errorFeatured183
Roblox 3D Assets Generator v1
๐ช183Create a 3D model from an image in 10 seconds!
-
Running on ZeroFeatured146
LLaMA Mesh
๐146Create 3D mesh by chatting.
-
stabilityai/stable-point-aware-3d
Image-to-3D โข 2B โข Updated โข 967 โข 324
Audio->3D
STEM
Vision_transformer_robotics
VILA
Midi-composer
Diffusion models
Explore the capabilities of diffusion models for natural language processing. This collection features a diverse set of models trained using diffusion
-
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper โข 2309.05793 โข Published โข 50 -
3D Gaussian Splatting for Real-Time Radiance Field Rendering
Paper โข 2308.04079 โข Published โข 193 -
stabilityai/stable-diffusion-xl-base-1.0
Text-to-Image โข Updated โข 1.95M โข โข 7.25k -
Ryukijano/lora-trained-xl-kaggle-p100
Text-to-Image โข Updated โข 21 โข 1
Neural Rendering
This collection focuses on using neural networks for photorealistic rendering and image synthesis. It features models capable to text-to-image gen.
-
NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection
Paper โข 2307.14620 โข Published โข 14 -
LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs
Paper โข 2306.05410 โข Published โข 4 -
ashawkey/nerf2mesh
Updated โข 14 -
Build errorFeatured25
NeRF
๐ฎ25
Deep Reinforcement Learning
Features implementations and paces of popular RL algorithms and new paradigms on a variety of environments.
-
Ryukijano/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning โข Updated -
Ryukijano/Mujoco_rl_halfcheetah_Decision_Trasformer
Reinforcement Learning โข Updated โข 7 -
Ryukijano/poca-SoccerTwos
Reinforcement Learning โข Updated โข 54 -
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Paper โข 2308.03526 โข Published โข 28
Own Work
Deep learning
-
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation
Paper โข 2311.12229 โข Published โข 27 -
Running on ZeroFeatured992
IP-Adapter-FaceID
๐ง992Generate images with your face
-
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper โข 2403.03163 โข Published โข 98
LLMs
-
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper โข 2312.11514 โข Published โข 260 -
3D-LFM: Lifting Foundation Model
Paper โข 2312.11894 โข Published โข 15 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper โข 2312.15166 โข Published โข 60 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper โข 2312.16862 โข Published โข 31
Computer vision
-
Unsupervised Universal Image Segmentation
Paper โข 2312.17243 โข Published โข 20 -
Denoising Vision Transformers
Paper โข 2401.02957 โข Published โข 31 -
timm/ViT-B-16-SigLIP
Zero-Shot Image Classification โข Updated โข 38.4k โข 34 -
Runtime error19
Slimsam
๐19Small yet powerful mask generation application โก๏ธ
Audio
-
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
Paper โข 2402.00892 โข Published โข 14 -
Running on ZeroFeatured284
MusicGen Streaming
๐ฅ284Generate music from text prompts in real-time
-
Runtime error145
Whisper JAX
๐145Transcribe or translate audio from microphone, file, or YouTube
-
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Paper โข 2406.03344 โข Published โข 22
Multi modal foundational models
Text_to_video diffusion
Vision_language_models
Text-3D
-
Running on L4Featured1.13k
Stable Fast 3D
๐ฎ1.13kGenerate a 3D mesh model from an image
-
Runtime errorFeatured183
Roblox 3D Assets Generator v1
๐ช183Create a 3D model from an image in 10 seconds!
-
Running on ZeroFeatured146
LLaMA Mesh
๐146Create 3D mesh by chatting.
-
stabilityai/stable-point-aware-3d
Image-to-3D โข 2B โข Updated โข 967 โข 324
2D->3D
Audio->3D
Segmentation