Sovit Ranjan Rath

Generative AI & Computer Vision Engineer

Bridging cutting-edge AI research with real-world applications through technical leadership and education.

Profile Photo

Professional Summary

Generative AI & Computer Vision Engineer with 2+ years of experience building multimodal AI systems (LLMs/VLMs, RAG) and 3 years experience in Deep Learning based computer vision. Authored 340+ technical articles and courses bridging cutting-edge research with real-world applications. Passionate about deploying generative AI to accelerate creative workflows in gaming.

Core Skills

Generative AI

LLMs (Phi, Whisper) Vision-Language Models RAG Systems Image Generation

Frameworks

PyTorch TensorFlow Hugging Face LangChain ONNX

Technical Domains

Multimodal AI Computer Vision Model Optimization Agentic Workflows

Professional Experience

Indegene — Lead SWE (GenAI & LLMs)

Apr 2025 – Present
  • Slashed API costs by 98.2% by optimizing in-memory context from 2M to 36k tokens.
  • Architected an agentic workflow enabling code execution and dynamic UI visualization (graphs/charts) for healthcare solutions.
  • Built tools for real-time data interaction using generative AI.
  • Built MCP (Model Context Protocol) server for Excel reasoning integrated with GPT and Claude APIs.
  • Converted existing FastAPI server architecture to MCP server for enhanced AI agent capabilities.

OpenCV University — Technical Lead, AI Education

Feb 2021 – Apr 2025
  • Led development of 4+ courses on deep learning and computer vision, used by developers and engineers globally.
  • Engineered multimodal RAG pipelines (text/vision/audio) and demo apps for CLIP-based semantic search and Whisper transcription.
  • Collaborated with product/marketing teams to align AI education with industry needs.

DebuggerCafe.com — Author & Consultant

2020 – Present
  • Published 340+ tutorials on generative AI, RAG, and computer vision with reproducible code.
  • Designed quantized CV models for low-compute environments and real-time video analysis pipelines.

Key Projects

Multimodal AI Library

Integrated CLIP, SAM, and Molmo for NLP-guided image segmentation.

Local RAG System

Built a Python-based Q&A tool using Sentence Transformers + Phi and custom vector search.

Vision Transformers Library

Created pip-installable ViT/Swin/DETR implementations.

Object Detection Toolkit

Developed modular training scripts for Faster RCNN (PyTorch).

Education

B.Tech, Information Technology

2016-2020

CGPA: 8.61/10

Technical Presence

  • 340+ Articles on LLMs, VLMs, generative AI, and deep learning (DebuggerCafe.com).
  • Open Source: GitHub repositories for RAG, vision transformers, and AI tools.

Made with DeepSite LogoDeepSite - 🧬 Remix