Could you elaborate on the architectural or post-processing steps you implemented for video-based tracking?

#1
by Jinhaoli - opened

Given that DINOv3 is a model designed for processing individual images, I'm curious to learn about the mechanism you used to achieve object tracking across video frames.

Sign up or log in to comment