SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 120
Do we still need a network for specific computer vision tasks anymore today?
Note 1. Process video frames one at a time, equipped with a memory attention module to attend to the previous memories of the target object.