arxiv:2501.10080

Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks

Published on Jan 17

Upvote

Authors:

Michael Schwingshackl ,

Fabio Francisco Oberweger ,

Abstract

A novel few-shot semantic segmentation approach for machinery integrates CLIPSeg, SAM, SuperPoint, and GCN, achieving effective segmentation with minimal training time and robust generalization to real data.

AI-generated summary

This paper proposes a novel approach to few-shot semantic segmentation for machinery with multiple parts that exhibit spatial and hierarchical relationships. Our method integrates the foundation models CLIPSeg and Segment Anything Model (SAM) with the interest point detector SuperPoint and a graph convolutional network (GCN) to accurately segment machinery parts. By providing 1 to 25 annotated samples, our model, evaluated on a purely synthetic dataset depicting a truck-mounted loading crane, achieves effective segmentation across various levels of detail. Training times are kept under five minutes on consumer GPUs. The model demonstrates robust generalization to real data, achieving a qualitative synthetic-to-real generalization with a J&F score of 92.2 on real data using 10 synthetic support samples. When benchmarked on the DAVIS 2017 dataset, it achieves a J&F score of 71.5 in semi-supervised video segmentation with three support samples. This method's fast training times and effective generalization to real data make it a valuable tool for autonomous systems interacting with machinery and infrastructure, and illustrate the potential of combined and orchestrated foundation models for few-shot segmentation tasks.

View arXiv page View PDF Project page GitHub 8 Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2501.10080 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2501.10080 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2501.10080 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.