Papers
arxiv:2409.17895

Self-supervised Monocular Depth Estimation with Large Kernel Attention

Published on Sep 26, 2024
Authors:
,
,
,
,
,

Abstract

A self-supervised monocular depth estimation network using large kernel attention and an up-sampling module achieves competitive results on the KITTI dataset.

AI-generated summary

Self-supervised monocular depth estimation has emerged as a promising approach since it does not rely on labeled training data. Most methods combine convolution and Transformer to model long-distance dependencies to estimate depth accurately. However, Transformer treats 2D image features as 1D sequences, and positional encoding somewhat mitigates the loss of spatial information between different feature blocks, tending to overlook channel features, which limit the performance of depth estimation. In this paper, we propose a self-supervised monocular depth estimation network to get finer details. Specifically, we propose a decoder based on large kernel attention, which can model long-distance dependencies without compromising the two-dimension structure of features while maintaining feature channel adaptivity. In addition, we introduce a up-sampling module to accurately recover the fine details in the depth map. Our method achieves competitive results on the KITTI dataset.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2409.17895 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2409.17895 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2409.17895 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.