Self-Supervised Monocular Depth Estimation Based on Dual-Branch DepthNet and Multi-Attention Fusion
Authors:
Yuanyuan Wang, Dianxi Shi, Junze Zhang, Luoxi Jing, Xueqi Li, and Xucan Chen
Conference:
ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages:
482-498
Keywords:
Monocular Depth Estimation·Self-Supervised·Dual-Branch Network
Abstract
Monocular depth estimation infers 3D geometric structures of scenes from a single RGB image, offering significant applications in autonomous driving, robot navigation, and other fields. While current self-supervised learning methods avoid dependency on ground truth depth data, they still exhibit no-table limitations in complex scenarios: traditional encoder-decoder architec-tures inevitably lose high-frequency detail features when acquiring global context through continuous downsampling, resulting in blurred edges and texture distortion in depth maps. To alleviate these issues, we propose a nov-el approach named HyperDetailNet, which significantly enhances depth es-timation detail preservation. Specifically, our method contains two key com-ponents: 1 A dual-branch detail-global feature extraction network, where the detail branch adopts an enlarge-then-reduce strategy to preserve high-frequency texture information, while the global branch extracts overall struc-tural information of the scene. 2 To effectively fuse features from both branches, we designed a multi-attention fusion module that combines spatial attention, channel attention, and sliding window self-attention mechanisms to enhance model perception of detailed regions. Experimental results demonstrate that HyperDetailNet achieves excellent performance on both KITTI and Make3D datasets, with significant improvements in depth estima-tion for edge and texture-rich areas. Additionally, ablation experiments veri-fy the effectiveness of the dual-branch detail-global feature extraction DepthNet and multi-attention fusion module.
BibTeX Citation:
@inproceedings{ICIC2025,
author = {Yuanyuan Wang, Dianxi Shi, Junze Zhang, Luoxi Jing, Xueqi Li, and Xucan Chen},
title = {Self-Supervised Monocular Depth Estimation Based on Dual-Branch DepthNet and Multi-Attention Fusion},
booktitle = {Proceedings of the 21st International Conference on Intelligent Computing (ICIC 2025)},
month = {July},
date = {26-29},
year = {2025},
address = {Ningbo, China},
pages = {482-498},
note = {Poster Volume Ⅰ}
}