Semantics-aware transformer for 3D reconstruction from binocular images

doi:https://doi.org/10.1007/s11801-022-2055-0

Home > Archive>Volume 18, Issue 5, 2022 >293-299. DOI:https://doi.org/10.1007/s11801-022-2055-0

Semantics-aware transformer for 3D reconstruction from binocular images
DOI:
                        https://doi.org/10.1007/s11801-022-2055-0
                    
CSTR:
                        [cstr]
                    
Author:
                        
                        
                    
Affiliation:1. The Engineering Research Center of Learning-Based Intelligent System and the Key Laboratory of Computer Vision and System of Ministry of Education, Tianjin University of Technology, Tianjin 300384, China;2. Zhejiang University of Technology, Hangzhou 310014, China
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Existing multi-view three-dimensional (3D) reconstruction methods can only capture single type of feature from input view, failing to obtain fine-grained semantics for reconstructing the complex shapes. They rarely explore the semantic association between input views, leading to a rough 3D shape. To address these challenges, we propose a semantics-aware transformer (SATF) for 3D reconstruction. It is composed of two parallel view transformer encoders and a point cloud transformer decoder, and takes two red, green and blue (RGB) images as input and outputs a dense point cloud with richer details. Each view transformer encoder can learn a multi-level feature, facilitating characterizing fine-grained semantics from input view. The point cloud transformer decoder explores a semantically-associated feature by aligning the semantics of two input views, which describes the semantic association between views. Furthermore, it can generate a sparse point cloud using the semantically-associated feature. At last, the decoder enriches the sparse point cloud for producing a dense point cloud with richer details. Extensive experiments on the ShapeNet dataset show that our SATF outperforms the state-of-the-art methods.

Reference

Cited by

Get Citation

JIA Xin, YANGShourui, GUAN Diyi. Semantics-aware transformer for 3D reconstruction from binocular images[J]. Optoelectronics Letters,2022,18(5):293-299

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:April 04,2022
Revised:April 14,2022
Adopted:
Online: June 07,2022
Published:

Home

About us

Authors

Editors

News

Contents

Contact us

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code