AI-Driven Window Detection and Semantic Segmentation from Street View Imagery Using Grounding DINO and DeepLabV3 for Digital Twin Modeling
DOI:
https://doi.org/10.3126/njg.v25i1.95080Keywords:
Deep learning,, Semantic segmentation, Grounding DINO, DeepLabV3, Digital twinsAbstract
AI-Driven automated generation of facade information using streed view images can be a vital step towards large-scale urban digital twin generation. Traditional approaches rely on rule-based methods and manual annotation, which poses a significant time lag and is difficult on a large scale. This study focused on a state-of-the-art AI-based pipeline for window detection from street view images and semantic segmentation for windows parameter generation. The proposed workflow consists of image rectification (correcting perspective distortion in street view images). Secondly, window regions are detected using a zero-shot object detection model (GroundingDINO) followed by semantic segmentation using a fine-tuned DeepLabV3 model trained on the WinSyn dataset. Through systematic experimentation with different parameters and hyperparameters, the optimization of label classes from 11 to 3 classes significantly improved segmentation performance. The refined model achieved a mean Intersection over Union (mIoU) of 80.74%, representing an improvement of 44.31% compared to the baseline performance of 36.43% obtained using four classes. This class optimization reduced ambiguity among window components and improved segmentation consistency. Segmentation outputs are further refined using morphological operations to improve frame continuity and remove noise in window panes. Geometric parameters such as pane arrangement, frame thickness, and window layout are extracted from the refined masks and structured into a parametric representation. The proposed pipeline demonstrates the potential of combining zero-shot detection and semantic segmentation for automated façade analysis from street-view imagery. The extracted window information can support applications in urban digital twin generation, building energy modeling, and large-scale architectural analysis.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Survey Department, Government of Nepal

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
© Copyright reserved by Survey Department, Government of Nepal