A platform for research: civil engineering, architecture and urbanism
Comprehensive urban space representation with varying numbers of street-level images
Abstract Street-level imagery has emerged as a valuable tool for observing large-scale urban spaces with unprecedented detail. However, previous studies have been limited to analyzing individual street-level images. This approach falls short in representing the characteristics of a spatial unit, such as a street or grid, which may contain varying numbers of street-level images ranging from several to hundreds. As a result, a more comprehensive and representative approach is required to capture the complexity and diversity of urban environments at different spatial scales. To address this issue, this study proposes a deep learning-based module called Vision-LSTM, which can effectively obtain vector representation from varying numbers of street-level images in spatial units. The effectiveness of the module is validated through experiments to recognize urban villages, achieving reliable recognition results (overall accuracy: 91.6%) through multimodal learning that combines street-level imagery with remote sensing imagery and social sensing data. Compared to existing image fusion methods, Vision-LSTM demonstrates significant effectiveness in capturing associations between street-level images. The proposed module can provide a more comprehensive understanding of urban spaces, enhancing the research value of street-level imagery and facilitating multimodal learning-based urban research. Our models are available at https://github.com/yingjinghuang/Vision-LSTM.
Highlights Representing regional features by capturing associations among street-level images. The proposed Vision-LSTM extract features from varying numbers of images. Multimodal model fused satellite imagery, street-level imagery, and mobility data. Both visual and dynamic mobility information crucial for urban village recognition. The framework achieved 91.6% accuracy in identifying urban villages.
Comprehensive urban space representation with varying numbers of street-level images
Abstract Street-level imagery has emerged as a valuable tool for observing large-scale urban spaces with unprecedented detail. However, previous studies have been limited to analyzing individual street-level images. This approach falls short in representing the characteristics of a spatial unit, such as a street or grid, which may contain varying numbers of street-level images ranging from several to hundreds. As a result, a more comprehensive and representative approach is required to capture the complexity and diversity of urban environments at different spatial scales. To address this issue, this study proposes a deep learning-based module called Vision-LSTM, which can effectively obtain vector representation from varying numbers of street-level images in spatial units. The effectiveness of the module is validated through experiments to recognize urban villages, achieving reliable recognition results (overall accuracy: 91.6%) through multimodal learning that combines street-level imagery with remote sensing imagery and social sensing data. Compared to existing image fusion methods, Vision-LSTM demonstrates significant effectiveness in capturing associations between street-level images. The proposed module can provide a more comprehensive understanding of urban spaces, enhancing the research value of street-level imagery and facilitating multimodal learning-based urban research. Our models are available at https://github.com/yingjinghuang/Vision-LSTM.
Highlights Representing regional features by capturing associations among street-level images. The proposed Vision-LSTM extract features from varying numbers of images. Multimodal model fused satellite imagery, street-level imagery, and mobility data. Both visual and dynamic mobility information crucial for urban village recognition. The framework achieved 91.6% accuracy in identifying urban villages.
Comprehensive urban space representation with varying numbers of street-level images
Huang, Yingjing (author) / Zhang, Fan (author) / Gao, Yong (author) / Tu, Wei (author) / Duarte, Fabio (author) / Ratti, Carlo (author) / Guo, Diansheng (author) / Liu, Yu (author)
2023-09-23
Article (Journal)
Electronic Resource
English
Self-supervised learning unveils urban change from street-level images
Elsevier | 2024
|Self-supervised learning unveils change in urban housing from street-level images
ArXiv | 2023
|British Library Online Contents | 1998
|Automated localization of urban drainage infrastructure from public-access street-level images
Taylor & Francis Verlag | 2019
|