A platform for research: civil engineering, architecture and urbanism
Creating a Large-Scale National Residential Building Energy Dataset Using a Two-Stage Machine Learning Approach
Buildings account for 40% of total energy demand in the US. Consequently, there is a pressing need for a dataset that provides comprehensive information on the energy consumption of household units in the US. The current practice on large-scale energy simulations may not reflect the actual energy consumption patterns. Additionally, the existing national building energy datasets, such as the RECS, have a limited number of datapoint and do not reflect the social aspects of the households. This study aimed to create a large-scale national residential building energy dataset using a two-stage machine learning approach, combining two national datasets of the RECS and the AHS. The outcome of this study is a large-scale and comprehensive national dataset that contains information about energy consumption in household units as well as their detailed building features. Three machine learning algorithms, including artificial neural networks (ANN), random forest (RF), and gradient boosting regression (GBR), were used to develop a data-integration framework. The results showed that RF had the best performance in predicting the end-use energy consumption. Additionally, the predicted energy consumption in the generated large-scale dataset had an accuracy of over 80%. These findings have significant implications for energy-efficient building design and operation.
Creating a Large-Scale National Residential Building Energy Dataset Using a Two-Stage Machine Learning Approach
Buildings account for 40% of total energy demand in the US. Consequently, there is a pressing need for a dataset that provides comprehensive information on the energy consumption of household units in the US. The current practice on large-scale energy simulations may not reflect the actual energy consumption patterns. Additionally, the existing national building energy datasets, such as the RECS, have a limited number of datapoint and do not reflect the social aspects of the households. This study aimed to create a large-scale national residential building energy dataset using a two-stage machine learning approach, combining two national datasets of the RECS and the AHS. The outcome of this study is a large-scale and comprehensive national dataset that contains information about energy consumption in household units as well as their detailed building features. Three machine learning algorithms, including artificial neural networks (ANN), random forest (RF), and gradient boosting regression (GBR), were used to develop a data-integration framework. The results showed that RF had the best performance in predicting the end-use energy consumption. Additionally, the predicted energy consumption in the generated large-scale dataset had an accuracy of over 80%. These findings have significant implications for energy-efficient building design and operation.
Creating a Large-Scale National Residential Building Energy Dataset Using a Two-Stage Machine Learning Approach
Vosoughkhosravi, Sorena (author) / Jafari, Amirhosein (author)
Construction Research Congress 2024 ; 2024 ; Des Moines, Iowa
Construction Research Congress 2024 ; 305-315
2024-03-18
Conference paper
Electronic Resource
English
Generating a nationwide residential building types dataset using machine learning
Elsevier | 2025
|BTS: Building Timeseries Dataset: Empowering Large-Scale Building Analytics
ArXiv | 2024
|DataCite | 2024
|Predicting residential building cooling load with a machine learning random forest approach
Springer Verlag | 2024
|