Performance of neural network for indoor airflow prediction: Sensitivity towards weight initialization: Fid-Bau Portal

Performance of neural network for indoor airflow prediction: Sensitivity towards weight initialization

Zhou, Qi / Ooka, Ryozo

Highlights • Neural network modeling capability is insensitive to weight initialization. • Impact of weight initialization strategy lies in sampling interval of initial weights. • Large weight sampling interval causes more sensitivity of generalization to initialization. • Weight variation for weight with large initial value is limited. • Batch normalization leads to robustness of generalization to weight initialization.

Abstract Neural networks (NNs) have been proposed as a promising alternative for fast and accurate prediction of indoor airflow. NN training is of great importance for acquiring accurate prediction results, which is essentially a nonconvex optimization process through gradient descent-based algorithms. NN performance at a certain solution is dependent on the initial parameter values from random initialization, crucial to the reliability of evaluation for model comparisons and hyperparameter tuning. In this study, the sensitivity of NN performance for indoor airflow prediction towards weight initialization is revealed by clarifying two issues on solution equivalence and the impact of weight initialization strategy. By reproducing non-isothermal indoor airflows, numerical experiments were conducted on various scenarios considering different initialization strategies. For each scenario, following the same convergence criteria, the training process was repeated to obtain multiple solutions concerning training / validation errors and temperature / velocity predictions. The results indicate that sensitivity of NN modeling capability to weight initialization for all scenarios are similar; while significant discrepancies among scenarios in sensitivity of generalization capability and convergence to weight initialization are demonstrated. NN with weight sampling intervals larger than [-1, 1] is more sensitive to initial weights than that with smaller sampling intervals.