Eine Plattform für die Wissenschaft: Bauingenieurwesen, Architektur und Urbanistik
Imputation of missing data in time series for air pollutants
Abstract Missing data are major concerns in epidemiological studies of the health effects of environmental air pollutants. This article presents an imputation-based method that is suitable for multivariate time series data, which uses the EM algorithm under the assumption of normal distribution. Different approaches are considered for filtering the temporal component. A simulation study was performed to assess validity and performance of proposed method in comparison with some frequently used methods. Simulations showed that when the amount of missing data was as low as 5%, the complete data analysis yielded satisfactory results regardless of the generating mechanism of the missing data, whereas the validity began to degenerate when the proportion of missing values exceeded 10%. The proposed imputation method exhibited good accuracy and precision in different settings with respect to the patterns of missing observations. Most of the imputations obtained valid results, even under missing not at random. The methods proposed in this study are implemented as a package called mtsdi for the statistical software system R.
Highlights We propose a method for imputation of missing values in times series. Simulations showed adequate goodness-of-fit. The findings also suggest good accuracy and precision. We implemented the method as an open source R library.
Imputation of missing data in time series for air pollutants
Abstract Missing data are major concerns in epidemiological studies of the health effects of environmental air pollutants. This article presents an imputation-based method that is suitable for multivariate time series data, which uses the EM algorithm under the assumption of normal distribution. Different approaches are considered for filtering the temporal component. A simulation study was performed to assess validity and performance of proposed method in comparison with some frequently used methods. Simulations showed that when the amount of missing data was as low as 5%, the complete data analysis yielded satisfactory results regardless of the generating mechanism of the missing data, whereas the validity began to degenerate when the proportion of missing values exceeded 10%. The proposed imputation method exhibited good accuracy and precision in different settings with respect to the patterns of missing observations. Most of the imputations obtained valid results, even under missing not at random. The methods proposed in this study are implemented as a package called mtsdi for the statistical software system R.
Highlights We propose a method for imputation of missing values in times series. Simulations showed adequate goodness-of-fit. The findings also suggest good accuracy and precision. We implemented the method as an open source R library.
Imputation of missing data in time series for air pollutants
Junger, W.L. (Autor:in) / Ponce de Leon, A. (Autor:in)
Atmospheric Environment ; 102 ; 96-104
21.11.2014
9 pages
Aufsatz (Zeitschrift)
Elektronische Ressource
Englisch
Imputation of missing data in time series for air pollutants
Elsevier | 2015
|Augmented Stochastic Multiple Imputation Model for Airport Pavement Missing Data Imputation
British Library Online Contents | 2014
|Airport Pavement Missing Data Management and Imputation with Stochastic Multiple Imputation Model
British Library Online Contents | 2013
|Imputation of Missing Traffic Data during Holiday Periods
Online Contents | 2008
|Imputation of Missing Traffic Data during Holiday Periods
Taylor & Francis Verlag | 2008
|