Indexed by:
Abstract:
Complete data on wastewater quality are essential for managing and monitoring wastewater treatment processes. Most management and monitoring methods involve the use of voluminous training data for imputation, but the problem is that the sensors used in wastewater treatment plants (WWTPs) collect only a limited amount of data. The lack of sufficient training data can diminish the accuracy of traditional imputation techniques. To address this problem, this study developed a novel approach called Miss-GBRT (imputing missing values with gradient boosting regression trees), which can impute missing values into wastewater quality data even with minimal training data. The proposed approach consists of a preprocessing stage and an imputation stage. In the preprocessing stage, different copies of masked datasets are produced from raw data according to various levels of missingness, after which pre-imputation is conducted to ensure the integrality of training data. In the imputation stage, Miss-GBRT is used to combine shallow regression trees to regress the residuals of time and impute each missing value into a masked dataset in a stepwise manner. We carried out extensive experiments on the WWTP datasets of the University of California, Irvine and Beijing Drainage Group to compare Miss-GBRT with baseline imputation methods. The results demonstrated that the proposed approach improves the accuracy with which missing wastewater quality data are imputed under limited training data. It can also perform better than other methods on datasets with considerable proportions of missing values.
Keyword:
Reprint Author's Address:
Email:
Source :
APPLIED INTELLIGENCE
ISSN: 0924-669X
Year: 2023
Issue: 19
Volume: 53
Page: 22917-22937
5 . 3 0 0
JCR@2022
ESI Discipline: ENGINEERING;
ESI HC Threshold:19
Cited Count:
WoS CC Cited Count: 6
SCOPUS Cited Count: 7
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 1
Affiliated Colleges: