|How to estimate moments and quantiles of environmental data sets with non-detected observations? A case study on volatile organic compounds in marine water samples|Huybrechts, T.; Thas, O.; Dewulf, J.; Van Langenhove, H. (2002). How to estimate moments and quantiles of environmental data sets with non-detected observations? A case study on volatile organic compounds in marine water samples. J. Chromatogr. 975(1): 123-133. http://dx.doi.org/10.1016/S0021-9673(02)01327-4
In: Journal of Chromatography A. Elsevier: Amsterdam. ISSN 0021-9673; e-ISSN 1873-3778, more
Mean; Median; Water analysis; Standard deviation; Maximum likelihood estimation; Probability-plot regression; Decision limit; Detection limit; Censored data; Interquartile range; Robust imputation
Concentrations of 27 priority volatile organic compounds were measured in water samples of the North Sea and Scheldt estuary during a 3-year monitoring study. Despite the use of a sensitive analytical method, a number of data were censored. That is, some concentrations were below the decision limit or critical level defined by IUPAC. To characterize the observed measurement results, an attempt was made to identify an appropriate procedure to compute summary statistics for the censored data sets. Several parametric and robust parametric approaches based on the maximum likelihood principle and probability-plot regression method were evaluated for the estimation of the mean, standard deviation, median and interquartile range using three uncensored analytes (1,1,2-trichloroethane, tetrachloroethene and o-xylene) from the monitoring survey. Performance was assessed by artificially censoring the observed concentrations and estimating moments and quantiles at each censoring level. Results showed that methods with the least distributional assumptions, such as the robust bias-corrected restricted maximum likelihood method, perform best for estimating the mean and standard deviation, while both parametric and robust parametric techniques can be used for quantiles. Hence, summary statistics could be estimated with little bias (5-10%) up to 80% of censoring for the data sets employed in this study.