Buley, R. P., H. E. Correia, A. Abebe, T. B. Issa, and A. E. Wilson. 2021. Predicting microcystin occurrence in freshwater lakes and reservoirs: assessing environmental variables. Inland Waters 11(3):430-444.
Determining the environmental conditions that influence the occurrence and concentration of the cyanobacterial toxin microcystin (MC) is a critical step for predicting cases in which the toxin will adversely affect drinking water sources, recreational waterbodies, and other freshwater ecosystems. Although widely studied, little consensus exists regarding the factors that influence MC on a global scale. The objective of this study was to identify the environmental variables most strongly associated with MC concentrations using observational data from lakes and reservoirs around the world while also addressing the substantial proportions of missing values that a large aggregated dataset often involves. A total of 124 studies containing data from an estimated 2040 lakes and reservoirs in 22 countries was used to construct a global dataset. Variables including <35% of non-missing observations were removed prior to analysis. Missing values for the remaining 12 predictors of MC were imputed using an iterative imputation algorithm based on a random forest approach. Variable selection was performed with generalized additive modeling on the complete case and imputed datasets. Models applied to the imputed data produced lower prediction errors than those fit to the complete dataset. Variables of greatest significance to MC concentration included location (longitude–latitude pairs), total nitrogen, turbidity, and pH. Total phosphorus was not found to be a strong predictor of MC. In addition to assisting water resource managers in protecting their waterbodies against MC, the presented methodologies may provide a useful framework for future water quality modeling while accounting for varying proportions of missing data.