(Multiple) linear regression (MLR) and selected nonlinear methods from the field of machine learning were compared for the analysis of relationships between xylem tree-rings and the environment: artificial neural networks with a training algorithm that uses Bayesian regularization (ANN), model trees (MT), ensembles of model trees (BMT) and random forests of regression trees (RF). The selected methods were compared on nine datasets, which included different tree-ring parameters and different target environmental variables. For the nonlinear methods, better statistical metrics were calculated on validation data in most cases, but the differences in comparison to linear regression were minor. Additional analysis indicated that the methods mostly differ in predicting the extreme values. The characteristic of nonlinear methods is that the change in the dependent variable is not proportional to the change of one or more independent variables. The latter results in a reduced range and variability of reconstructed values, which makes the reconstruction visually less attractive as compared to linear extrapolation, even though in most cases statistically better. None of the nonlinear machine learning methods showed best results on all datasets, therefore it makes sense to always compare different machine learning regression methods prior to climate reconstruction. To do so, the R function compare_methods() was developed and implemented in the dendroTools R package, which is freely available on the CRAN repository.
|