This thesis addresses the problem of L2-penalization in linear mixed models (LMM), with
a focus on high-dimensional settings where the number of covariates exceeds the number of
observations. Classical methods for parameter estimation fail under such conditions.
In the literature, L2-penalization (ridge regularization) is one of the most widely used methods to control overfitting, but its implementation within mixed models is limited, since most existing approaches and software tools do not support direct inclusion of penalization, which hinders the practical application of ridge regularization in mixed models.
We present a novel approach for introducing L2-penalization in LMM through artificially
generated pseudo-observations, which enables estimation of penalized LMM using standard
software tools such as lme4 and glmmTMB.We theoretically justify its equivalence to the Bayesian approach and derive the construction of pseudo-observations corresponding to the penalization term in the penalized log-likelihood.
The method is evaluated on simulated high-dimensional data, where we compare the predictive performance of penalized models across different values of the penalization parameter λ. The results demonstrate that the proposed approach yields stable parameter estimates even in high-dimensional settings. We further compare different strategies for selecting the penalization parameter, including cross-validation with leave-one-cluster-out.
This work contributes to the development of methodology for modeling correlated highdimensional data using linear mixed models and opens avenues for future research on other penalization approaches and generalized mixed models.
|