In this thesis we study Chatterjee's coefficient of dependence $\xi$ and its generalization to conditional dependence $T(Y,Z| X)$, which quantifies the additional information in $Z$ relative to $X$ for explaining $Y$. For Chatterjee's $\xi$ there is a sample version $\xi_n$ that converges almost surely to $\xi(X,Y)$, as well as a representation of $\xi$ via partial derivatives of the copula. Through extensive simulations we examine (i) the equitability of the estimator $\xi_n$, (ii) its response to noise, (iii) comparisons across a range of functional relationships with classical correlation coefficients (Pearson's $r$, Spearman's $\rho$, and Kendall's $\tau$) and (iv) the approach of the empirical distribution of $\sqrt{n}\,\xi_n$ to its normal asymptotic limit for both small and large sample sizes.
Using Galton's peas data set, we demonstrate the pronounced asymmetry of $\xi$ - a key feature of this measure of dependence - where $\xi(X,Y)$ and $\xi(Y,X)$ can differ substantially, thereby revealing the directional (functional) structure of the association. The generalization $T(Y,Z| X)$ is illustrated in assessing the contribution of additional explanatory variables: on Portuguese wine data we empirically evaluate the extent to which chemical characteristics provide information about quality beyond variables already taken into account.
|