SUPTECH WORKSHOP III Background Session II – Measures of statistical association Measures of association – motivation We are often interested in modeling relationships: is A related to B? does A cause B? do A and B usually coincide? what implications does the occurrence of A have on B? (conditional probability) Probabilistic independence: Two events A and B are independent (A ⊥ B) if and only if their joint probability equals the product of their probabilities. P(A ∩ B) = P(A)P(B) ⇐⇒ P(A) = P(A ∩ B) P(B) = P(A | B) ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 2 / 14 Independence and correlation Pearson product-moment correlation coefficient: ρX,Y = corr(X, Y ) = cov(X, Y ) σXσY = E[(X − µX)(Y − µY )] σXσY where E is the expected value operator, cov means covariance, and corr is a the correlation coefficient. ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 3 / 14 Pearson correlation coefficient ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 4 / 14 Pearson correlation coefficient ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 5 / 14 Correlation vs causality For any two correlated events, A and B, the different possible relationships include: A causes B (direct causation); B causes A (reverse causation); A and B are consequences of a common cause, but do not cause each other; A and B both cause C, which is (explicitly or implicitly) conditioned on; A causes B and B causes A (bidirectional or cyclic causation); A causes C which causes B (indirect causation); There is no connection between A and B; the correlation is a coincidence. ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 6 / 14 Correlation vs causality Thus there can be no conclusion made regarding the existence or the direction of a cause-and-effect relationship only from the fact that A and B are correlated. Determining whether there is an actual cause-and-effect relationship requires further investigation, even when the relationship between A and B is statistically significant, a large effect size is observed, or a large part of the variance is explained. ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 7 / 14 Spearman’s correlation coefficient Spearman’s rank correlation coefficient is a nonparametric measure of rank correlation . It assesses how well the relationship between two variables can be described using a monotonic function. For a sample of size n, the scores Xi and Yi are converted to ranks rg Xi, rg Yi, and rs is computed from: rs = ρrgX ,rgY = cov(rgX, rgY ) σrgX σrgY ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 8 / 14 Spearman’s correlation coefficient ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 9 / 14 Spearman’s correlation coefficient ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 10 / 14 Spearman’s correlation coefficient ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 11 / 14 Partial correlation Partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed. ρXY ·Z = ρXY − ρXZρZY 1 − ρ2 XZ 1 − ρ2 ZY If we define the precision matrix P = (pij) = Ω−1 , we have: ρXiXj·V\{Xi,Xj} = − pij √ piipjj ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 12 / 14 Cross-quantilogram (Han et al., 2016) Denote (pt, st), t ∈ Z a bivariate time series, with Fp,t(·) and Fs,t(·) the conditional distribution functions and qp,t(αp) = inf{x, Fp,t(x) ≥ αp}, qs,t(αs) = inf{x, Fs,t(x) ≥ αs} the corresponding quantiles. Given a lag k ∈ Z, we want to analyze the dependence between the events pt ≤ qp,t(αp) and st−k ≤ qs,t(αs). Denote ψa(u) = I(u < 0) − a, where I is an indicator function. The cross-quantilogram for (αs, αp) at lag k is then defined as ρ(αs,αp)(k) = E ψαp (pt − qp,t(αp)) ψαs (st−k − qs,t−k(αs)) E ψ2 αp (pt − qp,t(αp) E ψ2 αs (st−k − qs,t−k(αs) ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 13 / 14 Sources and further reading https://en.wikipedia.org/wiki/Pearson_correlation_coefficient https://en.wikipedia.org/wiki/Correlation_and_dependence https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation https://en.wikipedia.org/wiki/Spearman’s_rank_correlation_coefficient https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient Han, H., Linton, O., Oka, T., & Whang, Y. J. (2016). The cross-quantilogram: Measuring quantile dependence and testing directional predictability between time series. Journal of Econometrics, 193(1), 251-270. ·SUPTECH WORKSHOP III ·Background Session II – Measures of statistical association 14 / 14