# When is the correlation coefficient zero

The Bottom Line

The sample correlation coefficient needed to reject the hypothesis that the true (Pearson) correlation coefficient is zero becomes small quite fast as the sample size increases. So, in general, no, you cannot simultaneously have a large (in magnitude) correlation coefficient and a simultaneously large $p$-value .

The Top Line (Details)

The test used for the Pearson correlation coefficient in the $R$ function cor.test is a very slightly modified version of the method I discuss below.

Suppose $(X_1,Y_1), (X_2,Y_2),\ldots,(X_n,Y_n)$ are iid bivariate normal random vectors with correlation $\rho$. We want to test the null hypothesis that $\rho = 0$ versus $\rho \neq 0$. Let $r$ be the sample correlation coefficient. Using standard linear-regression theory, it is not hard to show that the test statistic, $$T = \frac><\sqrt<(1-r^2)>>$$ has a $t_$ distribution under the null hypothesis. For large $n$, the $t_$ distribution approaches the standard normal. Hence $T^2$ is approximately chi-squared distributed with one degree of freedom. (Under the assumptions we've made, $T^2 \sim F_<1,n-2>$ in actuality, but the $\chi^2_1$ approximation makes clearer what is going on, I think.)

So, $$\mathbb P\left(\frac<1-r^2> (n-2) \geq q_<1-\alpha> \right) \approx \alpha \>,$$ where $q_<1-\alpha>$ is the $(1-\alpha)$ quantile of a chi-squared distribution with one degree of freedom.

Now, note that $r^2/(1-r^2)$ is increasing as $r^2$ increases. Rearranging the quantity in the probability statement, we have that for all $$|r| \geq \frac<1><\sqrt<1+(n-2)/q_<1-\alpha>>>$$ we'll get a rejection of the null hypothesis at level $\alpha$. Clearly the right-hand side decreases with $n$.

Here is a plot of the rejection region of $|r|$ as a function of the sample size. So, for example, when the sample size exceeds 100, the (absolute) correlation need only be about 0.2 to reject the null at the $\alpha = 0.05$ level.

We can do a simple simulation to generate a pair of zero-mean vectors with an exact correlation coefficient. Below is the code. From this we can look at the output of cor.test .

As requested in the comments, here is the code to reproduce the plot, which can be run immediately following the code above (and uses some of the variables defined there).

http://stats.stackexchange.com/questions/17371/example-of-strong-correlation-coefficient-with-a-high-p-value when is the correlation coefficient zero

Source: stats.stackexchange.com

Category: Forex