# When is the correlation coefficient zero

**The Bottom Line**

The sample correlation coefficient needed to reject the hypothesis that the true (Pearson) correlation coefficient is zero becomes small quite fast as the sample size increases. So, in general, *no, you cannot simultaneously have a large (in magnitude) correlation coefficient and a simultaneously large $p$-value* .

**The Top Line** *(Details)*

The test used for the Pearson correlation coefficient in the $R$ function cor.test is a very slightly modified version of the method I discuss below.

Suppose $(X_1,Y_1), (X_2,Y_2),\ldots,(X_n,Y_n)$ are iid bivariate normal random vectors with correlation $\rho$. We want to test the null hypothesis that $\rho = 0$ versus $\rho \neq 0$. Let $r$ be the sample correlation coefficient. Using standard linear-regression theory, it is not hard to show that the test statistic, $$ T = \frac

So, $$ \mathbb P\left(\frac

\approx \alpha \>, $$ where $q_<1-\alpha>$ is the $(1-\alpha)$ quantile of a chi-squared distribution with one degree of freedom.

Now, note that $r^2/(1-r^2)$ is increasing as $r^2$ increases. Rearranging the quantity in the probability statement, we have that for all $$ |r| \geq \frac<1><\sqrt<1+(n-2)/q_<1-\alpha>>> $$ we'll get a rejection of the null hypothesis at level $\alpha$. Clearly the right-hand side decreases with $n$.

Here is a plot of the rejection region of $|r|$ as a function of the sample size. So, for example, when the sample size exceeds 100, the (absolute) correlation need only be about 0.2 to reject the null at the $\alpha = 0.05$ level.

We can do a simple simulation to generate a pair of zero-mean vectors with an *exact* correlation coefficient. Below is the code. From this we can look at the output of cor.test .

As requested in the comments, here is the code to reproduce the plot, which can be run immediately following the code above (and uses some of the variables defined there).