# How to determine the correlation coefficient

Significance of the Correlation Coefficient

Test for the significance of relationships between two CONTINUOUS variables

• We introduced Pearson correlation as a measure of the STRENGTH of a relationship between two variables
• But any relationship should be assessed for its SIGNIFICANCE as well as its strength.

A general discussion of significance tests for relationships between two continuous variables.

• Factors in relationships between two variables
• The strength of the relationship:
• is indicated by the correlation coefficient: r
• but is actually measured by the coefficient of determination: r 2
• The significance of the relationship
• is expressed in probability levels: p (e.g. significant at p =.05)
• This tells how unlikely a given correlation coefficient, r. will occur given no relationship in the population
• NOTE! NOTE! NOTE! The smaller the p-level, the more significant the relationship
• BUT! BUT! BUT! The larger the correlation, the stronger the relationship
• Consider the classical model for testing significance
• It assumes that you have a sample of cases from a population
• The question is whether your observed statistic for the sample is likely to be observed given some assumption of the corresponding population parameter.
• If your observed statistic does not exactly match the population parameter, perhaps the difference is due to sampling error
• The fundamental question: is the difference between what you observe and what you expect given the assumption of the population large enough to be significant -- to reject the assumption?
• The greater the difference -- the more the sample statistic deviates from the population parameter -- the more significant it is
• That is, the lessl ikely (small probability values) that the population assumption is true.
• The classical model makes some assumptions about the population parameter:
• Population parameters are expressed as Greek letters, while corresponding sample statistics are expressed in lower-case Roman letters:
• r = correlation between two variables in the sample
• (rho) = correlation between the same two variables in the population
• A common assumption is that there is NO relationship between X and Y in the population:

= 0.0
• Under this common null hypothesis in correlational analysis: r = 0.0

Testing for the significance of the correlation coefficient, r

• When the test is against the null hypothesis: r xy = 0.0
• What is the likelihood of drawing a sample with r xy ­ 0.0?
• The sampling distribution of r is

• approximately normal (but bounded at -1.0 and +1.0) when N is large
• and distributes t when N is small .
• The simplest formula for computing the appropriate t value to test significance of a correlation coefficient employs the t distribution:

• The degrees of freedom for entering the t-distribution is N - 2
• Example. Suppose you obsserve that r= .50 between literacy rate and political stability in 10 nations
• Is this relationship "strong"?
• Coefficient of determination = r-squared = .25
• Means that 25% of variance in political stability is "explained" by literacy rate
• Is the relationship "significant"?
• That remains to be determined using the formula above r = .50 and N=10

set level of significance (assume .05)

determine one-or two-tailed test (aim for one-tailed)

• For 8 df and one-tailed test, critical value of t = 1.86
• We observe only t = 1.63
• It lies below the critical t of 1.86
• So the null hypothesis of no relationship in the population (r = 0) cannot be rejected
• Comments
• Note that a relationship can be strong and yet not significant
• Conversely, a relationship can be weak but significant
• The key factor is the size of the sample .
• For small samples, it is easy to produce a strong correlation by chance and one must pay attention to signficance to keep from jumping to conclusions: i.e.
• rejecting a true null hypothesis,
• which meansmaking a Type I error .
• For large samples, it is easy to achieve significance, and one must pay attention to the strength of the correlation to determine if the relationship explains very much.
• Alternative ways of testing significance of r against the null hypothesis
• Look up the values in a table
• Read them off the SPSS output:
• check to see whether SPSS is making a one-tailed test
• or a two-tailed test
• Testing the significance of r when r is NOT assumed to be 0

• This is a more complex procedure, which is discussed briefly in the Kirk reading
• The test requires first transforming the sample r to a new value, Z'.
• This test is seldom used.
• You will not be responsible for it.

Source: www.janda.org

Category: Forex