# How to determine the correlation coefficient

**Significance of the Correlation Coefficient**

**Test for the significance of relationships between two CONTINUOUS variables**

- We introduced Pearson correlation as a measure of the STRENGTH of a relationship between two variables
- But any relationship should be assessed for its SIGNIFICANCE as well as its strength.

**A general discussion of significance tests for relationships between two continuous variables.**

- Factors in relationships between two variables
- The
**strength**of the relationship:- is indicated by the correlation coefficient:
**r** - but is actually measured by the coefficient of determination:
**r 2**

- is indicated by the correlation coefficient:
- The
**significance**of the relationship- is expressed in probability levels:
**p**(e.g. significant at p =.05) - This tells how
**unlikely**a given correlation coefficient,**r**. will occur given**no**relationship in the population- NOTE! NOTE! NOTE! The
**smaller**the p-level, the more**significant**the relationship - BUT! BUT! BUT! The
**larger**the correlation, the**stronger**the relationship

- NOTE! NOTE! NOTE! The

- is expressed in probability levels:

- The
- Consider the
**classical model**for testing significance- It assumes that you have a
**sample**of cases from a**population** - The question is whether your observed
**statistic**for the sample isto be observed*likely***given**some assumption of the corresponding population**parameter.** - If your observed statistic does not exactly match the population parameter, perhaps the difference is due to sampling error
- The fundamental question: is the difference between what you observe and what you expect given the assumption of the population large enough to be
**significant**-- to reject the assumption? - The greater the difference -- the more the sample
**statistic**deviates from the population**parameter**-- the more**significant**it is - That is, the
**lessl ikely**(small probability values) that the population assumption is true.

- It assumes that you have a
- The classical model makes some assumptions about the population parameter:
- Population parameters are expressed as Greek letters, while corresponding sample statistics are expressed in lower-case Roman letters:
= correlation between two variables in the*r***sample**- (rho) = correlation between the same two variables in the
**population**

- A common assumption is that there is NO relationship between X and Y in the population: = 0.0
- Under this common
**null**hypothesis in correlational analysis:*r*= 0.0

- Population parameters are expressed as Greek letters, while corresponding sample statistics are expressed in lower-case Roman letters:

**Testing for the significance of the correlation coefficient, r**

- When the test is against the null hypothesis: r
_{xy}= 0.0- What is the likelihood of drawing a sample with r
_{xy} 0.0? - The sampling distribution of r is
- approximately
**normal**(but bounded at -1.0 and +1.0) when N is**large** - and distributes
**t**when N is**small**.

- approximately
- The simplest formula for computing the appropriate
**t value**to test significance of a correlation coefficient employs the t distribution: - The
**degrees of freedom**for entering the t-distribution is**N - 2**

- What is the likelihood of drawing a sample with r
**Example**. Suppose you obsserve that**r=****.50**between literacy rate and political stability in**10**nations- Is this relationship "strong"?
- Coefficient of determination = r-squared = .25
- Means that 25% of variance in political stability is "explained" by literacy rate

- Is the relationship "significant"?
- That remains to be determined using the formula above
**r = .50**and**N=10**set level of significance (assume .05)

determine one-or two-tailed test (aim for one-tailed)

- For 8 df and one-tailed test, critical value of
**t = 1.86**- We observe only t = 1.63
- It lies
**below**the critical t of 1.86 - So the null hypothesis of no relationship in the population (r = 0)
**cannot**be rejected

- Is this relationship "strong"?

- Note that a relationship can be
**strong**and yet**not**significant - Conversely, a relationship can be
**weak**but**significant**- The key factor is the
**size of the sample**. - For small samples, it is easy to produce a strong correlation by chance and one must pay attention to signficance to keep from jumping to conclusions: i.e.
- rejecting a
**true**null hypothesis, - which meansmaking a
**Type I error**.

- rejecting a

- The key factor is the
- For large samples, it is easy to achieve significance, and one must pay attention to the strength of the correlation to determine if the relationship explains very much.

- Look up the values in a table
- Read them off the SPSS output:
- check to see whether SPSS is making a one-tailed test
- or a two-tailed test

**Testing the significance of r when r is NOT assumed to be 0**

- This is a more complex procedure, which is discussed briefly in the Kirk reading
- The test requires first transforming the sample r to a new value, Z'.
- This test is seldom used.
- You will not be responsible for it.

- The test requires first transforming the sample r to a new value, Z'.

Source: www.janda.org

Category: Forex