Stack Overflow: March 2013

We use linear regression to find relationships between two or more variables and to create a model that attempts to describe the relationship.

A scatterplot is a 2-dimensional graph that displays pairs of data (i.e. observations).

A correlation coefficient is a number that tells strength and direction of a relationship.

· 0 means no linear relationship exists (could be there is a curved relationship)

· +1 or -1 means the points fall on a perfect straight line

· The closer the number is to +1 or -1, the stronger the relationship.

· Rule of thumb is that +0.7 or -0.7 (or more) is a strong relationship; +0.5 or -0.5 indicates a moderate relationship.

Kinds of correlation coefficient – the one used depends on the kind of data:

· Pearson r – used for data measured at least on an interval level (such my as data for TL, SL, and SV scales)

· Spearman rho – used for linear relationships when data is measured on an ordinal scale (such as a ranking)

· Phi – used for linear relationships for data measured dichotomously (e.g. yes/no, pass/fail)

· also Point Biserial and Eta....don’t care about these for now

For a null hypothesis, the expected correlation is 0. The key question is whether the variance from what we expect can be attributed to a relationship that really exists, or is the variance found only because of a sampling error.

A one-tailed hypothesis assumes the relationship is positive or negative.

A two-tailed hypothesis makes no assumption about the relationship.

For a correlational study, degrees of freedom = N-2. One degree of freedom is lost for every variable in the model. Degrees of Freedom represents how many numbers are free to vary in a calculation sequence (Steinberg, 2008).

References

Rumsey, D. (2009). Statistics II for Dummies. Hoboken, NJ: Wiley Publishing, Inc.

Steinberg, W. J. (2008). Statistics Alive! Thousand Oaks, CA: Sage Publications.

Stack Overflow

Wednesday, March 6, 2013

correlation basics

About Me