Visual: Should I use a linear model?

I’ve read about this trick in two places, as a way to determine if the choice of a linear model for your data is a good one. You can plot the residuals (the squared error terms) against each of the corresponding predicted values (the Y’s), and if a linear fit is not good, you will see funky shapes! For example, (another fantastic graphic from ESL with Applications in R):

The plot on the right shows that a linear model is not a good fit to the data, because there is a strong pattern that indicates non-linearity. When we transform some of the variables (for example, squaring one of them), this pattern goes away (the image on the right). Transforming your variables is another trick that we use to use linear regression to fit polynomial-looking data. Very cool! Plotting the residuals like this is also a good strategy to find outliers. While an outlier may not change your model fit drastically, it can drastically change the R (squared) statistic, so you should keep watch! In this case, we can also calculate the studentized residuals, which basically means dividing by the standard deviation, and then plotting that. If there is a studentized residual greater then about 3, spidey sense says that it is probably an outlier.

Suggested Citation:

Sochat, Vanessa. "Visual: Should I use a linear model?." @vsoch (blog), 21 Aug 2013, https://vsoch.github.io/2013/visual-should-i-use-a-linear-model/ (accessed 01 Jul 25).

« Correlation Dual Regression to Compare Functional Brain Networks »