Difference between revisions of "Regression Lab In StatCrunch"

From Sean_Carver
Jump to: navigation, search
Line 6: Line 6:
 
* '''Do you see outliers in the plots?'''
 
* '''Do you see outliers in the plots?'''
 
* Click on an outlier to turn it pink.
 
* Click on an outlier to turn it pink.
* '''Is the outlier an outlier for the ''other'' relationship? (x versus y) versus (x versus z)'''.
+
* '''Is the outlier an outlier for the ''other'' relationship? (x versus y) versus (x versus z)'''.  The dot on the other scatter plot also turns pink.
 
* Look at the row in the data set for each outlier.
 
* Look at the row in the data set for each outlier.
* Can you tell if the data were recorded wrong or if the diamond really had those dimensions?  Consider what x, y, and z mean and remember you also have a measure of the diamond's weight (carat). '''Discuss'''.
+
* Can you tell if the data were recorded wrong or if the diamond really had those dimensions?  Consider what x, y, and z mean and remember you also have a measure of the diamond's weight (carat). Plot scatter plots with carat and '''discuss'''.
* '''Repeat for the other outliers.'''
+
* '''Repeat for the other most extreme outliers.'''
* '''Report the correlation coefficient and regression line with the outliers and without the outliers.'''
 
 
* Plot the Residuals versus X-Values, a histogram of the residuals, and a QQ-Plot of the residuals. '''Where does the outlier(s) show up? With the outliers removed, is a simple-linear regression analysis appropriate?'''
 
* Plot the Residuals versus X-Values, a histogram of the residuals, and a QQ-Plot of the residuals. '''Where does the outlier(s) show up? With the outliers removed, is a simple-linear regression analysis appropriate?'''
* To do the analyses without the outliers, use Stat, Regression, Simple Linear to Save Residuals.
+
* '''Report the correlation coefficient and regression line with the outliers and without the outliers (see below).'''
* Then use a "where" function to restrict the data to points with small enough residuals.
+
* To do the analyses without the outliers, use Stat, Regression, Simple Linear to Save Residuals, then use a "where" function to restrict the data to points with small enough residuals.

Revision as of 17:54, 17 February 2019

Lab 3: Regression, Correlation and Outliers

  • We are going to work with the small diamonds data set again: diamonds3K sampled from a larger data set with Codebook.
  • We are going to look at the three dimensions of diamonds and how they covary: length (x), width (y), and height (z).
  • Make a scatter plot for (x versus y) and separately for (x versus z).
  • One of the conditions for correlation and regression is the "No outliers condition."
  • Do you see outliers in the plots?
  • Click on an outlier to turn it pink.
  • Is the outlier an outlier for the other relationship? (x versus y) versus (x versus z). The dot on the other scatter plot also turns pink.
  • Look at the row in the data set for each outlier.
  • Can you tell if the data were recorded wrong or if the diamond really had those dimensions? Consider what x, y, and z mean and remember you also have a measure of the diamond's weight (carat). Plot scatter plots with carat and discuss.
  • Repeat for the other most extreme outliers.
  • Plot the Residuals versus X-Values, a histogram of the residuals, and a QQ-Plot of the residuals. Where does the outlier(s) show up? With the outliers removed, is a simple-linear regression analysis appropriate?
  • Report the correlation coefficient and regression line with the outliers and without the outliers (see below).
  • To do the analyses without the outliers, use Stat, Regression, Simple Linear to Save Residuals, then use a "where" function to restrict the data to points with small enough residuals.