Difference between revisions of "Regression Lab In StatCrunch"

From Sean_Carver
Jump to: navigation, search
Line 3: Line 3:
 
* We are going to look at the three dimensions of diamonds and how they covary: length (x), width (y), and height (z).
 
* We are going to look at the three dimensions of diamonds and how they covary: length (x), width (y), and height (z).
 
* Make a scatter plot for (x versus y) and separately for (x versus z).
 
* Make a scatter plot for (x versus y) and separately for (x versus z).
* One of the conditions for
+
* One of the conditions for correlation and regression is the "No outliers condition."
 +
* Do you see outliers in plots.
 +
* Click on an outlier to turn it pink.
 +
* Is the outlier an outlier for the other relationship.
 +
* Look at the row in the data set for each outlier.
 +
* Can you tell if the data were recorded wrong or if the diamond really had those dimensions?  Discuss.
 +
* Do correlation and regression with the outliers and without the outliers.
 +
* To do the analyses without the outliers, use Stat, Regression, Simple Linear to Save Residuals.
 +
* Then use a where function to restrict the data to points with small enough residuals.

Revision as of 15:35, 10 February 2019

Lab 3: Regression, Correlation and Outliers

  • We are going to work with the small diamonds data set again: diamonds3K sampled from a larger data set with Codebook.
  • We are going to look at the three dimensions of diamonds and how they covary: length (x), width (y), and height (z).
  • Make a scatter plot for (x versus y) and separately for (x versus z).
  • One of the conditions for correlation and regression is the "No outliers condition."
  • Do you see outliers in plots.
  • Click on an outlier to turn it pink.
  • Is the outlier an outlier for the other relationship.
  • Look at the row in the data set for each outlier.
  • Can you tell if the data were recorded wrong or if the diamond really had those dimensions? Discuss.
  • Do correlation and regression with the outliers and without the outliers.
  • To do the analyses without the outliers, use Stat, Regression, Simple Linear to Save Residuals.
  • Then use a where function to restrict the data to points with small enough residuals.