Difference between revisions of "Regression Lab In StatCrunch"
From Sean_Carver
Line 8: | Line 8: | ||
* '''Is the outlier an outlier for the ''other'' relationship? (x versus y) versus (x versus z)'''. | * '''Is the outlier an outlier for the ''other'' relationship? (x versus y) versus (x versus z)'''. | ||
* Look at the row in the data set for each outlier. | * Look at the row in the data set for each outlier. | ||
− | * Can you tell if the data were recorded wrong or if the diamond really had those dimensions? '''Discuss'''. | + | * Can you tell if the data were recorded wrong or if the diamond really had those dimensions? Consider what x, y, and z mean and remember you also have a measure of the diamond's weight (carat). '''Discuss'''. |
* '''Repeat for the other outliers.''' | * '''Repeat for the other outliers.''' | ||
* '''Report the correlation coefficient and regression line with the outliers and without the outliers.''' | * '''Report the correlation coefficient and regression line with the outliers and without the outliers.''' |
Revision as of 02:21, 11 February 2019
Lab 3: Regression, Correlation and Outliers
- We are going to work with the small diamonds data set again: diamonds3K sampled from a larger data set with Codebook.
- We are going to look at the three dimensions of diamonds and how they covary: length (x), width (y), and height (z).
- Make a scatter plot for (x versus y) and separately for (x versus z).
- One of the conditions for correlation and regression is the "No outliers condition."
- Do you see outliers in the plots?
- Click on an outlier to turn it pink.
- Is the outlier an outlier for the other relationship? (x versus y) versus (x versus z).
- Look at the row in the data set for each outlier.
- Can you tell if the data were recorded wrong or if the diamond really had those dimensions? Consider what x, y, and z mean and remember you also have a measure of the diamond's weight (carat). Discuss.
- Repeat for the other outliers.
- Report the correlation coefficient and regression line with the outliers and without the outliers.
- Plot the Residuals versus X-Values, a histogram of the residuals, and a QQ-Plot of the residuals. Where does the outlier(s) show up?
- To do the analyses without the outliers, use Stat, Regression, Simple Linear to Save Residuals.
- Then use a where function to restrict the data to points with small enough residuals.