Difference between revisions of "De Veaux Map"

Revision as of 00:51, 21 November 2018

1 Part I: Exploring and Understanding Data
2 Part II: Exploring Relationships Between Variables
3 Part III: Gathering Data
4 Part IV: Randomness and Probability
5 Part V: From the Data at Hand to the World at Large
6 Part VI: Accessing Associations Between Variables
7 Part VII: Inference When Variables are Related

Part I: Exploring and Understanding Data

Chapter 1: Exploring and Understanding Data

1.1: What is Statistics?
1.2: Data
1.3: Variables

Types of variables: Quantitative, identifier, ordinal, categorical (categorical & nominal considered synonyms)

Chapter 2: Displaying and Describing Categorical Data

2.1: Summarizing and Displaying a Single Categorical Variable

The area principle

Frequency tables

Bar charts

Pie charts

2.2: Exploring the Relationship Between Two Categorical Variables

Contingency tables

Conditional distributions

Independence

Plotting conditional distributions (with pie charts, bar charts and segmented bar charts)

Chapter 3: Displaying and Displaying Quantitative Data

3.1: Displaying Quantitative Variables

Histograms

Stem and leaf displays

Dotplots

3.2: Shape

Unimodal, bimodal or multimodal

Symmetric or skewed

Outliers

3.3: Center

Median

3.4: Spread

Range, min, max

Interquartile range, Q1, Q3

3.5: Boxplots and 5-Number Summaries
3.6: The Center of a Symmetric Distribution: The Mean

Mean or Median?

3.7: The Spread of a Symmetric Distribution: The Standard Deviation

Formulas for variance and standard deviation

Thinking about variation

3.8: Summary---What to Tell About a Quantitative Variable

Chapter 4: Understanding and Comparing Distributions

4.1: Comparing Groups with Histograms
4.2: Comparing Groups with Boxplots
4.3: Outliers
4.4: Timeplots
4.5: Re-Expressing Data: A First Look

...To improve symmetry

...To equalize spread across groups

Chapter 5: The Standard Deviation as a Ruler and the Normal Model

5.1: Standardizing with z-Scores
5.2: Shifting and Scaling

Shifting to adjust the center

Rescaling to adjust the scale

Shifting, scaling and z-Scores

5.3: Normal Models

The "nearly normal condition"

The 68-95-99.7 Rule

Working with pictures of the Normal curve

Inflection points at mean +/- one standard deviation

Interpretation of area under Normal curve as proportion of observations in interval (implied by pictures and exposition)

5.4: Finding Normal Percentiles

Normal percentiles

Other models

From percentiles to scores: z in reverse

5.5: Normal Probability Plots

Part II: Exploring Relationships Between Variables

Chapter 6: Scatterplots, Association, and Correlation

6.1: Scatterplots

Direction (negative or positive)

Form

Strength

Outliers

Explanatory and response variables

6.2: Correlation

Formula

Assumptions and conditions for correlation, including...

"Quantitative variables condition,"

"Straight enough condition,"

"No outliers condition"

6.3: Warning: Correlation Does Not Equal Causation
6.4: Straightening Scatterplots

Chapter 7: Linear Regression

7.1 Least Squares: The Line of "Best Fit"

The linear model

Predicted values and residuals

The least squares line and the sense in which it is the best fit

7.2 The Linear Model

Using the linear model to make predictions

7.3 Finding the Least Squares Line

Formulas for slope and intercept

7.4 Regression to the Mean

Etiology of the word "Regression"

Math Box: Derivation of regression formula

7.5 Examining the Residuals

Formula for residuals

Appropriate (lack of) form of Residuals versus x-Values plot

The residual standard deviation

7.6 R^2---The Variation Accounted For by the Model

How big should R^2 be?

Predicting in the other direction---A tale of two regressions

7.7 Regression Assumptions and Conditions

"Quantitative variable" condition

"Straight enough" condition

"Outlier" condition

"Does the plot thicken?" condition

Judging the conditions with the residuals-versus-predicted-values plot

Chapter 8: Regression Wisdom

8.1: Examining Residuals

Getting the "bends": When the residuals aren't straight

Sifting residuals for groups

Subsetting with a categorical variable

8.2: Extrapolation: Reaching Beyond the Data

Warning with extrapolation

Warning with predicting what will happen to cases in the regression if they were changed

8.3: Outliers, Leverage, and Influence
8.4: Lurking Variables and Causation
8.5: Working with Summary Values

Chapter 9: Re-expressing Data: Get It Straight!

9.1: Straightening Scatterplots -- The Four Goals

Goal 1: Make the distribution of a variable more symmetric.

Goal 2: Make the spread of several groups more alike, even if their centers differ

Goal 3: Make the form of a scatterplot more nearly linear

Goal 4: Make the scatter in a scatterplot spread out evenly rather than thinkening at one end

Recognizing when a re-expression can help

9.2: Finding a Good Re-Expression

Plan A: The ladder of powers

Re-expressing to straighten a scatterplot

Comparing re-expressions

Plan B: Attack of the logarithms

Multiple benefits to re-expressions

Why not just fit a curve?

Part III: Gathering Data

Chapter 10: Understanding Randomness

10.1: What Is Randomness?

Meaning of the word "random"

Discussion of the process of generating random numbers

10.2: Simulating by Hand

Basic terminology: Simulations, trials, components, response variable

Chapter 11: Sample Surveys

11.1: The Three Big Ideas of Sampling

Idea 1: Examine a part of the whole

Population versus sample

Bias

Idea 2: Randomize

Idea 3: It's the sample size

Sample size

Does a census make sense

11.2: Populations and Parameters
11.3: Simple Random Samples

Sampling frame

Sampling variability

11.4: Other Sampling Designs

Stratified sampling

Cluster sampling

Multistage sampling

Systematic sampling

11.5: From the Population to the Sample: You Can't Always Get What You Want
11.6: The Valid Survey

Know what you want to know

Tune your instrument

Ask specific rather than general questions

Ask for quantitative results when possible

Be careful in phrasing questions

Pilot studies

11.7: Common Sampling Mistakes or How to Sample Badly

Mistake 1: Sample volunteers

Mistake 2: Sample convieniently

Mistake 3: Use a bad sampling frame

Mistake 4: Undercoverage

Nonresponse bias

Response bias

How to think about biases

Look for biases in any survey you encounter

Spend your time and resources reducing biases

Think about the members of the population who could have been excluded from your study

Always report your sampling methods in detail

Chapter 12: Experiments and Observational Studies

12.1: Observational Studies

Observational studies

Retrospective studies

Prospective studies

12.2: Randomized, Comparative Experiments

Random assignment of subjects to treatments

Explanatory variables, factors and levels

Response variables

12.3: The Four Principles of Experimental Design

Principle 1: Control

Principle 2: Randomize

Principle 3: Replicate

Principle 4: Block

Diagramming experiments

Statistically significant differences between groups

Contrasting experiments and samples

12.4: Control Treatments

Blinding (single and double)

Placebos

12.5: Blocking

Matched participants

12.6: Confounding

Lurking or confounding

Part IV: Randomness and Probability

Chapter 13: From Randomness to Probability

13.1: Random Phenomena

"A random phenomenon is a situation in which we know what outcomes can possibly occur, but we don't know which particular outcome will happen"

Trials

Outcomes

Sample space

Events

The law of large numbers

Empirical probability

The nonexistent law of averages

13.2: Modeling Probability

Theoretical probability

Personal probability

13.3: Formal Probability

The five rules of probability

Rule 1: A probability must be a number between 0 and 1

Rule 2: Probability assignment rule: The probability of a the sample space must be 1

Rule 3: The complement rule

Rule 4: The addition rule

Rule 5: The multiplication rule

Chapter 14: Probability Rules!

14.1: The General Addition Rule
14.2: Conditional Probability and the General Multiplication Rule
14.3: Independence
14.4: Picturing Probability: Tables, Venn Diagrams, and Trees
14.5: Reversing the Conditioning and Bayes' Rule

Chapter 15: Random Variables

15.1: Center: The Expected Value

Definition of a random variable

Discrete random variables (can "list" all the outcomes)

Continuous random variables (not discrete)

Probability models for discrete random variables

Computation of expected value for discrete random variables

15.2: Spread: The Standard Deviation

Computation of variance and standard deviation for discrete random variables

15.3: Shifting and Combining Random Variables

E(X +/- c)

Var(X +/- c)

E(aX)

Var(aX)

E(X +/- Y)

Var(X +/- Y), when X and Y are independent

[Unnumbered section, labeled optional]: Correlation and Covariance

Covariance of two random variables

Var(X +/- Y), when X and Y covary

Correlation of two random variables

15.4: Continuous Random Variables

The Normal random variable as an example of a continuous random variable

Caption to Figure 15.1: Interpretation of area under Normal curve as probability of finding an observation in the interval.

How can every value have a probability 0?

Sums of independent Normal random variables are Normal.

Chapter 16: Probability Models

16.1: Bernoulli Trials
16.2: The Geometric Model

Independence

The 10% condition

16.3: The Binomial Model

Binomial probabilities and the binomial model

Binomial coefficients

16.4: Approximating the Binomial Model with a Normal Model

The success/failure condition

16.5: The Continuity Correction
16.6: The Poisson Model
16.7: Other Continuous Random Variables: The Uniform and the Exponential

The uniform distribution

The exponential model

Part V: From the Data at Hand to the World at Large

Chapter 17: Sampling Distribution Models

17.1: Sampling Distribution of a Proportion

Often, the Normal model well fits the sampling distribution for proportion

Which Normal? Mean/standard deviation for Normal approximation to the sampling distribution for proportions

Sampling variability

17.2: When Does the Normal Model Work Well? Assumptions and Conditions (for proportions)

The independence assumption

The randomization condition

The 10% condition

The success/failure condition

17.3: The Sampling Distributions of Other Statistics

Simulating the sampling distributions of other statistics

Medians

Variances

Minimums

Simulating the sampling distribution of a mean

17.4: The Central Limit Theorem: The Fundamental Theorem of Statistics

Statement of theorem

Assumptions and conditions

But which Normal: Mean and standard deviation for sampling distributions for means

17.5: Sampling Distributions: A Summary

Chapter 18: Confidence Intervals for Proportions

18.1: A Confidence Interval

The standard error

What a confidence interval says about a parameter

18.2: Interpreting Confidence Intervals: What Does 95% Confidence Really Mean
18.3: Margin of Error: Certainty vs. Precision

Margin of error

How the margin of error depends upon the confidence level

Critical values

18.4: Assumptions and Conditions

Independence assumption

Independence condition

Randomization condition

10% condition

Sample size assumption

Success/failure condition

Chapter 19: Testing Hypotheses About Proportions

19.1: Hypotheses

The null hypothesis

The alternative hypothesis

A trial (criminal justice) as a hypothesis test

19.2: P-Values

Definition of P-value

What to do with an "innocent" defendant (verdict: not guilty)

19.3: The Reasoning of Hypothesis Testing

1. Hypotheses (pose hypotheses)

2. Model (verify problem satisfies conditions)

3. Mechanics (perform calculations)

4. Conclusion (interpret results)

19.4: Alternative Alternatives

Two-sided alternative

One-sided alternative

19.5: P-Values and Decisions: What to Tell About a Hypothesis Test

Discussion of when a p-value is small enough (no threshold yet)

Chapter 20: Inference About Means

20.1: Getting Started: The Central Limit Theorem (Again)

For means, population standard deviation is required, sample standard deviation is all we have

20.2: Gosset's t

t-Distribution versus Normal distribution

Degrees of freedom

What did Gosset see?

A confidence interval for means

A practical sampling distribution model for means

One-sample t-interval for the mean

Assumptions and Condition

Independence assumption (randomization condition)

Normal population assumption (nearly normal condition)

Relationship to sample size

Using Table T to find t-Values

20.3: Interpreting Confidence Intervals
20.4: A Hypothesis Test for the Mean

One-sample t-test for the mean

Intervals and tests (relationship)

The special case of proportions (relationship above differs)

20.5: Choosing the Sample Size

Chapter 21: More About Tests and Intervals

21.1: Choosing Hypotheses
21.2: How to Think About P-Values

The P-value is not the probability that the null hypothesis is true

What to do with a small P-value

A small p-value does not imply a large effect

What to do with a high P-value

A big p-value does not prove the null hypothesis

21.3: Alpha Levels

Alpha levels and statistical significance

Where did the value 0.05 come from?

Practical vs. statistical significance

21.4: Critical Values for Hypothesis Tests

Table T

A confidence interval for small samples

Confidence intervals and hypothesis tests

21.5: Errors

Type I errors

Type II errors

Probabilities defined as alpha and beta

Power

Effect size

Pictures of errors

Reducing both type I and type II errors

Part VI: Accessing Associations Between Variables

Chapter 22: Comparing Groups

22.1: The Standard Deviation of a Difference
22.2: Assumptions and Conditions for Comparing Proportions
22.3: A Confidence Interval for the Difference Between Two Proportions
22.4: The Two Sample z-Test: Testing for the Difference Between Proportions
22.5: A Confidence Interval for the Difference Between Two Means
22.6: The Two-Sample t-Test: Testing for the Difference Between to Means
22.7: The Pooled t-Test: Everyone into the Pool?

@@ Line 440: / Line 440: @@
 === Chapter 25: Inferences for Regression ===
-== Part VII: Inference When Variables are Related ===
+== Part VII: Inference When Variables are Related ==
 === Chapter 26: Analysis of Variance ===