Stat 202 Discussion

Broad Objectives

Understand the traditional way of structuring data (datasets, tables, cases, variables, values); structured versus unstructured data.
Be able to recognize the types of variables in a data set (quantitative, identifier, categorical (ordinal, nominal, binary)).
Understand that different analyses and displays are appropriate for different types of variables.
Understand the concept of a distribution of a variable (what values the variable takes and how often it takes those values).
Describe the distribution of a single quantitative variable (histogram, box plot, QQ plot, shape, outliers, center, spread, modes, symmetry, skewness, normal/bell shaped, mean, median, standard deviation, Q1, Q3, IQR, percentiles).
Describe the outliers in single quantitative variables (tails, 1.5 IQR rule), and know which measures are resistant to outliers (resistant) and which are sensitive to outliers (not resistant), and what that means.
Describe the distribution of a single categorical variable (bar plot, pie chart, frequency table).
Understand the concept of and apply transformations of a variable (e.g. z-score, change of units, log) and know the special properties of a linear transformation.
Understand what it means for data (a quantitative variable) to fit a Normal model with parameters (revealed by histogram, QQ-Plot, including typical noise) and know how to make predictions based on that assumption.
Understand descriptions of Normal and other models with density curves.

Describe the relationship between a categorical variable and a quantitative variable (with a series of histograms, or side-by-side box plots).
Describe the relationship between pairs of quantitative variables (with scatter plots, and numerical measures below).
Know what it means for a pair of quantitative variables to fit a linear model with scatter.
Know that correlation and regression are only appropriate for pairs of quantitative variables that fit a linear model with scatter.
Describe the linear relationship between quantitative variables with correlation and regression, and know the properties of these analyses.
Know the difference between outliers for a relationship between variables, and outliers for individual variables, understand the influence of influential data points.
Understand the concept of predicting the value of the response variable with the value of the explanatory and the regression line.
Understand the concept of, and how to compute, residuals of a regression analysis; know how to analyze residuals for the appropriateness of the linear model.
Know the significance of R^2 in assessing the fraction of variance explained by linear regression between two quantitative variables.
Know that neither correlation nor association implies causation.

Understand the distinctions and differences between samples, census, and population (and related concepts of sample size, statistics and parameters).
Understand why it is important to sample and assign groups randomly, its relationship of this concept to a representative sample.
Understand common sample designs (simple random sample, stratified sampling, cluster and multistage sampling, systematic sampling).
Understand bias, and the common sources of bias in sampling (voluntary response sampling, convenience sampling, undercoverage nonresponse bias, response bias).
Understand the concept of a pilot study, and why it is useful.

Understand and use set notation (element of, subset of, union, intersection, complement of, null set, disjoint sets).
Understand and apply the terminology of probability (random phenomenon, sample space, outcomes, events, independent sets) and the rules/"axioms" of probability.
Understand the mathematical concept of a function to be able to apply it to the definition of a random variable.
Know and understand the definition of a random variable as a function mapping a sample space of a random phenomenon to real numbers, and be able to give important examples of random variables (coin toss, die toss, sampling, binomial).
Know how to compute the mean and standard deviation of a discrete random variable from its probability table.
Know how to apply formulas to compute means, variances, and standard deviations of sums, differences, and linear transformations of independent and correlated random variables.
Know the assumptions behind the use of a Binomial model, and recognize situations when this model is applicable.
Use the Binomial calculator to make predictions in StatCrunch; know how to use other StatCrunch calculators (geometric, Normal, Uniform) and when they apply.
Apply and recognize formulas for mean and standard deviation of binomial random variables.
Understand the concept of a random number table or a pseudorandom number generator and how it applies to simulation; understand the concept of a pseudorandom seed.

Understand that, in the context of sampling, the values of parameters do not depend on the sample, whereas the values of statistics do.
Understand that, in the context of sampling, the sample space is the set of all possible samples of a certain size (n).
Understand the definition of sampling distribution for a statistic: what values it takes, over the whole sample space, and how often it takes those values.
Use the binomial distribution for making predictions about the sampling distribution of counts and proportions; know the randomization condition and 10% condition for the appropriateness of this endeavor.
Know the success/failure condition for approximating a binomial distribution with a Normal one.
Use the Normal distribution for making predictions about the sampling distribution for proportions; know the conditions for the appropriateness of this endeavor (randomization condition, 10% condition, success/failure condition).
Use the Normal distribution for making predictions about the sampling distribution for means; know the randomization condition and the sample size condition for the appropriateness of this endeavor.
Apply and recognize formula for the theoretical means and standard deviations of the sampling distributions of means and proportions.

Understand the concepts of estimating a parameter from a sample, and of a biased and unbiased estimator.
Know that the sample proportion is an unbiased estimator of the true proportion and that the sample mean is an unbiased estimate of the true mean.
Understand the difference between standard deviation of a sampling distribution and standard error of a sample distribution.
Apply and recognize formulas for the standard error of the sample proportion and sample mean.
Be able to compute a one- and two proportion z-interval, and one- and two-sample t-interval from appropriate data.
Understand the conditions for using a one-sample proportion interval (randomization/independence condition, 10% condition, success/failure condition).
Appropriately interpret a confidence interval for a parameter.
Understand the concepts of margin of error, confidence level, and standard error, and how they interrelate and depend on sample size, p-hat, s.

[More coming ... ]