Stat 202 2019S Course Materials

Lab 1: Due in class January 14, 2019. Choose a topic from the list below, or choose your own topic. Try to find data on that topic. Try "Google Dataset Search" and Kaggle.Com. If you can't find data on your topic, choose another topic. Then answer the following questions:

  • What is your topic?
  • How did you find data?
  • Is the data structured or unstructured?
  • What are the cases?
  • What are some of the variables (there may be too many to list)?
  • Is each of these variables quantitative, ordinal categorical, nominal categorical, binary, or identifier? You may find a "codebook" with the data set that will help you answer this question and the ones above.

Suggested topics for data search: (actually, whatever interests you): sports (of various kinds, there are lots of free good data on baseball), entertainment, movies (again good data), law, criminology, government, city planning, architecture, weather, climate, geology, seismology, medicine, epidemiology, health, fitness, biology, evolution, extinction, ecology, math, computer science, statistics, data science, anthropology, ethnic studies, gender studies, history, sociology, culture, tourism, archeology, art, literature, writing, journalism, census, linguistics, finance, economics, business, astronomy, physics, chemistry, library sciences, theology, anything else you can think of.

Lab 2: Describe the distributions of the cut, carat and price variables of the diamonds3K dataset, a sample from a larger data set, described in the codebook for full diamonds dataset. Please see my instructor's answer.

Lab 3: Regression Lab in StatCrunch, click here.