Stat 202 2019S Course Materials
Lab 1: Due in class January 14, 2019. Choose a topic from the list below, or choose your own topic. Try to find data on that topic. Try "Google Dataset Search" and Kaggle.Com. If you can't find data on your topic, choose another topic. Then answer the following questions:
- What is your topic?
- How did you find data?
- Is the data structured or unstructured?
- What are the cases?
- What are some of the variables (there may be too many to list)?
- Is each of these variables quantitative, ordinal categorical, nominal categorical, binary, or identifier? You may find a "codebook" with the data set that will help you answer this question and the ones above.
Suggested topics for data search: (actually, whatever interests you): sports (of various kinds, there are lots of free good data on baseball), entertainment, movies (again good data), law, criminology, government, city planning, architecture, weather, climate, geology, seismology, medicine, epidemiology, health, fitness, biology, evolution, extinction, ecology, math, computer science, statistics, data science, anthropology, ethnic studies, gender studies, history, sociology, culture, tourism, archeology, art, literature, writing, journalism, census, linguistics, finance, economics, business, astronomy, physics, chemistry, library sciences, theology, anything else you can think of.
To load StatCrunch outside of Pearson: http://statcrunch.american.edu/
More Data Sets for Activities:
- Small Diamonds Data Set (3000 diamonds sampled from full Diamonds Data Set): diamonds3K.
- Codebook for full Diamonds Data Set.
- Cars Miles Per Gallon Data Set: mpg.
- Codebook for Cars Miles Per Gallon Data Set.
- Simulated Exam Scores: rounded N(70,10): egexam.
Lab 2: Describe the distributions of the cut, carat and price variables of the diamonds3K dataset, a sample from a larger data set, described in the codebook for full diamonds dataset. Please see my instructor's answer.
Anscombe Data Set: anscombe