Difference between revisions of "Course Materials Stat 370 Spring 2017"

From Sean_Carver
Jump to: navigation, search
(Day 3: January 27, 2017)
(Day 20: Tuesday, April 4, 2017)
 
(33 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Day 1: January 17, 2017 ==
+
== Day 1: Tuesday, January 17, 2017 ==
  
We had introductions and we went over the syllabus.  We talked about our goals and dreams and reasons for taking this course.  We also talked about our experience with R and with programming in general.
+
We had introductions and we went over the syllabus.  We talked about our goals and dreams and reasons for taking this course.  We also talked about our experience with R and with programming in general.  
  
== Day 2: January 24, 2017 ==
+
Remember '''No class next Friday (Inauguration Day)'''.
  
We talked about getting and installing R and RStudio.  We discussed some of the panels of using RStudio including the editor and the console.  I showed you dynamic documents with R-Markdown.  We wrote an R Script, and a dynamic document. We talked about Markov chains. I presented several project ideas concerning Markov chains, including applications to baseball, and neuroscience.
+
== Day 2: Tuesday, January 24, 2017 ==
  
== Day 3: January 27, 2017 ==
+
We talked about getting and installing R and RStudio.  We discussed some of the panels within RStudio including the editor and the console.  I showed you dynamic documents with R-Markdown.  We wrote an R Script, and a dynamic document. We talked about Markov chains. I presented several project ideas concerning Markov chains, including applications to baseball, and neuroscience.
  
I started by discussing the project idea, described below.  You are not required to follow this suggestion.  Some students are already doing something else, and that is fine. Synergies with other projects, outside of this class, are encouraged.
+
== Day 3: Friday, January 27, 2017 ==
  
* Choose a data rich fieldA few ideas: weather, social media, baseball, climate (although access to government data on this subject seems uncertain, at the moment).
+
I started by discussing a [[Stat_370_Project_Idea|default project idea (click here)]]You are not required to follow this suggestion.  Some students are already doing something else, and that is fine. Synergies with other projects, outside of this class, are encouraged.
* Write code to access the data via the web automatically (thus, your program will update the data, as new data arrive).
 
* Write code to analyze and display the data in interesting ways.
 
* Tie everything together into a dynamic document.
 
* Do something more (ideas below) ...
 
  
The reason to "do something more" is that I am planning to do most of the above in classSpecifically,
+
After I discussed this project idea, we talked about calling and writing functions, and passing arguments to functions.  We discussed required/positional arguments and optional/named arguments.  The function getwd() was an example of one with no arguments.  We talked about getting help...from Google and from the console with the ? and ?? commands.  E.g. type "?getwd" for help on getwd() or "??topic" if you don't know the exact function name.  We talked about operators such as + - * / and ^.  We talked about variables, though we have more to discuss hereWe talked about sampling from the vector c("HomeRun", "Single", "Double", "Triple", "Out").
  
* We will spend parts of a few class periods devoted to brainstorming and discussing ideas of where to get data for this type of project.
+
== Day 4: Tuesday, January 31, 2017 ==
* In class or at home we, will investigate what needs to be done to access the data.
 
* I imagine that for some data sets, even accessing the data in a form useful for the dynamic document you envision will be a project in and of itself.  This could be considered (part of) the "do something more."
 
* Part of at least one class period will be devoted to learning how to access data via the web, and discussing what others have come up with.  There is some flexibility here in terms of how much we cover.
 
* We will go over analyzing, and displaying data and dynamic documents.
 
  
Some other ideas for "do something more".
+
We did an introduction to Git and GitBash today.  In GitBash we learned the commands "pwd", "ls", and "cd", plus three commands specific to git, "git init", "git add [file]", and "git commit -am "[message]".  The brackets [...] and what is inside is replaced by what is indicated.
  
* Web Scraping (getting data from websites with an R-Program, e.g. How many job postings on the Indeed job board have the word "RStudio" or "R" in them?
+
We did an lecture on Markov Chains. We presented an example of a traveler randomly selecting which city to visit next depending on what city they are currently in. We discussed transition probabilities and the transition probability matrix.  A matrix was a two dimensional table of numbers.  The entries have a row number and a column number.  The rows indicate which city you travel from and the columns indicate which city you travel to.  In Markov chains the nodes are called "states".  In the traveler example, these are the cities. In baseball, the states include the information about which bases have runners and the number of outs.
* Displaying your dynamics document on the web, and updating automatically every day(This is probably easy.)
 
* Interactive Dynamic Documents (on the website select parameters like bin width for histogram, etc.)
 
* Displaying geographic data with maps, etc.
 
* Text mining and sentiment analysis (there are R packages for this).  E.g. what percentage of tweets with the word "Trump" display a fearful emotion?
 
* More?
 
  
After I discussed this project idea, we talked about calling and writing functions, and passing arguments to functions.  We discussed required/positional arguments and optional/named arguments.  The function getwd() was an example of one with no arguments.  We talked about getting help. From Google and from the console with the ? and ?? commandsE.g. type "?getwd" for help on getwd() or "??topic" if you don't know the exact function name.
+
Please read the first 5 pages of the "Modeling Baseball Using a Markov Chain" handout.
 +
 
 +
== Day 5: Friday, February 3, 2017 ==
 +
 
 +
Today we discussed variables and types of variables.  The atomic types were logical, integer, numeric (double), character, complex and raw.  We made vectors and lists and matrices, and arrays, and data frames with these atomic types.  We talked about coercion of variables from one type to another.  For instance TRUE + TRUE + FALSE is coerced to 2.  We also talked about variables whose values were functions.  We then moved to work on Git.  We created a GitHub account and forked Max Albert's Baseball_R repository that includes the code for our textbook.  We created a new branch in our repository, made a change, then tried to push the branch to our forked repository.  Pushing the branch didn't work.
 +
 
 +
== Day 6: Tuesday, February 7, 2017 ==
 +
 
 +
Today we discussed: factors, custom classes, and data frames.  I showed you the default datasets in R.  We spent a little more time on GitHub, including figuring out how do create new branches.  Then I discussed model selection, and passed out a printout of my code called KLI in R for estimating the number of samples needed for model selection with confidence.
 +
 
 +
== Day 7: Friday, February 10, 2017 ==
 +
 
 +
Today we have a homework assignment: [[Media:Stat370_2017S_HW01.pdf|Homework 01]].  Most people got through the script part of the assignment and were preparing to work on the dynamic documents part.
 +
 
 +
== Day 8: Tuesday, February 14, 2017 ==
 +
 
 +
We worked on dynamic documents part of the assignment from last class: [[Media:Stat370_2017S_HW01.pdf|Homework 01]].
 +
 
 +
== Day 9: Friday, February 17, 2017 ==
 +
 
 +
We did an exercise on spatial data in class today.
 +
 
 +
== Day 10: Tuesday, February 21, 2017 ==
 +
 
 +
There is a new homework for today's class:  [[Media:STAT370 2017S HW02.pdf|Homework 02]].
 +
 
 +
We are going to tie both homeworks together on a website.  This website will not be accessible via the web but will be accessible via a browser on your local computer.  The code will of course be in GitHub.  On the About page for the website, please list your goals, dreams, and what you want to get out of this class.
 +
 
 +
See
 +
 
 +
http://rmarkdown.rstudio.com/rmarkdown_websites.html
 +
 
 +
== Day 11: Friday, February 24, 2017 ==
 +
 
 +
We worked on Homework 02 in class, today. Those who were finished worked on projects.  Homework 02 had real challengesGood going if you were able to complete it!
 +
 
 +
== Day 12: Tuesday, February 28, 2017 ==
 +
 
 +
We now have a server!  Check it out: it is hosted at http://stat370.com.  Today, we set up user accounts on the server and worked on Homework 03, which involved first putting R-Studio's template for R-Markdown on the server and making it accessible to the web, and then updating the server with a completed version of homework 03, which involved an exercise with loops and conditionals based on the truncated Normal distribution.
 +
 
 +
== Day 13: Friday, March 3, 2017 ==
 +
 
 +
We spend more time working on homework and getting our previous work on the server.
 +
 
 +
== Day 14: Tuesday, March 7, 2017 ==
 +
 
 +
We started the day with a survey.  The results are posted here: [[Stat_370_Survey_Results|Survey Results]].  Then we had a lecture on loops and conditionals and we did some exercises related to Homework 03.
 +
 
 +
== Day 15: Friday, March 10, 2017 ==
 +
 
 +
Most people were absent today (Spring Break starts now).  We discussed the second half of the course and I worked individually with the students present.
 +
 
 +
== Day 16: Tuesday, March 21, 2017 ==
 +
 
 +
We studied distributions (e.g. rnorm(), qnorm(), pnorm(), and dnorm()) and t.test()I also went around the room and talked to people about their projects.
 +
 
 +
== Day 17: Friday, March 24, 2017 ==
 +
 
 +
We studied prop.test(), and correlation, and regression.  Afterwards, I continued to go around the room and talked to people about their projects while people did this tutorial: https://ww2.coastal.edu/kingw/statistics/R-tutorials/simplelinear.html
 +
 
 +
== Day 18: Tuesday, March 28, 2017 ==
 +
 
 +
We were going to do pull requests today, however I discovered that very few people had gotten their homework and projects on the server, so I touched base with everyone, and see where people are concerning getting their work on the server.
 +
 
 +
== Day 19: Friday, March 31, 2017 ==
 +
 
 +
Today I passed out a handout concerning plotting and graphing with RWe worked through the handout as I continued to go around the room and touched base with everyone.
 +
 
 +
== Day 20: Tuesday, April 4, 2017 ==
 +
 
 +
I have a handout concerning Shiny. I also have simplified directions for working with the server available http://stat370.com/wiki/index.php/Instructions_for_using_server

Latest revision as of 15:13, 4 April 2017

Day 1: Tuesday, January 17, 2017

We had introductions and we went over the syllabus. We talked about our goals and dreams and reasons for taking this course. We also talked about our experience with R and with programming in general.

Remember No class next Friday (Inauguration Day).

Day 2: Tuesday, January 24, 2017

We talked about getting and installing R and RStudio. We discussed some of the panels within RStudio including the editor and the console. I showed you dynamic documents with R-Markdown. We wrote an R Script, and a dynamic document. We talked about Markov chains. I presented several project ideas concerning Markov chains, including applications to baseball, and neuroscience.

Day 3: Friday, January 27, 2017

I started by discussing a default project idea (click here). You are not required to follow this suggestion. Some students are already doing something else, and that is fine. Synergies with other projects, outside of this class, are encouraged.

After I discussed this project idea, we talked about calling and writing functions, and passing arguments to functions. We discussed required/positional arguments and optional/named arguments. The function getwd() was an example of one with no arguments. We talked about getting help...from Google and from the console with the ? and ?? commands. E.g. type "?getwd" for help on getwd() or "??topic" if you don't know the exact function name. We talked about operators such as + - * / and ^. We talked about variables, though we have more to discuss here. We talked about sampling from the vector c("HomeRun", "Single", "Double", "Triple", "Out").

Day 4: Tuesday, January 31, 2017

We did an introduction to Git and GitBash today. In GitBash we learned the commands "pwd", "ls", and "cd", plus three commands specific to git, "git init", "git add [file]", and "git commit -am "[message]". The brackets [...] and what is inside is replaced by what is indicated.

We did an lecture on Markov Chains. We presented an example of a traveler randomly selecting which city to visit next depending on what city they are currently in. We discussed transition probabilities and the transition probability matrix. A matrix was a two dimensional table of numbers. The entries have a row number and a column number. The rows indicate which city you travel from and the columns indicate which city you travel to. In Markov chains the nodes are called "states". In the traveler example, these are the cities. In baseball, the states include the information about which bases have runners and the number of outs.

Please read the first 5 pages of the "Modeling Baseball Using a Markov Chain" handout.

Day 5: Friday, February 3, 2017

Today we discussed variables and types of variables. The atomic types were logical, integer, numeric (double), character, complex and raw. We made vectors and lists and matrices, and arrays, and data frames with these atomic types. We talked about coercion of variables from one type to another. For instance TRUE + TRUE + FALSE is coerced to 2. We also talked about variables whose values were functions. We then moved to work on Git. We created a GitHub account and forked Max Albert's Baseball_R repository that includes the code for our textbook. We created a new branch in our repository, made a change, then tried to push the branch to our forked repository. Pushing the branch didn't work.

Day 6: Tuesday, February 7, 2017

Today we discussed: factors, custom classes, and data frames. I showed you the default datasets in R. We spent a little more time on GitHub, including figuring out how do create new branches. Then I discussed model selection, and passed out a printout of my code called KLI in R for estimating the number of samples needed for model selection with confidence.

Day 7: Friday, February 10, 2017

Today we have a homework assignment: Homework 01. Most people got through the script part of the assignment and were preparing to work on the dynamic documents part.

Day 8: Tuesday, February 14, 2017

We worked on dynamic documents part of the assignment from last class: Homework 01.

Day 9: Friday, February 17, 2017

We did an exercise on spatial data in class today.

Day 10: Tuesday, February 21, 2017

There is a new homework for today's class: Homework 02.

We are going to tie both homeworks together on a website. This website will not be accessible via the web but will be accessible via a browser on your local computer. The code will of course be in GitHub. On the About page for the website, please list your goals, dreams, and what you want to get out of this class.

See

http://rmarkdown.rstudio.com/rmarkdown_websites.html

Day 11: Friday, February 24, 2017

We worked on Homework 02 in class, today. Those who were finished worked on projects. Homework 02 had real challenges. Good going if you were able to complete it!

Day 12: Tuesday, February 28, 2017

We now have a server! Check it out: it is hosted at http://stat370.com. Today, we set up user accounts on the server and worked on Homework 03, which involved first putting R-Studio's template for R-Markdown on the server and making it accessible to the web, and then updating the server with a completed version of homework 03, which involved an exercise with loops and conditionals based on the truncated Normal distribution.

Day 13: Friday, March 3, 2017

We spend more time working on homework and getting our previous work on the server.

Day 14: Tuesday, March 7, 2017

We started the day with a survey. The results are posted here: Survey Results. Then we had a lecture on loops and conditionals and we did some exercises related to Homework 03.

Day 15: Friday, March 10, 2017

Most people were absent today (Spring Break starts now). We discussed the second half of the course and I worked individually with the students present.

Day 16: Tuesday, March 21, 2017

We studied distributions (e.g. rnorm(), qnorm(), pnorm(), and dnorm()) and t.test(). I also went around the room and talked to people about their projects.

Day 17: Friday, March 24, 2017

We studied prop.test(), and correlation, and regression. Afterwards, I continued to go around the room and talked to people about their projects while people did this tutorial: https://ww2.coastal.edu/kingw/statistics/R-tutorials/simplelinear.html

Day 18: Tuesday, March 28, 2017

We were going to do pull requests today, however I discovered that very few people had gotten their homework and projects on the server, so I touched base with everyone, and see where people are concerning getting their work on the server.

Day 19: Friday, March 31, 2017

Today I passed out a handout concerning plotting and graphing with R. We worked through the handout as I continued to go around the room and touched base with everyone.

Day 20: Tuesday, April 4, 2017

I have a handout concerning Shiny. I also have simplified directions for working with the server available http://stat370.com/wiki/index.php/Instructions_for_using_server