Syllabus: Stat 370 Spring 2017

From Sean_Carver
Revision as of 21:00, 16 January 2017 by Carver (talk | contribs)
Jump to: navigation, search

Introduction to Statistical Computing and Modeling (STAT 370) Spring 2017 Section 001 [UNDER CONSTRUCTION]

Course Materials (click here)

Instructor: Sean Carver, Ph.D., Professorial Lecturer, American University.

Contact:

  • office location: 107 Gray Hall
  • email: carver@american.edu
  • office phone: 202-885-6629

Course Description (from department website): The basics of programming using the open source statistical program R. Data analysis, both numerical and qualitative, including graphical and formal inference. Applications include numerical methods, text mining, modeling, and simulation. Usually offered every spring. Prerequisite:

Prerequisite: MATH-221 and STAT-202 or STAT-203, or permission of instructor.

Text: The Book of R, by Tilman M. Davies, No Starch Press, 2016.

Optional Text: Analyzing Baseball Data with R, by Max Marchi and Jim Albert, CRC Press, 2013. This is a great book, if you like baseball, data science, or especially both. We are going to use it to learn how to simulate a baseball game with a Markov Chain. I may be able to legally provide the relevant chapter of the book, if you want to save money. Don't like baseball, or don't know the rules? Don't worry, neither do I, but this example is fantastic from a pedagogical perspective, and I think you will agree, regardless of your interest and knowledge in the sport.

Software: Please install the following software on the machines you intend to use for this class: R, RStudio, Git, XQuartz (Mac), GitBash (Windows). You are welcome to use a lab computer or your laptop during class.

Learning Outcomes: Students will be able to

  • Submit work as a dynamic document via GitHub, and learn some of its tools for collaboration.
  • Use R as a powerful calculator.
  • Write basic programs using control and data structures.
  • Import data from external sources.
  • Perform analyses using regression, text mining, and simulation.
  • Use SQL databases to retrieve data with specified features.

Office Hours: Students are strongly encouraged to come to office hours if they need or want help.

My office is Gray Hall, Room 107. Office hours are TENTATIVELY scheduled as follows: (may be adjusted throughout the semester)

  • Tuesday, Wednesday, Friday: 4:00 PM TO 6:00 PM.

NOTE: If you would like to come to office hours on a regular or irregular basis and you have a compelling reason why you cannot make it during the hours listed above, please send me an email. I cannot guarantee that I will be able to find a time that works (this semester will be a very busy one for me), but I will try.

Class times and locations:

  • Tuesday, Friday 11:20 AM TO 12:35 PM, ANDERSON B-13

Important Dates:

  • January 17 (Tuesday): First day of class.
  • January 20 (Friday): Inaugaration Day, no class.
  • January 24 (Tuesday): Initial Project Brainstorm (come with ideas).
  • February 14 (Tuesday): Project Proposals due.
  • March 12 - 19: Spring Break, no class.
  • March 21 (Tuesday): Project Updates due.
  • March 31 (Friday): Midterm Exam.
  • April 28 (Friday): Last day of classes and final projects due.
  • May 12 (Friday): Grades due to registrar, no final exam.

Projects: In my experience, there is no better way to learn to code than to engage with a project that you feel passionate about. For the first month of class, we are going to spend some class time finding and defining projects that meet the course objectives, and that inspire this kind of passion in us. I anticipate that some of these projects may involve a lot of work for one person, which is why I am going to teach some of the tools that the open-source software community uses to facilitate collaboration. Collaboration will be accomplished through the cloud based service GitHub, and a local program called Git. As a pedagogical exercise, you will use these tools to collaborate on writing a children's story in R-Markdown (available within R). After the exercise you will not be compelled to collaborate, but the option will be there. Do group projects give you nightmares? The open source community has figured out paradigms for successful collaboration, although these paradigms are not widely used outside of the coding community because they are not especially simple to learn. These tools make attribution for work very transparent. This transparency will make it possible for students to get credit for contributions to several projects, not just one. You will be graded on the body of your work, if you choose to divide your effort among more than one project. If you start a project, it will be up to you whether you want to allow and invite others to contribute. Project milestone will be a proposal (February 14), and project update (March 21), final submission (April 28, last day of class). Each project will have a cloud based repository. GitHub is best for collaboration, but they charge for private repositories (needed if you don't want the world to see your work) -- GitLab doesn't charge for private repositories. Either way, you will make it possible for me to pull the most recent version of the project onto my computer for grading. I will do this at a specified date and time -- when it is due. For collaborative projects you will also be turning in a portfolio of your submissions which should be easy to generate.

Reading Material: Class time will be used most effectively if you have read the relevant section of the book ahead of time. Please be conscientious about the reading. Usually it will only be one chapter, however for the second class (January 24), please read both chapters 1 and 2. Reading assignments will be announced during the previous class.

Homework: For homework we will use private repositories on GitLab, except for the children's story assignment. For the children's story, each student will start their own GitHub repository for their own story, and invite others to collaborate. There will be work to do most classes. You will commit this work to the cloud whenever you want. You can update the work as you progress. At specified times and dates, I will pull your work from the cloud onto my computer to grade it. The specified time for the pull will be when it is due. I won't see any changes you commit after the pull time. I expect most homework sets will come from the required text book, although there is some flexibility, based on class interest, especially vis-a-vis the projects students are engaged in.

Attendance: You are expected to attend class unless there is a compelling reason why you cannot make it. Attendance and participation is worth 10% of your grade. Beyond the 10%, I believe that excellent attendance will be necessary to meet the objectives of this class. However, I understand that there are times when you cannot make it to class for compelling reasons. To accommodate the unavoidable, I will forgive occasional absences for everyone when I compute the final grades. If you need to miss more than a few classes, please see the Dean of Students. Exam day absences must be excused through the Dean of Students. On other days please send an email to me when you can't make it to class, explaining why. If your attendance and/or participation is not acceptable, you will receive an early warning from me through the registrar. This is how you will know you are in danger of losing the credit in this category If your attendance and/or participation continues to be poor, you might miss all 10% (i.e. get a zero in this category).

Midterm: The midterm project will be a coding exercise, completed in class on March 31. You will have access to your books, notes, and Google, but you will not be able to interact with another live person. Pull time will be the end of class on that day. Midterm examinations will be pulled from the private homework repository.


Tentative grading scheme:

ITEM PERCENT
Attendance and Participation 10%
Homework 35%
Midterm 20%
Project 35%

Class Etiquette: Please give the class your full attention and refrain from talking, texting, surfing the web, and similar distractions. If it is clear to other students that you are not paying attention, it will be harder for them to pay attention to me. This statement is true in general, but it is especially true if you are talking. Also, it can also be harder for me to give good lectures, when it is clear that not everyone is paying attention. Like you, your classmates are paying a lot of money to be here. Have some respect for your fellow students! Otherwise you are negatively impacting their educational experience, which isn't fair to them. If you need to attend to something urgently, it is OK to excuse yourself from the classroom. Please be warned that if people are not following this request, I may reread this statement to the class.

Please participate in class by asking questions when you do not understand something. Invariably other students benefit from these questions. Please engage in discussions, and please engage with the class, generally. I find it easier to give good lectures when students are asking questions, and engaging with the material.

Academic Integrity: Cheating is not acceptable and will not be tolerated. Consider this: in subtle ways, cheating to get a better grade on an exam can result in lowering the grades of some of your classmates. Certainly this is true when a specific curve is used to assign grades. Even when I don't use curves explicitly, they can be implicit in decisions about writing and grading exams. As required by the policy of American University, I will report all suspected cases of cheating to the Dean's office who will proceed to investigate and adjudicate the issues. Cheating is giving or receiving unauthorized assistance on exams, from other students or other people. When inappropriate copying between students is caught, both parties may be culpable. You can get help on homework from other students, but you must write up the work yourself, and the work must reflect your own understanding of the material.

Public Service Announcement: A representative of AU's Students Against Sexual Violence (SASV) approached me and asked me to include on my syllabi a list of resources available for survivors of sexual assault and their friends. While sexual violence is by no means the only challenge faced by students, I agree that this issue merits particular attention, so I am honoring her request by attaching the list she gave me:

Sexual Assault Resources

  • It’s never the survivor’s fault. There are many people you can talk to if you or someone you care about has been sexually assaulted:
  • AU's Sexual Assault Prevention Coordinator Daniel Rappaport (rappapor@american.edu)
  • AU's Coordinator for Victim Advocacy Sara Yzaguirre (sarayza@american.edu)
  • DC SANE Program (Sexual Assault Nurse Examiner) 1-800-641-4028
  • The only hospital in DC area that gives Physical Evidence Recover Kits (rape kits) is Medstar Washington Hospital
  • DC Rape Crisis Center: 202-333-7273
  • Students found responsible for sexual misconduct can be sanctioned with penalties that include suspension or expulsion from American University, and they may be subject to criminal charges
  • If you want to submit a formal complaint against someone who has sexually assaulted you, harassed you, or discriminated against you based on your gender identity or sexual orientation, you can do so online at http://www.american.edu/ocl/dos/, or contact the Dean of Students at dos@american.edu or 202-885-3300. These are Title IX violations, and universities are legally required to prohibit these actions.
  • Resources on campus that are required to keep what you tell them confidential are Daniel Rappaport, Sara Yzaguirre, ordained chaplains in Kay, and counselors at the counseling center. (OASIS may also belong here but it didn't exist when this list was created.)