Stat 370 2017S Post Mortem
As STAT 370 draws to a close, it is time for analysis of what went well, and what could be done better.
What went well
- Flexible projects: I believe that the only successful way to learn to code is with projects you are passionate about. I encouraged students to find such a project and many did. I gave students as much flexibility as possible to pursue their dreams. If students found a topic interesting, I wanted them to pursue it. I did not want to stand in their way. I did however steer students toward original projects, not just repeating analyses done by others. I believe that students who felt the passion certainly benefited from it. And if there were any students who did not fall in love with their topic, they probably were not hurt by the flexibility I gave them.
- Reproducible research and dynamic documents: I experimented with these course objectives and I would definitely continue to make this a big part of the course, if I teach it again.
- Course work posted on a web server: I feel very certain that having a course web server is the right way to go. Having shell access to the server means students can clone and update their projects with simple git shell commands. Students have their own home directory on the server and their own public_html sub-directory where the server looks to publish any work. Students invariably wanted their work published on the web.
- Course wiki: This ties together everything on the web-server with links to project pages and Github pages.
- Github: Github was useful for moving files between the server, students' laptops, and the lab computers. Students found the learning curve manageable.
What will be done better, next time
- Instructor learning topics on the fly: So much of what I learned I learned on the fly. This worked in many cases, and it was necessary. Being able to learn new things allowed me to adapt to the interests and experiences of the class. But if I teach the class again, I will certainly benefit from the learning curves I traversed this semester.
- GNU-Make is too hard for students, even if I write the Makefiles: I wanted to use GNU-make so that students could build their projects on the server, and wouldn't need to put large, completely changing with every update, html files under version control. This is what you are supposed to do but it was prohibitively confusing. And using GNU-make created the problem of character set incompatibilities between Macs and the Linux server. It became apparent in a compelling way that students should build their pages and sites with GUI tools, on their own computer, commit and push the html files to Github, and pull them onto the server. That said, one student had a very complicated website, with many pages, that took about 10 minutes to build in R-Studio---every time she made a small change. We wrote a Makefile together that only builds the specific pages that she changed, since the last build. This saved her a lot of time and frustration. While Make isn't a tool for the whole class, it's good to have in your back pocket for these situations.
- Keep shell interaction as simple as possible: Students get confused by the shell and don't seem to understand its power. They prefer GUI tools, even though programming is done with typed commands. As the class proceeded, I discovered simpler and simpler ways of doing things that need to be done with the shell, and GUI-tools for doing things that don't need to be done with the shell. These insights should benefit future classes.
- Project filesystem: Students cloned multiple copies of their projects, or had multiple copies of files, or had projects within other projects. This creates problems of not editing the file you think you are editing, leading to headaches and difficult to trace, and difficult to fix failures. R-Studio is poorly designed to prevent these problems. You never see the directory of the file you are editing. I did not anticipate these difficulties adequately. If I had, I would have told students to create one directory with all of their R-projects for the class, with no nested projects. Further I would tell students to create projects in Github, then clone them into the R-projects directory.
- Too much instructor troubleshooting and debugging: I spent a lot of time troubleshooting for students. On the other hand, some students were able to troubleshoot for themselves, even if it took them a lot of time. Others got frustrated and came for help. In the future, I will write a handout about doing your own troubleshooting and debugging. I will tell students that it is their responsibility for getting their code to work. I will convey a willingness to help to the extent that I am able, but there is only so much time I have to help the whole class. I will also point out that troubleshooting and debugging is a huge part of a coders life, and learning this skill is part of the objectives for this course. Solving a problem on your own can greatly boost your confidence. Indeed, having an attitude that you can fix the problem is a huge advantage for fixing it.
- Students who want continuous data collection, should rent their own servers: Several students projects involved lots of data (even Terabytes) to be collected by our web-server from Twitter API or crawling the web. After working with this for a while I realized this was not a good idea. The Twitter API is not intended for use in a multi-user environment, and bad netiquite by one user can lead to all users (including me) being blacklisted. This includes web crawling and scraping, as well. Until I figure out how to do this properly (which won't happen before projects are due), students will have to settle for a Plan B. That said, I intend to work with one student over the summer to complete her project in the way she envisioned, with a Amazon Web Services, rented server and terabytes of data. My experience helping this student will make this kind of project possible in the future. The student will turn in something else before the end of the semester.
- More instruction on creating graphs: I realized when grading final projects after the class was over that many students needed more instruction on how to professionally generate graphs. For example, students need to be taught how to effectively label plot axes.
The jury is still out
- Data visualization, Shiny, Plotly: Final projects, and all homework is due May 10. We have covered Plotly and Shiny in class, but students haven't done much with these packages. We'll see how they do on this last assignment.
- More graded homework assignments: The handful of homework sets assigned in class covered reproducible research/dynamic documents, working with data in R, and loops and conditionals. Once I made a decision to put and grade all student homework on the web server, we spent a lot of time sorting out how. We continued to do active learning during class, which introduced students to a variety of topics, some of which were incorporated into student projects. We continued this pattern after settling on how to use the web server. This arrangement worked really well for students who kept busy, consumed with their passion-inspiring projects. I am convinced that for these students, doing less passion-inspiring homework sets would have been a detrimental distraction. How effective is this strategy overall? I concede that some students may have benefited from more graded assignments. If I teach this material again, I will really think through how to best serve the needs of all students. Maybe certain homework will be required; others you select several of the most interesting assignments from a list, based on what is relevant to your project and interests. If I implement this idea, I will require that everyone do all active learning assignments in class, to become familiar with topics, in case they need them later.