Thank you for joining us this semester and for your patience and understanding as we have been navigating the online format. It has been great to have you all in the course.
Unfortunately, we will not be able to get our course evaluations until the end of the semester. Students formally enrolled (both for credit or audit) should get an email with a link to the evaluation at that time, and I would really appreciate it if you can take a few minutes to answer the questions.
This is the second time I have offered this course, and it is still under very active development. I’m currently in the process of turning it into a permanent course (NTRES 6100), so I would really value everybody’s feedback on how we can improve it and make it as useful as possible.
Please share your candid thoughts and suggestions in the evaluation. Remember that all comments are completely anonymous and I only get to see it after S/U decisions have been submitted.
If anyone has feedback or suggestions they would like to share right away, we would be happy to hear your thoughts! You can either private message, email, or post in the feedback
channel on Slack. Remember you can post anonymously by prefacing your message with /anon
(e.g. `/anon Here is my message).
Resist the temptation to manually edit or reformat your original file because if your documentation of the changes is imperfect, you may lose important information. Clean up the data in R. Your R code, along with appropriate documentation will be a record of the changes. The code can be modified and rerun, using the raw data file as input, if needed.
Make sure that your code does not rely on objects or functions defined outside of your script. If that is the case, it can’t readily be run by yourself or someone else in the future.
Make sure to frequently re-start R as you’re working, as elaborated on by Jenny Bryan here.
Also, if you haven’t already, follow the instructions from r4ds on how ensure that RStudio does not restore your workspace between sessions, so you start with a clean environment every time. Make sure this option is selected under your RStudio preferences:
Slides from Deep thoughts by Jenny Bryan:
Organizing your work into RStudio projects avoids issues with absolute file paths and makes it easier to keep track of the code used to generate plots and reports, share your code with others, and work on multiple different projects in parallel. Not convinced yet? Check out What they forgot to teach you about R
Document the big-picture structure both within files (comments) and between files (README’s). In general, comments (and Git commit messages) should explain the why not the what (which should be self-evident from well-written code). Can a collaborator or you-in-six-months quickly figure out what’s going on in your code?
Developing a consistent style in your coding, makes it a lot easier to read. Here is some inspiration:
Very important advice from Vince Buffalo
Reshape your data into tidy format for analysis. That way, you can take advantage of the powerful set of tools available in the tidyverse and beyond instead of having to invent your own roundabout approaches, and this will both make your code more robust, concise, and readable.
Some ways to make code more human-readable include:
select(data, columnname)
instead of data[,5]
Here are a few other books you might want to check out:
And for getting help, check out the slide deck or recorded talk for Jenny Bryan’s talk “Object of type ‘closure’ is not subsettable” at the 2020 RStudio conference. You can also check out her Reprex webinar.
Dashboards:
Check out the Dashboard developed for the continually updated Coronavirus dataset we worked with earlier in the course. Note that you can grab all the code under Source code
in the top right corner.
Dashboards can also be made interactive with dynamic and user-controlled displays of data through use of Shiny, an R package that makes it easy to build interactive web apps straight from R.
For an example, check out this dynamic visualization of the gapminder dataset we worked with in our last class (make sure to check out the cool video with Hans Rosling)
Here is a much simpler shiny app with interactive display of the same data.
For more examples, check out the RStudio Flexdashboard website. And check out Mastering shiny by Hadley Wickham
Tips on how to start engaging on twitter from R for Excel Users by Julia Lowndes and Allison Horst
and
Finding the YOU in the R community by Thomas Mock
From the Ocean Health Index Data Science Training:
If there are 3 things to communicate to others after this workshop, I think they would be:
1. Data science is a discipline that can improve your analyses
This helps your science:
2. Open data science tools exist
This helps your science:
3. Learn these tools with collaborators and community (redefined):
This helps your science: