NTRES 6100

Cornell University, Fall 2024

This website complements the class Canvas site.


Instructor: Associate Professor Nina Overgaard Therkildsen   TA: Ph.D. Student Jaime Ortiz Pachar


Meeting times and locations

Lectures:
Tuesdays and Thursdays 10.10am - 11:25am (August 27 - November 5, 2024), Morrison Hall 163

Optional lab sessions:
Hands-on practice sessions in groups and with TA support
Thursdays or Fridays 12:20pm - 2:15pm, Kennedy Hall 101


Course description

As datasets grow larger and more complex across all areas of science, computational skills are increasingly in high demand. This course introduces a series of practical tools that enable researchers to spend less time wrestling with software or repeating error-prone manual data processing and more time getting research done in efficient and transparent ways that facilitate collaboration and reproducibility. We will work in R/RStudio, primarily with the tidyverse packages and with Git and GitHub integration. The course emphasizes practical skill development and will be structured around hands-on (the keyboard) learning.

By the end of this course, students will be able to:

  • Describe strategies for ensuring that their data analysis is reproducible
  • Demonstrate best practices for coding and project-oriented workflows in RStudio
  • Import and clean messy data files using a variety of packages and functions in R
  • Subset, reorganize, and merge diverse datasets in R
  • Effectively explore and visualize patterns in complex datasets with ggplot in R
  • Write simple functions/programs and data analysis pipelines in R
  • Automate repeated analysis tasks in R
  • Track the history of file changes (version control) and collaborate effectively on scripts with others with Git and GitHub
  • Use R Markdown to combine text, equations, code, tables, and figures into reports, websites, and presentations