Skip to content

Workshop Schedule


Day 1

Lesson Overview Instructor Time
Workshop Introduction Welcome and housekeeping Elizabeth 10:00-10:30
Intro to R and RStudio Introduction to R and RStudio Noor 10:30-11:45
Self learning materials Overview of self-learning materials Elizabeth 11:45-12:00

Before the next class

A. Please study the contents and work through all the code within the following lessons.

B. Complete the exercises:

  • Each lesson above contains exercises; please go through each of them.

  • Copy over your solutions into the Google Form using the submit link below the day before the next class

Questions?

If you get stuck due to an error while running code in the lesson, email us

  • 1. R Syntax and Data Structure

    About data types and data structure

    In order to utilize R effectively, you will need to understand what types of data you can use in R and also how you can store data in "objects" or "variables".

    This lesson will cover:

    • Assigning a value to a object

    • What types of information can you store in R

    • What are the different objects that you can use to store data in R

  • 2. Functions and Arguments

    Functions and Arguments in R

    Functions are the basic "commands" used in R to get something done. To use functions (denoted by function_name followed by "()"), one has to enter some information within the parenthesis and optionally some arguments to change the default behavior of a function.

    You can also create your own functions! When you want to perform a task or a series of tasks more than once, creating a custom function is the best way to go.

    In this lesson you will explore:

    • Using built-in functions

    • Creating your own custom functions

  • 3. Reading in and inspecting data

    Read and inspect data structures in R

    When using R, it is almost a certainty that you will have to bring data into the R environment.

    In this lesson you will learn:

    • Reading different types (formats) of data

    • Inspecting the contents and structure of the dataset once you have read it in

Day 2

Lesson Overview Instructor Time
Review self-learning Questions about self-learning All 10:00-10:50
In-class exercises Use and customize function and arguments Noor 10:50-11:15
Data Wrangling Subsetting Vectors and Factors Will 11:15-12:00

Before the next class

A. Please study the contents and work through all the code within the following lessons.

B. Complete the exercises:

  • Each lesson above contains exercises; please go through each of them.

  • Copy over your solutions into the Google Form using the submit link below the day before the next class

Questions?

If you get stuck due to an error while running code in the lesson, email us

  • 1. Packages and libraries

    Installing and loading packages in R

    Base R is incredibly powerful, but it cannot do everything. R has been built to encourage community involvement in expanding functionality. Thousands of supplemental add-ons, also called "packages" have been contributed by the community. Each package comprises of several functions that enable users to perform their desired analysis.

    This lesson will cover:

    • Descriptions of package repositories

    • Installing a package

    • Loading a package

    • Accessing the documention for your installed packages and getting help

  • 2. Data wrangling: data frames, matrics and lists

    Subset, merge, and create new datasets

    In class we covered data wrangling (extracting/subsetting) information from single-dimensional objects (vectors, factors). The next step is to learn how to wrangle data in two-dimensional objects.

    This lesson will cover:

    • Examining and extracting values from two-dimensional data structures using indices, row names, or column names

    • Retreiving information from lists

  • 3. The %in% operator

    %in% operator, any and all functions

    Very often you will have to compare two vectors to figure out if, and which, values are common between them. The %in% operator can be used for this purpose.

    This lesson will cover:

    • Implementing the %in% operator to evaluate two vectors

    • Distinguishing %in% from == and other logical operators

    • Using any() and all() functions

  • 4. Reordering and matching

    Ordering of vectors and data frames

    Sometimes you will want to rearrange values within a vector (row names or column names). The match() function can be very powerful for this task.

    This lesson will cover:

    • Maunually rearranging values within a vector

    • Implementing the match() function to automatically rearrange the values within a vector

  • 5. Data frame for plotting

    Learn about map() function for iterative tasks

    We will be starting with visualization in the next class. To set up for this, you need to create a new metadata data frame with information from the counts data frame. You will need to use a function over every column within the counts data frame iteratively. You could do that manually, but it is error-prone; the map() family of functions makes this more efficient.

    This lesson will cover:

    • Utilizing map_dbl() to take the average of every column in a data frame

    • Briefly discuss other functions within the map() family of functions

    • Create a new data frame for plotting

Prepare for in-class exercise:

  • Download the data and place the file into the data directory.
Data Download link
Animal data Right click & Save link as...
  • Read the .csv file into your environment and assign it to a variable called animals. Be sure to check that your row names are the different animals.

  • Save the R project when you close Rstudio.


Day 3

Lesson Overview Instructor Time
Review self-learning Questions about self-learning All 10:00-10:35
In-class exercises Customizing functions and arguments Will 10:50-11:15
Plotting with ggplot2 ggplot2 for data visualization Noor 11:15-12:00

Before the next class

  1. Please study the contents and work through all the code within the following lessons.

  2. Complete the exercises:

  3. Each lesson above contains exercises; please go through each of them.

  4. Copy over your solutions into the Google Form using the submit link below the day before the next class

Questions?

If you get stuck due to an error while running code in the lesson, email us

  • 1. Custom functions for plots

    Consistent formats for plotting

    When creating your plots in ggplot2 you may want to have consistent formatting (using theme() functions) across your plots, e.g. if you are generating plots for a manuscript.

    This lesson will cover:

    • Developing a custom function for creating consistently formatted plots
  • 2. Boxplot with ggplot2

    Customizing barplots with ggplot2

    Previously, you created a scatterplot using ggplot2. However, ggplot2 can be used to create a very wide variety of plots. One of the other frequently used plots you can create with ggplot2 is a barplot.

    This lesson will cover:

    • Creating and customizing a barplot using ggplot2
  • 3. Exporting files and plots

    Writing files and plots in different formats

    Now that you have completed some analysis in R, you will need to eventually export that work out of R/RStudio. R provides lots of flexibility in what and how you export your data and plots.

    This lesson will cover:

    • Exporting your figures from R using a variety of file formats

    • Writing your data from R to a file

  • 4. Finding help

    How to best look for help

    Hopefully, this course has given you the basic tools you need to be successful when using R. However, it would be impossible to cover every aspect of R and you will need to be able to troubleshoot future issues as they arise.

    This lesson will cover:

    • Suggestions for how to best ask for help

    • Where to look for help

  • 5. Tidyverse

    Data wrangling within Tidyverse

    The Tidyverse suite of integrated packages are designed to work together to make common data science operations more user friendly. Tidyverse is becoming increasingly prevalent and it is necessary that R users are conversant in the basics of Tidyverse. We have already used two Tidyverse packages in this workshop (ggplot2 and purrr) and in this lesson we will learn some key features from a few additional packages that make up Tidyverse.

    This lesson will cover:

    • Usage of pipes for connecting together multiple commands

    • Tibbles for two-dimensional data storage

    • Data wrangling within Tidyverse


Day 4

Lesson Overview Instructor Time
Review self-learning Questions about self-learning All 10:00-10:35
In-class exercises In class exercises Will 10:50-11:15
Discussion Q&A Noor 11:15 - 11:45
Wrap Up Wrap up and checking out Noor 11:45 - 12:00

Additional exercises and answer keys

Additional resources


Attribution & Citation

  • These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

  • Some materials used in these lessons were derived from work that is Copyright © Data Carpentry. All Data Carpentry instructional material is made available under the Creative Commons Attribution license (CC BY 4.0)

  • To cite material from this course in your publications, please use:

    Meeta Mistry, Mary Piper, Jihe Liu, & Radhika Khetani. (2021, May 5). hbctraining/Intro-to-R-flipped: R workshop first release. Zenodo. https://doi.org/10.5281/zenodo.4739342

  • A lot of time and effort went into the preparation of these materials. Citations help us understand the needs of the community, gain recognition for our work, and attract further funding to support our teaching activities. Thank you for citing this material if it helped you in your data analysis.