Skip to the content.

Functional Analysis of Gene Lists

Audience Computational skills required Duration
Biologists Beginner/Intermediate R 3-hour workshop (~3 hours of trainer-led time)

Description

This repository has teaching materials for a 3 hour, hands-on Functional Analysis workshop led at a relaxed pace. Functional analysis methods help us to gain insight about the biology underlying a list of genes. These genes could be output from a differential expression analysis, a GWAS analysis, proteomics analysis, etc. Regardless of the source of the gene list, functional analysis can explore whether particular pathways or processes are enriched among a list of genes.

In this workshop, we will use over-representation analysis (ORA) and functional class scoring (FCS) methods to identify potential pathways that are associated with our list of genes. We will be using the clusterProfiler R package to determine whether there is enrichment of any gene ontology (GO) processes in a list of genes and generate plots from the results. We will also give a brief introduction to using clusterProfiler to perform FCS with gene set enrichment analysis (GSEA) followed by the Pathview R package for visualization.

Learning Objectives

These materials are developed for a trainer-led workshop, but also amenable to self-guided learning.

Contents

Lessons Estimated Duration
Setting up 15 min
Gene annotations 30 min
Functional analysis methods 120 min

Dataset

Download the R project and data for this workshop here. Decompress and move the folder to the location on your computer where you would like to perform the analysis.

Installation Requirements

Download the most recent versions of R and RStudio for your laptop:

Install the required R packages by running the following code in RStudio:

# Install CRAN packages
install.packages(c("BiocManager", "devtools", "tidyverse"))

# Install Bioconductor packages
BiocManager::install(c("clusterProfiler", "DOSE", "org.Hs.eg.db", "pathview", "AnnotationDbi", "EnsDb.Hsapiens.v75"))

Load the libraries to make sure the packages installed properly:

library(clusterProfiler)
library(DOSE)
library(org.Hs.eg.db) 
library(pathview)
library(tidyverse)
library(AnnotationDbi)
library(EnsDb.Hsapiens.v75)

NOTE: The library used for the annotations associated with genes (here we are using org.Hs.eg.db) will change based on organism (e.g. if studying mouse, would need to install and load org.Mm.eg.db). The list of different organism packages are given here.

These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.