Introduction to Peak Analysis Workshop
Audience | Computational Skills | Prerequisites | Duration |
---|---|---|---|
Biologists | Intermediate | None | Introduction to R |
Learning Objectives
- Describe peak data and different file formats generated from peak calling algorithms
- Assess various metrics used to assess the quality of peak calls
- Compare peak calls across samples within a dataset
- Create visualizations to evaluate peak annotations
- Evaluate differentially enriched regions between two sample groups
These materials were developed for a trainer-led workshop, but are also amenable to self-guided learning.
Lessons
Description
This repository has teaching materials for a hands-on Introduction to Peak Analysis workshop. This workshop will use the R statistical programming environment to evaluate files generated from peak calling of ChIP-seq (and related approaches i.e. CUT&RUN and ATAC-seq) data. We will provide participants with a suite of tools and a basic workflow beginning with quality metrics through to annotation and visualization. This workshop will introduce participants to:
- File formats for peak data
- Approaches to check peak quality and reproducibility across replicates
- Peak annotation methods and tools for visualization
- Differential peak enrichment analysis and functional analysis
Working knowledge of R is required or completion of the Introduction to R workshop.
Note for Trainers: Please note that the schedule linked below assumes that learners will spend between 3-4 hours on reading through, and completing exercises from selected lessons between classes. The online component of the workshop focuses on more exercises and discussion/Q & A.
Dataset
The R project for this workshop can be downloaded with this link.
Installation Requirements
Download the most recent versions of R and RStudio for your laptop:
NOTE: When installing the following packages, if you are asked to select (a/s/n) or (y/n), please select “a” or “y” as applicable.
(1) Install the below packages on your laptop from CRAN. You DO NOT have to go to the CRAN webpage; you can use the following function to install them:
install.packages("BiocManager")
install.packages("tidyverse")
install.packages("pheatmap")
install.packages("UpSetR")
install.packages("RColorBrewer")
install.packages("ggrepel")
install.packages("ggupset")
Note that these package names are case sensitive!
(2) Install the below packages from Bioconductor. Load BiocManager, then run BiocManager’s install()
function 7 times for the 7 packages:
library(BiocManager)
install("insert_first_package_name_in_quotations")
install("insert_second_package_name_in_quotations")
& so on ...
Note that these package names are case sensitive!
BiocManager::install("ChIPseeker")
BiocManager::install("ChIPpeakAnno")
BiocManager::install("DiffBind")
BiocManager::install("clusterProfiler")
BiocManager::install("TxDb.Mmusculus.UCSC.mm10.knownGene")
BiocManager::install("IRanges")
BiocManager::install("GenomicRanges")
BiocManager::install("DESeq2")
BiocManager::install("org.Mm.eg.db")
NOTE: The library used for the annotations associated with genes (here we are using
TxDb.Mmusculus.UCSC.mm10.knownGene
) will change based on organism. The list of different organism packages are given here.
(3) Finally, please check that all the packages were installed successfully by loading them one at a time using the library()
function.
library(tidyverse)
library(pheatmap)
library(UpSetR)
library(ChIPseeker)
library(ChIPpeakAnno)
library(DiffBind)
library(clusterProfiler)
library(TxDb.Mmusculus.UCSC.mm10.knownGene)
library(IRanges)
library(GenomicRanges)
library(DESeq2)
library(RColorBrewer)
library(ggrepel)
library(ggupset)
library(org.Mm.eg.db)
(4) Once all packages have been loaded, run sessionInfo().
sessionInfo()