Workshop Schedule

NOTE: The Basic Data Skills Introduction to R workshop is a prerequisite. If you would like some practice wih R prior to taking this workshop, please work through this R refresher lesson.

Pre-reading:

Please study the contents and work through all the exercises within the following lessons:
- Workflow overview: From sequenced reads to peaks
- Existing workflows for ChIP-seq analysis
- Download the compressed R Project that we will be using by right-clicking on this link and selecting “Save Link As…“ and download the compressed R Project to your desired location. Double-click on the compressed ZIP file in order to uncompress it.

Day 1

Time	Topic	Instructor
09:30 - 09:45	Workshop Introduction	Will
09:45 - 10:15	Pre-reading discussion	Meeta
10:15 - 11:00	Understanding peaks and peak file formats	Will
11:00- 11:05	Break
11:05 - 12:00	Assessing peak quality metrics	Meeta

Before the next class:

I. Please study the contents and work through all the code within the following lessons:

Assessing sample similarity and identifying potential outliers

Click here for a preview of this lesson

One step in the QC of samples is to see how samples compare to one another. Generally, we expect replicates from each sample group to be more similar to each other and dissimilar to replicates from a different sample group. Here, we use read density (counts across the genome) and peak signal data to check if it meets our expectations.

In this lesson you will:
- Create PCA plots and inter-sample correlation heatmaps
- Evaluate plots to identify potential outliers and other effects
- Create visualiations using signal data from peaks to identify proposed thresholds for downstream analysis
Concordance across replicates using peak overlaps

Click here for a preview of this lesson

A quantitative way of evaluating how similar replicates are is to identify how many of the same peaks were called in each replicate. Biological replicates will inevitably exhibit some amount of variability, but the hope is that the majority of our peaks are identified in each sample. By looking at peak overlaps we can identify and remove a weaker replicate and/or use the overlap to create a consensus set of peaks.

In this lesson, we will:
- Discuss IRange and GRanges data structures in R
- Compute peak overlaps and create visualizations for the results
Complete the exercises:
- Each lesson above contains exercises; please go through each of them.
- Copy over your solutions into the Google Forms the day before the next class.

Questions?

If you get stuck due to an error while runnning code in the lesson, email us

Day 2

Time	Topic	Instructor
09:30 - 10:00	Self-learning review	All
10:00 - 10:45	Peak annotation and visualization using ChIPseeker	Will
10:45- 10:55	Break
10:55 - 12:00	Differential enrichment analysis using DiffBind	Meeta

Before the next class:

I. Please study the contents and work through all the code within the following lessons:

Peak visualization using IGV

Click here for a preview of this lesson

Now that we have identified regions that are differentially enriched, it would be good to perform a qualitative assessment. To do this we will take a look at the data in IGV, a genome browser and see what read density looks like in significant regions.

In this lesson, we will:
- Learn how to navigate IGV and introduce various features
- Evaluate significant regions from DiffBind
Annotation and functional analysis of DE regions

Click here for a preview of this lesson

To gain biological insight from the genomic coordinates identified as differentially bound, we need to map them back to genomic features and see if there is some over-representation of target genes in specific pathways.

In this lesson, we will:
- Use ChIPseeker to annotate the DE regions
- Perform functional analysis on the DE target genes
Complete the exercises:
- The Functional Analysis lesson above contains exercises; please go through each of them.
- Copy over your solutions into the Google Forms the day before the next class.

Questions?

If you get stuck due to an error while runnning code in the lesson, email us

Day 3

Time	Topic	Instructor
09:30 - 10:30	Self-learning review	All
10:30 - 11:15	Motif analysis/discovery	Meeta
11:15- 11:25	Break
11:05 - 11:45	Discussion Q&A	All
11:45 - 12:00	Wrap-up	Will

Answer keys

Resources

Comprehensive assessment of differential ChIP-seq tools guides optimal algorithm selection
GRanges Tutorial
ChIPpeakAnno Vignette
HBC Introduction to Chromatin Biology Workshop - Workflow upstream of peaks
A short lesson on Integrating ChIP-seq target genes with and RNA-seq

These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.