Workshop Schedule
NOTE: The Basic Data Skills Introduction to R workshop is a prerequisite. If you would like some practice wih R prior to taking this workshop, please work through this R refresher lesson.
Pre-reading:
- Please study the contents and work through all the exercises within the following lessons:
- Workflow overview: From sequenced reads to peaks
- Existing workflows for ChIP-seq analysis
- Download the compressed R Project that we will be using by right-clicking on this link and selecting “Save Link As…“ and download the compressed R Project to your desired location. Double-click on the compressed ZIP file in order to uncompress it.
Day 1
Time | Topic | Instructor |
---|---|---|
09:30 - 09:45 | Workshop Introduction | Meeta |
09:45 - 10:15 | Pre-reading discussion | Meeta |
10:15 - 11:00 | Understanding peaks and peak file formats | Meeta |
11:00- 11:05 | Break | |
11:05 - 12:00 | Assessing peak quality metrics | Will |
Before the next class:
I. Please study the contents and work through all the code within the following lessons:
- Assessing sample similarity and identifying potential outliers
Click here for a preview of this lesson
One step in the QC of samples is to see how samples compare to one another. Generally, we expect replicates from each sample group to be more similar to each other and dissimilar to replicates from a different sample group. Here, we use read density (counts across the genome) and peak signal data to check if it meets our expectations.
In this lesson you will:
- Create PCA plots and inter-sample correlation heatmaps
- Evaluate plots to identify potential outliers and other effects
- Create visualiations using signal data from peaks to identify proposed thresholds for downstream analysis
- Concordance across replicates using peak overlaps
Click here for a preview of this lesson
A quantitative way of evaluating how similar replicates are is to identify how many of the same peaks were called in each replicate. Biological replicates will inevitably exhibit some amount of variability, but the hope is that the majority of our peaks are identified in each sample. By looking at peak overlaps we can identify and remove a weaker replicate and/or use the overlap to create a consensus set of peaks.
In this lesson, we will:
- Discuss IRange and GRanges data structures in R
- Compute peak overlaps and create visualizations for the results
- Complete the exercises:
- Each lesson above contains exercises; please go through each of them.
- Copy over your solutions into the Google Forms the day before the next class.
Questions?
- If you get stuck due to an error while runnning code in the lesson, email us
Day 2
Time | Topic | Instructor |
---|---|---|
09:30 - 10:00 | Self-learning review | |
10:00 - 10:45 | Peak annotation and visualization using ChIPseeker | |
10:45- 10:55 | Break | |
10:55 - 12:00 | Differential enrichment analsysis using DiffBind |
Before the next class:
I. Please study the contents and work through all the code within the following lessons:
- Peak visualization using IGV
Click here for a preview of this lesson
A two sentence summary of the lesson....
In this lesson, we will:
- Point 1
- Point 2
- Annotation and functional analysis of DE regions
Click here for a preview of this lesson
A two sentence summary of the lesson....
In this lesson, we will:
- Point 1
- Point 2
- Complete the exercises:
- Each lesson above contains exercises; please go through each of them.
- Copy over your solutions into the Google Forms the day before the next class.
Questions?
- If you get stuck due to an error while runnning code in the lesson, email us
Day 3
Time | Topic | Instructor |
---|---|---|
09:30 - 10:30 | Self-learning review | |
10:30 - 11:15 | Motif analysis/discovery | |
11:15- 11:25 | Break | |
11:05 - 11:45 | Discussion Q&A | |
11:45 - 12:00 | Wrap-up |
Answer keys
-
Day 2 exercises
-
Day 3 In-class
Resources
These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.