Skip to the content.

Introduction to Peak Analysis

Learning Objectives

Installations

On your desktop

  1. R
  2. RStudio
  3. Integrative Genomics Viewer (IGV)
  4. The listed R packages

On your HPCC (if not using Harvard’s O2 cluster)

Required

  1. Nextflow version 24.11.0-edge

Alternative to Nextflow

  1. samtools version 1.15.1
  2. bedtools version 2.30.0
  3. Picard version 2.27.5
  4. phantompeakqualtools version 1.2.2
  5. deepTools version 3.5.6
  6. bedGraphToBigWig version 302.1

NOTE: If you are not working on the O2 cluster and are using different versions of these software programs, these packages may still work with the provided commands. However, this workshop was designed on these versions specifically, so you may need to tweak some of the commands if you use different versions of this software.

Lessons

  1. Workflow overview: From sequenced reads to peaks
  2. Existing workflows for ChIP-seq analysis
  3. Understanding peaks and peak file formats
  4. Assessing peak quality metrics
  5. Assessing sample similarity and identifying potential outliers
  6. Concordance across replicates using peak overlaps
  7. Peak annotation and visualization using ChIPseeker
  8. Differential enrichment analysis using DiffBind
  9. Peak visualization using IGV
  10. Annotation and functional analysis of DE regions
  11. Motif analysis/discovery

NOTE: If you aren’t working on Harvard’s O2 cluster the directory structure for the HPCC that you are using is likely different and you will need to modify paths to work within your HPCC’s directory structure.

Answer key

These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.