Workshop Schedule
Pre-requisite for this workshop: The Basic Data Skills Introduction to the command-line interface workshop or a working knowledge of the command line and cluster computing.
Pre-reading
Day 1
Time | Topic | Instructor |
---|---|---|
09:30 - 09:45 | Workshop Introduction | Meeta |
09:45 - 10:25 | Working in an HPC environment - Review | Meeta |
10:25 - 11:05 | Project Organization (using Data Management best practices) | Will |
11:05 - 11:45 | Quality Control of Sequence Data: Running FASTQC | Jihe |
11:45 - 12:00 | Overview of self-learning materials and homework submission | Jihe |
Before the next class:
- Please study the contents and work through all the code within the following lessons:
- Experimental design considerations
- Quality Control of Sequence Data: Running FASTQC on multiple samples
-
Quality Control of Sequence Data: Evaluating FASTQC reports
NOTE: To run through the code above, you will need to be logged into FAS-RC and working on a compute node (i.e. your command prompt should have the word
compute
in it).- Log in using
ssh username@login.rc.fas.harvard.edu
and enter your password (replace username with your username). - Once you are on the login node, use
salloc -p test -t 0-2:30 --mem 8G
to get on a compute node or as specified in the lesson. - Proceed only once your command prompt does not have the word
login
in it. - If you log out between lessons (using the
exit
command twice), please follow points 1. and 2. above to log back in and get on a compute node when you restart with the self learning.
- Log in using
- Complete the exercises:
- Each lesson above contain exercises; please go through each of them.
- Copy over your code from the exercises into a text file.
- Upload the saved text file to Dropbox the day before the next class.
Questions?
- If you get stuck due to an error while runnning code in the lesson, email us
- Post any conceptual questions that you would like to have reviewed in class here.
Day 2
Time | Topic | Instructor |
---|---|---|
09:30 - 10:30 | Self-learning lessons review | All |
10:30 - 11:10 | Sequence Alignment Theory | Meeta |
11:10 - 11:50 | Quantifying expression using alignment-free methods (Salmon) | Will |
11:50 - 12:00 | Review of workflow | Meeta |
Before the next class:
- Please study the contents and work through all the code within the following lessons:
- Quantifying expression using alignment-free methods (Salmon on multiple samples)
- QC with Alignment Data
-
Documenting Steps in the Workflow with MultiQC
NOTE: To run through the code above, you will need to be logged into FAS-RC and working on a compute node (i.e. your command prompt should have the word
compute
in it).- Log in using
ssh username@login.rc.fas.harvard.edu
and enter your password (replace username with your username). - Once you are on the login node, use
salloc -p test -t 0-2:30 --mem 8G
to get on a compute node or as specified in the lesson. - Proceed only once your command prompt does not have the word
login
in it. - If you log out between lessons (using the
exit
command twice), please follow points 1. and 2. above to log back in and get on a compute node when you restart with the self learning.
- Log in using
- Complete the exercises:
- Each lesson above contain exercises; please go through each of them.
- Copy over your code from the exercises into a text file.
- Upload the saved text file to Dropbox the day before the next class.
Questions?
- If you get stuck due to an error while runnning code in the lesson, email us
- Post any conceptual questions that you would like to have reviewed in class here.
Day 3
Time | Topic | Instructor |
---|---|---|
09:30 - 10:10 | Self-learning lessons review | All |
10:10 - 10:45 | Troubleshooting RNA-seq Data Analysis | Will |
10:45 - 11:45 | Automating the RNA-seq workflow | Meeta |
11:45 - 12:00 | Wrap up | Will |
- Downloadable Answer Keys (Day 2 exercises):
- Downloadable Answer Keys (Day 3 exercises):
- Automation Script
Resources
- Video about statistics behind salmon quantification
- Obtaining reference genomes or transcriptomes
- Training materials from FAS-RC
Building on this workshop
- Introduction to R workshop materials
- Introduction to Differential Gene Expression analysis (bulk RNA-seq) workshop materials
These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.