Experimental design considerations
Understanding the steps in the experimental process of RNA extraction and preparation of RNA-Seq libraries is helpful for designing an RNA-Seq experiment, but there are special considerations that should be highlighted that can greatly affect the quality of a differential expression analysis.
These important considerations include:
- Number and type of replicates
- Avoiding confounding
- Addressing batch effects
We will go over each of these considerations in detail, discussing best practice and optimal design.
Replicates
Experimental replicates can be performed as technical replicates or biological replicates.
Image credit: Klaus B., EMBO J (2015) 34: 2727-2730
-
Technical replicates: use the same biological sample to repeat the technical or experimental steps in order to accurately measure technical variation and remove it during analysis.
-
Biological replicates use different biological samples of the same condition to measure the biological variation between samples.
In the days of microarrays, technical replicates were considered a necessity; however, with the current RNA-Seq technologies, technical variation is much lower than biological variation and technical replicates are unneccessary.
In contrast, biological replicates are absolutely essential. For differential expression analysis, the more biological replicates, the better the estimates of biological variation and the more precise our estimates of the mean expression levels. This leads to more accurate modeling of our data and identification of more differentially expressed genes. We will revisit this later today.
Confounding
A confounded RNA-Seq experiment is one where you cannot distinguish the separate effects of two different sources of variation in the data.
For example, we know that sex has large effects on gene expression, and if all of our control mice were female and all of the treatment mice were male, then our treatment effect would be confounded by sex. We could not differentiate the effect of treatment from the effect of sex.
To AVOID confounding:
-
Ensure animals in each condition are all the same sex, age, litter, and batch, if possible.
-
If not possible, then ensure to split the animals equally between conditions
Batch effects
Batch effects are a significant issue for RNA-seq analyses, since you can see significant differences in expression due solely to batch.
Image credit: Hicks SC, et al., bioRxiv (2015)
How to know whether you have batches?
-
Were all RNA isolations performed on the same day?
-
Were all library preparations performed on the same day?
-
Did the same person perform the RNA isolation/library preparation for all samples?
-
Did you use the same reagents/kits for all samples?
-
Did you perform the RNA isolation/library preparation in the same location?
If any of the answers is ‘No’, then you have batches.
Best practices regarding batches:
-
Design the experiment from start to finish to avoid batches, if possible. If unsure of what can bring in a batch effect, talk with a biostats consultant before starting experiment.
-
If unable to avoid batches:
-
Do NOT confound your experiment by batch:
Image credit: Hicks SC, et al., bioRxiv (2015)
-
DO split replicates of the different sample groups across batches. The more replicates the better (definitely 3 or more).
Image credit: Hicks SC, et al., bioRxiv (2015)
-
DO include batch information in your experimental metadata. During the analysis, we can regress out the variation due to batch so it doesn’t affect our results if we have that information.
-