Seurat Cheatsheet - Answer Key

Exercise 1

What are the last 5 cells barcodes and the last 5 genes in the integrated seurat object.

# Barcodes
Cells(seurat_integrated) %>% tail()
[1] "stim_TTTGCATGCGACAT-1" "stim_TTTGCATGCTAAGC-1" "stim_TTTGCATGGGACGA-1"
[4] "stim_TTTGCATGGTGAGG-1" "stim_TTTGCATGGTTTGG-1" "stim_TTTGCATGTCTTAC-1"
# Barcodes
colnames(seurat_integrated) %>% tail()
[1] "stim_TTTGCATGCGACAT-1" "stim_TTTGCATGCTAAGC-1" "stim_TTTGCATGGGACGA-1"
[4] "stim_TTTGCATGGTGAGG-1" "stim_TTTGCATGGTTTGG-1" "stim_TTTGCATGTCTTAC-1"
# Genes
Features(seurat_integrated) %>% tail()
[1] "PNPT1"  "ZFAND1" "GNB5"   "GSN"    "MRPL23" "BAK1"  
# Genes
rownames(seurat_integrated) %>% tail()
[1] "PNPT1"  "ZFAND1" "GNB5"   "GSN"    "MRPL23" "BAK1"  

Exercise 2

What are the last 5 identities for the cells in the integrated seurat object?

Idents(seurat_integrated) %>% tail()
stim_TTTGCATGCGACAT-1 stim_TTTGCATGCTAAGC-1 stim_TTTGCATGGGACGA-1 
    Activated T cells                     6                     6 
stim_TTTGCATGGTGAGG-1 stim_TTTGCATGGTTTGG-1 stim_TTTGCATGTCTTAC-1 
                   15                     0     Activated T cells 
16 Levels: CD14+ monocytes Activated T cells 0 4 5 6 7 8 9 10 11 12 13 ... 16

Exercise 3

What are the the 5 least variable genes in the integrated seurat object?

VariableFeatures(seurat_integrated) %>% tail()
[1] "PNPT1"  "ZFAND1" "GNB5"   "GSN"    "MRPL23" "BAK1"  
Note

You may have noticed that these are the same genes from when we were looking at the last 5 genes in our Seurat object. This is due to the fact that we ran integration on the top 3,000 variable genes. As a result, the order of the genes in the integrated dataset follows the same order as the variable features!

Exercise 4

What are the dimensions for each assay in the integrated seurat object?

dim(seurat_integrated[["RNA"]])
[1] 14065 29629
dim(seurat_integrated[["SCT"]])
[1] 14065 29629
dim(seurat_integrated[["integrated"]])
[1]  3000 29629
Note

Notice that the number of genes is higher in the RNA assay compared to the integrated object. This goes back to previous note, where only the expression from variable genes is stored in the integrated assay.

Exercise 5

Show the code to get the entire SCT normalized (data) count matrix.

LayerData(seurat_integrated, assay="SCT", layer="data")

Exercise 6

Show how you would use the FetchData() function to generate a dataframe of UMAP_1, UMAP_2, and sample values for each cell.

FetchData(seurat_integrated, vars=c("UMAP_1", "UMAP_2", "sample")) %>% head()
                          UMAP_1     UMAP_2 sample
ctrl_AAACATACAATGCC-1   7.270473  0.9072988   ctrl
ctrl_AAACATACATTTCC-1  -8.742020  1.5622634   ctrl
ctrl_AAACATACCAGAAA-1 -10.032904  4.7139827   ctrl
ctrl_AAACATACCAGCTA-1  -8.363044  5.0377137   ctrl
ctrl_AAACATACCATGCA-1   6.875784 -4.6442526   ctrl
ctrl_AAACATACCTCGCT-1  -9.338899  2.2808882   ctrl