Anatomy of a Seurat Object

category_1
category_2
category_3
category_4

Write a description of the lesson here.

Author

Noor Sohail

Published

April 15, 2026

Keywords

keyword_1, keyword_2, keyword_3, keyword_4, keyword_5, keyword_6

Approximate time: XX minutes

Learning objectives

In this lesson, we will:

  • Learning Objective 1
  • Learning Objective 2
  • Learning Objective 3

Overview of lesson

When doing XYZ…

Load dataset

library(Seurat)
library(tidyverse)
crc <- Load10X_Spatial(data.dir = "data/P5CRC_cropped/",
                       bin.size = c(8, 16),
                       slice = "P5CRC")

Anatomy of a Seurat object

As we can see from our Seurat callout, there are a lot of different slots inside our object. Here, we will go through each of the major components of a Seurat object and how you would access key pieces of information.

Assays

Assays are where we can store different counts matrices - we are not forced to keep the same features and variable genes across assays. Each assay will contain it’s own Layers that can be distinct from other assays in the object. This is useful in several different cases:

  • Multi-modal assays, where you can keep the expression matrices for RNA, ATAC, or protein in a single Seurat object
  • Storing counts matrices from a variety of different normalization techniques
  • Batch integration methods will sometimes generate a transformed counts matrix

Here we can print the different assays that exist within our crc object:

Assays(crc)
[1] "Spatial.008um" "Spatial.016um"

We have 2 distinct assays for the different bin sizes, which makes sense because we have different count matrices for our cells based upon the bin size that was selected.

The DefaultAssay() function shows us which assay information will be used in other Seurat function calls, unless explicitly specified otherwise.

DefaultAssay(crc)
[1] "Spatial.008um"

We can also change what our default assay is. Let’s set it to the 016um bins:

DefaultAssay(crc) <- "Spatial.016um"
crc
An object of class Seurat 
36170 features across 97570 samples within 2 assays 
Active assay: Spatial.016um (18085 features, 0 variable features)
 1 layer present: counts
 1 other assay present: Spatial.008um
 2 spatial fields of view present: P5CRC.008um P5CRC.016um

Now we see that the callout says: Active assay: Spatial.016um

Features and Cells

Our count matrices function as any other matrix does, with rows and columns.

In Seurat, the rows correspond to Features. In the case of spatial transcriptomics, our features are genes. In other experiments, features could refer to chromatin peaks or proteins. The important thing to keep in mind is what technology you are using. Since this is a Visium HD dataset, we are quantifying RNA expression (genes).

We can see what the first few genes/features in our count matrix are:

Features(crc) %>% head()
[1] "SAMD11"  "NOC2L"   "KLHL17"  "PLEKHN1" "PERM1"   "HES4"   

As well as see the number of genes that are found in each of our assays:

nrow(crc[["Spatial.008um"]])
[1] 18085
nrow(crc[["Spatial.016um"]])
[1] 18085

The columns correspond to Cells (or samples as it appears in the callout). We can see what the first few cells in our count matrix are:

Cells(crc) %>% head()
[1] "s_016um_00050_00315-1" "s_016um_00064_00214-1" "s_016um_00101_00317-1"
[4] "s_016um_00049_00195-1" "s_016um_00032_00133-1" "s_016um_00061_00268-1"

As well as see the number of cells that are found in each of our assays:

ncol(crc[["Spatial.008um"]])
[1] 77896
ncol(crc[["Spatial.016um"]])
[1] 19674
  1. What differences do you see between the 8um and 16um bins?

Layers

Layers are our count matrices.

Layers(crc)
[1] "counts"

By default, Seurat uses the following naming convention for the counts matrices within an Assay:

Table 1: Description of each Layer() count matrix
Layer Description
counts Raw counts
data Normalized counts
scale.data Scaled‑normalized counts

You may notice that our Seurat object only contains counts right now. This is because we have not run any normalization steps yet (we will discuss how to do so in future lessons).

Using the LayerData() function we can access the entire counts matrix. Furthermore, we can specify the assay if we would prefer to not use the DefaultAssay.

LayerData(crc, 
          assay = "Spatial.016um", 
          layer = "count")[1:5, 1:5]
Table 2: First 5 rows and columns of raw counts matrix
s_016um_00050_00315-1 s_016um_00064_00214-1 s_016um_00101_00317-1 s_016um_00049_00195-1 s_016um_00032_00133-1
SAMD11 0 0 0 0 0
NOC2L 0 0 0 0 0
KLHL17 0 0 0 0 0
PLEKHN1 0 0 0 0 0
PERM1 0 0 0 0 0

By printing the first 5 features and cells in our object (for easier visualization). We can see that we are working with whole numbers which reinforces the idea that this is the raw data, with no transformations having been applied.

Spatial fields

We do not just have expression data associated with our sample, we also have the spatial slide that comes with its own set of values and information.

For example we can grab the x,y coordinates of each bin using the GetTissueCoordinates() function.

GetTissueCoordinates(crc) %>% View()
Table 3: Coordinates on spatial slide for each cell
x y cell
s_016um_00050_00315-1 62715.76 61103.62 s_016um_00050_00315-1
s_016um_00064_00214-1 56816.46 60261.12 s_016um_00064_00214-1
s_016um_00101_00317-1 62844.88 58123.60 s_016um_00101_00317-1
s_016um_00049_00195-1 55702.46 61133.16 s_016um_00049_00195-1
s_016um_00032_00133-1 52074.99 62111.72 s_016um_00032_00133-1

Or visualize what our slide looks like with SpatialDimPlot():

SpatialDimPlot(crc, pt.size.factor = 15)
Figure 1: Spatial visualization of spots with SpatialDimPlot()

Metadata

Seurat automatically creates some metadata for each of the cells when the object is created. This information is stored in the @meta.data slot within the Seurat object. The rownames are automatically set to be the cell names.

crc@meta.data %>% View()
Table 4: Seurat default @meta.data
orig.ident nCount_Spatial.008um nFeature_Spatial.008um nCount_Spatial.016um nFeature_Spatial.016um
s_008um_00078_00444-1 s 65 57 NA NA
s_008um_00128_00278-1 s 1300 906 NA NA
s_008um_00052_00559-1 s 128 121 NA NA
s_008um_00121_00413-1 s 538 326 NA NA
s_008um_00167_00326-1 s 44 39 NA NA

What does each column represent?

Table 5: Columns automatically populated in @meta.data
Column Description
orig.ident Sample identity if known; defaults to “s”
nCount_RNA Number of UMIs per cell
nFeature_RNA Number of genes detected per cell

While it may seem intimidating at first, the important thing to remember is that this is a dataframe. Therefore can modify and work with this dataframe just like we would any other in R! For example, we can set our orig.ident column to be our sample name rather than “s”.

crc@meta.data$orig.ident <- "P5CRC"
crc@meta.data %>% View()
Table 6: Seurat default @meta.data after updating orig.ident
orig.ident nCount_Spatial.008um nFeature_Spatial.008um nCount_Spatial.016um nFeature_Spatial.016um
s_008um_00078_00444-1 P5CRC 65 57 NA NA
s_008um_00128_00278-1 P5CRC 1300 906 NA NA
s_008um_00052_00559-1 P5CRC 128 121 NA NA
s_008um_00121_00413-1 P5CRC 538 326 NA NA
s_008um_00167_00326-1 P5CRC 44 39 NA NA

Additionally, we do not have use the @meta.data each time we want to access a single column. We can use the $ follow by the column name as a shorthand.

crc$nCount_Spatial.008um %>% head()
s_008um_00078_00444-1 s_008um_00128_00278-1 s_008um_00052_00559-1 
                   65                  1300                   128 
s_008um_00121_00413-1 s_008um_00167_00326-1 s_008um_00202_00633-1 
                  538                    44                   365 

Idents

The cell identities are stored as Idents(), which contain the default way to label cells. For example, if we wanted to label each cell by which sample they came from we could run:

Idents(crc) <- "orig.ident"
Idents(crc) %>% head()
s_008um_00078_00444-1 s_008um_00128_00278-1 s_008um_00052_00559-1 
                P5CRC                 P5CRC                 P5CRC 
s_008um_00121_00413-1 s_008um_00167_00326-1 s_008um_00202_00633-1 
                P5CRC                 P5CRC                 P5CRC 
Levels: P5CRC

Where we see that the identities of the cells are the value stored in the @meta.data column orig.ident.


Back to Lesson >>

Back to Schedule

Reuse

CC-BY-4.0