Normalization and Sketch Downsampling - Answer Key

Author

Noor Sohail

Published

July 22, 2025

Exercise 1

  1. Using GeneCards, look up one of the top genes and read more about its role and how it could possible relate to CRC.
Gene Genecard description Marker
IGHG1, IGLC1, IGLC7, IGHG3, IGHA1 membrane-bound immunoglobulins serve as receptors which, upon binding of a specific antigen, trigger the clonal expansion and differentiation of B lymphocytes into immunoglobulins-secreting plasma cells B cell -> plasma cell marker
IGHM antigen recognition molecules of B cells B cell marker
SST Somatostatin also affects rates of neurotransmission in the central nervous system and proliferation of both normal and tumorigenic cells. ?
INSL5 May have a role in gut contractility ?
CHGA It is found in secretory vesicles of neurons and endocrine cells Neuron + endocrine celltype marker
CXCL8 ? ?
GCG secreted from gut endocrine cells and promote nutrient absorption ?
HBA2 ? ?
VIP It stimulates myocardial contractility, causes vasodilation, increases glycogenolysis, lowers arterial blood pressure and relaxes the smooth muscle of trachea, stomach and gall bladder ?
MMP12 Macrophage Metalloelastase Macrophage marker
PYY secreted by endocrine cells in the gut Endocrine marker

You might not always be able to identify what the exact role of a gene is at this point. However, now we have a better understanding of what genes are going to be driving the downstream steps of our analysis.

  1. Visualize one of the top variable genes on the spatial slide to see if expression varies across the dataset.
SpatialFeaturePlot(seurat_sketch,
                   features = "IGHG1",
                   image.alpha = 0,
                   pt.size.factor = 15)

We can see that the expression of IGHG1 appears to have patches of bins with very high expression, whereas in other areas it has near 0 expression. This confirms that it is a gene that is expressed variably throughout the dataset.

SpatialFeaturePlot(seurat_sketch,
                   features = "IGHM",
                   image.alpha = 0,
                   pt.size.factor = 15)

Similarly, we can see that gene IGHM, which is a B-cell marker genes, is expressed selectively throughout the dataset.

Reuse

CC-BY-4.0