Installing and using UMAP
To visualize the cell clusters, there are a few different dimensionality reduction techniques that can be helpful. The most popular methods include t-distributed stochastic neighbor embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) techniques.
In the Seurat package there is a function to use the UMAP visualization (RunUMAP()
), however it does require the user to first install the umap-learn
python package. To install this package follow the instructions below.
Getting Going with Python
These materials were modified from https://unidata.github.io/online-python-training/conda-windows.html.
The aim of this web page is to help you get started with Python. We will explain what a package management tool is, how to download conda
package management tool via the Anaconda installer, and guide you on the Mac/Windows command prompt so that you can use conda
from the command line. Finally, we will wrap up by installing one library with conda
.
What Is a Package Management Tool?
A package management tool is a software application that helps you manage software libraries that enable you to get your work done. These software libraries may relate to plotting for scientific publication or accessing certain kinds of data, for example.
When you start using Python, you will want use software libraries that are not part of the standard Python installation. For example, we wish to use the umap-learn for cluster visualization. Anaconda from Continuum Analytics will help you install umap-learn
easily.
Installing the conda
Package Management Tool
Before we install conda
, close your R and RStudio.
The conda
package management tool is part of the Anaconda software package. Install conda
by navigating to the Anaconda download page. Scroll down to choose a tab for the OS of your computer:
Download Python by clicking on the “64-bit Graphical Installer” link. It is a big download, so it is best to be on fast network. Open the installer file you just downloaded. It should be named something like Anaconda[version]-Windows-x86_64
.
This action will guide you through the conda installation.
For Mac OS, the installation will automatically make Anaconda the default Python, which is great.
For Windows OS, the last step of the installation process will ask you if you want to add Anaconda to the PATH environment variable and whether you would like to make this your default Python. Ensure both options are checked.
The following warning will also pop up on Windows if you have Miniconda3 or another python installed, to which you should select OK
(unless you need this other python for other purposes):
Command prompt
Mac command prompt
Use Spotlight to search for the Terminal
program. It would look something like: Janes-iMac:~ janedoe$
. This is known as the command line. The command line is where you give text instructions to your computer.
Windows command prompt
The Windows Command Prompt, installed by the Anaconda program, is a software program giving you the ability to give text based instructions to your computer.
To open the Windows Command Prompt, left-click on the Windows Start menu located in the lower left portion of the desktop and search for ‘Command Prompt’ in the search box located at the bottom of the menu.
In the Windows Command Prompt, you will see some text such as C:\Users\Jane>
. This is known as the command line. The command line is where you give text instructions to your computer.
Interacting with conda
Let’s make sure conda is installed by entering this instruction on the command line:
conda list
yields
# packages in environment at C:\Users\Jane\AppData\Local\Continuum\Anaconda3:
#
alabaster 0.7.7 py35_0
anaconda 4.0.0 np110py35_0
anaconda-client 1.4.0 py35_0
...
numexpr 2.5 np110py35_0
numpy 1.10.4 py35_0
odo 0.4.2 py35_0
...
yaml 0.1.6 0
zeromq 4.1.3 0
zlib 1.2.8 0
which will list linked packages in a conda environment. You’ll notice libraries such as the scientific computing library numpy that you will probably be making use of.
Install umap-learn
with conda
We first have to give conda
an instruction on where to find umap-learn
on the conda-forge
channel.
conda config --add channels conda-forge
We can now install umap-learn
:
conda install -c conda-forge umap-learn
You should receive the warning:
The following packages will be SUPERCEDED by a higher priority channel:
certifi
conda
You will be asked y/n
, and you should type y
.
Let’s verify we installed umap-learn
with the following command:
conda list
should yield amongst other libraries:
# packages in environment at /Users/Jane/anaconda:
#
...
umap-learn 0.3.9 py37_0 conda-forge
...
Now we can restart our R session, load our libraries, and continue with running the RunUMAP()
function.
NOTE: If your R session crashes on a Mac OS, then you can try running the following in R:
library(reticulate) use_python(python = "~/Anaconda3/bin/python", required = TRUE)
Then try to re-run the
RunUMAP()
.