Loading and Installing Libraries
This lesson introduces Python libraries, showing how to install packages, import them into your environment and explore library functions to extend Python functionality for data analysis.
Import libraries, Package management, Bioconda, PyPI, Conda-forge
Approximate time: 45 minutes
Learning objectives
In this lesson, we will:
- Explain different ways to install external Python libraries
- Demonstrate how to load a library and how to find functions specific to a library
Overview of lesson
Much of the power and appeal of Python comes from its rich ecosystem of libraries. Instead of writing everything from scratch, you can install and import packages that handle tasks like reading CSV files, working with biological sequences, or creating plots. The open-source community has developed a vast array of libraries to support various domains and use cases. You do not have to reinvent the wheel every time you need to perform a common task.
In this lesson, you will learn how to install, load and explore libraries so you can tap into this broader ecosystem for your own work.
Libraries in Python
Libraries are collections of Python functions, data and compiled code in a well-defined format, created to add specific functionality. Just as we created our own functions in the previous lesson, other users have created packages of functions they have shared with the community in the form of libraries. These packages can be installed and loaded into your Python environment so you can use the functions that they contain.
There are a set of standard (or base) packages which are considered part of the Python source code and automatically available as part of your Python installation. Base packages contain the basic functions that allow Python to work and enable standard statistical and graphical functions on datasets. For example, all of the functions that we have been using so far in our examples are basic functions.
Libraries are directories where packages for Python are stored. Note that the terms package and library are sometimes used interchangably and there has been some discussion amongst the community to resolve this.
Channels for Python libraries
There are many different channels for Python libraries. You can think of these as different places where Python libraries are stored and can be accessed from. Some of the most commonly used channels include:
| channel | description |
|---|---|
| bioconda | Conda channel for bioinformatics tools and libraries.scanpy, anndata, etc. |
| conda-forge | Community conda channel with many packages for science, data and general Python use.NumPy, pandas, etc. |
| PyPI | Main online index for Python packages, used with pip to install and manage libraries.scipy, matplotlib, etc. |
If you click on the “Channels” button from the “Environments” tab of the Anaconda Navigator, you can see the channels that are currently being used to search for packages. You can add additional channels here to search for packages that are not available in the default channels. For example, if you want to search for bioinformatics tools, you can add the bioconda channel.
Additionally, you can click on the “Not installed” filter to see a list of packages that are not currently installed in your environment. This can be a useful way to find new packages to explore and install.
Importing libraries
Now that we have installed the libraries we want to use, we need to import them into our Python environment in order to access their functions.
Changing the kernel in Jupyter Lab
First, we will close our Jupyter Lab notebook and re-launch it to make sure that we are using the intro_python environment by setting our “kernel” to intro_python. Kernels are another way to refer to a Python environment in Jupyter Lab. When we relaunch Jupyter Lab it should ask us which kernel we would like to use. Select Python [conda env:intro_python]* and click “Select”. We should now see that out kernel is Python [conda env:intro_python]* in the top right of our Jupyter notebook.