This lesson introduces data visualization in Python using Matplotlib and Seaborn, showing how to build scatterplots, adjust aesthetics and customize labels to create clear figures.
Explain the concept of layering in plotting and how to build a plot step-by-step
Create a scatterplot using MatPlotLib and customize its aesthetics with Seaborn
Apply different themes to a plot and adjust axis labels and titles
Overview of lesson
Plots are one of the best ways to communication and summarize results to others. With Matplotlib and Seaborn, you can create customizable visualizations from your data. Data scientists and researchers use these tools everyday to explore trends and create publication-ready figures for presentations and manuscripts. In this lesson, you will learn the basics of building a scatterplot and adjusting its aesthetics to give you the foundation for creating any plot you may want to generate in the future.
Plotting basics
MatPlotLib is one of the most widely used plotting packages in Python. With it, we can create many different types of plots, including scatterplots, line plots, bar plots, boxplots and more. The important thing to remember is that you can slowly build upon your plot, adding different layers of information to create a more informative and visually appealing plot. So there is no need to create a perfect plot in one step!
We will start with drawing a simple x-y scatterplot of mean_expression versus age_in_days from new_metadata.
Initialize a plot with MatPlotLib
First, we will import the MatPlotLib and Pandas libraries as well as load new_metadata that we created in the previous lesson:
# Import librariesimport matplotlib.pyplot as pltimport pandas as pd # Load the new metadata data frame that we created in the previous lessonnew_metadata = pd.read_csv("data/new_metadata.csv", index_col=0) # Print out new_metadatanew_metadata
Table 1: DataFrame containing updated metadata for each of our 12 samples.
genotype
celltype
replicate
mean_expression
age_in_days
sample1
Wt
typeA
1
10.266102
40
sample2
Wt
typeA
2
10.849759
32
sample3
Wt
typeA
3
9.452517
38
sample4
KO
typeA
1
15.833872
35
sample5
KO
typeA
2
15.590184
41
sample6
KO
typeA
3
15.551529
32
sample7
Wt
typeB
1
15.522219
34
sample8
Wt
typeB
2
13.808281
26
sample9
Wt
typeB
3
14.108399
28
sample10
KO
typeB
1
10.743292
28
sample11
KO
typeB
2
10.778318
30
sample12
KO
typeB
3
9.754733
32
We will first initialize a plot by using the figure() function from MatPlotLib. Let us look at some of the arguments we can use with the help() function:
# Look at the help for the figure functionhelp(plt.figure)
So first, we’ll create an empty plot of size 8 inches by 6 inches:
# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Render the plotplt.show()
<Figure size 768x576 with 0 Axes>
Figure 1: An empty plot initialized with MatPlotLib.
Note
As we go through the lesson, you may notice that most plots will render just fine without plt.show(). plt.show() is a good practice to have in your plot’s code to explicitly state that you would like the figure to be created at this point. Certain computing set-ups require plt.show() in order to render an image, while others may get confused and attempt to overlay multiple plots without it. As a result, it is a good habit to use plt.show() to explicitly state where in the code you’d like your plot to be rendered.
Adding a scatterplot layer
We will once again first initialize the plot with figure() and then add the scatterplot layer to the plot with scatter(). We need to specify where we are pulling the data to plot from, which in this case will be the new_metadata DataFrame, and the x and y values for our scatterplot, which in this case will be age_in_days for x and mean_expression for y.
The plt.figure and plt.scatter calls are connected because MatPlotLib commands build upon each other. The plt.figure command initializes the plot and sets the size, while the plt.scatter command adds the scatterplot layer to the existing plot. When we call these functions sequentially, we are building our plot layer by layer.
# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plotplt.scatter(data = new_metadata, x ="age_in_days", y ="mean_expression")# Render the plotplt.show()
Figure 2: Scatterplot of age in days vs. mean expression.
Alternative
Instead of providing the data argument, you could specify the x and y axes as the given columns from new_metadata.
# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plotplt.scatter(x = new_metadata["age_in_days"], y = new_metadata["mean_expression"])# Render the plotplt.show()
Now that we have the required fundamentals, let’s add some extra details like color to the plot. We can color the points on the plot based on the genotype column with the c argument.
# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypeplt.scatter(data = new_metadata, x ="age_in_days", y ="mean_expression", c ="genotype")# Render the plotplt.show()
---------------------------------------------------------------------------ValueError Traceback (most recent call last)
File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/axes/_axes.py:4761, in Axes._parse_scatter_color_args(c, edgecolors, kwargs, xsize, get_next_color_func) 4760try: # Is 'c' acceptable as PathCollection facecolors?-> 4761 colors = mcolors.to_rgba_array(c) 4762except (TypeError, ValueError) as err:
File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/colors.py:515, in to_rgba_array(c, alpha) 514else:
--> 515 rgba = np.array([to_rgba(cc)for cc in c])
517if alpha isnotNone:
File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/colors.py:317, in to_rgba(c, alpha) 316if rgba isNone: # Suppress exception chaining of cache lookup failure.--> 317 rgba = _to_rgba_no_colorcycle(c,alpha) 318try:
File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/colors.py:394, in _to_rgba_no_colorcycle(c, alpha) 393return c, c, c, alpha if alpha isnotNoneelse1.--> 394raiseValueError(f"Invalid RGBA argument: {orig_c!r}")
395# turn 2-D array into 1-D arrayValueError: Invalid RGBA argument: 'Wt'
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
CellIn[4], line 5 2 plt.figure(figsize = (8, 6))
4# Add a scatterplot layer to the plot, coloring points by genotype----> 5plt.scatter(data=new_metadata, 6x="age_in_days", 7y="mean_expression", 8c="genotype") 10# Render the plot 11 plt.show()
File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/_api/deprecation.py:453, in make_keyword_only.<locals>.wrapper(*args, **kwargs) 447iflen(args) > name_idx:
448 warn_deprecated(
449 since, message="Passing the %(name)s%(obj_type)s" 450"positionally is deprecated since Matplotlib %(since)s; the " 451"parameter will become keyword-only in %(removal)s.",
452 name=name, obj_type=f"parameter of {func.__name__}()")
--> 453returnfunc(*args,**kwargs)File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/pyplot.py:3948, in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, edgecolors, colorizer, plotnonfinite, data, **kwargs) 3928@_copy_docstring_and_deprecators(Axes.scatter)
3929defscatter(
3930 x: float | ArrayLike,
(...) 3946 **kwargs,
3947 ) -> PathCollection:
-> 3948 __ret = gca().scatter( 3949x, 3950y, 3951s=s, 3952c=c, 3953marker=marker, 3954cmap=cmap, 3955norm=norm, 3956vmin=vmin, 3957vmax=vmax, 3958alpha=alpha, 3959linewidths=linewidths, 3960edgecolors=edgecolors, 3961colorizer=colorizer, 3962plotnonfinite=plotnonfinite, 3963**({"data":data}ifdataisnotNoneelse{}), 3964**kwargs, 3965) 3966 sci(__ret)
3967return __ret
File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/_api/deprecation.py:453, in make_keyword_only.<locals>.wrapper(*args, **kwargs) 447iflen(args) > name_idx:
448 warn_deprecated(
449 since, message="Passing the %(name)s%(obj_type)s" 450"positionally is deprecated since Matplotlib %(since)s; the " 451"parameter will become keyword-only in %(removal)s.",
452 name=name, obj_type=f"parameter of {func.__name__}()")
--> 453returnfunc(*args,**kwargs)File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/__init__.py:1553, in _preprocess_data.<locals>.inner(ax, data, *args, **kwargs) 1549if label_namer and"label"notin args_and_kwargs:
1550 new_kwargs["label"] = _label_from_arg(
1551 args_and_kwargs.get(label_namer), auto_label)
-> 1553returnfunc(*new_args,**new_kwargs)File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/axes/_axes.py:4954, in Axes.scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, edgecolors, colorizer, plotnonfinite, **kwargs) 4951if edgecolors isNone:
4952 orig_edgecolor = kwargs.get('edgecolor', None)
4953 c, colors, edgecolors = \
-> 4954self._parse_scatter_color_args( 4955c,edgecolors,kwargs,x.size, 4956get_next_color_func=self._get_patches_for_fill.get_next_color) 4958if plotnonfinite and colors isNone:
4959 c = np.ma.masked_invalid(c)
File /opt/anaconda3/lib/python3.13/site-packages/matplotlib/axes/_axes.py:4770, in Axes._parse_scatter_color_args(c, edgecolors, kwargs, xsize, get_next_color_func) 4767raise invalid_shape_exception(c.size, xsize) fromerr 4768# Both the mapping *and* the RGBA conversion failed: pretty 4769# severe failure => one may appreciate a verbose feedback.-> 4770raiseValueError(
4771f"'c' argument must be a color, a sequence of colors, " 4772f"or a sequence of numbers, not {c!r}") fromerr 4773else:
4774iflen(colors) notin (0, 1, xsize):
4775# NB: remember that a single color is also acceptable. 4776# Besides *colors* will be an empty array if c == 'none'.ValueError: 'c' argument must be a color, a sequence of colors, or a sequence of numbers, not sample1 Wt
sample2 Wt
sample3 Wt
sample4 KO
sample5 KO
sample6 KO
sample7 Wt
sample8 Wt
sample9 Wt
sample10 KO
sample11 KO
sample12 KO
Name: genotype, dtype: object
Figure 3: Initial attempt to color the scatterplot of age in days vs. mean expression by genotype, which results in an error.
We are getting an error from trying to set the color. This is because the c argument in scatter() expects a list of color values, but we are providing it with categorical data from the genotype column.
Changing aesthetics
To work around the error from plt.scatter, we will instead use the seaborn package’s scatterplot() function and use the hue argument instead of c, which allows us to specify a categorical variable in order to color the plot points. The documentation for seaborn.scatterplot() is quite extensive and can be found on their official website.
You will notice that there are a default set of colors that we can use, so we do not have to specify a color. The legend and axis labels have also been automatically plotted for us!
# Import libraryimport seaborn as sns# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype")# Render the plotplt.show()
Figure 4: Scatterplot of age in days vs. mean expression, colored by genotype.
seaborn is a bit more flexible than MatPlotLib and allows us to easily add more aesthetics to our plot. You will oftentimes find yourself using a blend of both packages together to create the plot you want.
Let’s try to have both celltype and genotype represented on the plot. We can assign the celltype column to the style argument in scatterplot(), so each celltype is plotted with a different shaped data point.
# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype")# Render the plotplt.show()
Figure 5: Scatterplot of age in days vs. mean expression, colored by genotype and shaped by celltype.
Note
You may have noticed that the figure legend moved when we added our style argument. This is because there is a argument (loc) set within plt.legend() which allows you to direct the placement of the legend. It can take the value of one of nine possible locations ('upper left', 'upper right', 'lower left', 'lower right', 'upper center', 'lower center', 'center left', 'center right', 'center') to determine where to place the legend. However, the default is a value of 'best' which selects one of those nine possible locations, which minimizes the overlap of the legend on top of data. As a result of adding our style argument, it made the legend longer, which meant that it would be better placed is a different location in the plot to minimize overlap with the data points.
The data points are quite small. We can also adjust the s (size) of the data points within the scatterplot() function. Since we do not want the size of the data points to be scaled according to a column in new_metadata, we will just specify a number for this argument.
# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Render the plotplt.show()
Figure 6: Scatterplot of age in days vs. mean expression, colored by genotype and shaped by celltype, with adjusted size.
Themes
There are a variety of themes that you can apply to your plot to change the background and gridlines. The default theme is darkgrid, but you can change it with the set_style() function from seaborn.
# Set the theme to "whitegrid"sns.set_style(style ="whitegrid")# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Render the plotplt.show()
Figure 7: Scatterplot of age in days vs. mean expression, colored by genotype and shaped by celltype, with adjusted size and a different theme.
Customizing themes
You can also customize themes further with rc_params when you want to adjust specific elements of the theme. The documentation for set_style() can be found on their official website.
Changing labels
The axis labels and tick labels don’t get any larger by changing themes. We can, however, change both the x-axis labels and size labels with the plt.xlabel() functions from matplotlib. Since we will be adding this layer “on top” of, or after, sns.set_style(), any features we change will override what is set by the sns.set_style() layer.
Let’s increase the size of the x-axis title to be 20.
# Set the theme to "whitegrid"sns.set_style(style ="whitegrid")# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Change the size and text of the axis labelplt.xlabel(xlabel ="Age in Days", fontsize =20)# Render the plotplt.show()
Figure 8: Scatterplot of age in days vs. mean expression, colored by genotype and shaped by celltype, with adjusted size, a different theme and larger x-axis title.
Saving plots
If you wanted to save this plot, you can use the savefig() function from matplotlib and specify the file name and format you want to save it in. By default, this function will save the last plot that was generated in a given code block, so make sure to call savefig() after you have generated the plot you want to save in the same codeblock. For example, to save the plot as a PNG file, you can use:
# Set the theme to "whitegrid"sns.set_style(style ="whitegrid")# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Change the size and text of the axis labelplt.xlabel(xlabel ="Age in Days", fontsize=20)# Save the plot as a PNG fileplt.savefig(fname ="figures/scatterplot.png",format="png")
If you wanted to specify the resolution (DPI) or the size of the saved figure, you can also include those arguments in the savefig() function. For example, to save the plot as a PNG file with a resolution of 300 DPI we can use:
# Set the theme to "whitegrid"sns.set_style(style ="whitegrid")# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Change the size and text of the axis labelplt.xlabel(xlabel ="Age in Days", fontsize =20)# Save the plot as a PNG file with specific DPI and sizeplt.savefig(fname ="figures/scatterplot_dpi.png",format="png", dpi =300)
Add a plt.ylabel() layer to the current plot such that the y-axis is labeled “Mean expression”.
Use the plt.title() layer to add a plot title of your choice.
When you add the arguments loc="right" to the plt.title() function, what does it change?
Let’s remove the loc = "right" argument from plt.title(). Try adding the layer plt.legend(loc = "center right") to the end of your code. What does this do? How many layers can be added to a plot, in your estimation?
---title: "Plotting Basics with `Matplotlib` and `Seaborn`"description: | This lesson introduces data visualization in Python using `Matplotlib` and `Seaborn`, showing how to build scatterplots, adjust aesthetics and customize labels to create clear figures.author: - Noor Sohail - Will Gammerdingerdate: "2026-03-16"categories: - Python programming - Data visualization - Matplotlib - Seabornkeywords: - Figures - Scatterplots - Aesthetics - Themes - Plot customizationlicense: "CC-BY-4.0"editor_options: markdown: wrap: 72jupyter: intro_python---Approximate time: 40 minutes## Learning objectives In this lesson, we will:- Explain the concept of layering in plotting and how to build a plot step-by-step- Create a scatterplot using `MatPlotLib` and customize its aesthetics with `Seaborn`- Apply different themes to a plot and adjust axis labels and titles## Overview of lessonPlots are one of the best ways to communication and summarize results to others. With `Matplotlib` and `Seaborn`, you can create customizable visualizations from your data. Data scientists and researchers use these tools everyday to explore trends and create publication-ready figures for presentations and manuscripts. In this lesson, you will learn the basics of building a scatterplot and adjusting its aesthetics to give you the foundation for creating any plot you may want to generate in the future.## Plotting basics`MatPlotLib` is one of the most widely used plotting packages in Python. With it, we can create many different types of plots, including scatterplots, line plots, bar plots, boxplots and more. The important thing to remember is that you can slowly build upon your plot, adding different layers of information to create a more informative and visually appealing plot. So there is no need to create a perfect plot in one step!We will start with drawing a simple x-y scatterplot of `mean_expression` versus `age_in_days` from `new_metadata`.### Initialize a plot with `MatPlotLib`First, we will import the `MatPlotLib` and `Pandas` libraries as well as load `new_metadata` that we created in the previous lesson:```{python}#| label: tbl-load_new_metadata#| tbl-cap: DataFrame containing updated metadata for each of our 12 samples.# Import librariesimport matplotlib.pyplot as pltimport pandas as pd # Load the new metadata data frame that we created in the previous lessonnew_metadata = pd.read_csv("data/new_metadata.csv", index_col=0) # Print out new_metadatanew_metadata```We will first initialize a plot by using the `figure()` function from `MatPlotLib`. Let us look at some of the arguments we can use with the `help()` function:```{python}#| label: help_figure#| eval: false# Look at the help for the figure functionhelp(plt.figure)```So first, we'll create an empty plot of size 8 inches by 6 inches:```{python}#| label: fig-initialize_plot#| fig-cap: An empty plot initialized with `MatPlotLib`.# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Render the plotplt.show()```::: callout-noteAs we go through the lesson, you may notice that most plots will render just fine without `plt.show()`. `plt.show()` is a good practice to have in your plot’s code to explicitly state that you would like the figure to be created at this point. Certain computing set-ups require `plt.show()` in order to render an image, while others may get confused and attempt to overlay multiple plots without it. As a result, it is a good habit to use `plt.show()` to explicitly state where in the code you’d like your plot to be rendered.:::### Adding a scatterplot layerWe will once again first initialize the plot with `figure()` and then add the scatterplot layer to the plot with `scatter()`. We need to specify where we are pulling the data to plot from, which in this case will be the `new_metadata` DataFrame, and the `x` and `y` values for our scatterplot, which in this case will be `age_in_days` for `x` and `mean_expression` for `y`. The `plt.figure` and `plt.scatter` calls are connected because `MatPlotLib` commands build upon each other. The `plt.figure` command initializes the plot and sets the size, while the `plt.scatter` command adds the scatterplot layer to the existing plot. When we call these functions sequentially, we are building our plot layer by layer.```{python}#| label: fig-add_scatterplot_layer#| fig-cap: Scatterplot of age in days vs. mean expression.# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plotplt.scatter(data = new_metadata, x ="age_in_days", y ="mean_expression")# Render the plotplt.show()```::: callout-note# Alternative Instead of providing the `data` argument, you could specify the `x` and `y` axes as the given columns from `new_metadata`.```{python}#| label: add_scatterplot_layer_alt#| eval: false# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plotplt.scatter(x = new_metadata["age_in_days"], y = new_metadata["mean_expression"])# Render the plotplt.show()```:::Now that we have the required fundamentals, let’s add some extra details like color to the plot. We can color the points on the plot based on the genotype column with the `c` argument. ```{python}#| label: fig-add_color_scatterplot_layer_error#| fig-cap: Initial attempt to color the scatterplot of age in days vs. mean expression by genotype, which results in an error.#| error: true# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypeplt.scatter(data = new_metadata, x ="age_in_days", y ="mean_expression", c ="genotype")# Render the plotplt.show()```**We are getting an error from trying to set the color.** This is because the `c` argument in `scatter()` expects a list of color values, but we are providing it with categorical data from the `genotype` column. ### Changing aestheticsTo work around the error from `plt.scatter`, we will instead use the `seaborn` package's `scatterplot()` function and use the `hue` argument instead of `c`, which allows us to specify a categorical variable in order to color the plot points. The documentation for `seaborn.scatterplot()` is quite extensive and can be found [on their official website](https://seaborn.pydata.org/generated/seaborn.scatterplot.html).You will notice that there are a default set of colors that we can use, so we do not have to specify a color. The legend and axis labels have also been automatically plotted for us!```{python}#| label: fig-add_color_scatterplot_layer#| fig-cap: Scatterplot of age in days vs. mean expression, colored by genotype.# Import libraryimport seaborn as sns# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype")# Render the plotplt.show()````seaborn` is a bit more flexible than `MatPlotLib` and allows us to easily add more aesthetics to our plot. You will oftentimes find yourself using a blend of both packages together to create the plot you want.Let’s try to have both `celltype` and `genotype` represented on the plot. We can assign the `celltype` column to the `style` argument in `scatterplot()`, so each celltype is plotted with a different shaped data point.```{python}#| label: fig-add_shape_scatterplot_layer#| fig-cap: Scatterplot of age in days vs. mean expression, colored by genotype and shaped by celltype.# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype")# Render the plotplt.show()```::: callout-noteYou may have noticed that the figure legend moved when we added our `style` argument. This is because there is a argument (`loc`) set within `plt.legend()` which allows you to direct the placement of the legend. It can take the value of one of nine possible locations (`'upper left'`, `'upper right'`, `'lower left'`, `'lower right'`, `'upper center'`, `'lower center'`, `'center left'`, `'center right'`, `'center'`) to determine where to place the legend. However, the default is a value of `'best'` which selects one of those nine possible locations, which minimizes the overlap of the legend on top of data. As a result of adding our `style` argument, it made the legend longer, which meant that it would be better placed is a different location in the plot to minimize overlap with the data points.:::The data points are quite small. We can also adjust the `s` (size) of the data points within the `scatterplot()` function. Since we do not want the size of the data points to be scaled according to a column in `new_metadata`, we will just specify a number for this argument.```{python}#| label: fig-add_size_scatterplot_layer#| fig-cap: Scatterplot of age in days vs. mean expression, colored by genotype and shaped by celltype, with adjusted size.# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Render the plotplt.show()```### ThemesThere are a variety of themes that you can apply to your plot to change the background and gridlines. The default theme is `darkgrid`, but you can change it with the `set_style()` function from `seaborn`. ```{python}#| label: fig-change_theme#| fig-cap: Scatterplot of age in days vs. mean expression, colored by genotype and shaped by celltype, with adjusted size and a different theme.# Set the theme to "whitegrid"sns.set_style(style ="whitegrid")# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Render the plotplt.show()```::: callout-note# Customizing themesYou can also customize themes further with `rc_params` when you want to adjust specific elements of the theme. The documentation for `set_style()` can be found [on their official website](https://seaborn.pydata.org/generated/seaborn.set_style.html).:::### Changing labelsThe axis labels and tick labels don't get any larger by changing themes. We can, however, change both the x-axis labels and size labels with the `plt.xlabel()` functions from `matplotlib`. Since we will be adding this layer “on top” of, or after, `sns.set_style()`, any features we change will override what is set by the `sns.set_style()` layer.Let’s increase the size of the x-axis title to be 20.```{python}#| label: fig-change_axis_label_size#| fig-cap: Scatterplot of age in days vs. mean expression, colored by genotype and shaped by celltype, with adjusted size, a different theme and larger x-axis title.# Set the theme to "whitegrid"sns.set_style(style ="whitegrid")# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Change the size and text of the axis labelplt.xlabel(xlabel ="Age in Days", fontsize =20)# Render the plotplt.show()```## Saving plotsIf you wanted to save this plot, you can use the `savefig()` function from `matplotlib` and specify the file name and format you want to save it in. By default, this function will save the last plot that was generated in a given code block, so make sure to call `savefig()` after you have generated the plot you want to save in the same codeblock. For example, to save the plot as a PNG file, you can use:```{python}#| label: save_plot#| eval: false# Set the theme to "whitegrid"sns.set_style(style ="whitegrid")# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Change the size and text of the axis labelplt.xlabel(xlabel ="Age in Days", fontsize=20)# Save the plot as a PNG fileplt.savefig(fname ="figures/scatterplot.png",format="png")```If you wanted to specify the resolution (DPI) or the size of the saved figure, you can also include those arguments in the `savefig()` function. For example, to save the plot as a PNG file with a resolution of 300 DPI we can use:```{python}#| label: save_plot_dpi_size#| eval: false# Set the theme to "whitegrid"sns.set_style(style ="whitegrid")# Initialize a plot with a specific sizeplt.figure(figsize = (8, 6))# Add a scatterplot layer to the plot, coloring points by genotypesns.scatterplot(data = new_metadata, x ="age_in_days", y ="mean_expression", hue ="genotype", style ="celltype", s =50)# Change the size and text of the axis labelplt.xlabel(xlabel ="Age in Days", fontsize =20)# Save the plot as a PNG file with specific DPI and sizeplt.savefig(fname ="figures/scatterplot_dpi.png",format="png", dpi =300)```:::{.callout-tip}# [**Exercise 1**](11_plotting_basics-Answer_key.qmd#exercise-1)1. Add a `plt.ylabel()` layer to the current plot such that the y-axis is labeled "Mean expression".2. Use the `plt.title()` layer to add a plot title of your choice. 3. When you add the arguments `loc="right"` to the `plt.title()` function, what does it change?4. Let's remove the `loc = "right"` argument from `plt.title()`. Try adding the layer `plt.legend(loc = "center right")` to the end of your code. What does this do? How many layers can be added to a plot, in your estimation?:::***[Next Lesson >>](12_boxplots.qmd)[Back to Schedule](../schedule/schedule.qmd)