Accessing genomic reference and experimental sequencing data

Audience Computational Skills Prerequisites Duration
Biologists Beginner bash Introduction to the command-line interface 3 hour workshop (~3 hours of trainer-led time)


This repository has teaching materials for a 2 hour, hands-on Accessing genomic reference and experimental sequencing data workshop led at a relaxed pace.

For many types of sequencing analyses, we need access to public data stored in various databases and repositories. This workshop will discuss types of genomic reference data available through public databases such as Ensembl, NCBI, and UCSC, and step through how to find and download this data. The workshop will also explore how to find and download publicly available experimental data, such as data (FASTQ files and count matrices) from published papers, using GEO and the SRA repositories. While most of the workshop will access data using a web browser, downloading data from the SRA will require beginner knowledge of the command-line interface.

Learning Objectives

These materials are developed for a trainer-led workshop, but also amenable to self-guided learning.


Lessons Estimated Duration
Accessing genome reference data 60 min
Accessing publically available experimental data 90 min

Installation Requirements

Mac users: No installation requirements.

Windows users: GitBash