Reading in and inspecting data
- Read the
.csvfile into your environment and assign it to a variable called
animals. Be sure to check that your row names are the different animals.
- Check to make sure that
animalsis a dataframe.
- How many rows are in the
animalsdataframe? How many columns?
- Extract the
speedvalue of 40 km/h from the
- Return the rows with animals that are the
- Return the rows with animals that have
speedgreater than 50 km/h and output only the
colorcolumn. Keep the output as a data frame.
- Change the color of “Grey” to “Gray”.
- Create a list called
animals_listin which the first element contains the speed column of the
animalsdataframe and the second element contains the color column of the
- Give each element of your list the appropriate name (i.e speed and color).
The %in% operator, reordering and matching
Read in the project summary file (“project-summary.txt”) to a variable called
proj_summary; this file contains quality metric information for an RNA-seq dataset. Be sure to specify the row names are in column 1 and the separator is a tab.
We have obtained batch information for the control samples in this dataset. Copy and paste the code below to create a dataframe of control samples with the associated batch information:
ctrl_samples <- data.frame(row.names = c("sample3", "sample10", "sample8", "sample4", "sample15"), date = c("01/13/2018", "03/15/2018", "01/13/2018", "09/20/2018","03/15/2018"))
- How many of the
ctrl_samplesare also in the
proj_summarydataframe? Use the %in% operator to compare sample names.
- Keep only the rows in
proj_summarywhich correspond to those in
ctrl_samples. Do this with the %in% operator. Save it to a variable called
- We would like to add in the batch information for the samples in
proj_summary_ctrl. Find the rows that match in
cbind()to add a column called
proj_summary_ctrldataframe. Assign this new dataframe back to
proj_summaryto keep only the “high” and “low” samples based on the treament column. Save the new dataframe to a variable called
- Further, subset the dataframe to remove the non-numeric columns “Quality_format”, and “treatment”. Try to do this using the
map_lgl()function in addition to
is.numeric(). Save the new dataframe back to