CUNY Borough of Manhattan Community College Data in Science Questions
I’m working on a biology question and need an explanation and answer to help me learn.
Please write this up as using Rmarkdown. Make sure everything runs. Answer questions in text. Comment with abandon.
- Create a vector of 100 randomly distributed numbers between 0 and 100 using
runif
and save the vector into the variablemy_vec
. What information doesstr
andsummary
tell you aboutmy_vec
? How do they differ? - Load the
readxl
andreadr
libraries. They are part of tidyverse and you should have them. If not,install.packages()
is your friend! Then, load the following data files: https://biol355.github.io/Data/my_data.csv usingread.csv
andread_csv
and https://biol355.github.io/Data/my_data.xlsx usingread_excel
. Looking at the three objects you loaded in, what are the any differences or similarities between them? - What does the output of
str
,summary
,skimr::skim()
, andvisdat::vis_dat
tell you about the data you loaded? What is different or the same? - Add a column to the mtcars data called
Model
which uses the row names of mtcars (rownames(mtcars)
) as its values. Show me the head of the data frame to see if it’s been done correctly. Note, to add a column to a data frame, we can specifyyourdf$new_col_name <- new_vector_we_we_are_adding
(note, that’s pseudo-code). Note how we are using the$
notation to add a new column. - Let’s use the
bind_rows
function in dplyr, as it’s pretty powerful. Let’s say you want to add a new row to mtcars for a new model. Make a new data frame with the following columns: Model = Fizzywig, mpg=31.415, awesomness=11. Now try to make a new data frame where yourbind
mtcars
and this new data frame. What happens? Don’t do this in a markdown code chunk – just try it, and then report what happens. It might or might not go as planned (and Rmarkdown can choke unless you add the appropriate argument to the code chunk – more on that soon)! Then, make a new data frame here you usedplyr::bind_rows
to combine them. Examine the resulting data frame. What do you see? You can try this in a code chunk for your markdown. How do the two methods differ? Look at their help files for some information that might help you.