Goals for today


  1. Practice writing custom functions

  2. Practice iterations using for loops and map functions



General instructions


  • Today, we will write some useful custom functions, and will combine these functions together with for loops and map functions to automate repetitive tasks


  • To start, first open a new RMarkdown file in your course repo, set the output format to github_document, save it in your lab folder as lab9.Rmd, and work in this RMarkdown file for the rest of this lab. Load the following package.
library(tidyverse)


  • We provide some possible solutions for each question, but we highly recommend that you don’t look at them unless you are really stuck.




Exercise 1. DNA or RNA? (45 minutes)



1.1 Write a function, dna_or_rna(sequence), that determines if a sequence of base pairs is DNA, RNA, or if it is not possible to tell given the sequence provided.

  • Since all the function will know about the material is the sequence, the only way to tell the difference between DNA and RNA is that RNA has the base Uracil ("u") instead of the base Thymine ("t").

  • Have the function return one of three outputs: “DNA”, “RNA”, or “unknown”. Then run the following three lines of code:

dna_or_rna("attggc")

dna_or_rna("gccaau")

dna_or_rna("ccagac")

dna_or_rna("tgcacug")


Hint: the str_split function might be helpful.


dna_or_rna("attggc")
## [1] "DNA"
dna_or_rna("gccaau")
## [1] "RNA"
dna_or_rna("ccagac")
## [1] "unknown"
dna_or_rna("tgcacug")
## [1] "unknown"


One possible solution
click to expand
dna_or_rna
## function(sequence){
##   bases <- str_split(sequence, pattern = "") %>%
##     unlist() %>%
##     unique()
##   if (all(bases %in% c("a", "t", "g", "c")) & "t" %in% bases){
##     return("DNA")
##   } else if (all(bases %in% c("a", "u", "g", "c")) & "u" %in% bases){
##     return("RNA")
##   } else {
##     return("unknown")
##   }
## }
## <bytecode: 0x000001af25b13f20>



1.2 Use the dna_or_rna() function and a for loop to print the type of the sequences in the following list.


sequences = c("ttgaatgccttacaactgatcattacacaggcggcatgaagcaaaaatatactgtgaaccaatgcaggcg", 
              "gauuauuccccacaaagggagugggauuaggagcugcaucauuuacaagagcagaauguuucaaaugcau", 
              "gaaagcaagaaaaggcaggcgaggaagggaagaagggggggaaacc", 
              "guuuccuacaguauuugaugagaaugagaguuuacuccuggaagauaauauuagaauguuuacaacugcaccugaucagguggauaaggaagaugaagacu", 
              "gataaggaagaugaagacutucaggaaucuaauaaaaugcacuccaugaauggauucauguaugggaaucagccggguc")


One possible solution
click to expand
sequence_type <- vector("double", length(sequences))
for (i in seq_along(sequences)){
  sequence_type[i] <- dna_or_rna(sequences[i])
}
sequence_type
## [1] "DNA"     "RNA"     "unknown" "RNA"     "unknown"



1.3 Use the dna_or_rna() function and an appropriate map function to print the type of the sequences in the above list.


One possible solution
click to expand
map_chr(sequences, dna_or_rna)
## [1] "DNA"     "RNA"     "unknown" "RNA"     "unknown"
## Alternatively
map(sequences, dna_or_rna) %>% unlist()
## [1] "DNA"     "RNA"     "unknown" "RNA"     "unknown"



1.4 Make your function work with both upper and lower case letters, or even strings with mixed capitalization. Test your function with the following three lines of code:

dna_or_rna("ATTGGC")

dna_or_rna("gCCAAu")

dna_or_rna("ggcacgG")


dna_or_rna("ATTGGC")
## [1] "DNA"
dna_or_rna("gCCAAu")
## [1] "RNA"
dna_or_rna("ggcacgG")
## [1] "unknown"


One possible solution
click to expand
dna_or_rna
## function(sequence){
##   bases <- str_split(sequence, pattern = "") %>%
##     unlist() %>%
##     tolower() %>%
##     unique()
##   if (all(bases %in% c("a", "t", "g", "c")) & "t" %in% bases){
##     return("DNA")
##   } else if (all(bases %in% c("a", "u", "g", "c")) & "u" %in% bases){
##     return("RNA")
##   } else {
##     return("unknown")
##   }
## }
## <bytecode: 0x000001af26e934b0>



Recap (5 minutes)


Share your findings, challenges, and questions with the class.


Short break (10 minutes)



Exercise 2: Rounding (50 minutes)



Rounding appears to be a very simple arithmetic operation. However, things get a little bit complicated when it comes to the number 5, which is at the exact mid-point between rounding up and rounding down.

The round function in base R is weird. It is supposed to use a round half to even rule when rounding off a 5 (see https://en.wikipedia.org/wiki/Rounding#Round_half_to_even). However, this is dependent on your computer’s operating system, and therefore this rule is sometimes inconsistent. For example:

# This is how a "round half to even" rule should work
round(0.5, digits=0)
## [1] 0
round(1.5, digits=0)
## [1] 2
round(-0.5, digits=0)
## [1] 0
round(-1.5, digits=0)
## [1] -2
# Things get weird sometimes though
round(0.55, digits=1) # This is what we would expect
## [1] 0.6
round(2.45, digits=1) # Under a "round half to even" rule, we are expecting it to reture 2.4. However, here it returns 2.5 on my operating system (might be different for yours)
## [1] 2.5



2.1 To correct this inconsistency issue, write a custom function that consistently applies a round half away from zero rule.

  • Under this rule, when rounding off a 5 , your function should round up when it’s positive, and down when it is negative
  • Your function should takes a “digits” argument exactly as in the original R function.

Hint: you may need the arithmetic operator %/% and the sign() function.


The following is how it should work:

#Example output
round_away(0.55, digits=0)
## [1] 1
round_away(2.45, digits=0) 
## [1] 2
round_away(-0.55, digits=0)
## [1] -1
round_away(-2.45, digits=0) 
## [1] -2
round_away(0.55, digits=1)
## [1] 0.6
round_away(2.45, digits=1) 
## [1] 2.5
round_away(-0.55, digits=1)
## [1] -0.6
round_away(-2.45, digits=1) 
## [1] -2.5


One possible solution
click to expand
round_away
## function(x, digits=0){
##   x_new<-abs(x*10^digits)
##   x_sign <- sign(x)
##   integer <- x_new%/%1
##   decimal <- x_new-integer
##   if(decimal<0.5){
##     x_new <- integer
##   } else {
##     x_new <- integer +1
##   }
##   x_rounded <- x_new/10^digits*x_sign
##   return(x_rounded)
## }
## <bytecode: 0x000001af2826b830>



2.2 Now, building on the previous question, write a custom function that consistently applies a round half to even rule when rounding off a 5.

Hint: you will need the arithmetic operator %%.


The following is how this function should work:

#Example output
round_even(0.55, digits=0)
## [1] 1
round_even(2.45, digits=0) 
## [1] 2
round_even(-0.55, digits=0)
## [1] -1
round_even(-2.45, digits=0) 
## [1] -2
round_even(0.55, digits=1)
## [1] 0.6
round_even(2.45, digits=1) 
## [1] 2.4
round_even(-0.55, digits=1)
## [1] -0.6
round_even(-2.45, digits=1) 
## [1] -2.4


One possible solution
click to expand
round_even
## function(x, digits=0){
##   x_new<-abs(x*10^digits)
##   x_sign <- sign(x)
##   integer <- x_new%/%1
##   decimal <- x_new-integer
##   if(decimal<0.5){
##     x_new <- integer
##   } else if (decimal ==0.5){
##     if(integer%%2==0){
##       x_new <- integer 
##     } else {
##       x_new <- integer +1
##     }
##   } else {
##       x_new <- integer +1
##   }
##   x_rounded <- x_new/10^digits*x_sign
##   return(x_rounded)
## }
## <bytecode: 0x000001af2100d718>



Recap (5 minutes)


Share your findings, challenges, and questions with the class.



END LAB 9