2018-07-30

Readable code with base R (part 1)


Producing readable R code is of great importance, especially if there is a chance that you will share your code with people other than your future self. The now widely used magrittr pipe operator and dplyr tools are great frameworks for this purpose. However, if you want to keep your dependencies on other packages at minimum, you may need to fall back to base R. In this series of blog posts, I will present some (maybe underused) base R tools for producing very readable R code.

Using startsWith and endsWith for string-matching

There are special base functions for pre- or postfix matching.

# Basic usage:
w <- "Hello World!"
startsWith(w, "Hell")
## [1] TRUE

startsWith(w, "Helo")
## [1] FALSE

endsWith(w, "!")
## [1] TRUE

Of course, it also works with vectors. Can’t remember the exact name of a base function? Try this… ;)

base_funcs <- ls("package:base")

base_funcs[startsWith(base_funcs, "row")]
##  [1] "row"                    "rowMeans"              
##  [3] "rownames"               "row.names"             
##  [5] "row.names<-"            "rownames<-"            
##  [7] "row.names<-.data.frame" "row.names.data.frame"  
##  [9] "row.names<-.default"    "row.names.default"     
## [11] "rowsum"                 "rowsum.data.frame"     
## [13] "rowsum.default"         "rowSums"

The ‘readable’ property really shines when combined with control-flow.

tell_file_type <- function(fn) {
    # Check different file endings
    if (endsWith(fn, "txt")) {
        print("A text file.")
    }
    if (any(endsWith(fn, c("xlsx", "xls")))) {
        print("An Excel file.")
    }
}
tell_file_type("A.txt")
## [1] "A text file."

tell_file_type("B.xls")
## [1] "An Excel file."

The resulting code reads very well.

Filter

Using another nice base function, Filter, the above code can be further improved.

get_file_type <- function(fn) {
  file_endings <- c(text="txt", Excel="xls", Excel="xlsx")  
  Filter(file_endings, f = function(x) endsWith(fn, x))
}

get_file_type("C.xlsx")
##  Excel 
## "xlsx"

Again, very readable to my eyes. It should be noted that for this particular problem using tools::file_ext is even more appropriate, but I think the point has been made.

Last but not least, since Filter works on lists, you can use it on data.frames as well.

dat <- data.frame(A=1:3, B=5:3, L=letters[1:3])
dat
##   A B L
## 1 1 5 a
## 2 2 4 b
## 3 3 3 c

Filter(dat, f = is.numeric)
##   A B
## 1 1 5
## 2 2 4
## 3 3 3

Filter(dat, f = Negate(is.numeric))  # or Filter(dat, f = function(x) !is.numeric(x))
##   L
## 1 a
## 2 b
## 3 c

Readable code with base R (part 1) Readable code with base R (part 1) Produc...