# Chapter 14 The apply() family

R disposes of the apply() function family, which consists of iterative functions that aim at minimizing your need to explicitly create loops.

## 14.1apply()

Let us consider that we have a height matrix containing the height (in metres) that was taken from five individuals (in rows) at four different times (as columns).

(height <- matrix(runif(20, 1.5, 2), nrow = 5, ncol = 4))
##          [,1]     [,2]
## [1,] 1.950628 1.620880
## [2,] 1.911863 1.618455
## [3,] 1.637759 1.828076
## [4,] 1.678700 1.958679
## [5,] 1.604016 1.794074
##          [,3]     [,4]
## [1,] 1.680440 1.629621
## [2,] 1.900025 1.728687
## [3,] 1.977953 1.878461
## [4,] 1.862474 1.838020
## [5,] 1.876476 1.742830

We would like to obtain the average height at each time step.

One option is to use a for() {} loop to iterate from column 1 to 4, use the function mean() to calculate the average of the values, and sequentially store the output value in a vector.

Alternatively, we can use the apply() function to set it to apply the mean() function to every column of the height matrix. See the example below:

apply(X = height, MARGIN = 2, FUN = mean)
## [1] 1.756593 1.764033 1.859474
## [4] 1.763524

The apply() function begins with three arguments main arguments: X, which will take a matrix or a data frame; FUN, which can be any function that will be applied to the MARGINs of X; and MARGIN which will take 1 for row-wise computations, or 2 for column-wise computations.

## 14.2lapply()

lapply() applies a function to every element of a list.

The output returned is also list (explaining the “l” in lapply) and has the same number of elements as the object passed to it.

SimulatedData <- list(SimpleSequence = 1:4, Norm10 = rnorm(10),
Norm20 = rnorm(20, 1), Norm100 = rnorm(100, 5))
# Apply mean to each element of the list

lapply(X = SimulatedData, FUN = mean)
## $SimpleSequence ## [1] 2.5 ## ##$Norm10
## [1] -0.1416001
##
## $Norm20 ## [1] 0.6315101 ## ##$Norm100
## [1] 4.93122

lapply() operations done in objects different from a list will be coerced to a list via base::as.list().

## 14.3sapply()

sapply() is a ‘wrapper’ function for lapply(), but returns a simplified output as a vector, instead of a list.

SimulatedData <- list(SimpleSequence = 1:4, Norm10 = rnorm(10),
Norm20 = rnorm(20, 1), Norm100 = rnorm(100, 5))

# Apply mean to each element of the list
sapply(SimulatedData, mean)
## SimpleSequence         Norm10
##      2.5000000     -0.6693939
##         Norm20        Norm100
##      1.1531907      5.0468430

## 14.4mapply()

mapply() works as a multivariate version of sapply().

It will apply a given function to the first element of each argument first, followed by the second element, and so on. For example:

lilySeeds <- c(80, 65, 89, 23, 21)
poppySeeds <- c(20, 35, 11, 77, 79)

# Output
mapply(sum, lilySeeds, poppySeeds)
## [1] 100 100 100 100 100

## 14.5tapply()

tapply() is used to apply a function over subsets of a vector.

It is primarily used when the dataset contains dataset contains different groups (i.e. levels/factors) and we want to apply a function to each of these groups.

##                    mpg cyl
## Mazda RX4         21.0   6
## Mazda RX4 Wag     21.0   6
## Datsun 710        22.8   4
## Hornet 4 Drive    21.4   6
## Valiant           18.1   6
##                   disp  hp
## Mazda RX4          160 110
## Mazda RX4 Wag      160 110
## Datsun 710         108  93
## Hornet 4 Drive     258 110
## Valiant            225 105
##                   drat    wt
## Mazda RX4         3.90 2.620
## Mazda RX4 Wag     3.90 2.875
## Datsun 710        3.85 2.320
## Hornet 4 Drive    3.08 3.215
## Valiant           2.76 3.460
##                    qsec vs am
## Mazda RX4         16.46  0  1
## Mazda RX4 Wag     17.02  0  1
## Datsun 710        18.61  1  1
## Hornet 4 Drive    19.44  1  0
## Hornet Sportabout 17.02  0  0
## Valiant           20.22  1  0
##                   gear carb
## Mazda RX4            4    4
## Mazda RX4 Wag        4    4
## Datsun 710           4    1
## Hornet 4 Drive       3    1
tapply(mtcars$hp, mtcars$cyl, FUN = mean)