Chapter 24 Writing functions
Imagine that we would like to rescale variables to the range of 0
to 1
.
# our.dataset has 4 variables
<- data.frame(a = rnorm(10), b = rnorm(10), c = rnorm(10),
our.dataset d = rnorm(10))
The equation for rescaling variables into the simplex (0-1) is:
\[x_{\text{new}} = \frac{x_i - \text{min}(x)}{ \text{max}(x) - \text{min}(x)}\]
We could rescale these four variables to 0
and 1
by doing the following:
$a <- (our.dataset$a - min(our.dataset$a, na.rm = TRUE))/(max(our.dataset$a,
our.datasetna.rm = TRUE) - min(our.dataset$a, na.rm = TRUE))
$b <- (our.dataset$b - min(our.dataset$b, na.rm = TRUE))/(max(our.dataset$b,
our.datasetna.rm = TRUE) - min(our.dataset$a, na.rm = TRUE))
$c <- (our.dataset$c - min(our.dataset$c, na.rm = TRUE))/(max(our.dataset$c,
our.datasetna.rm = TRUE) - min(our.dataset$c, na.rm = TRUE))
$d <- (our.dataset$d - min(our.dataset$d, na.rm = TRUE))/(max(our.dataset$d,
our.datasetna.rm = TRUE) - min(our.dataset$d, na.rm = TRUE))
What if our dataset had 31 variables?
Repeating that equation and this chunk of code 31 times could become a tedious and inneficient process:
$a <- (our.dataset$a - min(our.dataset$a, na.rm = TRUE))/(max(our.dataset$a,
our.datasetna.rm = TRUE) - min(our.dataset$a, na.rm = TRUE))
But, we can see that, except from the input, the code was practically the same among the variables
The function here was deliberately hidden to not cause confusion among the participants
# our secret hidden function
rescale01(our.dataset$a)
rescale01(our.dataset$b)
rescale01(our.dataset$c)
rescale01(our.dataset$d)