Calculating Entropy the Functional Way
Previously, I wrote a short article on how to implement fold left in R. It was fairly obvious that there must be a builtin function for it in R. At the time, I just assumed it would be "reduce" or it would not exist, however the proper function name is called "Reduce" with a capital R -- as a side note, I do not really understand the naming scheme of functions in the R base library.
So here's the fairly obvious way on how to calculate Shannon's entropy in R using Reduce:
> fentropy <- function(x,y) { x + (-y * log2(y)) }
> Reduce(fentropy, c(0.5,0.5), 0)
[1] 1
> Reduce(fentropy, c(0.25,0.25,0.25,0.25), 0)
[1] 2
First for the binary case with answer 1, and then for four values uniformly distributed.
Last but not least, we could also write an entropy function the "R way" which uses its nice functions which work over vectors:
entropy <- function(ps) {
H = -sum(ifelse(ps>0, ps * log2(ps), 0))
return(H)
}
fold left in R
Often used high order functions in functional programming are left and right folds.
A left fold [foldleft f accu l] applies the head of the list l to the function f together with the accumulator variable accu. The result is the new accumulator which is used in the next recursive call together with the tail of the list.
A left (or right) fold is easily implemented in R as follows:
foldleft <- function(f,accu,l) {
if(length(l)==0) {
accu
} else {
head <- l[1];
tail <- l[-1];
foldleft(f, (f(accu, head)) , tail)
}
}
To see how it works, we could apply it to calculate the variance of a fair die. Remember the variance is just where
is the mean, which is implemented in the following function f:
mean<-sum(1:6)/6
f <- function(accu,i) {
accu+(1/6 * (i-mean)^2)
}
foldleft(f,0,c(1,2,3,4,5,6))
where our last call to foldleft evaluates to 2.91667.