## Example of Profiling R code.

These examples are courtesy of Phil Spector.

Consider the problem of removing rows from a data frame where any value in the row has an NA, i.e. there is any incomplete data.

Here are four different functions for doing this.

This first function, creates the output data frame one row at a time, that is it adds a new row to the object res each time it finds a row that has no NAs.

```funAgg = function(x) {
# initialize res
res = NULL
n = nrow(x)

for (i in 1:n) {
if (!any(is.na(x[i,]))) res = rbind(res, x[i,])
}
res
}
```

An alternative here is to initialise the res object to the correct size (or roughly the correct size). Then replace the rows one at a time as a row in the input with no NAs is found. Notice that we have to keep track of where we are in the output

```funLoop = function(x) {
# Initialize res with x
res = x
n = nrow(x)
k = 1

for (i in 1:n) {
if (!any(is.na(x[i,]))) {
res[k, ] = x[i,]
k = k + 1
}
}
res[1:(k-1), ]
}
```

Our next attempt, uses the property of the is.na function (i.e. that it returns a matrix of logicals when given a data frame) and it employs the apply function. It is very simple.

```funApply = function(x) {
drop = apply(is.na(x), 1, any)
x[!drop, ]
}
```

This function is what is used in R when you call omit.na.

```funOmit = function(x) {
# The or operation is very fast, it is replacing the any function
# Also note that it doesn't require having another data frame as big as x

drop = F
n = ncol(x)
for (i in 1:n)
drop = drop | is.na(x[, i])
x[!drop, ]
}
```
```#Make up large test case
xx = matrix(rnorm(2000000),100000,20)
xx[xx>2] = NA
x = as.data.frame(xx)

# Call the R code profiler and give it an output file to hold results
Rprof("exampleAgg.out")
# Call the function to be profiled
y = funAgg(xx)
Rprof(NULL)

Rprof("exampleLoop.out")
y = funLoop(xx)
Rprof(NULL)

Rprof("exampleApply.out")
y = funApply(xx)
Rprof(NULL)

Rprof("exampleOmit.out")
y = funOmit(xx)
Rprof(NULL)
```

The output from the profiling.

The function funAgg - interrupted before completion
Each sample represents 0.02 seconds.
Total run time: 73.9000000000024 seconds. Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.

```   %       total       %       self
total    seconds     self    seconds    name
100.00     73.90      0.32      0.24     "funAgg"
99.16     73.28     99.16     73.28     "rbind"
0.46      0.34      0.38      0.28     "any"
0.08      0.06      0.08      0.06     "is.na"
0.05      0.04      0.05      0.04     "!"

%       self        %       total
self     seconds    total    seconds    name
99.16     73.28     99.16     73.28     "rbind"
0.38      0.28      0.46      0.34     "any"
0.32      0.24    100.00     73.90     "funAgg"
0.08      0.06      0.08      0.06     "is.na"
0.05      0.04      0.05      0.04     "!"
```

The Function funLoop
Each sample represents 0.02 seconds.
Total run time: 3.56 seconds.

Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.

```   %       total       %       self
total    seconds     self    seconds    name
100.00      3.56     50.56      1.80     "funLoop"
46.07      1.64     39.89      1.42     "any"
6.18      0.22      6.18      0.22     "is.na"
2.25      0.08      2.25      0.08     "!"
1.12      0.04      1.12      0.04     "+"

%       self        %       total
self     seconds    total    seconds    name
50.56      1.80    100.00      3.56     "funLoop"
39.89      1.42     46.07      1.64     "any"
6.18      0.22      6.18      0.22     "is.na"
2.25      0.08      2.25      0.08     "!"
1.12      0.04      1.12      0.04     "+"
```

The function funApply
Each sample represents 0.02 seconds.
Total run time: 3.94 seconds.

Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.

```   %       total       %       self
total    seconds     self    seconds    name
100.00      3.94      2.54      0.10     "funApply"
97.46      3.84     46.70      1.84     "apply"
23.86      0.94     23.86      0.94     "FUN"
18.27      0.72      1.52      0.06     "unlist"
17.26      0.68      0.00      0.00     "any"
16.75      0.66     15.74      0.62     "lapply"
6.60      0.26      6.60      0.26     "aperm"
2.03      0.08      2.03      0.08     "is.na"
1.02      0.04      1.02      0.04     "names<-.default"
1.02      0.04      0.00      0.00     "names<-"

%       self        %       total
self     seconds    total    seconds    name
46.70      1.84     97.46      3.84     "apply"
23.86      0.94     23.86      0.94     "FUN"
15.74      0.62     16.75      0.66     "lapply"
6.60      0.26      6.60      0.26     "aperm"
2.54      0.10    100.00      3.94     "funApply"
2.03      0.08      2.03      0.08     "is.na"
1.52      0.06     18.27      0.72     "unlist"
1.02      0.04      1.02      0.04     "names<-.default"
```

The function funOmit
Each sample represents 0.02 seconds.
Total run time: 0.62 seconds.

Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.

```   %       total       %       self
total    seconds     self    seconds    name
100.00      0.62     38.71      0.24     "funOmit"
48.39      0.30     48.39      0.30     "|"
12.90      0.08     12.90      0.08     "is.na"

%       self        %       total
self     seconds    total    seconds    name
48.39      0.30     48.39      0.30     "|"
38.71      0.24    100.00      0.62     "funOmit"
12.90      0.08     12.90      0.08     "is.na"
```