Loops in R and Python: Who is faster?

This post is about R versus Python in terms of the time they require to loop and generate pseudo-random numbers. To accomplish the task, the following steps were performed in Python and R (1) loop 100k times (\(i\) is the loop index) (2) generate a random integer number out of the array of integers from 1 to the current loop index \(i\) (\(i\)+1 for Python) (3) output elapsed time at the probe loop steps: \(i\) (\(i\)+1 for Python) in [10, 100, 1000, 5000, 10000, 25000, 50000, 75000, 100000]

R code

library(magrittr)
#number of the loop iterations
n_elements <- 1e5
#probe points
x <- c(10,100,1000,5000,10000,25000,50000,75000,100000)
#for loop
t <- Sys.time()
vec <- NULL
elapsed <- NULL
for (i in seq_len(n_elements))
{
    vec <- c(vec, sample(i, size = 1, replace = T))
    if(i %in% x) 
        elapsed <- c(elapsed, as.numeric(difftime(Sys.time(), t, 'secs')))
}
#lapply function
t <- Sys.time()
vec <- NULL
elapsed_sapply <- lapply(seq_len(n_elements), function(i) {
    vec <- c(vec, sample(i, size = 1, replace = T))
    if(i %in% x) 
        return(as.numeric(difftime(Sys.time(), t, 'secs')))
}) %>% Filter(Negate(is.null), .) %>% unlist()

Python code

from numpy import random as rand
import datetime as dt

#number of the loop iterations
n_elements = int(1e5)
#probe points
x = [10,100,1000,5000,10000,25000,50000,75000,100000]

#for loop
t = dt.datetime.now()
vec = []
elapsed = []

for i in range(n_elements):
    vec.append(rand.choice(i+1, size=1, replace=True))
    if i+1 in x:
        elapsed.append((dt.datetime.now() - t).total_seconds())

Results

The result is presented on the plot below (click here to explore the distributions in Plotly).

Conclusions

The following conclusions can be drawn:

Python is faster than R, when the number of iterations is less than 1000. Below 100 steps, python is up to 8 times faster than R, while if the number of steps is higher than 1000, R beats Python when using lapply function!
Try to avoid using for loop in R, especially when the number of looping steps is higher than 1000. Use the function lapply instead.
Timing runaway of the R for loop starts at 10k looping steps.

If you have questions please comment below.

LoopTips & Tricks