This post is about R versus Python in terms of the time they require to loop and generate pseudo-random numbers. To accomplish the task, the following steps were performed in Python and R (1) loop 100k times (\(i\) is the loop index) (2) generate a random integer number out of the array of integers from 1 to the current loop index \(i\) (\(i\)+1 for Python) (3) output elapsed time at the probe loop steps: \(i\) (\(i\)+1 for Python) in [10, 100, 1000, 5000, 10000, 25000, 50000, 75000, 100000]
R code
library(magrittr)
#number of the loop iterations
n_elements <- 1e5
#probe points
x <- c(10,100,1000,5000,10000,25000,50000,75000,100000)
#for loop
t <- Sys.time()
vec <- NULL
elapsed <- NULL
for (i in seq_len(n_elements))
{
vec <- c(vec, sample(i, size = 1, replace = T))
if(i %in% x)
elapsed <- c(elapsed, as.numeric(difftime(Sys.time(), t, 'secs')))
}
#lapply function
t <- Sys.time()
vec <- NULL
elapsed_sapply <- lapply(seq_len(n_elements), function(i) {
vec <- c(vec, sample(i, size = 1, replace = T))
if(i %in% x)
return(as.numeric(difftime(Sys.time(), t, 'secs')))
}) %>% Filter(Negate(is.null), .) %>% unlist()
Python code
from numpy import random as rand
import datetime as dt
#number of the loop iterations
n_elements = int(1e5)
#probe points
x = [10,100,1000,5000,10000,25000,50000,75000,100000]
#for loop
t = dt.datetime.now()
vec = []
elapsed = []
for i in range(n_elements):
vec.append(rand.choice(i+1, size=1, replace=True))
if i+1 in x:
elapsed.append((dt.datetime.now() - t).total_seconds())
Results
The result is presented on the plot below (click here to explore the distributions in Plotly).
Conclusions
The following conclusions can be drawn:
- Python is faster than R, when the number of iterations is less than 1000. Below 100 steps, python is up to 8 times faster than R, while if the number of steps is higher than 1000, R beats Python when using lapply function!
- Try to avoid using
forloop in R, especially when the number of looping steps is higher than 1000. Use the functionlapplyinstead. - Timing runaway of the R
forloop starts at 10k looping steps.
If you have questions please comment below.
