Report Generation is a very important part in any Organization’s Business Intelligence and Analytics Division. The ability to create automated reports out of the given data is one of the most desirable things, that any innovative team would thrive for. And that is one area where SAS is considered to be more matured than R – not because R does not have those features – but primarily because R practitioners are not familiar with those. That’s the same feeling I came across today when I stumbled upon this package glue
in R, which is a very good and competitive alternative for Reporting Template packages like whisker
and brew
.
The package can be installed directly from CRAN.
install.packages('glue')
Let us try to put together a very minimal reporting template to output basic information about the given Dataset.
library(glue) df <- mtcars msg <- 'Dataframe Info: \n\n This dataset has {nrow(df)} rows and {ncol(df)} columns. \n There {ifelse(sum(is.na(df))>0,"is","are")} {sum(is.na(df))} Missing Value glue(msg) Dataframe Info: This dataset has 32 rows and 11 columns. There are 0 Missing Values.
As in the above code, glue()
is the primary function that takes a string
with r expressions enclosed in curly braces {}
whose resulting value would get concatenated with the given string. Creation of the templatised string is what we have done with msg
. This whole exercise wouldn’t make much of a sense if it’s required for only one dataset, rather it serves its purpose when the same code is used for different datasets with no code change. Hence let us try running this on a different dataset – R’s inbuilt iris
dataset. Also since we are outputting the count of missing values, let’s manually assign NA
for two instances and run the code.
df <- iris df[2,3] <- NA df[4,2] <- NA msg <- 'Dataframe Info: \n\n This dataset has {nrow(df)} rows and {ncol(df)} columns. \n There {ifelse(sum(is.na(df))>0,"is","are")} {sum(is.na(df))} Missing Value glue(msg) Dataframe Info: This dataset has 150 rows and 5 columns. There is 2 Missing Values.
That looks fine. But what if we want to report the contents of the dataframe. That’s where coupling glue's glue_data()
function with magrittr's %>%
operator helps.
library(magrittr) head(mtcars) %>% glue_data("* {rownames(.)} has {cyl} cylinders and {hp} hp") * Mazda RX4 has 6 cylinders and 110 hp * Mazda RX4 Wag has 6 cylinders and 110 hp * Datsun 710 has 4 cylinders and 93 hp * Hornet 4 Drive has 6 cylinders and 110 hp * Hornet Sportabout has 8 cylinders and 175 hp * Valiant has 6 cylinders and 105 hp
This is just to introduce glue
and its possibilities. This could potentially help in automating a lot of Reports and also to start with Exception-based Reporting. The code used in the article can be found on my github.