To build a Forest Plot often the forestplot package is used in R. However, I find the ggplot2 to have more advantages in making Forest Plots, such as enable inclusion of several variables with many categories in a lattice form. You can also use any scale of your choice such as log scale etc. In this post, I will introduce how to plot Risk Ratios and their Confidence Intervals of several conditions.
Lets start by loading the package ggplot2 in our R.
library(ggplot2)
Data
For demostration purposes, I will load a data which contains few columns named Condition, RiskRatio, LowerLimit, UpperLimit, and Group. The current data is in long format; if your data is not in this format, check out the melt function, in the reshape package, it provides a really easy way to reshape data into long format. The reference group RR=1. My data is in xlsx format, therefore, I load data using read_excel in readxl package as demonstrated below.
RR_data <- data.frame(read_excel("C:/Users/fatakora/Dropbox/MY write R write ups/Risk_Ratio_data.xlsx"))
Condition RiskRatio LowerLimit UpperLimit Group
1 Condition1 1.0512 1.0174 1.0863 GroupB
2 Condition2 1.0169 0.9638 1.0731 GroupB
3 Condition3 1.0391 1.0185 1.0601 GroupB
10 Condition1 1.1057 1.0667 1.1463 GroupC
11 Condition2 1.4204 1.3471 1.4978 GroupC
12 Condition3 1.0344 1.0105 1.0589 GroupC
19 Condition1 1.0000 1.0000 1.0000 GroupA
20 Condition2 1.0000 1.0000 1.0000 GroupA
21 Condition3 1.0000 1.0000 1.0000 GroupA
For the sake of easy demonstrations and simplicity, we truncate the upper limits to 2 as maximum and lower limits to 0.5 as minimum.
RR_data$UpperLimit[RR_data$UpperLimit > 2] = 2 RR_data$LowerLimit[RR_data$LowerLimit < 0.5] = 0.5
ggplot2
The following codes will plot the graph below
p = ggplot(data=RR_data,
aes(x = Group,y = RiskRatio, ymin = LowerLimit, ymax = UpperLimit ))+
geom_pointrange(aes(col=Group))+
geom_hline(aes(fill=Group),yintercept =1, linetype=2)+
xlab('Group')+ ylab("Risk Ratio (95% Confidence Interval)")+
geom_errorbar(aes(ymin=LowerLimit, ymax=UpperLimit,col=Group),width=0.5,cex=1)+
facet_wrap(~Condition,strip.position="left",nrow=9,scales = "free_y") +
theme(plot.title=element_text(size=16,face="bold"),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
axis.text.x=element_text(face="bold"),
axis.title=element_text(size=12,face="bold"),
strip.text.y = element_text(hjust=0,vjust = 1,angle=180,face="bold"))+
coord_flip()
p
To add your logscale use scale_y_log10". For example, after log scale “half risk” (RR = 0.5) is equidistant from 1 as “double risk” (2.0). Note that, position can be used to change where you want the axis to appear (in this case I chose top but default is bottom).
p + scale_y_log10(breaks=c(0.5,1,2),position="top",limits=c(0.5,2)) + guides(col = guide_legend(reverse = TRUE))
To make the lines vertical, just take out coord_flip() out of p, in this strip.text.y will not be needed since we don’t have to rotate or adjust the labels of the panels of (conditions) in this case. The strip_position in the facet_wrap is also changed to “top”, the y-axes ticks and texts is no more set to blank as shown in the following codes
p = ggplot(data=RR_data,
aes(x = Group,y = RiskRatio, ymin = LowerLimit, ymax = UpperLimit ))+
geom_pointrange(aes(col=Group))+
geom_hline(aes(fill=Group),yintercept =1, linetype=2)+
xlab('Group')+ ylab("Risk Ratio (95% Confidence Interval)")+
geom_errorbar(aes(ymin=LowerLimit, ymax=UpperLimit,col=Group),width=0.2,cex=1)+
facet_wrap(~Condition,strip.position="top",nrow=1,scales = "free_x") +
theme(plot.title=element_text(size=16,face="bold"),
axis.text.x=element_text(face="bold"),
axis.title=element_text(size=12,face="bold"))+
scale_y_log10(breaks=c(0.5,1,2))
p
Conclusion
I have explored how to make lattice-like forest plots in R using gplot2. This can be extended to different estimates/measures and their confidence intervals. Note that you can tweak the graphs by playing with the arguments in the functions.


