To build a Forest Plot often the forestplot
package is used in R. However, I find the ggplot2
to have more advantages in making Forest Plots, such as enable inclusion of several variables with many categories in a lattice form. You can also use any scale of your choice such as log scale etc. In this post, I will introduce how to plot Risk Ratios and their Confidence Intervals of several conditions.
Lets start by loading the package ggplot2
in our R.
library(ggplot2)
Data
For demostration purposes, I will load a data which contains few columns named Condition, RiskRatio, LowerLimit, UpperLimit, and Group. The current data is in long format; if your data is not in this format, check out the melt
function, in the reshape
package, it provides a really easy way to reshape data into long format. The reference group RR=1
. My data is in xlsx
format, therefore, I load data using read_excel
in readxl
package as demonstrated below.
RR_data <- data.frame(read_excel("C:/Users/fatakora/Dropbox/MY write R write ups/Risk_Ratio_data.xlsx")) Condition RiskRatio LowerLimit UpperLimit Group 1 Condition1 1.0512 1.0174 1.0863 GroupB 2 Condition2 1.0169 0.9638 1.0731 GroupB 3 Condition3 1.0391 1.0185 1.0601 GroupB 10 Condition1 1.1057 1.0667 1.1463 GroupC 11 Condition2 1.4204 1.3471 1.4978 GroupC 12 Condition3 1.0344 1.0105 1.0589 GroupC 19 Condition1 1.0000 1.0000 1.0000 GroupA 20 Condition2 1.0000 1.0000 1.0000 GroupA 21 Condition3 1.0000 1.0000 1.0000 GroupA
For the sake of easy demonstrations and simplicity, we truncate the upper limits to 2 as maximum and lower limits to 0.5 as minimum.
RR_data$UpperLimit[RR_data$UpperLimit > 2] = 2 RR_data$LowerLimit[RR_data$LowerLimit < 0.5] = 0.5
ggplot2
The following codes will plot the graph below
p = ggplot(data=RR_data, aes(x = Group,y = RiskRatio, ymin = LowerLimit, ymax = UpperLimit ))+ geom_pointrange(aes(col=Group))+ geom_hline(aes(fill=Group),yintercept =1, linetype=2)+ xlab('Group')+ ylab("Risk Ratio (95% Confidence Interval)")+ geom_errorbar(aes(ymin=LowerLimit, ymax=UpperLimit,col=Group),width=0.5,cex=1)+ facet_wrap(~Condition,strip.position="left",nrow=9,scales = "free_y") + theme(plot.title=element_text(size=16,face="bold"), axis.text.y=element_blank(), axis.ticks.y=element_blank(), axis.text.x=element_text(face="bold"), axis.title=element_text(size=12,face="bold"), strip.text.y = element_text(hjust=0,vjust = 1,angle=180,face="bold"))+ coord_flip() p
To add your logscale use scale_y_log10"
. For example, after log scale “half risk” (RR = 0.5) is equidistant from 1 as “double risk” (2.0). Note that, position
can be used to change where you want the axis to appear (in this case I chose top but default is bottom).
p + scale_y_log10(breaks=c(0.5,1,2),position="top",limits=c(0.5,2)) + guides(col = guide_legend(reverse = TRUE))
To make the lines vertical, just take out coord_flip()
out of p
, in this strip.text.y
will not be needed since we don’t have to rotate or adjust the labels of the panels of (conditions) in this case. The strip_position
in the facet_wrap
is also changed to “top”, the y-axes ticks and texts is no more set to blank as shown in the following codes
p = ggplot(data=RR_data, aes(x = Group,y = RiskRatio, ymin = LowerLimit, ymax = UpperLimit ))+ geom_pointrange(aes(col=Group))+ geom_hline(aes(fill=Group),yintercept =1, linetype=2)+ xlab('Group')+ ylab("Risk Ratio (95% Confidence Interval)")+ geom_errorbar(aes(ymin=LowerLimit, ymax=UpperLimit,col=Group),width=0.2,cex=1)+ facet_wrap(~Condition,strip.position="top",nrow=1,scales = "free_x") + theme(plot.title=element_text(size=16,face="bold"), axis.text.x=element_text(face="bold"), axis.title=element_text(size=12,face="bold"))+ scale_y_log10(breaks=c(0.5,1,2)) p
Conclusion
I have explored how to make lattice-like forest plots in R using gplot2
. This can be extended to different estimates/measures and their confidence intervals. Note that you can tweak the graphs by playing with the arguments in the functions.