Plotting Error Bars in R

24 August 2009 jhj1 53 Comments

One common frustration that I have heard expressed about R is that there is no automatic way to plot error bars (whiskers really) on bar plots. I just encountered this issue revising a paper for submission and figured I'd share my code. The following simple function will plot reasonable error bars on a bar plot.

[r]
error.bar <- function(x, y, upper, lower=upper, length=0.1,...){ if(length(x) != length(y) | length(y) !=length(lower) | length(lower) != length(upper)) stop("vectors must be same length") arrows(x,y+upper, x, y-lower, angle=90, code=3, length=length, ...) } [/r] Now let's use it. First, I'll create 5 means drawn from a Gaussian random variable with unit mean and variance. I want to point out another mild annoyance with the way that R handles bar plots, and how to fix it. By default, barplot() suppresses the X-axis. Not sure why. If you want the axis to show up with the same line style as the Y-axis, include the argument axis.lty=1, as below. By creating an object to hold your bar plot, you capture the midpoints of the bars along the abscissa that can later be used to plot the error bars. [r] y <- rnorm(500, mean=1) y <- matrix(y,100,5) y.means <- apply(y,2,mean) y.sd <- apply(y,2,sd) barx <- barplot(y.means, names.arg=1:5,ylim=c(0,1.5), col="blue", axis.lty=1, xlab="Replicates", ylab="Value (arbitrary units)") error.bar(barx,y.means, 1.96*y.sd/10) [/r]
Now let's say we want to create the very common plot in reporting the results of scientific experiments: adjacent bars representing the treatment and the control with 95% confidence intervals on the estimates of the means. The trick here is to create a 2 x n matrix of your bar values, where each row holds the values to be compared (e.g., treatment vs. control, male vs. female, etc.). Let's look at our same Gaussian means but now compare them to a Gaussian r.v. with mean 1.1 and unit variance.

[r]
y1 <- rnorm(500, mean=1.1) y1 <- matrix(y1,100,5) y1.means <- apply(y1,2,mean) y1.sd <- apply(y1,2,sd) yy <- matrix(c(y.means,y1.means),2,5,byrow=TRUE) ee <- matrix(c(y.sd,y1.sd),2,5,byrow=TRUE)*1.96/10 barx <- barplot(yy, beside=TRUE,col=c("blue","magenta"), ylim=c(0,1.5), names.arg=1:5, axis.lty=1, xlab="Replicates", ylab="Value (arbitrary units)") error.bar(barx,yy,ee) [/r]

Clearly, a sample size of 100 is too small to show that the means are significantly different. The effect size is very small for the variability in these r.v.'s. Try 10000.

[r]
y <- rnorm(50000, mean=1) y <- matrix(y,10000,5) y.means <- apply(y,2,mean) y.sd <- apply(y,2,sd) y1 <- rnorm(50000, mean=1.1) y1 <- matrix(y1,10000,5) y1.means <- apply(y1,2,mean) y1.sd <- apply(y1,2,sd) yy <- matrix(c(y.means,y1.means),2,5,byrow=TRUE) ee <- matrix(c(y.sd,y1.sd),2,5,byrow=TRUE)*1.96/sqrt(10000) barx <- barplot(yy, beside=TRUE,col=c("blue","magenta"), ylim=c(0,1.5), names.arg=1:5, axis.lty=1, xlab="Replicates", ylab="Value (arbitrary units)") error.bar(barx,yy,ee) [/r]

That works. Maybe I'll show some code for doing power calculations next time...

53 thoughts on “Plotting Error Bars in R”

jhj1 says:

24 August 2009 at 21:59

Clearly, I need to do something about these annoying line numbers in the text boxes and, for that matter, the text boxes themselves! Will work on that...
Anton says:

11 September 2009 at 08:37

Thank you! Going right into my scripts collection.
N. says:

11 March 2010 at 09:42

Why do you divide 1.96*Standard Deviation by ten for your error bar?

Wouldn't 1.96* SD be the 95% confidence interval?
jhj1 says:

19 March 2010 at 20:57

standard error = standard deviation/sqrt(N). The simulated data had N=100, therefore the 95% CI was approximately 1.96*sd/10.
Vitória Piai says:

8 August 2010 at 07:37

That was really helpful, thank you!

I'm having a problem with the ylim, though. If I set it to be
ylim = c(450, 950)
then I get the axis correctly, but I still get the bars appearing as if they were starting at zero.
I don't know whether this is a limitation of ylim in barplot in itself or whether I'm doing something wrong.
jhj1 says:

8 August 2010 at 13:42

Hard to know without the specifics of the problem, but it sounds like you didn't give the means as the y argument of the function.
Guekoman says:

16 August 2010 at 06:07

How can i get the same CI errorbars when I have different size sample??
jhj1 says:

17 August 2010 at 21:22

The bit of code that calculates the CIs is: ee <- matrix(c(y.sd,y1.sd),2,5,byrow=TRUE)*1.96/10. All you are doing is making a 2x5 matrix of the standard errors here by first constructing a matrix of standard deviations and then multiplying by z/sqrt(n). One very simple (if inelegant) way to plot error bars for two samples of different size it is to move the z/sqrt(n) inside the matrix() argument. Say that the size of one sample was 100 and the other was 144, then you could simply do this: ee <- matrix(c(y.sd*1.96/10,y1.sd*1.96/12),2,5,byrow=TRUE).
Cano says:

25 August 2010 at 09:19

I have tried this but i get this message all the time
"Error: object 'error.bar' not found"
jhj1 says:

1 September 2010 at 21:14

You need to source the code that I provide -- the function called "error.bar". Either copy the function (minus the annoying line numbers that the blog software adds) and paste it at the command line, save it to a file (e.g., error.bar.R) and source it, i.e., source("error.bar.R") (which assumes you saved the file in your working directory; otherwise, include the path), or use the capabilities of an R-aware text editor (e.g., ess mode in Emacs or Tinn-R). Once you have sourced it, the code should work.
sp says:

7 September 2010 at 09:30

how can one plot error bars when using "plot' with type='b'.
jhj1 says:

8 September 2010 at 07:21

@sp: I don't know, but it would be similar to what I did with barplot() -- shouldn't be too hard.
Andrew says:

7 December 2010 at 08:36

Hi,

Thanks for demonstrating this; you are right, it is a major problem with R, particularly for new users like myself.

As this is going straight into a piece of work that I am currently doing, how do you prefered to be referenced?
jhj1 says:

9 December 2010 at 19:51

Not really a citation. Put me in the acknowledgements.
Ron Li says:

21 December 2010 at 13:30

Hello!

I tried your scrip but got an error message:

Error in error.bar(barx, yy, ee) : vectors must be same length

Thanks,

Ron
jhj1 says:

21 December 2010 at 19:43

Well, you're three vectors do need to be the same length! barx, yy, and ee should all be of length k...
Merry Christmas says:

27 December 2010 at 05:48

Hello,

Thanks a lot for your function.
I just have a problem though : I have already SE and means in my data, and when entering these arguments in the error.bar function
e.g. error.bar(graph, y_value, y_SE)
it says "unexpected input in "error.bar(barplot(etc..))"".
Do you have any idea why ?

Many thanks in advance,

Cheers
jhj1 says:

27 December 2010 at 21:38

It's hard to know without seeing the exact inputs/outputs. I usually get the "unexpected input" error when I forget a comma between arguments, so I would check on that. At a minimum, you need three arguments, all separated by commas: error.bar(x,y,upper) [where 'upper' is the bit you want to add/subtract from your vector of means means, y]. This assumes symmetric CIs (which you can change by adding 'lower' as a separate argument).
Merry Christmas says:

28 December 2010 at 01:06

Thank you very much, it works !
Now another challenge : I want to use the "sort" function to show the bars from the lowest to the highest y value. When doing it, the SD error bars are not sorted and don't match their y values.
How can I implement the sort function in the error .bar function ?
Many thanks in advance for your help !
Russell W says:

26 January 2011 at 20:10

Wow. I even got this to work with a line plot I was doing. It's a little artificial. instead of handing off the object assigned the line draw (like with what you did with barx <-) you can just hand the function a c() vector with the right number of points. Works like a charm. Thanks!
marion says:

9 March 2011 at 06:27

Thanks for this. Was looking for a way to plot barplots with errors before and totally happy to have found your explanation.
Anna says:

1 April 2011 at 03:27

Hello, thank you so much for this information it's indeed very helpfull! I have only one question: how d0 you obtain "z" for the CI?
Cheers, Anna
jhj1 says:

2 April 2011 at 21:54

Anna, not quite sure what you mean by "z" but I'll take a stab. I assume what you're asking is where did the 1.96 come from. This comes directly from the normal approximation. $z = \Phi^{-1}(\Phi(z)) = \Phi^{-1}(0.975) = 1.96$ , where $\Phi(z)$ cumulative distribution function for a normal distribution, and $\Phi^{-1}$ is its inverse. You know, it's that whole 95% of the observations in a normal distribution lie $\pm$ 2 standard deviations from the mean? It's actually more like 1.96...
Carmen says:

29 April 2011 at 01:47

Great! Very useful!
Thanks!
Arnau says:

9 May 2011 at 05:10

Many thanks for the code! very useful!!
Jonny says:

20 July 2011 at 08:34

Great code! Very easy to use.
Thanks for posting this!
Mick says:

29 July 2011 at 16:59

Thanks for this nice little code!

It works great; I also added the option to specify sd, se or CI for the error bars.

One thing I'm having trouble with is specifying ylim with a minimum other than zero. Any ideas?

Cheers,
Mick says:

30 July 2011 at 10:36

Oops, posted the question too quickly. 'xpd=FALSE' fixed my problem with the bars extending below the axis.

Cheers!
Fred says:

21 August 2011 at 09:34

Thank you so much. This was invaluable to me.
Greg Benison says:

6 September 2011 at 05:24

Just finished writing a postscript library to produce bar plots with error bars when I came across your R solution. Thanks for posting this! It's surprising how non-trivial it seems to find tools to do this...
sarah says:

14 September 2011 at 15:36

thanks dear!!!
Maria says:

26 September 2011 at 19:55

Thank you so much for posting this code, I lost so much time trying to get those error bars!
Pancho says:

13 October 2011 at 01:27

Yes, this is great! I have used it several times in reports. Thanks again. But do I need to cite you? And how?
Kevin says:

24 January 2012 at 17:28

plotCI() in the plotrix package
Chris says:

26 February 2012 at 10:14

Thanks for writing this, way better than the way I was doing it previously with lines()!

Just a couple recommendations:

1) I added a lwd argument to the function so users can specify the line width.
2) I changed the standard error calculations to sqrt(length(COLUMN)) to get around the different sample size problem.

Best,

Chris
jhj1 says:

3 March 2012 at 13:38

no need to cite, thanks.
Sabine says:

11 May 2012 at 06:54

Thank you very much for sharing the code! Unfortunatelly, I couldnt make the code work for a boxplot with "beside=F". Do you have any idea?Best,
Sabine
Matteo says:

17 July 2012 at 11:00

Thanks for your help!
I would like to modifie the function in order to have a smaller arrow. how do I do it?
ciao
Matteo
jhj1 says:

17 July 2012 at 15:38

Sorry, not much to go on here. I hope it worked out for you...
jhj1 says:

17 July 2012 at 15:41

Look at the help file for arrows(). There are "length", "width", "angle", and "code" arguments that allow you to change the style of the arrow head.
Steve says:

28 August 2012 at 20:48

Great function--this helped me out greatly! Thanks so much for posting. It is often these short functions that can save tons of time...
Raj says:

30 August 2012 at 03:56

Thank you
Jakob says:

7 September 2012 at 02:25

Thanks!! Will use this all the time.
Justin says:

16 September 2012 at 12:20

Just wanted to add my thanks for posting this. It's exactly what I was looking for
George says:

21 October 2012 at 20:24

Very elegant way to do it! As shown in the examples, it works well for bar plots. For point plots I want to rub out the middle of the arrow so that it doesn't overlap my plotting symbol. To do this I plotted the error bars first and then plotted my points with pch = 21 and bg = "white". Does anyone have a more elegant or more general method?
jhj1 says:

21 October 2012 at 20:48

That's probably what I would do!
Andrew Park says:

30 November 2012 at 07:11

I used this coed to produce error bars in a "side by side" bar plot (like the second graph in your example). While the code works and produced error bars, they are mysteriously divorced from the values in the bars; some of them float above the bars and others are completely embedded somewhere within the bar. Any thoughts?
jhj1 says:

4 December 2012 at 19:00

Hard to know without more specifics. My guess is that your vectors don't match up so that you're getting weird effects of recycling. This is a natural way that error bars might get offset from the means.
mwj says:

3 February 2013 at 08:13

Hi, great post, thanks heaps! I'm wondering how I need to adapt the code to help me calculate the appropriate confidence intervals when I have a poisson response variable and factor predictor variable...If you have some quick hints, they are greatly appreciated!
clear says:

18 March 2013 at 07:13

Awesome post.

monkey's uncle

Plotting Error Bars in R

53 thoughts on “Plotting Error Bars in R”

Leave a Reply Cancel reply

notes on human ecology, population, and infectious disease