Box plot for Excel 2007
Oct 15th, 2007 by Jesper
Keywords: Boxplot, box plot, stem and leaf plots, Excel 2007, how to make
Version: Excel 2007
Since the previous entriesI have recieved quite a few questions about Box-plots in Excel 2007, so I decided I should describe one way to create decent looking box plots in Excel 2007. In my example I start with a set of data containing six samples with ten replicates each, and from this I want to create a box plot showing the extremes, median and the quartiles.

I create five new rows (12-16), max, 3rd quartile, median, 1st quartile and min and then calculate the statistics accordingly in cells B12:B16:
=MAX(B2:B10)
=PERCENTILE(B2:B10,0.75)
=MEDIAN(B2:B10)
=PERCENTILE(B2:B10,0.25)
=MIN(B2:B10)
Then copy to cells C12:G16.

Since we will “trick” Excel to draw a box-plot and use a stacked column chart we have to modify our data slightly. The first segment of the stacked column will be invisible and end where the lower boundary of the 2nd quartile begins ( =PERCENTILE(B2:B10,0.25) ). The next segment will consist of the 2nd quartile (median-1st quartile, or B14-B15). The third segment is the 3rd quartile (3rd quartile - median, or B13-B14). The length of the whiskers representing the max and min values are calculated as 1st quartile - min or B15-B16 and max - 3rd quartile, or B12-B13.
These values are calculated in a new range, see image below.

Now I’m ready to insert the chart. I select the range B19:G21 (see image below) and select a 2D stacked column from the Insert–>Table menu.

Next we add the whiskers. Select the second segment, click on Chart Tools –> Layou –> Select Error bars –> More error bars options and pick the Display Direction: Minus, indicate the Error Amount: Custom and click the Specify Value button. Leave the Positive Error Value as is and select the range containing the Min values for the Negative Error bar.
Repeat for the max value whiskers. The chart now should look like the one in the image below.

To make the chart a bit neater, right-click the lower segment series (green series in the image) and select properties and make invisible. Format the rest of the chart to your liking. Done!

Good luck, and enjoy your new Box plots.
Popularity: 100% [?]
Dear Sir,
i have been trying to find a solution to draw a continuous graph , for programs, on a continious timeline.
for eg, the tv channel programming starts at 6 am and goes on till 12 midnight. i would like to plot two variables on a bar chart - the x coordinate of individual bars should represent the time duration of the program and the Y coordinate of the bar should represent the viewership.
my question is - would it be possible in Excel ?
are there any solutions for this ?
would be much obliged if you could help me .
thanks and regards,
Praveen
Hi
Yes it is possible, you use the same basic techniqe as described here. Chose a horizontal bar chart and trick Excel by hiding part of the bar with a dummy series to achieve the effect you are looking for.
If you need more hands on help please contact me via the Contact page.
great…i make it…
:)))
Hi Jesper, great stuff, however, the error bars are not the max and min values in a box plot, they are 1.5 times the inter-quartile range. With your approach there is no room for outliers.
Have a great day
Cute trick…and very useful. What about the instance where there are negative values? Your method of tricking excel be “reversed” where all the values are negative but the method breaks if only some of the values are negative.
It gets tricky because what range you hide (or not) depends on where (or if) any of the ranges straddle the zero point.
A more robust solution would be include some boolean logic in the formulas. I’ve been playing around with it and am stymied because the changing relative location of the zero point means:
1) the order of the stack changes
2) the series which should be made invisible changes. It will be either 1stQ, 3rdQ or none.
3) if none, then either the 1st and 2nd, or 3nd and 3rd series need to be rendered in the same color.
The cute trick starts to become an inelegant mess. Macros seem much easier at this point.
Ahh, what going to lunch will do. I figured out a better way to handle negative values.
In the “trick” range and an offset constant equal or greater than the lowest negative value to all of the numbers. Proceed as in the original instructions.
Finally add another range of data that has two points, the offset constant and the highest value in the original range. Change the axis of that range to “secondary” and make the marker and line style “none” to make it invisible.
Finally, change the tick labels of the primary axis to “none.”
Now you have a good looking chart with an appropriate scale. You may have to tweak the scale range on the secondary axis to make sure it lines up properly.
Art, that’s a good solution, I will definately write up an short instruction about it if you don’t object. Thanks.
Dear Sirs,
much thanks for your responses, but my question i still unanswered.
is it possible to plot columns of different thicknesses - signifying durations of programs and heights denoting the program ratings.
eg, if a program has a duration of 2 hours and1 trp, the column is 2 units wide and 1 unit tall, another program has a duration of half hour nd trps of 5, the column is half a unit wideand 5 units all.
would appreciate any solution on the same,
thanks , Praveen