PRACTICAL BUSINESS ANALYSISPRACTICAL BUSINESS ANALYSIS

by

The author of the article “How to Display Data Badly” presented the 12 most powerful techniques/rules to display data badly.  Once you are done reading the article, please search on the internet to find an example that uses at least one of the 12 rules to show data badly.  Post your example, comment on which rule is used, and offer a solution to re-display the data.

How to Display Data Badly

Never use plagiarized sources. Get Your Original Essay on
PRACTICAL BUSINESS ANALYSISPRACTICAL BUSINESS ANALYSIS
Hire Professionals Just from $11/Page
Order Now Click here

Author(s): Howard Wainer

Source: The American Statistician , May, 1984, Vol. 38, No. 2 (May, 1984), pp. 137-147

Published by: Taylor & Francis, Ltd. on behalf of the American Statistical Association

Stable URL: https://www.jstor.org/stable/2683253

REFERENCES
Linked references are available on JSTOR for this article:
https://www.jstor.org/stable/2683253?seq=1&cid=pdf-
reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Terms and Conditions of Use

Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to
digitize, preserve and extend access to The American Statistician

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

https://www.jstor.org/stable/2683253

https://www.jstor.org/stable/2683253?seq=1&cid=pdf-reference#references_tab_contents

https://www.jstor.org/stable/2683253?seq=1&cid=pdf-reference#references_tab_contents

Commentaries are informative essays dealing with viewpoints of sta-

tistical practice, statistical education, and other topics considered to

be of general interest to the board readership of The American Statis-

tician. Commentaries are similar in spirit to Letters to the Editor, but

they involve longer discussions of background, issues, and perspec-

tives. All commentaries will be refereed for their merit and com-

patibility with these criteria.

HOWARD WAINER*

Methods for displaying data badly have been devel-
oping for many years, and a wide variety of interesting

and inventive schemes have emerged. Presented here is

a synthesis yielding the 12 most powerful techniques

that seem to underlie many of the realizations found in

practice. These 12 (the dirty dozen) are identified and

illustrated.

KEY WORDS: Graphics; Data display; Data density;
Data-ink ratio.

1. INTRODUCTION

The display of data is a topic of substantial contem-

porary interest and one that has occupied the thoughts

of many scholars for almost 200 years. During this time

there have been a number of attempts to codify stan-

dards of good practice (e.g., ASME Standards 1915;
Cox 1978; Ehrenberg 1977) as well as a number of
books that have illustrated them (i.e., Bertin
1973,1977,1981; Schmid 1954; Schmid and Schmid

1979; Tufte 1983). The last decade or so has seen a
tremendous increase in the development of new display

techniques and tools that have been reviewed recently
(Macdonald-Ross 1977; Fienberg 1979; Cox 1978;
Wainer and Thissen 1981). We wish to concentrate on
methods of data display that leave the viewers as unin-

formed as they were before seeing the display or, worse,
those that induce confusion. Although such techniques
are broadly practiced, to my knowledge they have not
as yet been gathered into a single source or carefully

How to Display Data Badly

categorized. This article is the beginning of such a

compendium.

The aim of good data graphics is to display data accu-
rately and clearly. Let us use this definition as a starting

point for categorizing methods of bad data display. The

definition has three parts. These are (a) showing data,

(b) showing data accurately, and (c) showing data
clearly. Thus, if we wish to display data badly, we have

three avenues to follow. Let us examine them in se-

quence, parse them into some of their component parts,

and see if we can identify means for measuring the
success of each strategy.

2. SHOWING DATA

Obviously, if the aim of a good display is to convey
information, the less information carried in the display,

Change in Science Achievement of 9-, ,,i,,,, biologicalscience
13-, and 17-Year-Olds, by Type of
Exercise: 1969-1977 _. Physical science

Change in percent correct 9-YEAR-OLDS
1 ,

O1 A s ss llll.|||
-2 ________

-3 _

13-YEAR-OLDS

0I

-2

-3 _

-4

_5

-6

17-YEAR-OLDS

– h Is_ _ _ _ _ _ _ _ _ __ _ _ _ _ 1 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

-2 …………………. …1

-4 = a
1969 1970 1973 1977

Figure 1. An example of a low density graph (from S13 [ddi = .3]).

*Howard Wainer is Senior Research Scientist, Educational Testing

Service, Princeton, NJ 08541. This is the text of an invited address to

the American Statistical Association. It was supported in part by the

Program Statistics Research Project of the Educational Testing Ser-

vice. The author would like to express his gratitude to the numerous

friends and colleagues who read or heard this article and offered

valuable suggestions for its improvement. Especially helpful were

David Andrews, Paul Holland, Bruce Kaplan, James 0. Ramsay,

Edward Tufte, the participants in the Stanford Workshop on Ad-

vanced Graphical Presentation, two anonymous referees, the long-

suffering associate editor, and Gary Koch.

C) The American Statistician, May 1984, Vol. 38, No. 2 137

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

0.8 F

i_ 2

0. –

0.0 0 t.4 0.6 0.Z 1.0

LOCATION DIFFERENCE: 3 JNI T

Figure 2. A low density graph (from Friedman and Rafsky 1981
[ddi = .5]).

the worse it is. Tufte (1983) has devised a scheme for
measuring the amount of information in displays, called
the data density index (ddi), which is “the number of
numbers plotted per square inch.” This easily calcu-
lated index is often surprisingly informative. In popular
and technical media we have found a range from .1 to
362. This provides us with the first rule of bad data
display.

Rule 1-Show as Few Data as Possible (Minimize the
Data Density)

What does a data graphic with a ddi of .3 look like?

Shown in Figure 1 is a graphic from the book Social
Indicators III (S13), originally done in four colors (orig-
inal size 7″ by 9″) that contains 18 numbers (18/63 = .3).
The median data graph in S13 has a data density of .6
numbers/in2; this one is not an unusual choice. Shown in
Figure 2 is a plot from the article by Friedman and
Rafsky (1981) with a ddi of .5 (it shows 4 numbers in 8

Labor jyy US. vs Japan
.00 _~~~~~~~~~~~~~.

100%-moutpu pe r mon-ho ur in mQanus ur ngfn cia reta-oU upt

70%

62.3%/

44%/

Figure 3. A low density graph (? 1978, The Washington Post) with
chart-junk to fill in the space (ddi = .2).

Public and Private Elementary Schools m Public
Selected Years 1929-1970

-Prjvale
Thousan d0oi Schools

300

1929-30 1 939-40 1949-50 1959-60 1969-70
School Year

Figure 4. Hiding the data in the scale (from S13).

in2). This is unusual for JASA, where the median data
graph has a ddi of 27. In defense of the producers of this

plot, the point of the graph is to show that a method of

analysis suggested by a critic of their paper was not
fruitful. I suspect that prose would have worked pretty

well also.

Although arguments can be made that high data den-

sity does not imply that a graphic will be good, nor one
with low density bad, it does reflect on the efficiency of

the transmission of information. Obviously, if we hold

clarity and accuracy constant, more information is bet-

THE NUMBER OF PRIVATE ELEMENTARY SCHOOLS
FROM 1930-1970

15-

14 –

13-

C,,

Is

* 12-
C/

1930 9.275
10 _ 1940 10.000

1950 10.375
1960 13.574
1970 14.372

9

0″ 1930 1940 1950 1960 1910

Figure 5. Expanding the scale and showing the data in Figure 4
(from S13).

138 (? The American Statistician, May 1984, Vol. 38, No. 2

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

A New Set of Projectins for the U.S. Supply of Energy
Compared are two proctlons ot United State *rtrgy upply In th, y.r 2000 made by the Pftedynt s Council of
Envirnonot talOuallty and th ectual 1977 supply Alltigurasar * i nquads uunita ootm”aurmomnt that reprnt a
million billin-on quadrilion- Britlsh thettal unlts (8T U a), a standard masure otf ergy

0~~i_ . O.la,daa* Solad 1977 T tal 77 5_ a t U 7
M~~~~~~S 5 / 4 2 – -14 1

Nca, Coal

2000 – A-

Er,ph syes ,egy ‘ c_e,to Tutal 05
’40 19 1

I 7 7 7)

2000-8

Erphas.zes.rc,eased Total 1a 9
erergy oduct.on 37d *

(1979 The New York Times

Figure 6. Ignoring the visual metaphor (? 1978, The New York
Times).

ter than less. One of the great assets of graphical tech-

niques is that they can convey large amounts of informa-

tion in a small space.

We note that when a graph contains little or no infor-

mation the plot can look quite empty (Figure 2) and
thus raise suspicions in the viewer that there is nothing

to be communicated. A way to avoid these suspicions is

to fill up the plot with nondata figurations-what Tufte

has termed “chartjunk.” Figure 3 shows a plot of the

labor productivity of Japan relative to that of the

United States. It contains one number for each of three
years. Obviously, a graph of such sparse information
would have a lot of blank space, so filling the space

hides the paucity of information from the reader.

A convenient measure of the extent to which this

practice is in use is Tufte’s “data-ink ratio.” This mea-

sure is the ratio of the amount of ink used in graphing
the data to the total amount of ink in the graph. The
closer to zero this ratio gets, the worse the graph. The
notion of the data-ink ratio brings us to the second
principle of bad data display.

Rule 2-Hide What Data You Do Show
(Minimize the Data-Ink Ratio)

One can hide data in a variety of ways. One method
that occurs with some regularity is hiding the data in the
grid. The grid is useful for plotting the points, but only
rarely afterwards. Thus to display data badly, use a fine
grid and plot the points dimly (see Tufte 1983,
pp. 94-95 for one repeated version of this).

A second way to hide the data is in the scale. This

corresponds to blowing up the scale (i.e., looking at the
data from far away) so that any variation in the data is
obscured by the magnitude of the scale. One can justify
this practice by appealing to “honesty requires that we
start the scale at zero,” or other sorts of sophistry.

In Figure 4 is a plot that (from S13) effectively hides
the growth of private schools in the scale. A redrawing

of the number of private schools on a different scale
conveys the growth that took place during the mid-

1950’s (Figure 5). The relationship between this rise and

Brown vs. Topeka School Board becomes an immediate

question.

To conclude this section, we have seen that we can

display data badly either by not including them (Rule 1)

.N 1.: – ,l&U^,*

Cm nlions of U.S dollars) (in millions of U S dollars)

3,000 6,000 ____ l

U.S. exports U.S. imports
to China from Taiwan

2,000 4,000

U.S. imports U.S. exports
from China to Taiwan

1,000 2000

1972 1974 1976 1978 1980 1970 1972 1974 1976 1978 1980

Source Dpartment of Commerce

Figure 7. Reversing the metaphor in mid-graph while changing
scales on both axes (? June 14, 1981, The New York Times).

or by hiding them (Rule 2). We can measure the extent
to which we are successful in excluding the data through
the data density; we can sometimes convince viewers

that we have included the data through the incorpo-
ration of chartjunk. Hiding the data can be done either

by using an overabundance of chartjunk or by cleverly
choosing the scale so that the data disappear. A mea-
sure of the success we have achieved in hiding the data
is through the data-ink ratio.

3. SHOWING DATA ACCURATELY

The essence of a graphic display is that a set of num-

bers having both magnitudes and an order are repre-
sented by an appropriate visual metaphor-the mag-
nitude and order of the metaphorical representation
match the numbers. We can display data badly by ignor-
ing or distorting this concept.

Rule 3-Ignore the Visual Metaphor Altogether

If the data are ordered and if the visual metaphor has

a natural order, a bad display will surely emerge if you
shuffle the relationship. In Figure 6 note that the bar

labeled 14.1 is longer than the bar labeled 18. Another
method is to change the meaning of the metaphor in the
middle of the plot. In Figure 7 the dark shading repre-
sents imports on one side and exports on the other. This
is but one of the problems of this graph; more serious
still is the change of scale. There is also a difference in
the time scale, but that is minor. A common theme in

Playfair’s (1786) work was the difference between im-
ports and exports. In Figure 8, a 200-year-old graph
tells the story clearly. Two such plots would have illus-
trated the story surrounding this graph quite clearly.

Rule 4-Only Order Matters

One frequent trick is to use length as the visual meta-
phor when area is what is perceived. This was used quite

effectively by The Washington Post in Figure 9. Note

that this graph also has a low data density (.1), and its

data-ink ratio is close to zero. We can also calculate

Tufte’s (1983) measure of perceptual distortion (PD)

for this graph. The PD in this instance is the perceived

?) The American Statistician, May 1984, Vol. 38, No. 2 139

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

C IIA ItT T

E-xPORT.i & 1511’01TS8I

E-.N; A C_ r LA_- – .

.. ~ ~ ~ ~ ~ ~ . .. ….. 5tf

Figure 8. A plot on the same topic done well two centuries eariler (from Playfair 1786).

Til E tIXITE I, S1A’M1IS, (WFAMIEICiA.A
E 1430I3632

5 5
U5

1958- ESENHOWER: $1.

Ti E FA ‘ t V

1963 – KENNEDY: 94c pw~~~~~~~~~~~ t 4 3 6 2t X
1968- JOHNSON: 53Uc

IN: I 1TEDISTATE:SM’A3.11:11 . _

of thel
lXnlshllng r4 ffiN^>:
Dollar
rc:LuborDportment

1978-CTER: 44CAR
(August)

Figure 9. An example of how to goose up the effect by squaring
the eyeball (? 1978, The Washington Post).

change in the value of the dollar from Eisenhower to

Carter divided by the actual change. I read and measure

thus:

Actual Measured

1.00 – .44 22.00 – 2.06

=44 1.27 2.06 96
PD = 9.68/1.27 = 7.62

This distortion of over 700% is substantial but by no

means a record.
A less distorted view of these data is provided in

Figure 10. In addition, the spacing suggested by the

0 E I SENHOWER
KENNE D T

JOHNSON

0.8

~0. 4

=0.2

CC

0.2

0. O.I I I
1958 1963 1968 1973 1978

YERR
Figure 10. The data in Figure 9 as an unadorned line chart (from

Wainer, 1980).

140 ? The American Statistician, May 1984, Vol. 38, No. 2

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

presidential faces is made explicit on the time scale.

Rule 5-Graph Data Out of Context

Often we can modify the perception of the graph

(particularly for time series data) by choosing carefully
the interval displayed. A precipitous drop can disappear

if we choose a starting date just after the drop. Simi-
larly, we can turn slight meanders into sharp changes by

focusing on a single meander and expanding the scale.
Often the choice of scale is arbitrary but can have pro-

found effects on the perception of the display. Figure 11
shows a famous example in which President Reagan

gives an out-of-context view of the effects of his tax cut.

The Times’ alternative provides the context for a deeper

understanding. Simultaneously omitting the context as
well as any quantitative scale is the key to the practice

of Ordinal Graphics (see also Rule 4). Automatic rules
do not always work, and wisdom is always required.

In Section 3 we discussed three rules for the accurate

display of data. One can compromise accuracy by ignor-

ing visual metaphors (Rule 3), by only paying attention
to the order of the numbers and not their magnitude

(Rule 4), or by showing data out of context (Rule 5).
We advocated the use of Tufte’s measure of perceptual

distortion as a way of measuring the extent to which the

accuracy of the data has been compromised by the dis-
play. One can think of modifications that would allow it

to be applied in other situations, but we leave such
expansion to other accounts.

4. SHOWING DATA CLEARLY

In this section we discuss methods for badly dis-

playing data that do not seem as serious as those de-

THE NEW YORK TIMES, SUNDAY, AUGUST 2, 1981

$2500 Payments under the $2500 Ways and Means __
Committee plan

2000 Payments undr the 2000
Prdentfs proposW

1500
YOUR TAXES

Tawpld by AVERAGE FAMILY INCOME – S20.000

SOfl4Nfl~ 1982 1986

1 ??? w m 11W0 $ THEIR
$ ~~~~~~~BILL

500 l

OUR BILL

1982 1983 1984 1985 198

Figure 11. The White House showing neither scale nor context
(? 1981, The New York Times, reprinted with permission).

scribed previously; that is, the data are displayed, and

they might even be accurate in their portrayal. Yet sub-

tle (and not so subtle) techniques can be used to effec-

tively obscure the most meaningful or interesting as-

pects of the data. It is more difficult to provide objective

measures of presentational clarity, but we rely on the

reader to judge from the examples presented.

Rule 6-Change Scales in Mid-Axis

This is a powerful technique that can make large dif-

ferences look small and make exponential changes look

linear.

In Figure 12 is a graph that supports the associated

story about the skyrocketing circulation of The New

York Post compared to the plummeting Daily News

circulation. The reason given is that New Yorkers

“trust” the Post. It takes a careful look to note the

700,000 jump that the scale makes between the two
lines.

In Figure 13 is a plot of physicians’ incomes over

time. It appears to be linear, with a slight tapering off

in recent years. A careful look at the scale shows that it

starts out plotting every eight years and ends up plotting

yearly. A more regular scale (in Figure 14) tells quite a
different story.

The soaraway Post
the daily paper

New Yorkers trust
1,900,000.

1 ,829,000 NEW S
1,800000 ;

1 700,000

1,636,000

% 1,555,000

1,500,000(- –

… 1,491,000

bu,000 – –

.:_ k00t0
… – .- -_ __ ~ a., :a1….

E~~~17 197 198 198 1982_– .-E
Fiur 12. Chngn scl in mid-ai to mak lag differences.

? The American Statistician, May 1984, Vol. 38, No. 2 141

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

hIcomes of Doctors
Vs. Other Profesionals

(MEDIAN NET INCOMES)
SOURCE: Council on Wage and Price Stability

OFFICED-BASED 62.799
NONSALARIED PHYSICIANS 54,14

1i 50,823 5,4

46,780

43,100

34,740

25,050

16,107

13,150

8,744

$3,262 … A

1939 1947 1951 1955 1963 1965 1967 1970 1972 1973 1974 1975 1976

Figure 13. Changing scale in mid-axis to make exponential growth
linear (? The Washington Post).

Rule 7-Emphasize the Trivial (Ignore the Important)

Sometimes the data that are to be displayed have one
important aspect and others that are trivial. The graph

can be made worse by emphasizing the trivial part. In
Figure 15 we have a page from S13 that compares the

income levels of men and women by educational levels.

It reveals the not surprising result that better educated

individuals are paid better than more poorly educated

ones and that changes across time expressed in constant

dollars are reasonably constant. The comparison of

greatest interest and current concern, comparing sal-

aries between sexes within education level, must be

made clumsily by vertically transposing from one graph
to another. It seems clear that Rule 7 must have been

operating here, for it would have been easy to place the

graphs side by side and allow the comparison of interest
to be made more directly. Looking at the problem from

a strictly data-analytic point of view, we note that there

are two large main effects (education and sex) and a
small time effect. This would have implied a plot that

INCOMES OF DOCTORS VS. OTHER PROFESSIONRLS

710

-60
cso

n50 DOCTORS
OTHER

PROFESSIONALS

z30/

z20 2 z H~~~~~~~~~~~EOICRRE STRRTE0
n010.

1939 19414 1949 1954 1959 1964 1969 1974

YEAR

Figure 14. Data from Figure 13 redone with linear scale (from
Wainer 1980).

Median Income of Year-Round, Full-Time
Workers 25 to 34 Years Old, by Sex and
Educational Attainment: 1961977

Constant 1977 dollars MALE

$20,000 – – – – ?

$18000 . = = = = =

$1200 >_ __ _ s__ –LIL _ 16eaorm

$16000 _ _ _, s_ _s _ ,_~ _| 6to16 years or ore

~~, mm ~~~~ ~~ ____ ~~~ 13 ro 1,5 yeror
$14,000 L A__ M ein ino meo _

1 2 year 9
$12,000 I~– -n ~ r ~

$10,0000

showed~~ ~a own, larges a effct clal an pla eard o thessals

$8,000 – – -_ _ _ _ – –

$4,000 – -__- –

$2,000 —

20

$ l0 – -ale’ -|– – – – – d- ic –

110 12_1`1F ,IA,A 1 e

$14,001 SW_L7 1V _

$1,000 y ars Feor lees

$4 ,000 – _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

$2 ,0 0 0 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

$0 – _ _

1968 1970 1972 1974 1976 1978 1980

Figure 15. Emphasizing the trivial: Hiding the main effect of sex
differences in income through the vertical placement of plots (from
S13).

showed the large effects clearly and placed the smallish
time trend into the background (Figure 16).

MEDIAN INCOME OF YEAR-ROUND FULL TIME WORKERS
25-34 YEARS OLD BY SEX AND EDUCATIONAL ATTAINMENT:

1968-1977 (IN CONSTANT 1977 DOLLARS)

20-

16 ‘1 Males

C” ~MalesV
12 –

I– -‘ ~~~~~~~~~~~~~~~Females

8 Females

Legend

4 -maximum

-median (uveriime)

142 C) The American Statistician, May 1984, Vol. 38, No. 2

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

U.S. IMPORTS OF RED MEATS
BIL. LB.

2.5 LAMB MUTTON AND GOATMEAT

2.0

1.5 …….. . … ……. ……

1.0– – – – –

1 ‘ 0 N” BEEF AN VEAL
N. …: : ‘ , , : : :, : . .;; ‘ : .: : :: : ‘,:: – -:… ….: .. .: ,- :. ‘ : : : : : : :; o~~~~~~~.’.:;::-: :: ‘X’-* .;…. .;. ..,…,…; ….,,.;.–

0.

1960 1963 i90 1969 1972 1975 1978
*eAftcA WG, r EOUIVALENT

^~~~~~~~~~~~~~~~~~~~~~~~~~a FA ‘.:,: -:[email protected]:-:-:9o

Figure 17. Jiggling the baseline makes comparisons more difficult
(from Handbook of Agricultural Charts).

Rule 8-Jiggle the Baseline

Making comparisons is always aided when the quan-
tities being compared start from a common base. Thus
we can always make the graph worse by starting from
different bases. Such schemes as the hanging or sus-
pended rootogram and the residual plot are meant to
facilitate comparisons. In Figure 17 is a plot of U.S.

imports of red meat taken from the Handbook of Agri-
cultural Charts published by the U.S. Department of
Agriculture. Shading beneath each line is a convention
that indicates summation, telling us that the amount of
each kind of meat is added to the amounts below it.
Because of the dominance of and the fluctuations in
importation of beef and veal, it is hard to see what the

changes are in the other kinds of meat-Is the importa-
tion of pork increasing? Decreasing? Staying constant?
The only purpose for stacking is to indicate graphically
the total summation. This is easily done through the
addition of another line for TOTAL. Note that a

TOTAL will always be clear and will never intersect the
other lines on the plot. A version of these data is shown

U.S. IMPORTS OF RED rMEATS*
BIL. LB.-

POkK

2.5 e-_____~~_

10 –

1960 1963 1966 1969 1972 1975 1978

Source: Handbook of Agri_ultural Charts , U .S . Department of
Agriculture, 1976, p. 93.

Chart Source: Origzinal

Figure 18. An alternative version of Figure 17 with a straight line
used as the basis of comparison.

Life Expectancy at Birth, by Sex, Selected m Male
Countres, Most Recent Available Year: Female
1970-1IMi Female

Austria, 1974 1975 HIM

Canada, 1970-1972

Finland, 1974

France, 1972 1 | 4

Germany (Fed Rep , i ill l
1973- 1975 R

Japan, 1974

U S S R., 1971-1972) i S S

Sweden, 1971-1975

United Kingdom, 1970-1972

United States, 1975

0 50 60 70 80 90

Years of life expectancy

Figure 19. Austria First! Obscuring the data structure by alpha-
betizing the plot (from S13).

in Figure 18 with the separate amounts of each meat, as

well as a summation line, shown clearly. Note how

easily one can see the structure of import of each kind
of meat now that the standard of comparison is a

straight line (the time axis) and no longer the import
amount of those meats with greater volume.

Rule 9-Austria First!

Ordering graphs and tables alphabetically can ob-
scure structure in the data that would have been obvious

had the display been ordered by some aspect of the
data. One can defend oneself against criticisms by
pointing out that alphabetizing “aids in finding entries
of interest.” Of course, with lists of modest length such
aids are unnecessary; with longer lists the indexing
schemes common in 19th century statistical atlases pro-
vide easy lookup capability.

Figure 19 is another graph from Sf3 showing life ex-
pectancies, divided by sex, in 10 industrialized nations.
The order of presentation is alphabetical (with the
USSR positioned as Russia). The message we get is that
there is little variation and that women live longer than
men. Redone as a stem-and-leaf diagram (Figure 20 is
simply a reordering of the data with spacing propor-
tional to the numerical differences), the magnitude of
the sex difference leaps out at us. We also note that the

USSR is an outlier for men.

Rule JO-Label (a) Illegibly, (b) Incompletely,

(c) Incorrectly, and (d) Ambiguously

There are many instances of labels that either do not

C) The American Statistician, May 1984, Vol. 38, No. 2 143

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

LIFE EXPECTANCY AT BIRTH, BY SEX,
MOST RECENT AVAILABLE YEAR

WOMEN YEARS r3Ef

SWEDEN 78
77

FRANCE, US, JAPAN, CANADA 76
FINLAND, AUSTRIA, UK 75

USSR, GERMANY 74
73

72 SWEDEN
71 JAPAN
70

69 CANADA, UK, US, FRANCE
68 GERMANY, AUSTRIA

67 FINLAND
66
65

64

673 USSR
62

I

Figure 20. Ordering and spacing the data from Figure 19 as a
stem-and-leaf diagram provides insights previously difficult to
extract (from S13).

tell the whole story, tell the wrong story, tell two or

more stories, or are so small that one cannot figure out
what story they are telling. One of my favorite examples

of small labels is from The New York Times (August

To Travel Agents
In lkosoldofdlAer

$57~~~~~~~~~~~~~~~’6

O ~~E,ASTIEEN unITE=D

web of discount fars and airlines’ telephone d s areras
(ravel agents’ overhead, offsetting revenue gains from higher volume.

Figure 21. Mixing a changed metaphor with a tiny label reverses
the meaning of the data (? 1978, The New York Times).

Commtssion Payqrents
to Travel Agents

1 5o

m

L 1 20-

L

I JUN ITE

0

N 9 0- TWA

5

E A S TERN

0

F 60-

D D E L T A

0

L 30-

L

A

R

5 0

1976 1977 1978

(e a t I ma t e d

Y EAR

Figure 22. Figure 21 redrawn with 1978 data placed on a
comparable basis (from Wainer 1980).

1978), in which the article complains that fare cuts lower

commission payments to travel agents. The graph (Fig-

ure 21) supports this view until one notices the tiny label
indicating that the small bar showing the decline is for

just the first half of 1978. This omits such …

The author of the article “How to Display Data Badly” presented the 12 most powerful techniques/rules to display data badly.  Once you are done reading the article, please search on the internet to find an example that uses at least one of the 12 rules to show data badly.  Post your example, comment on which rule is used, and offer a solution to re-display the data.

How to Display Data Badly

Author(s): Howard Wainer

Source: The American Statistician , May, 1984, Vol. 38, No. 2 (May, 1984), pp. 137-147

Published by: Taylor & Francis, Ltd. on behalf of the American Statistical Association

Stable URL: https://www.jstor.org/stable/2683253

REFERENCES
Linked references are available on JSTOR for this article:
https://www.jstor.org/stable/2683253?seq=1&cid=pdf-
reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Terms and Conditions of Use

Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to
digitize, preserve and extend access to The American Statistician

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

https://www.jstor.org/stable/2683253

https://www.jstor.org/stable/2683253?seq=1&cid=pdf-reference#references_tab_contents

https://www.jstor.org/stable/2683253?seq=1&cid=pdf-reference#references_tab_contents

Commentaries are informative essays dealing with viewpoints of sta-

tistical practice, statistical education, and other topics considered to

be of general interest to the board readership of The American Statis-

tician. Commentaries are similar in spirit to Letters to the Editor, but

they involve longer discussions of background, issues, and perspec-

tives. All commentaries will be refereed for their merit and com-

patibility with these criteria.

HOWARD WAINER*

Methods for displaying data badly have been devel-
oping for many years, and a wide variety of interesting

and inventive schemes have emerged. Presented here is

a synthesis yielding the 12 most powerful techniques

that seem to underlie many of the realizations found in

practice. These 12 (the dirty dozen) are identified and

illustrated.

KEY WORDS: Graphics; Data display; Data density;
Data-ink ratio.

1. INTRODUCTION

The display of data is a topic of substantial contem-

porary interest and one that has occupied the thoughts

of many scholars for almost 200 years. During this time

there have been a number of attempts to codify stan-

dards of good practice (e.g., ASME Standards 1915;
Cox 1978; Ehrenberg 1977) as well as a number of
books that have illustrated them (i.e., Bertin
1973,1977,1981; Schmid 1954; Schmid and Schmid

1979; Tufte 1983). The last decade or so has seen a
tremendous increase in the development of new display

techniques and tools that have been reviewed recently
(Macdonald-Ross 1977; Fienberg 1979; Cox 1978;
Wainer and Thissen 1981). We wish to concentrate on
methods of data display that leave the viewers as unin-

formed as they were before seeing the display or, worse,
those that induce confusion. Although such techniques
are broadly practiced, to my knowledge they have not
as yet been gathered into a single source or carefully

How to Display Data Badly

categorized. This article is the beginning of such a

compendium.

The aim of good data graphics is to display data accu-
rately and clearly. Let us use this definition as a starting

point for categorizing methods of bad data display. The

definition has three parts. These are (a) showing data,

(b) showing data accurately, and (c) showing data
clearly. Thus, if we wish to display data badly, we have

three avenues to follow. Let us examine them in se-

quence, parse them into some of their component parts,

and see if we can identify means for measuring the
success of each strategy.

2. SHOWING DATA

Obviously, if the aim of a good display is to convey
information, the less information carried in the display,

Change in Science Achievement of 9-, ,,i,,,, biologicalscience
13-, and 17-Year-Olds, by Type of
Exercise: 1969-1977 _. Physical science

Change in percent correct 9-YEAR-OLDS
1 ,

O1 A s ss llll.|||
-2 ________

-3 _

13-YEAR-OLDS

0I

-2

-3 _

-4

_5

-6

17-YEAR-OLDS

– h Is_ _ _ _ _ _ _ _ _ __ _ _ _ _ 1 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

-2 …………………. …1

-4 = a
1969 1970 1973 1977

Figure 1. An example of a low density graph (from S13 [ddi = .3]).

*Howard Wainer is Senior Research Scientist, Educational Testing

Service, Princeton, NJ 08541. This is the text of an invited address to

the American Statistical Association. It was supported in part by the

Program Statistics Research Project of the Educational Testing Ser-

vice. The author would like to express his gratitude to the numerous

friends and colleagues who read or heard this article and offered

valuable suggestions for its improvement. Especially helpful were

David Andrews, Paul Holland, Bruce Kaplan, James 0. Ramsay,

Edward Tufte, the participants in the Stanford Workshop on Ad-

vanced Graphical Presentation, two anonymous referees, the long-

suffering associate editor, and Gary Koch.

C) The American Statistician, May 1984, Vol. 38, No. 2 137

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

0.8 F

i_ 2

0. –

0.0 0 t.4 0.6 0.Z 1.0

LOCATION DIFFERENCE: 3 JNI T

Figure 2. A low density graph (from Friedman and Rafsky 1981
[ddi = .5]).

the worse it is. Tufte (1983) has devised a scheme for
measuring the amount of information in displays, called
the data density index (ddi), which is “the number of
numbers plotted per square inch.” This easily calcu-
lated index is often surprisingly informative. In popular
and technical media we have found a range from .1 to
362. This provides us with the first rule of bad data
display.

Rule 1-Show as Few Data as Possible (Minimize the
Data Density)

What does a data graphic with a ddi of .3 look like?

Shown in Figure 1 is a graphic from the book Social
Indicators III (S13), originally done in four colors (orig-
inal size 7″ by 9″) that contains 18 numbers (18/63 = .3).
The median data graph in S13 has a data density of .6
numbers/in2; this one is not an unusual choice. Shown in
Figure 2 is a plot from the article by Friedman and
Rafsky (1981) with a ddi of .5 (it shows 4 numbers in 8

Labor jyy US. vs Japan
.00 _~~~~~~~~~~~~~.

100%-moutpu pe r mon-ho ur in mQanus ur ngfn cia reta-oU upt

70%

62.3%/

44%/

Figure 3. A low density graph (? 1978, The Washington Post) with
chart-junk to fill in the space (ddi = .2).

Public and Private Elementary Schools m Public
Selected Years 1929-1970

-Prjvale
Thousan d0oi Schools

300

1929-30 1 939-40 1949-50 1959-60 1969-70
School Year

Figure 4. Hiding the data in the scale (from S13).

in2). This is unusual for JASA, where the median data
graph has a ddi of 27. In defense of the producers of this

plot, the point of the graph is to show that a method of

analysis suggested by a critic of their paper was not
fruitful. I suspect that prose would have worked pretty

well also.

Although arguments can be made that high data den-

sity does not imply that a graphic will be good, nor one
with low density bad, it does reflect on the efficiency of

the transmission of information. Obviously, if we hold

clarity and accuracy constant, more information is bet-

THE NUMBER OF PRIVATE ELEMENTARY SCHOOLS
FROM 1930-1970

15-

14 –

13-

C,,

Is

* 12-
C/

1930 9.275
10 _ 1940 10.000

1950 10.375
1960 13.574
1970 14.372

9

0″ 1930 1940 1950 1960 1910

Figure 5. Expanding the scale and showing the data in Figure 4
(from S13).

138 (? The American Statistician, May 1984, Vol. 38, No. 2

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

A New Set of Projectins for the U.S. Supply of Energy
Compared are two proctlons ot United State *rtrgy upply In th, y.r 2000 made by the Pftedynt s Council of
Envirnonot talOuallty and th ectual 1977 supply Alltigurasar * i nquads uunita ootm”aurmomnt that reprnt a
million billin-on quadrilion- Britlsh thettal unlts (8T U a), a standard masure otf ergy

0~~i_ . O.la,daa* Solad 1977 T tal 77 5_ a t U 7
M~~~~~~S 5 / 4 2 – -14 1

Nca, Coal

2000 – A-

Er,ph syes ,egy ‘ c_e,to Tutal 05
’40 19 1

I 7 7 7)

2000-8

Erphas.zes.rc,eased Total 1a 9
erergy oduct.on 37d *

(1979 The New York Times

Figure 6. Ignoring the visual metaphor (? 1978, The New York
Times).

ter than less. One of the great assets of graphical tech-

niques is that they can convey large amounts of informa-

tion in a small space.

We note that when a graph contains little or no infor-

mation the plot can look quite empty (Figure 2) and
thus raise suspicions in the viewer that there is nothing

to be communicated. A way to avoid these suspicions is

to fill up the plot with nondata figurations-what Tufte

has termed “chartjunk.” Figure 3 shows a plot of the

labor productivity of Japan relative to that of the

United States. It contains one number for each of three
years. Obviously, a graph of such sparse information
would have a lot of blank space, so filling the space

hides the paucity of information from the reader.

A convenient measure of the extent to which this

practice is in use is Tufte’s “data-ink ratio.” This mea-

sure is the ratio of the amount of ink used in graphing
the data to the total amount of ink in the graph. The
closer to zero this ratio gets, the worse the graph. The
notion of the data-ink ratio brings us to the second
principle of bad data display.

Rule 2-Hide What Data You Do Show
(Minimize the Data-Ink Ratio)

One can hide data in a variety of ways. One method
that occurs with some regularity is hiding the data in the
grid. The grid is useful for plotting the points, but only
rarely afterwards. Thus to display data badly, use a fine
grid and plot the points dimly (see Tufte 1983,
pp. 94-95 for one repeated version of this).

A second way to hide the data is in the scale. This

corresponds to blowing up the scale (i.e., looking at the
data from far away) so that any variation in the data is
obscured by the magnitude of the scale. One can justify
this practice by appealing to “honesty requires that we
start the scale at zero,” or other sorts of sophistry.

In Figure 4 is a plot that (from S13) effectively hides
the growth of private schools in the scale. A redrawing

of the number of private schools on a different scale
conveys the growth that took place during the mid-

1950’s (Figure 5). The relationship between this rise and

Brown vs. Topeka School Board becomes an immediate

question.

To conclude this section, we have seen that we can

display data badly either by not including them (Rule 1)

.N 1.: – ,l&U^,*

Cm nlions of U.S dollars) (in millions of U S dollars)

3,000 6,000 ____ l

U.S. exports U.S. imports
to China from Taiwan

2,000 4,000

U.S. imports U.S. exports
from China to Taiwan

1,000 2000

1972 1974 1976 1978 1980 1970 1972 1974 1976 1978 1980

Source Dpartment of Commerce

Figure 7. Reversing the metaphor in mid-graph while changing
scales on both axes (? June 14, 1981, The New York Times).

or by hiding them (Rule 2). We can measure the extent
to which we are successful in excluding the data through
the data density; we can sometimes convince viewers

that we have included the data through the incorpo-
ration of chartjunk. Hiding the data can be done either

by using an overabundance of chartjunk or by cleverly
choosing the scale so that the data disappear. A mea-
sure of the success we have achieved in hiding the data
is through the data-ink ratio.

3. SHOWING DATA ACCURATELY

The essence of a graphic display is that a set of num-

bers having both magnitudes and an order are repre-
sented by an appropriate visual metaphor-the mag-
nitude and order of the metaphorical representation
match the numbers. We can display data badly by ignor-
ing or distorting this concept.

Rule 3-Ignore the Visual Metaphor Altogether

If the data are ordered and if the visual metaphor has

a natural order, a bad display will surely emerge if you
shuffle the relationship. In Figure 6 note that the bar

labeled 14.1 is longer than the bar labeled 18. Another
method is to change the meaning of the metaphor in the
middle of the plot. In Figure 7 the dark shading repre-
sents imports on one side and exports on the other. This
is but one of the problems of this graph; more serious
still is the change of scale. There is also a difference in
the time scale, but that is minor. A common theme in

Playfair’s (1786) work was the difference between im-
ports and exports. In Figure 8, a 200-year-old graph
tells the story clearly. Two such plots would have illus-
trated the story surrounding this graph quite clearly.

Rule 4-Only Order Matters

One frequent trick is to use length as the visual meta-
phor when area is what is perceived. This was used quite

effectively by The Washington Post in Figure 9. Note

that this graph also has a low data density (.1), and its

data-ink ratio is close to zero. We can also calculate

Tufte’s (1983) measure of perceptual distortion (PD)

for this graph. The PD in this instance is the perceived

?) The American Statistician, May 1984, Vol. 38, No. 2 139

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

C IIA ItT T

E-xPORT.i & 1511’01TS8I

E-.N; A C_ r LA_- – .

.. ~ ~ ~ ~ ~ ~ . .. ….. 5tf

Figure 8. A plot on the same topic done well two centuries eariler (from Playfair 1786).

Til E tIXITE I, S1A’M1IS, (WFAMIEICiA.A
E 1430I3632

5 5
U5

1958- ESENHOWER: $1.

Ti E FA ‘ t V

1963 – KENNEDY: 94c pw~~~~~~~~~~~ t 4 3 6 2t X
1968- JOHNSON: 53Uc

IN: I 1TEDISTATE:SM’A3.11:11 . _

of thel
lXnlshllng r4 ffiN^>:
Dollar
rc:LuborDportment

1978-CTER: 44CAR
(August)

Figure 9. An example of how to goose up the effect by squaring
the eyeball (? 1978, The Washington Post).

change in the value of the dollar from Eisenhower to

Carter divided by the actual change. I read and measure

thus:

Actual Measured

1.00 – .44 22.00 – 2.06

=44 1.27 2.06 96
PD = 9.68/1.27 = 7.62

This distortion of over 700% is substantial but by no

means a record.
A less distorted view of these data is provided in

Figure 10. In addition, the spacing suggested by the

0 E I SENHOWER
KENNE D T

JOHNSON

0.8

~0. 4

=0.2

CC

0.2

0. O.I I I
1958 1963 1968 1973 1978

YERR
Figure 10. The data in Figure 9 as an unadorned line chart (from

Wainer, 1980).

140 ? The American Statistician, May 1984, Vol. 38, No. 2

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

presidential faces is made explicit on the time scale.

Rule 5-Graph Data Out of Context

Often we can modify the perception of the graph

(particularly for time series data) by choosing carefully
the interval displayed. A precipitous drop can disappear

if we choose a starting date just after the drop. Simi-
larly, we can turn slight meanders into sharp changes by

focusing on a single meander and expanding the scale.
Often the choice of scale is arbitrary but can have pro-

found effects on the perception of the display. Figure 11
shows a famous example in which President Reagan

gives an out-of-context view of the effects of his tax cut.

The Times’ alternative provides the context for a deeper

understanding. Simultaneously omitting the context as
well as any quantitative scale is the key to the practice

of Ordinal Graphics (see also Rule 4). Automatic rules
do not always work, and wisdom is always required.

In Section 3 we discussed three rules for the accurate

display of data. One can compromise accuracy by ignor-

ing visual metaphors (Rule 3), by only paying attention
to the order of the numbers and not their magnitude

(Rule 4), or by showing data out of context (Rule 5).
We advocated the use of Tufte’s measure of perceptual

distortion as a way of measuring the extent to which the

accuracy of the data has been compromised by the dis-
play. One can think of modifications that would allow it

to be applied in other situations, but we leave such
expansion to other accounts.

4. SHOWING DATA CLEARLY

In this section we discuss methods for badly dis-

playing data that do not seem as serious as those de-

THE NEW YORK TIMES, SUNDAY, AUGUST 2, 1981

$2500 Payments under the $2500 Ways and Means __
Committee plan

2000 Payments undr the 2000
Prdentfs proposW

1500
YOUR TAXES

Tawpld by AVERAGE FAMILY INCOME – S20.000

SOfl4Nfl~ 1982 1986

1 ??? w m 11W0 $ THEIR
$ ~~~~~~~BILL

500 l

OUR BILL

1982 1983 1984 1985 198

Figure 11. The White House showing neither scale nor context
(? 1981, The New York Times, reprinted with permission).

scribed previously; that is, the data are displayed, and

they might even be accurate in their portrayal. Yet sub-

tle (and not so subtle) techniques can be used to effec-

tively obscure the most meaningful or interesting as-

pects of the data. It is more difficult to provide objective

measures of presentational clarity, but we rely on the

reader to judge from the examples presented.

Rule 6-Change Scales in Mid-Axis

This is a powerful technique that can make large dif-

ferences look small and make exponential changes look

linear.

In Figure 12 is a graph that supports the associated

story about the skyrocketing circulation of The New

York Post compared to the plummeting Daily News

circulation. The reason given is that New Yorkers

“trust” the Post. It takes a careful look to note the

700,000 jump that the scale makes between the two
lines.

In Figure 13 is a plot of physicians’ incomes over

time. It appears to be linear, with a slight tapering off

in recent years. A careful look at the scale shows that it

starts out plotting every eight years and ends up plotting

yearly. A more regular scale (in Figure 14) tells quite a
different story.

The soaraway Post
the daily paper

New Yorkers trust
1,900,000.

1 ,829,000 NEW S
1,800000 ;

1 700,000

1,636,000

% 1,555,000

1,500,000(- –

… 1,491,000

bu,000 – –

.:_ k00t0
… – .- -_ __ ~ a., :a1….

E~~~17 197 198 198 1982_– .-E
Fiur 12. Chngn scl in mid-ai to mak lag differences.

? The American Statistician, May 1984, Vol. 38, No. 2 141

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

hIcomes of Doctors
Vs. Other Profesionals

(MEDIAN NET INCOMES)
SOURCE: Council on Wage and Price Stability

OFFICED-BASED 62.799
NONSALARIED PHYSICIANS 54,14

1i 50,823 5,4

46,780

43,100

34,740

25,050

16,107

13,150

8,744

$3,262 … A

1939 1947 1951 1955 1963 1965 1967 1970 1972 1973 1974 1975 1976

Figure 13. Changing scale in mid-axis to make exponential growth
linear (? The Washington Post).

Rule 7-Emphasize the Trivial (Ignore the Important)

Sometimes the data that are to be displayed have one
important aspect and others that are trivial. The graph

can be made worse by emphasizing the trivial part. In
Figure 15 we have a page from S13 that compares the

income levels of men and women by educational levels.

It reveals the not surprising result that better educated

individuals are paid better than more poorly educated

ones and that changes across time expressed in constant

dollars are reasonably constant. The comparison of

greatest interest and current concern, comparing sal-

aries between sexes within education level, must be

made clumsily by vertically transposing from one graph
to another. It seems clear that Rule 7 must have been

operating here, for it would have been easy to place the

graphs side by side and allow the comparison of interest
to be made more directly. Looking at the problem from

a strictly data-analytic point of view, we note that there

are two large main effects (education and sex) and a
small time effect. This would have implied a plot that

INCOMES OF DOCTORS VS. OTHER PROFESSIONRLS

710

-60
cso

n50 DOCTORS
OTHER

PROFESSIONALS

z30/

z20 2 z H~~~~~~~~~~~EOICRRE STRRTE0
n010.

1939 19414 1949 1954 1959 1964 1969 1974

YEAR

Figure 14. Data from Figure 13 redone with linear scale (from
Wainer 1980).

Median Income of Year-Round, Full-Time
Workers 25 to 34 Years Old, by Sex and
Educational Attainment: 1961977

Constant 1977 dollars MALE

$20,000 – – – – ?

$18000 . = = = = =

$1200 >_ __ _ s__ –LIL _ 16eaorm

$16000 _ _ _, s_ _s _ ,_~ _| 6to16 years or ore

~~, mm ~~~~ ~~ ____ ~~~ 13 ro 1,5 yeror
$14,000 L A__ M ein ino meo _

1 2 year 9
$12,000 I~– -n ~ r ~

$10,0000

showed~~ ~a own, larges a effct clal an pla eard o thessals

$8,000 – – -_ _ _ _ – –

$4,000 – -__- –

$2,000 —

20

$ l0 – -ale’ -|– – – – – d- ic –

110 12_1`1F ,IA,A 1 e

$14,001 SW_L7 1V _

$1,000 y ars Feor lees

$4 ,000 – _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

$2 ,0 0 0 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

$0 – _ _

1968 1970 1972 1974 1976 1978 1980

Figure 15. Emphasizing the trivial: Hiding the main effect of sex
differences in income through the vertical placement of plots (from
S13).

showed the large effects clearly and placed the smallish
time trend into the background (Figure 16).

MEDIAN INCOME OF YEAR-ROUND FULL TIME WORKERS
25-34 YEARS OLD BY SEX AND EDUCATIONAL ATTAINMENT:

1968-1977 (IN CONSTANT 1977 DOLLARS)

20-

16 ‘1 Males

C” ~MalesV
12 –

I– -‘ ~~~~~~~~~~~~~~~Females

8 Females

Legend

4 -maximum

-median (uveriime)

142 C) The American Statistician, May 1984, Vol. 38, No. 2

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

U.S. IMPORTS OF RED MEATS
BIL. LB.

2.5 LAMB MUTTON AND GOATMEAT

2.0

1.5 …….. . … ……. ……

1.0– – – – –

1 ‘ 0 N” BEEF AN VEAL
N. …: : ‘ , , : : :, : . .;; ‘ : .: : :: : ‘,:: – -:… ….: .. .: ,- :. ‘ : : : : : : :; o~~~~~~~.’.:;::-: :: ‘X’-* .;…. .;. ..,…,…; ….,,.;.–

0.

1960 1963 i90 1969 1972 1975 1978
*eAftcA WG, r EOUIVALENT

^~~~~~~~~~~~~~~~~~~~~~~~~~a FA ‘.:,: -:[email protected]:-:-:9o

Figure 17. Jiggling the baseline makes comparisons more difficult
(from Handbook of Agricultural Charts).

Rule 8-Jiggle the Baseline

Making comparisons is always aided when the quan-
tities being compared start from a common base. Thus
we can always make the graph worse by starting from
different bases. Such schemes as the hanging or sus-
pended rootogram and the residual plot are meant to
facilitate comparisons. In Figure 17 is a plot of U.S.

imports of red meat taken from the Handbook of Agri-
cultural Charts published by the U.S. Department of
Agriculture. Shading beneath each line is a convention
that indicates summation, telling us that the amount of
each kind of meat is added to the amounts below it.
Because of the dominance of and the fluctuations in
importation of beef and veal, it is hard to see what the

changes are in the other kinds of meat-Is the importa-
tion of pork increasing? Decreasing? Staying constant?
The only purpose for stacking is to indicate graphically
the total summation. This is easily done through the
addition of another line for TOTAL. Note that a

TOTAL will always be clear and will never intersect the
other lines on the plot. A version of these data is shown

U.S. IMPORTS OF RED rMEATS*
BIL. LB.-

POkK

2.5 e-_____~~_

10 –

1960 1963 1966 1969 1972 1975 1978

Source: Handbook of Agri_ultural Charts , U .S . Department of
Agriculture, 1976, p. 93.

Chart Source: Origzinal

Figure 18. An alternative version of Figure 17 with a straight line
used as the basis of comparison.

Life Expectancy at Birth, by Sex, Selected m Male
Countres, Most Recent Available Year: Female
1970-1IMi Female

Austria, 1974 1975 HIM

Canada, 1970-1972

Finland, 1974

France, 1972 1 | 4

Germany (Fed Rep , i ill l
1973- 1975 R

Japan, 1974

U S S R., 1971-1972) i S S

Sweden, 1971-1975

United Kingdom, 1970-1972

United States, 1975

0 50 60 70 80 90

Years of life expectancy

Figure 19. Austria First! Obscuring the data structure by alpha-
betizing the plot (from S13).

in Figure 18 with the separate amounts of each meat, as

well as a summation line, shown clearly. Note how

easily one can see the structure of import of each kind
of meat now that the standard of comparison is a

straight line (the time axis) and no longer the import
amount of those meats with greater volume.

Rule 9-Austria First!

Ordering graphs and tables alphabetically can ob-
scure structure in the data that would have been obvious

had the display been ordered by some aspect of the
data. One can defend oneself against criticisms by
pointing out that alphabetizing “aids in finding entries
of interest.” Of course, with lists of modest length such
aids are unnecessary; with longer lists the indexing
schemes common in 19th century statistical atlases pro-
vide easy lookup capability.

Figure 19 is another graph from Sf3 showing life ex-
pectancies, divided by sex, in 10 industrialized nations.
The order of presentation is alphabetical (with the
USSR positioned as Russia). The message we get is that
there is little variation and that women live longer than
men. Redone as a stem-and-leaf diagram (Figure 20 is
simply a reordering of the data with spacing propor-
tional to the numerical differences), the magnitude of
the sex difference leaps out at us. We also note that the

USSR is an outlier for men.

Rule JO-Label (a) Illegibly, (b) Incompletely,

(c) Incorrectly, and (d) Ambiguously

There are many instances of labels that either do not

C) The American Statistician, May 1984, Vol. 38, No. 2 143

This content downloaded from
������������128.193.164.203 on Fri, 08 Jan 2021 03:00:58 UTC������������

All use subject to https://about.jstor.org/terms

LIFE EXPECTANCY AT BIRTH, BY SEX,
MOST RECENT AVAILABLE YEAR

WOMEN YEARS r3Ef

SWEDEN 78
77

FRANCE, US, JAPAN, CANADA 76
FINLAND, AUSTRIA, UK 75

USSR, GERMANY 74
73

72 SWEDEN
71 JAPAN
70

69 CANADA, UK, US, FRANCE
68 GERMANY, AUSTRIA

67 FINLAND
66
65

64

673 USSR
62

I

Figure 20. Ordering and spacing the data from Figure 19 as a
stem-and-leaf diagram provides insights previously difficult to
extract (from S13).

tell the whole story, tell the wrong story, tell two or

more stories, or are so small that one cannot figure out
what story they are telling. One of my favorite examples

of small labels is from The New York Times (August

To Travel Agents
In lkosoldofdlAer

$57~~~~~~~~~~~~~~~’6

O ~~E,ASTIEEN unITE=D

web of discount fars and airlines’ telephone d s areras
(ravel agents’ overhead, offsetting revenue gains from higher volume.

Figure 21. Mixing a changed metaphor with a tiny label reverses
the meaning of the data (? 1978, The New York Times).

Commtssion Payqrents
to Travel Agents

1 5o

m

L 1 20-

L

I JUN ITE

0

N 9 0- TWA

5

E A S TERN

0

F 60-

D D E L T A

0

L 30-

L

A

R

5 0

1976 1977 1978

(e a t I ma t e d

Y EAR

Figure 22. Figure 21 redrawn with 1978 data placed on a
comparable basis (from Wainer 1980).

1978), in which the article complains that fare cuts lower

commission payments to travel agents. The graph (Fig-

ure 21) supports this view until one notices the tiny label
indicating that the small bar showing the decline is for

just the first half of 1978. This omits such …