Fake Graph: The Actual “Dunning-Kruger Effect” Is NOTHING Like I Thought It Was

For years, I’ve been teaching a fake graph.

In pretty much every course I teach, on some day when students seem discouraged or distracted, I’ll draw an X axis labeled “Experience” and a Y-axis labeled “Confidence,” and sketch out the “Dunning-Kruger Effect” curve, as preparation for an informal pep talk, which goes something like this:

Figure 1: A curve labeled as “Dunning-Kruger Effect,” but with a typo on the X axis, which is labeled “No nothing” instead of “Know nothing.”

When learning anything new, beginners tend to learn so much when they are first exposed to new ways of looking at the world and learning vocabulary terms to talk about them, that they get a huge initial confidence boost.

But as soon as you gain just a bit of knowledge, your confidence level starts to drop. Even though your professor can see you are making steady progress along the “experience” axis, the more you know, the more you realize how much you don’t know.

So regardless of the subject, student confidence takes a sharp nose-dive, just when the course really gets rolling.

Lots of beginners give up on the downslope, but eventually the curve bottoms out, and as you look back on your accomplishments, your confidence will rise again.

Dunning and Kruger won the Nobel Prize in Psychology for this work. Isn’t that encouraging to know?

Like many people who share a meme on social media because it helps them make a point they really want to make, I have sadly perpetuated a falsehood.

(Record scratch; freeze-frame.)

But let’s go back to when this all started. (Fast-rewind video effect.)

Monday morning, I figured that it was about time for me to give my Dunning-Kruger Effect pep talk, and since I’m teaching all my classes online (thanks, COVID-19), I thought I’d work this concept into a short video or maybe a new handout.

I looked at the graph I was intending to use, just to re-familiarize myself with what it actually says, and noticed a typo — instead of “Know nothing,” the label at X = 0 is “No nothing.” As I started Googling for a better graphic, I noticed just how many iterations of this curve are out there.

Most of the results looked like the curve I was familiar with, but a significant number showed a variation, with a rounded peak instead of a needle-sharp pike, and gradual rise instead of an abrupt curve.

Figure 2: A very different curve, also labeled “Dunning-Kruger Effect,” but with the Y axis labeled “Low” to “High” confidence, and terms like “Mount Stupid” and “Valley of Despair” added.

All these images are labeled “Dunning-Kruger Effect,” but the time I’ve recently spent studying coronavirus pandemic curves has made me appreciate just how significantly different some of these graphs are.

For instance, this one I’ve called “Figure 2” adds labels such as “Mount Stupid” and “Valley of Despair,” which strike me not only as a poor match for the kind of pep talk I wanted to give, but also too informal for Nobel Prize-winning scholarship.

So I wondered, what exactly were Dunning and Kruger measuring when they plotted “Confidence” on the Y graph?  Did they come up with the terms “Mt. Stupid” and “Valley of Despair”? How did they measure the acquisition of knowledge,?

I then wondered about the scale on the Y axis. What exactly were Dunning and Kruger measuring when the labeled a point on the Y axis as measuring “100% Confidence”? The graph plots the curve through the (0,0). But who begins any course having exactly *zero* confidence and knowing exactly “nothing”? Certainly SOME students will rate above zero in both categories, which should push the average at least slightly above zero.

How did they know to stop their study as soon as their test subjects became what the first figure calls an “expert” but the second calls a “guru”? How do they define expertise or guru status? Did this study actually follow test subjects from when they were babies (when they knew exactly “nothing”) to when they became experts in their fields? How many of the babies in their study went on to become experts with 100% knowledge in a field?

Since the label for Version 1 cites the title of the 1999 article by Dunning and Kruger, it wasn’t hard for me to find the full text in Academic Search Elite.

Kruger, Justin, and David Dunning. “Unskilled and Unaware of It: How Difficulties in Recognizing One’s Own Incompetence Lead to Inflated Self-Assessments.” Journal of Personality and Social Psychology, vol. 77, no. 6, Dec. 1999, pp. 1121–1134. Academic Search Elite, doi:10.1037/0022-3514.77.6.1121.

I looked for the familiar curve, but what I found instead were these four graphs:

What the actual what?

Their “Figure 2” does show a slight valley, but nowhere do I see anything that comes close to the beautiful curve I have drawn many times on whiteboards.

The X axis of the D-K graphs floating around on the Internet label the X-axis “Experience” or “knowledge in field.” But in the actual article, each of these charts plots exactly four points — one for each quartile. The Y axis, instead of having some kind of externally verified scale of “Confidence” is instead labeled “percentile.”

So the narrative I’ve been giving in the name of Dunning and Kruger is totally wrong.

Dunning and Kruger did not measure the confidence of students at the start of a class (at X = 0), and then track them through the course by measuring their confidence after the first, second, third, and final quarters.

No evidence from their study supports the narrative that the confidence of learners starts out at zero, spikes, nose-dives, and then climbs again — even though the Internet is full of graphics purported to illustrate that very narrative.

What data are Dunning and Kruger actually plotting?

Each chart breaks down the responses of groups of students who were asked to predict their score on a single test.  Their responses were then sorted intro four groups according to how they scored on that one test. (That’s why there’s a perfect 45 degree angle for “test scores” — the students were deliberately sorted that way.)

Here is the take-away message that Dunning and Kruger leave their readers with:

In sum, we present this article as an exploration into why people tend to hold overly optimistic and miscalibrated views about themselves. We propose that those with limited knowledge in a domain suffer a dual burden: Not only do they reach mistaken conclusions and make regrettable errors, but their incompetence robs them of the ability to realize it. Although we feel we have done a competent job in making a strong case for this analysis, studying it empirically, and drawing out relevant implications, our thesis leaves us with one haunting worry that we cannot vanquish. That worry is that this article may contain faulty logic, methodological errors, or poor communication. Let us assure our readers that to the extent this article is imperfect, it is not a sin we have committed knowingly.

Just as I wouldn’t want to use “Mt. Stupid” or “The Valley of Despair” in an informal pep talk to students who are frustrated in the middle of a course, I wouldn’t want to use terms like “unskilled” and “incompetent”  — which have specific meanings in the professional world where Dunning and Kruger live, but carry unpleasant emotional connotations that might make my students feel I am belittling their efforts to learn.

I can understand why an educator might want to take Dunning and Kruger’s negatively phrased finding — that students who lack knowledge of a domain also lack the ability to recognize the errors they make in that domain — and rephrase it more positively: “As students learn more, they are better able to recognize their errors.”

That positive version nicely supports the observation that students were better able to predict their test scores as they learned more.

However, Dunning and Kruger’s study did not actually measure student “confidence” on the Y axis, and the X axis does not measure how much experience students gain over time.

We are not looking at what happens to students over time as they learn; instead, we are looking at how accurately  students are able to predict their scores on a single test, and those students are sorted into four groups (graphed at X=1 through X=4) according to their test scores.

Students at all levels all predicted they would get roughly the same scores, slightly above average. The students in the bottom quartile vastly over-estimated their scores, while the students in the top quartile under-estimated their

The Y axis includes the student’s “perceived test score” at four different points,  Dunning and Kruger don’t provide us with any information to flesh out the left side, where X = 0. We only have data points for X at 1, 2, 3, and 4.

I just don’t see any support for the distinctive peak and valley curve that so many online sources associate with the Dunning-Kruger effect.