The Igon Value Problem

Priceless. Steve Pinker wrote a spectacular review of Malcolm Gladwell's latest book, What the Dog Saw and Other Adventures, in the New York Times today. I regularly read and enjoy Gladwell's essays in the New Yorker, but I find his style sometimes problematic, verging on anti-intellectual, and I'm thrilled to see a scientist of Pinker's stature calling him out.

Pinker coins a term for the problem with Gladwell's latest book and his work more generally.  Pinker's term: "The Igon Value Problem" is a clever play on the Eigenvalue Problem in mathematics.  You see, Gladwell apparently quotes someone referring to an "igon value." This is clearly a concept he never dealt with himself even though it is a ubiquitous tool in the statistics and decision science about which Gladwell is frequently so critical.  According to Pinker, the Igon Value Problem occurs "when a writer’s education on a topic consists in interviewing an expert," leading him or her to offering "generalizations that are banal, obtuse or flat wrong."  In other words, the Igon Value Problem is one of dilettantism.  Now, this is clearly a constant concern for any science writer, who has the unenviable task of rendering extremely complex and frequently quite technical information down to something that is simultaneously accurate, understandable, and interesting. However, when the bread and butter of one's work involves criticizing scientific orthodoxy, it seems like one needs to be extremely vigilant to get the scientific orthodoxy right.

Pinker raises the extremely important point that the decisions we make using the formal tools of decision science (and cognate fields) represent solutions to the inevitable trade-offs between information and cost.  This cost can take the form of financial cost, time spent on the problem, or computational resources, to name a few. Pinker writes:

Improving the ability of your detection technology to discriminate signals from noise is always a good thing, because it lowers the chance you’ll mistake a target for a distractor or vice versa. But given the technology you have, there is an optimal threshold for a decision, which depends on the relative costs of missing a target and issuing a false alarm. By failing to identify this trade-off, Gladwell bamboozles his readers with pseudoparadoxes about the limitations of pictures and the downside of precise information.

Pinker is particularly critical of an analogy Gladwell draws in one of his essays between predicting the success of future teachers and future professional quarterbacks.  Both are difficult decision tasks fraught with uncertainty.  Predicting whether an individual will be a quality teacher based on his or her performance on standardized tests or the presence or absence of teaching credentials is an imperfect process just as predicting the success of a quarterback in the N.F.L. based on his performance at the collegiate level.  Gladwell argues that anyone with a college degree should be allowed to teach and that the determination of the qualification for the job beyond the college degree should only be made after they have taught. This solution, he argues, is better than the standard practice of  credentialing, evaluating, and "going back and looking for better predictors.” You know, science? Pinker doesn't hold back in his evaluation of this logic:

But this “solution” misses the whole point of assessment, which is not clairvoyance but cost-effectiveness. To hire teachers indiscriminately and judge them on the job is an example of “going back and looking for better predictors”: the first year of a career is being used to predict the remainder. It’s simply the predictor that’s most expensive (in dollars and poorly taught students) along the accuracy- cost trade-off. Nor does the absurdity of this solution for professional athletics (should every college quarterback play in the N.F.L.?) give Gladwell doubts about his misleading analogy between hiring teachers (where the goal is to weed out the bottom 15 percent) and drafting quarterbacks (where the goal is to discover the sliver of a percentage point at the top).

This evaluation is spot-on. As a bit of an aside, the discussion of predicting the quality of prospective quarterbacks also reminds me of one of the great masterpieces of statistical science and the approach described by this paper certainly has a bearing on the types of predictive problems of which Gladwell ruminates.  In a 1975 paper, Brad Efron and Carl Morris present a method for predicting 18 major league baseball players' 1970 season batting average based on their first 45 at-bats. The naïve method for predicting (no doubt, the approach Gladwell's straw "we" would take) is simply to use the average after the first 45 at-bats. Turns out, there is a better way to solve the problem, in the sense that you can make more precise predictions (though hardly clairvoyant).  The method turns on what a Bayesian would call "exchangeability."  Basically, the idea is that being a major league baseball player buys you a certain base prediction for the batting average.  So if we combine the averages across the 18 players and with each individual's average in a weighted manner, we can make a prediction that has less variation in it.  A player's average after a small number of at-bats is a reflection of his abilities but also lots of forces that are out of his control -- i.e., are due to chance.  Thus, the uncertainty we have in a player's batting based on this small record is partly due to the inherent variability in his performance but also due to sampling error.  By pooling across players, we combine strength and remove some of this sampling error, allowing us to make more precise predictions. This approach is lucidly discussed in great detail in my colleague Simon Jackman's new book, draft chapters of which we used when we taught our course on Bayesian statistical methods for the social sciences.

Teacher training and credentialing can be thought of as strategies for ensuring exchangability in teachers, aiding the prediction of teacher performance.  I am not an expert, but it seems like we have a long way to go before we can make good predictions about who will become an effective teacher and who will not.  This doesn't mean that we should stop trying.

Janet Maslin, in her review of What the Dog Saw, waxes about Gladwell's scientific approach to his essays. She writes that the dispassionate tone of his essays "tames visceral events by approaching them scientifically." I fear that this sentiment, like the statements made in so many Gladwell works, reflects the great gulf between most educated Americans and the realities of scientific practice (we won't even talk about the gulf between less educated Americans and science).  Science is actually a passionate, messy endeavor and sometimes we really do get better by going back and finding better predictors.

Risk-Aversion and Finishing One's Dissertation

It's that time of the year again, it seems, when I have lots of students writing proposals to submit to NSF to fund their graduate education or dissertation research.  This always sets me to thinking about the practice of science and how one goes about being a successful scientist. I've written about "productive stupidity" before, and I still think that is very important. Before I had a blog, I composed a series of notes on how to write a successful NSF Doctoral Dissertation Improvement Grant when I saw the same mistakes over and over again sitting on the Cultural Anthropology panel.

This year, I've find myself thinking a lot about what Craig Loehle dubbed "the Medawar Zone." This is an nod to the great British scientist, Sir Peter Medawar, whose book, The Art of the Soluble: Creativity and Originality in Science, argued that best kind of scientific problems are those that can be solved.  In his classic (1990) paper Loehle argues that "there is a general parabolic relationship between the difficulty of a problem and its likely payoff." Re-reading this paper got me to thinking.

In Loehle's figure 1, he defines the Medawar Zone.  I have reproduced a sketch of the Medawar Zone here.

medawar-zoneNow, what occurred to me on this most recent reading of this paper is that for a net payoff curve to look like this, the benefits with increased difficulty of the problem are almost certainly concave.  That is, they show diminishing marginal returns to increased difficulty.  Hard to say what the cost curve with difficulty would be – linear? convex? Either way, there is an intermediate maximum (akin to Gadgil and Bossert's analysis of intermediate levels of reproductive effort) and the best plan is to pick a problem of intermediate difficulty because that is where the scientific benefits, net of the costs, are maximized.

Suppose that a dissertation is a risky endeavor.  This is not hard for me to suppose since I know many people from grad school days who had at least one failed dissertation project.  Sometimes this led to choosing another, typically less ambitious project.  Sometimes it led to an exit from grad school, sans Ph.D.  Stanford (like Harvard now, but not when I was a student) funds its Ph.D. students for effectively the entirety of their Ph.D.  This is a great thing for students because nothing interferes with your ability to think and be intellectually productive than worrying about how you're going to pay rent.  The downside of this generous funding is that students do not have much time to come up with an interesting dissertation project, write grants, go to the field, collect data, and write up before their funding runs out. So, writing a dissertation is risky.  There is always a chance that if you pick too hard a problem, you might not finish in time and your funding will run out. Well, it just so happens that the combination of a concave utility function and a risk of failure is pretty much the definition of a risk-averse decision-maker.

Say there is an average degree of difficulty in a field.  A student can choose to work on a topic that is more challenging than the average but there is the very real chance that such a project will fail and in order for the student to finish the Ph.D., she will have to quickly complete work on a problem that is easier than the average.  Because the payoff curve with difficulty is concave, it means that the amount you lose relative to the mean if you fail is much greater than the amount you gain relative to the mean if you succeed.  That is, your downside cost is much greater than your upside benefit.

risk-aversionIn the figure, note that d1>>d2.  Here, I have labeled the ordinate as w, which is the population genetics convention for fitness (i.e., the payoff).  The bar-x is the mean difficulty, while x2 and x1 are the high and low difficulty projects respectively.

The way that economists typically think about risk-aversion is that a risk-averse agent is one who is willing to pay a premium for certainty.  This certainty premium is depicted in the dotted line stretching back horizontally from the vertical dashed line at x=xbar to the utility curve.  The certain payoff the agent is willing to accept vs. the uncertain mean is where this dotted line hits the utility curve. Being at this point on the utility curve (where you have paid the certainty premium) probably puts you at the lower end of the Medawar Zone envelope, but hopefully, you're still in it.

I think that this very standard analysis actually provides the graduate student with pretty good advice. Pick a project you can do and maybe be a bit conservative.  The Ph.D. isn't a career – it's a launching point for a career. The best dissertation, after all, is a done dissertation.  While I think this is sensible advice for just about anyone working on a Ph.D., the thought of science progressing in such a conservative manner frankly gives me chills.  Talk about a recipe for normal science!  It seems what we need, institutionally, is a period in which conservatism is not the best option. This may just be the post-doc period.  For me, my time at the University of Washington (CSSS and CSDE) was a period when I had unmitigated freedom to explore methods relevant to what I was hired to do.  I learned more in two years than in – I'd rather not say how many – years of graduate school. The very prestigious post-doctoral programs such as the Miller Fellowships at Berkeley or the Society of Fellows at Harvard or Michigan seem like they are specifically designed to provide the environment where the concavity of the difficulty-payoff curve is reversed (favoring gambles on more difficult projects).

There is, unfortunately, a folklore that has diffused to me through graduate student networks that says that anthropologists need to get a faculty position straight out of their Ph.D. or they will never succeed professionally.  This is just the sort of received wisdom that makes my skin crawl and, I fear, is far too common in our field.  If our hurried-through Ph.D.s can't take the time to take risks, when can we ever expect them to do great work and solve truly difficult problems?