Tag Archives: science writing

Thoughts on Black Swans and Antifragility

I have recently read the latest book by Nassim Nicholas Taleb, Antifragile. I read his famous The Black Swan a while back while in the field and wrote lots of notes. I never got around to posting those notes since they were quite telegraphic (and often not even electronic!), as they were written in the middle of the night while fighting insomnia under mosquito netting. The publication of his latest, along with the time afforded by my holiday displacement, gives me an excuse to formalize some of these notes here. Like Andy Gelman, I have so many things to say about this work on so many different topics, this will be a bit of a brain dump.

Taleb’s work is quite important for my thinking on risk management and human evolution so it is with great interest that I read both books. Nonetheless, I find his works maddening to say the least. Before presenting my critique, however, I will pay the author as big a compliment as I suppose can be made. He makes me think. He makes me think a lot, and I think that there are some extremely important ideas is his writings. From my rather unsystematic readings of other commentators, this seems to be a pretty common conclusion about his work. For example, Brown (2007) writes in The American Statistician, “I predict that you will disagree with much of what you read, but you’ll be smarter for having read it. And there is more to agree with than disagree. Whether you love it or hate it, it’s likely to change public attitudes, so you can’ t ignore it.” The problem is that I am so distracted by all the maddening bits that I regularly nearly miss the ideas, and it is the ideas that are important. There is so much ego and so little discipline on display in his books, The Black Swan and Antifragile.

Some of these sentiments have been captured in Michiko Kakutani’s excellent review of Antifragile. There are some even more hilarious sentiments communicated in Tom Bartlett’s non-profile in the Chronicle of Higher Education.

I suspect that if Taleb and I ever sat down over a bottle of wine, we would not only have much to discuss but we would find that we are annoyed — frequently to the point of apoplexy — by the same people. Nonetheless, I find one of the most frustrating things about reading his work the absurd stereotypes he deploys and broad generalizations he uses to dismiss the work of just about any academic researcher. His disdain for academic research interferes with his ability to make cogent critique. Perhaps I have spent too much time at Stanford, where the nerd is glorified, but, among other things, I find his pejorative use of the term “nerd” for people like Dr. John, as contrasted to man-of-his-wits Stereotyped, I mean, Fat Tony off-putting and rather behind the times. Gone are the days when being labeled a nerd is a devastating put down.

My reading of Taleb’s critiques of prediction and risk management is that the primary problem is hubris. Is there anything fundamentally wrong with risk assessment? I am not convinced there is, and there are quite likely substantial benefits to systematic inquiry. The problem is that the risk assessment models become reified into a kind of reality. I warn students – and try to regularly remind myself – never to fall in love with one’s own model. Something that many economists and risk modelers do is start to believe that their models are something more real than heuristic. George Box’s adage has become a bit cliche but nonetheless always bears repeating: all models are wrong, but some are useful. We need to bear in mind the wrongness of models without dismissing their usefulness.

One problem about both projection and risk analysis, that Taleb does not discuss, is that risk modelers, demographers, climate scientists, economists, etc. are constrained politically in their assessments. The unfortunate reality is that no one wants to hear how bad things can get and modelers get substantial push-back from various stakeholders when they try to account for real worst-case scenarios.

There are ways of building in more extreme events than have been observed historically (Westfall and Hilbe (2007), e.g., note the use of extreme-value modeling). I have written before about the ideas of Martin Weitzman in modeling the disutility of catastrophic climate change. While he may be a professor at Harvard, my sense is that his ideas on modeling the risks of catastrophic climate change are not exactly mainstream. There is the very tangible evidence that no one is rushing out to mitigate the risks of climate change despite the fact that Weitzman’s model makes it pretty clear that it would be prudent to do so. Weitzman uses a Bayesian approach which, as noted by Westfall and Hilbe, is a part of modern statistical reasoning that was missed by Taleb. While beyond the scope of this already hydra-esque post, briefly, Bayesian reasoning allows one to combine empirical observations with prior expectations based on theory, prior research, or scenario-building exercises. The outcome of a Bayesian analysis is a compromise between the observed data and prior expectations. By placing non-zero probability on extreme outcomes, a prior distribution allows one to incorporate some sense of a black swan into expected (dis)utility calculations.

Nor does the existence of black swans mean that planning is useless. By their very definition, black swans are rare — though highly consequential — events. Does it not make sense to have a plan for dealing with the 99% of the time when we are not experiencing a black swan event? To be certain, this planning should not interfere with our ability to respond to major events but I don’t see any evidence that planning for more-or-less likely outcomes necessarily trades-off with responding to unlikely outcomes.

Taleb is disdainful about explanations for why the bubonic plague didn’t kill more people: “People will supply quantities of cosmetic explanations involving theories about the intensity of the plague and ‘scientific models’ of epidemics.” (Black Swan, p. 120) Does he not understand that epidemic models are a variety of that lionized category of nonlinear processes he waxes about? He should know better. Epidemic models are not one of these false bell-curve models he so despises. Anyone who thinks hard about an epidemic process – in which an infectious individual must come in contact with a susceptible one in order for a transmission event to take place – should be able to infer that an epidemic can not infect everyone. Epidemic models work and make useful predictions. We should, naturally, exhibit a healthy skepticism about them as we should any model. But they are an important tool for understanding and even planning.

Indeed, our understanding gained from the study of (nonlinear) epidemic models has provided us with the most powerful tools we have for control and even eradication. As Hans Heesterbeek has noted, the idea that we could control malaria by targeting the mosquito vector of the disease is one that was considered ludicrous before Ross’s development of the first epidemic model. The logic was essentially that there are so many mosquitoes that it would be absurdly impractical to eliminate them all. But the Ross model revealed that epidemics – because of their nonlinearity – have thresholds. We don’t have to eliminate all the mosquitoes to break the malaria transmission cycle; we just need to eliminate enough to bring the system below the epidemic threshold. This was a powerful idea and it is central to contemporary public health. It is what allowed epidemiologists and public health officials to eliminate smallpox and it is what is allowing us to very nearly eliminate polio if political forces (black swans?) will permit.

Taleb’s ludic fallacy (i.e., games of chance are somehow an adequate model of randomness in the world) is great. Quite possibly the most interesting and illuminating section of The Black Swan happens on p. 130 where he illustrates the major risks faced by a casino. Empirical data make a much stronger argument than do snide stereotypes. This said, Lund (2007) makes the important point that we need to ask what exactly is being modeled in any risk assessment or projection? One of the most valuable outcomes of any formalized risk assessment (or formal model construction more generally) is that it forces the investigator to be very explicitly about what is being modeled. The output of the model is often of secondary importance.

Much of the evidence deployed in his books is what Herb Gintis has called “stylized facts” and, of course, is subject to Taleb’s own critique of “hidden evidence.” Because the stylized facts are presented anecdotally, there is no way to judge what is being left out. A fair rejoinder to this critique might be that these are trade publications meant for a mass market and are therefore not going to be rich in data regardless. However, the tone of books – ripping on economists and bankers but also statisticians, historians, neuroscientists, and any number of other professionals who have the audacity to make a prediction or provide a causal explanation – makes the need for more measured empirical claims more important. I suspect that many of these people actually believe things that are quite compatible with the conclusions of both The Black Swan and Antifragile.

On Stress

The notion of antifragility turns on systems getting stronger when exposed to stressors. But we know that not all stressors are created equally. This is where the work of Robert Sapolsky really comes into play. In his book Why Zebras Don’t Get Ulcers, Sapolsky, citing the foundational work of Hans Seyle, notes that some stressors certainly make the organism stronger. Certain types of stress (“good stress”) improves the state of the organism, making it more resistant to subsequent stressors. Rising to a physical or intellectual challenge, meeting a deadline, competing in an athletic competition, working out: these are examples of good stresses. They train body, mind, and emotions and improve the state of the individual. It is not difficult to imagine that there could be similar types of good stressors at levels of organization higher than the individual too. The way the United States come together as a society to rise to the challenge of World War II and emerge as the world’s preeminent industrial power comes to mind. An important commonality of these good stressors is the time scale over which they act. They are all acute stressors that allow recovery and therefore permit the subsequently improved performance.

However, as Sapolsky argues so nicely, when stress becomes chronic, it is no longer good for the organism. The same glucocorticoids (i.e., “stress hormones”) that liberate glucose and focus attention during an acute crisis induce fatigue, exhaustion, and chronic disease when the are secreted at high levels chronically.

Any coherent theory of antifragility will need to deal with the types of stress to which systems are resistent and, importantly, have a strengthening effect. Using the idea of hormesis – that a positive biological outcome can arise from taking low doses of toxins – is scientifically hokey and boarders on mysticism. It unfortunately detracts from the good ideas buried in Antifragile.

I think that Taleb is on to something with the notion of antifragility but I worry that the policy implications end up being just so much orthodox laissez-faire conservatism. There is the idea that interventions – presumably by the State – can do nothing but make systems more fragile and generally worse. One area where the evidence very convincingly suggests that intervention works is public health. Life expectancy has doubled in the rich countries of the developed world from the beginning of the twentieth century to today. Many of the gains were made before the sort of dramatic things that come to mind when many people think about modern medicine. It turns out that sanitation and clean water went an awful long way toward decreasing mortality well before we had antibiotics or MRIs. Have these interventions made us more fragile? I don’t think so. The jury is still out, but it seems that reducing the infectious disease burden early in life (as improved sanitation does) seems to have synergistic effects on later-life mortality, an effect is mediated by inflammation.

On The Academy

Taleb drips derision throughout his work on university researchers. There is a lot to criticize in the contemporary university, however, as with so many other external critics of the university, I think that Taleb misses essential features and his criticisms end up being off base. Echoing one of the standard talking points of right-wing critics, Taleb belittles university researchers as being writers rather than doers (echoing the H.L. Menken witticism  “Those who can do; those who can’t teach”). Skin in the game purifies thought and action, a point with which I actually agree, however, thinking that that university researchers live in a world lacking consequences is nonsense. Writing is skin in the game. Because we live in a quite free society – and because of important institutional protections on intellectual freedom like tenure (another popular point of criticism from the right) – it is easy to forget that expressing opinions – especially when one speaks truth to power – can be dangerous. Literally. Note that intellectuals are often the first ones to go to the gallows when there are revolutions from both the right and the left: Nazis, Bolsheviks, and Mao’s Cultural Revolution to name a few. I occasionally get, for lack of a better term, unbalanced letters from people who are offended by the study of evolution and I know that some of my colleagues get this a lot more than I. Intellectuals get regular hate mail, a phenomenon amplified by the ubiquity of electronic communication. Writers receive death threats for their ideas (think Salman Rushdie). Ideas are dangerous and communicating them publicly is not always easy, comfortable, or even safe, yet it is the professional obligation of the academic.

There are more prosaic risks that academics face that suggest to me that they do indeed have substantial skin in the game. There is a tendency for critics from outside the academy to see universities as ossified places where people who “can’t do” go to live out their lives, the university is a dynamic place. Professors do not emerge fully formed from the ivory tower. They must be trained and promoted. This is the most obvious and ubiquitous way that what academics write has “real world” consequences – i.e., for themselves. If peers don’t like your work, you won’t get tenure. One particularly strident critic can sink a tenure case. Both the trader and the assistant professor have skin in their respective games – their continued livelihoods depend upon their trading decisions and their writing. That’s pretty real. By the way, it is a huge sunk investment that is being risked when an assistant professor comes up for tenure. Not much fun to be forty and let go from your first “real” job since you graduated with your terminal degree… (I should note that there are problems with this – it can lead to particularly conservative scholarship by junior faculty, among other things, but this is a topic for its own post.)

Now, I certainly think that are more and less consequential things to write about. I have gotten more interested in applied problems in health and the environment as I’ve moved through my career because I think that these are important topics about which I have potentially important things to say (and, yes, do). However, I also think it is of utmost importance to promote the free flow of ideas, whether or not they have obvious applications. Instrumentally, the ability to pursue ideas freely is what trains people to solve the sort of unknown and unforecastable problems that Taleb discusses in The Black Swan. One never knows what will be relevant and playing with ideas (in the personally and professionally consequential world of the academy) is a type of stress that makes academics better at playing with ideas and solving problems.

One of the major policy suggestions of Atifragile is that tinkering with complex systems will be superior to top-down management. I am largely sympathetic to this idea and to the idea that high-frequency-of-failure tinkering is also the source of innovation. Taleb contrasts this idea of tinkering is “top-down” or “directed” research, which he argues regularly fails to produce innovations or solutions to important problems. This notion of “top-down,” “directed” research is among the worst of his various straw men and a fundamental misunderstanding of the way that science works. A scientist writes a grant with specific scientific questions in mind, but the real benefit of a funded research program is the unexpected results one discovers while pursuing the directed goals. As a simple example, my colleague Tony Goldberg has discovered two novel simian hemorrhagic viruses in the red colobus monkeys of western Uganda as a result of our big grant to study the transmission dynamics and spillover potential of primate retroviruses. In the grant proposal, we discussed studying SIV, SFV, and STLV. We didn’t discuss the simian hemorrhagic fever viruses because we didn’t know they existed! That’s what discovery means. Their not being explicitly in the grant didn’t stop Tony and his collaborators from the Wisconsin Regional Primate Center from discovering these viruses but the systematic research meant that they were in a position to discover them.

The recommendation of adaptive, decentralized tinkering in complex systems is in keeping with work in resilience (another area about which Taleb is scornful because it is the poor step-child of antifragility). Because of the difficulty of making long-range predictions that arises from nonlinear, coupled systems, adaptive management is the best option for dealing with complex environmental problems. I have written about this before here.

So, there is a lot that is good in the works of Taleb. He makes you think, even if spend a lot of time rolling your eyes at the trite stereotypes and stylized facts that make up much of the rhetoric of his books. Importantly, he draws attention to probabilistic thinking for a general audience. Too much popular communication of science trades in false certainties and the mega-success of The Black Swan in particular has done a great service to increasing awareness among decision-makers and the reading public about the centrality of uncertainty. Antifragility is an interesting idea though not as broadly applicable as Taleb seems to think it is. The inspiration for antifragility seem to lie in largely biological systems. Unfortunately, basing an argument on general principles drawn from physiology, ecology, and evolutionary biology pushes Taleb’s knowledge base a bit beyond its limit. Too often, the analogies in this book fall flat or are simply on shaky ground empirically. Nonetheless, recommendations for adaptive management and bricolage are sensible for promoting resilient systems and innovation. Thinking about the world as an evolving complex system rather than the result of some engineering design is important and if throwing his intellectual cachet behind this notion helps it to get as ingrained into the general consciousness as the idea of a black swan has become, then Taleb has done another major service.

On Global State Shifts

This is a edited version of a post I sent out to the E-ANTH listserv in response to a debate over a recent paper in Nature and the response to it on the website “Clear Science,” written by Todd Meyers. In this debate, it was suggested that the Barnosky paper is the latest iteration of alarmist environmental narratives in the tradition of the master of that genre, Paul Ehrlich. Piqued by this conversation, I read the Barnosky paper and passed along my reading of it.

The Myers’s piece on the “Clear Science” web site is quite rhetorically clever. Climate-change deniers have a difficult task if they want to convincingly buck the overwhelming majority of reputable scientists on this issue. Myers uses ideas about the progress of science developed by the philosopher Thomas Kuhn in his classic book, The Structure of Scientific Revolutions. By framing the Barnosky et al. as mindlessly toeing the Kuhnian normal-science line, he has come up with a shrewd strategy for dealing with the serious scientific consensus around global climate change. Myers suggests that “Like scientists blindly devoted to a failed paradigm, the Nature piece simply tries to force new data to fit a flawed concept.”

I think that a pretty strong argument can be made that the perspective represented in the Barnosky et al. paper is actually paradigm-breaking. For 200 years the reigning paradigm in the historical sciences has been uniformitarianism. Hutton’s notion — that processes that we observe today have always been working — greatly extended the age of the Earth and allowed Lyell and Darwin to make their remarkable contributions to human understanding. This same principle allows us to make sense of the archaeological record and of ethnographic experience. It is a very useful foil for all manner of exceptionalist explanatory logic and I use it frequently.

However, there are plenty of ways that uniformitarianism fails. If we wanted to follow the Kuhnian narrative, we might say that evidence has mounted that leads to increased contradictions arising from the uniformitarian explanatory paradigm. Rates of change show heterogeneities and when we trying to understand connected systems characterized by extensive feedback, our intuitions based on gradual change can fail, sometimes spectacularly. This is actually a pretty revolutionary idea, apocalyptic popular writings aside, in mainstream science.

Barnosky et al. draw heavily on contemporary work in complex systems. The theoretical paper (Scheffer et al. 2009) upon which the Barnosky paper relies heavily represents a real step forward in the theoretical sophistication of this corpus and does so by making unique and testable predictions about systems approaching critical transitions. I have written about it previously here.

The most difficult part of projecting the future state of complex systems is that human element. This leads too many physical and biological scientists to simply ignore social and behavioral inputs. This said, there are far too few social and behavioral scientists willing to step up and do the hard collaborative work necessary to make progress on this extremely difficult problem. The difficulty of projecting human behavior often leads to projections of the business-as-usual variety and, unfortunately, these are often mischaracterized by the media and other readers. Such projections simply assume no change in behavior and look at the consequences some time down the line. A business-as-usual projection actually provides a lot of information, albeit about a very hypothetical future. What if things stayed the way they are? Yes, behavior changes. People adapt. Agricultural production becomes more efficient. Prices increase, reducing demand and allowing sustainable substitutes. Of course, sometimes things get worse too. Despite tremendous global awareness and lots of calls to reduce greenhouse gas emissions, carbon emissions have continued to rise. So, there is nothing inherently flawed about a business-as-usual projection. We just need to be clear about what it means when we use one.

A criticism that emerged on the list is that Barnosky et al. is essentially “an opinion piece.” However, the great majority of the Barnosky et al. paper is, in fact, simply a review. There are numerous facts to be reviewed: biodiversity has declined, fisheries have crashed, massive amounts of forest have been converted and degraded, the atmosphere has warmed. They are facts. And they are facts in which many vested interests would like to sow artificial uncertainty for political purposes. Positive things have happened too (e.g., malaria eradication in temperate climes, increased food security in some places that used to be highly insecure, increased agricultural productivity — though this may be of dubious sustainability), though these are generally on more local scales and, in some cases, may simply reflect exporting the problems to rich countries to the Global South. The fact that they are not reviewed does not mean that the paper belongs in an hysterical chicken-little genre.

A common critique of the doomsday genre is the certainty with which the horrible outcomes are framed. The Barnosky paper is suffused with uncertainty. In fact, this is the main point I take away from it! The first conclusion of the paper is that “it is essential to improve biological forecasting by anticipating critical transitions that can emerge on a planetary scale and understanding how such global forcings cause local changes.” This suggests to me that the authors are acknowledging massive uncertainty about the future, not saying that we are doomed with certainty. Or how about: “the plausibility of a future planetary state shift seems high, even though considerable uncertainty remains about whether it is inevitable and, if so, how far in the future it may be”?

Myers writes “they base their conclusions on the simplest linear mathematical estimate that assumes nothing will change except population over the next 40 years. They then draw a straight line, literally, from today to the environmental tipping point.” This is a profoundly misleading statement. Barnosky et al. are using the fold catastrophe model discussed in Scheffer et al. (2009). The Scheffer et al. analysis of the fold catastrophe model uses some fairly sophisticated ideas from complex systems theory, but the ideas are relatively simple. The straight line that so offends Myers arises because this is the direction of the basin of attraction. In the figure below, I show the fold-catastrophe model. The abcissa represents the forcing conditions of the system (e.g., population size or greenhouse gas emissions). The ordinate represents the state of the system (e.g., land cover or one of many ecosystem services). The sideways N represents an attractor — a more general notion of an equilibrium. The state of the system tends toward this curve whenever it is perturbed away.

The region in the interior of the fold (indicated by the dashed line) is unstable while the upper and lower tails (indicated by solid lines) are stable and tend to draw perturbations from the attractor toward them. The grey arrows indicate the basin of attraction. When the system is perturbed off of the attractor by some random shock, the state tends to move in the direction indicated by the arrow. When the state is forced all the way down the top arc of the fold, it enters a region where a relatively small shock can send the state into a qualitatively different regime of rapid degradation. This is illustrated by the black arrow (a shock) pushing the state away from point F2. The state will settle again on the attractor, but a second shock will send the state rapidly down toward the bottom arm of the fold (point F1). Note that this region of the attractor is stable so it would take a lot of work to get it back up again (e.g., reduce population or drastically reduced total greenhouse gasses). This is what people mean when they colloquially refer to a “global tipping point.”

This is the model. It may not be right, but thanks to Scheffer et al. (2009), it makes testable predictions. By framing global change in terms of this model, Barnosky et al. are making a case for empirical investigation of the types of data that can falsify the model. Maybe because of the restrictions placed on them by Nature (and these are severe!), maybe because of some poor choices of their own, they include an insufficiently explained, fundamentally complex figure that a critic with clear interests in muddying the scientific consensus can sieze on to dismiss the whole paper as just more Ehrlich-style hysteria.

For me — as I suspect for the authors of the Barnosky et al. paper — massive, structural uncertainty about the state of our planet, coupled with a number of increasingly well-supported models of the behavior or nonlinear systems (i.e., not simply normal science) strongly suggests a precautionary principle. This is something that the economist Marty Weitzman suggested in his (highly technical and therefore not widely read) paper in 2009 and that I have written about before here and here. This is not inflammatory fear-mongering, nor is it grubbing for grant money (I wish it were that easy!). It is responsible scientists doing their best to communicate the state of the science within the constraints of society and the primary mode of scientific communication. Let’s not be taken in by writers pretending to present “just the facts” in a cool, detached manner but who actually have every reason to try to foment unnecessary uncertainty about the state of our world and impugn the integrity of people doing their level best to understand a rapidly changing planet.

References

Kuhn, T. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Scheffer, M., J. Bascompte, W. A. Brock, V. Brovkin, S. R. Carpenter, V. Dakos, H. Held, E. H. van Nes, M. Rietkerk, and G. Sugihara. 2009. Early-Warning Signals for Critical Transitions. Nature. 461 (7260):53-59.

Weitzman, M. L. 2009. On Modeling and Interpreting the Economics of Catastrophic Climate Change. The Review of Economics and Statistics. XCI (1):1-19.

 

Wealth and Cheating

I recently read a story in the Los Angeles Times about a team of psychologists at UC Berkeley who showed, in a series of experimental and naturalistic studies, that wealthy individuals are more likely to cheat or violate social norms about fairness. The Story in the Times referred to the paper by Piff et al. in the 27 February edition of PNAS.  Here is the abstract of this paper:

Seven studies using experimental and naturalistic methods reveal that upper-class individuals behave more unethically than lower-class individuals. In studies 1 and 2, upper-class individuals were more likely to break the law while driving, relative to lower-class individuals. In follow-up laboratory studies, upper-class individuals were more likely to exhibit unethical decision-making tendencies (study 3), take valued goods from others (study 4), lie in a negotiation (study 5), cheat to increase their chances of winning a prize (study 6), and endorse unethical behavior at work (study 7) than were lower-class individuals. Mediator and moderator data demonstrated that upper-class individuals’ unethical tendencies are accounted for, in part, by their more favorable attitudes toward greed.

This study was apparently motivated by observations that people in expensive luxury cars are more likely to bolt ahead of their turn at four-way stop intersections in the San Francisco Bay Area, a daily experience for anyone driving in Palo Alto! It’s terrific that these authors actually took the trouble to systematize their casual observations of driving behavior and make an interesting and compelling scientific statement.

On Friday, I made my own observations about class, cheating, and the violation of norms as I flew down to LAX to attend Sunbelt XXXII (the annual conference for the International Network for Social Network Analysis). Of late, I’ve racked up a lot of miles on United and, as a result, occasionally get upgraded to first class or business class seating. My trip Friday was one of those occasions. As I sat in the (relatively) comfy leather seat of the first-class cabin reading Jeremy Boissevain’s rather appropriate (1974) book Friends of Friends: Networks, Manipulators, and Coalitions, I noticed that nearly everyone around me was busily chatting away or otherwise fiddling around with their smart phones. When the cabin door finally closed and the announcement was made requesting that phones be switched off, none of the people in my neighborhood did so. They put their phones down or in their shirt pockets and watched the flight attendants.  When the flight attendants passed through the cabin and were occupied with other business, out came the smart phones again. The one gentleman across the aisle from me looked like a school kid writing a note in class or something. He kept a wary half-eye out for the flight attendants and looked extremely guilty about his actions, but he nonetheless kept doing his, no doubt, extremely important business.  The man on the phone in the row ahead of me was a little more shameless. He seemed completely unconcerned that he might get busted. The woman in the row ahead of me and across the aisle moved her phone so that it was partially hidden by the arm-rest of her seat as she continued to scroll through her very, very important email. Of the six people I could easily see in my neighborhood, fully half of them continued to use their phones right into taxi and take-off.  Based on their attempts at concealment, at least two of them knew what they were doing was wrong. Now, any regular traveler has seen people using their phones on the plane after they are supposed to. However, I had never seen this sort of density of norm violation on a single flight before.

Of course, this is an anecdote but the study by Piff et al. (2012) shows how anecdotes about social behavior can go on to be systematized into interesting scientific studies.

Three Questions About Norms

Well, it certainly has been a while since I’ve written anything here. Life has gotten busy with new projects, new responsibilities, etc. Yesterday, I participated in a workshop on campus sponsored by the Woods Institute for the Environment, the Young Environmental Scholars Conference. I was asked to stand-in for a faculty member who had to cancel at the last minute. I threw together some rather hastily-written notes and figured I’d share them here (especially since I spoke quite a bit of the importance for public communication!).

The theme of the conference was “Environmental Policy, Behavior, and Norms” and we were asked to answer three questions: (1) What does doing normative research mean to you? (2) How do your own norms and values influence your research? (3) What room and role do you see for normative research in your field? So, in order, here are my answers.

What does doing normative research mean to you?

I actually don’t particularly like the term “normative research” because it sounds a little too much like imposing one’s values on other people. I am skeptical of the imposition of norms that have more to do with (often unrecognized) ideology and less about empirical truth – an idea that was later reinforced by a terrific concluding talk by Debra Satz. If I can define “normative” to mean with the intent to improve people’s lives, then OK.  Otherwise, I prefer to do “positive” research.

For me, normative research is about doing good science. As a biosocial scientist with broad interests, I wear a lot of hats. I have always been interested in questions about the natural world, and (deep) human history in particular. However, I find that the types of questions that really hold my interest these days are more and more engaged in the substantial challenges we face in the world with inequality and sustainability. In keeping with my deep pragmatist sympathies, I increasingly identify with Charles Sanders Pierce‘s idea that given the “great ocean of truth” that can potentially be uncovered by science, there is a moral burden to do things that have social value. (As an aside, I think that there is social value in understanding the natural world, so I don’t mean to imply a crude instrumentalism here.) In effect, there is a lot of cool science to be done; one may as well do something of relevance.  I personally have little patience for people who pursue racist or otherwise socially divisive agendas and cloak their work in a veil of  free scientific inquiry.  This said, I worry when advocacy interferes with intellectual fairness or an unwillingness to accept that one’s position is not actually true.

I think that we are fooling ourselves if we believe that our norms somehow don’t have an effect on our research.  Recognizing what these norms that shape your research – whether implicitly or explicitly – helps you manage your bias. Yes, I said manage. I’m not sure we can ever completely eliminate it. I see this as more of a management of a necessary trade-off, drawing an analogy between the practice of science and a classic problem in statistics, between bias and variance. The more biased one is, the less variance there is in the outcome of one’s investigation. The less bias, the greater the likelihood that results will differ from one’s expectations (or wishes). Recognizing how norms shape our research also deals with that murky area of pre-science: where do our ideas for what to study come from?

How do your own norms and values influence your research?

Some of the the norms that shape my own research and teaching include:

transparency: science works best when it is open. This places a premium on sharing data, methods, and communicating results in a manner that maximizes access to information. As a simple example, this norm shapes my belief that we should not train students from poor countries in the use of proprietary software (and other technologies) that they won’t be able to afford when they return to their home countries when there are free or otherwise open-source alternatives.

fairness: this naturally includes a sense of social justice or people playing on an equal playing field, but it also includes fairness to different ideas, alternative hypotheses, the possibility that one is wrong. This type of fairness is essential for one’s credibility as a public intellectual in science (particularly supporting policy), as noted eloquently in this interview with Dick Lewontin.

respect for people’s ultimate rationality: Trying to understand the social, ecological, and economic context of people’s decision-making, even if it violates our own normative – particularly market-based economic – expectations.

flexibility: solving real problems means that we need to be flexible in our approach, willing to go where the solutions lead us, learning new tools and collaborating. Flexibility also means a willingness to give up on a research program that is doing harm.

good-faith communication: I believe that there is no room for obscurantism in the academy of the 21st century. This includes public communication. There are, of course, complexities here with regard to the professional development of young scholars.  One of the key trade-offs for young scholars is the need for professional advancement (which comes from academic production) and activism, policy, and public communication. Within the elite universities, the reality is that neither public communication nor activism count much for tenure. However, as Jon Krosnick noted, tenure is a remarkable privilege and, while it may seem impossibly far away for a student just finishing a Ph.D., it’s not really. Once you prove that you have the requisite disciplinary chops, you have plenty of time to to use tenure for what it is designed for (i.e., protecting intellectual freedom) and engaging in critical public debate and communication.

humility: solving problems (in science and society) means caring more about the answer to a problem than one’s own pet theory. Humility is intimately related to respect for others’ rationality.  It also means recognizing the inherently collaborative nature of contemporary science: giving credit where it is due, seeking help when one is in over one’s head, etc. John DeGioia, President of Georgetown University, quoted St. Augustine in his letter of support for Georgetown Law Student, Sandra Fluke against the crude attacks by radio personality Rush Limbaugh and I think those words are quite applicable here as well.  Augustine implored his interlocutors to “lay aside arrogance” and to “let neither of us assert that he has found the truth; let us seek it as if it were unknown to both.” This is not a bad description of the way that science really should work.

What room and role do you see for normative research in your field?

I believe that there is actually an enormous amount of room for normative research, if by “normative research,” we mean research that has the potential to have a positive effect on people’s lives. If instead we mean imposing values on people, then I am less sure of its role.

Anthropology is often criticized from outside the field, and to a lesser extent, from within it for being overly politicized. You can see this in Nicholas Wade’s critical pieces in the New York Times Science Times section following the American Anthropological Association’s executive committee excising of the word “science” from the field’s long-range planning document. Wade writes,

The decision [to remove the word ‘science’ from the long-range planning document] has reopened a long-simmering tension between researchers in science-based anthropological disciplines — including archaeologists, physical anthropologists and some cultural anthropologists — and members of the profession who study race, ethnicity and gender and see themselves as advocates for native peoples or human rights.

This is a common sentiment. And it is a complete misunderstanding. It suggests that scientists can’t be advocates for native peoples or human rights.  It also suggests that one can’t study race, ethnicity, or gender from a scientific perspective.  Both these ideas are complete nonsense.  For all the leftist rhetoric, I am not impressed with the actual political practice of what I see in contemporary anthropology. There is plenty of posturing about power asymmetries and identity politics but it is always done in such a mind-numbingly opaque language and with no apparent practical tie-in to policies that make people’s lives better. And, of course, there is the outright disdain for “applied” work one sees in elite anthropology departments.

Writing specifically about Foucault, Chomsky captured my take on this whole mode of intellectual production:

The only way to understand [the mode of scholarship] is if you are a graduate student or you are attending a university and have been trained in this particular style of discourse. That’s a way of guaranteeing…that intellectuals will have power, prestige and influence. If something can be said simply, say it simply, so that the carpenter next door can understand you. Anything that is at all well understood about human affairs is pretty simple.

Ultimately, the simple truths about human affairs that I find anyone can relate to are subsistence, health, and the well-being of one’s children. These are the themes at the core of my own research and I hope that the work I do ultimately can effect some good in these areas.

My Erdős Number

Paul Erdős was the great peripatetic, and highly prolific, mathematician of the 20th century. A terrific web page run by Jerry Grossman at Oakland University provides details of the Erdős Project. Erdős was a pioneer in graph theory, which provides the formal tools for the analysis of social networks.  A collaboration graph is a special graph in which the nodes are authors and an edge connects authors if they co-author a publication. Erdős was such a prolific collaborator that he forms a major hub in the mathematics collaboration graph, linking many disparate authors in the different realms of pure and applied mathematics.

For whatever reason, today I used Grossman’s directions for finding one’s number. <drum roll> My Erdős number is 4.  The path that leads me to Erdős is pretty sweet, I have to say.  This past year, I published a paper in PNAS with Marc Feldman.  Marc wrote a number of papers (here’s one) with Sam Karlin (who, I’m proud to say, came and slept through at least one talk I gave at the Morrison Institute). Karlin wrote a paper with Gábor Szegő, who wrote a paper with Erdős.  Lots of Stanford greatness there that I feel privileged to be a part of. It turns out that I have independent (though longer) paths through my co-authors Marcel Salathé and Mark Handcock as well.

That's How Science Works

The RealClimate blog has a very astute entry on how the controversy surrounding the recent report in the prestigious journal Science that bacteria living in the arsenic-rich waters of Mono Lake in California can substitute arsenic for phosphorous in their DNA.  If true, this would be a major finding because it expands the range of environments in which we could conceivably find extraterrestrial life.  In effect, this result would suggest a wider range of building blocks for life.  Pretty heavy stuff. Now, I am way out of my depth on this topic, but it sounds like the paper published in Science suffers from some fairly serious problems. Some of the problems noted by experts in the field have been assembled by Carl Zimmer on his blog.  Carl also provides a pithy treatment of the controversy in an article at Slate.com. John Roach has a similarly excellent review of the controversy, including what we learn about science from it on his Cosmic Log blog.

Regardless of the scientific merits of this work, this episode is actually an instructive example of the way that science works. As the RealClimate folks write,

The arseno-DNA episode has displayed this process in full public view. If anything, this incident has demonstrated the credibility of scientists, and should promote public confidence in the scientific establishment.

The post then goes on to list three important lessons we can draw from this whole incident:

  1. “Major funding agencies willingly back studies challenging scientific consensus.” It helps if the challenge to scientific consensus is motivated by carefully reasoned theoretical challenges or, even better, data that challenge the consensus.  Some yahoo saying that evolution is “just a theory” or that climate change isn’t real because it was really cold last winter isn’t enough. In the case of arseno-DNA, as Carl Zimmer notes, the National Academy of Sciences published a report in 2007 that suggested the theoretical possibility of arsenic-based biology.  Carl also notes that some of the authors of this report are highly critical of the Science paper as well. The report challenged the orthodoxy that phosphate was a necessary building block of DNA, and the report’s author’s later called out NASA (the major funding source for this kind of extreme biology) for publishing sloppy science.  Lots of orthodoxy being challenged here…
  2. “Most everyone would be thrilled to overturn the consensus. Doing so successfully can be a career-making result. Journals such as Science and Nature are more than willing to publish results that overturn scientific consensus, even if data are preliminary – and funding agencies are willing to promote these results.” Individual scientists have enormous individual and institutional incentives to overturn orthodoxies if it is within their power. You become a star when you pull this feat off. And you better believe that every funding agency out there would like to take credit for funding the critical research that helped overturn a fundamental scientific paradigm.
  3. “Scientists offer opinions based on their scientific knowledge and a critical interpretation of data. Scientists willingly critique what they think might be flawed or unsubstantiated science, because their credibility – not their funding – is on the line.” As a scientist, you have to do this if you are going to be taken seriously by your peers — you know, the ones who do all that peer review that climate deniers, e.g., seem to get their collective panties in a wad about?

The RealClimate piece summarizes by noting:

This is the key lesson to take from this incident, and it applies to all scientific disciplines: peer-review continues after publication. Challenges to consensus are seriously entertained – and are accepted when supported by rigorous data. Poorly substantiated studies may inspire further study, but will be scientifically criticized without concern for funding opportunities. Scientists are not “afraid to lose their grant money”.

Read the RealClimate post to get the full story. Obviously, these authors (who do excellent science and amazing public education work, a rare combination) are interested in what this controversy has to say about accusations of bias in climate science — check out the RealClimate archives for some back-story on this. However, the post is so much more broadly applicable, as they note in the quote above. Science is not a monolithic body of information; it is a process, a system designed to produce positive (as opposed to normative) statements about the world around us. When it works correctly, science is indifferent to politics or the personal motivations of individual scientists because results get replicated.  Everything about a scientific paper is designed to allow other researchers to replicate the results that are presented in that paper.  If other researchers can’t replicate some group’s findings, those findings become suspect (and get increasingly so as more attempts to replicate fail).

So what does this mean for Anthropology as a science? You may remember that there has been some at times shrill “discussion” (as well as some genuine intellectual discussion) about the place for science in Anthropology and the American Anthropological Association in particular. For me, replicability is a sine qua non of science. The nature of much anthropological research, particularly research in cultural anthropology, makes the question of replication challenging. When you observe some group of people behaving in a particular way in a particular place at a particular time, who is to say otherwise? I don’t claim to have easy answers here, but there are a few things we can do to ensure the quality of our science.

First, we need to have scientific theories that are sufficiently robust that they can generate testable predictions that transcend the particularities of time and place. Results generated in one population/place/time can then be challenged by testing in other populations/places/times. Of course, it is of the utmost importance that we try to understand how the differences in population and place and time will change the results, but this is what our research is really about, right?  When we control for these differences, do we still see the expected results?

Second, we need to be scrupulous in our documentation of our results and the methods we employ to generate these results.  You know, like science? It’s never easy to read someone else’s lab notebook, but we need to be able to do this in anthropology, at least in principle.  Going back to the raw data as they are reduced in a lab notebook or its equivalent is probably the primary means through which scientific fraud is discovered. Of course, there are positive benefits to having scrupulously-kept field notes as well.  They serve as a rich foundation for future research by the investigator, for instance.

Third, we need to be willing to share our data. This is expected in the natural sciences (in fact, it is a condition for publication in journals like Science and Nature) and it should be in Anthropology as well.

I think that the points of the RealClimate post all apply to anthropology as well. Surrounding the latest brouhaha on science in anthropology, one hears a lot of grousing about various cartels (e.g., the AAA Executive Board, the editorial boards of various journals, etc.) that keep anthropologists of different strips (yes, it happens on both sides) from receiving grants or getting published or invited to serve on various boards, etc. Speaking from my experience as both panelist and applicant, I can confidently say that the National Science Foundation’s Cultural Anthropology Program funds good cultural anthropology of a variety of different approaches (there are also other BCS programs that entertain, and sometimes fund, applications from anthropologists) and the panel will happily fund orthodoxy-busting proposals if they are sufficiently meritorious.  The editorial position of American Ethnologist not in line with your type of research?  If you’ve done good science, there are lots of general science journals that will gladly take interesting and important anthropology papers (and, might I add, have much higher impact factors). I co-authored a paper with Rebecca and Doug Bird that appeared in PNAS not too long ago. Steve Lansing has also had a couple nice papers in PNAS as does Richard McElreath, or Herman Pontzer, or … a bunch of other anthropologists!  Mike Gurven at UCSB has had some luck getting papers into Proceedings of the Royal Society B.  Mhairi Gibson and Ruth Mace have papers in Biology Letters and PLoS Medicine.  Rebecca Sear has various papers in Proceedings of the Royal Society B. Monique Borgerhoff Mulder and a boat-load of other anthropologists (and at least one economist) have a paper in Science. Ruth Mace has papers in most of these journals as well as at least one in Science. Rob Boyd, Richard McElreath, Joe Henrich, and I all even have papers about human social behavior, culture, etc. in theoretical biology journals such as Theoretical Population Biology and the Journal of Theoretical Biology. There’s lots more.  As with my previous post, this is a total convenience sample of work with which I am already familiar. The point is that there are outlets for good scientific anthropology out there even if people like me are unlikely to publish in journals like PoLAR.

So, I’m sanguine about the process of science and the continuing ability for anthropologists to pursue science. My winter break is drawing to a close and I’m going to try to continue some of this myself!

Typologies of Critique

Greg Downey over at Neuroanthropology has a fantastic post on the most recent flare-up of the anthropology-is-it-science-or-is-it-literature wars.  There is an awful lot of wise prose to be found in this post (and some disturbing information about the labor action at Macquarie University), but the thing that tickled me more than anything was his typology of criticism.  I love these sort of typologies as intellectual play-things and have lots of my own (that probably any of my grad students or post-docs would be happy to tell you about over a beer some time).  Greg’s typology of stupid criticisms:

  1. Critique for incompleteness, “where the critic points out something tangentially related to the author’s topic or argument and then asserts that this missing element is THE most important consideration, so the argument is hopelessly, fatally flawed.”
  2. Critique from creative misunderstanding,  where “the critic latches onto a single term or phrase, intentionally misunderstands it or comes up with an interpretation that could only occur to the most hostile, cranky, ill-disposed reader, and then projects the misunderstanding onto a straw version of the presenter.”
  3. Critique from guilt by association, where “the critic sees some sort of link between what the author writes and some deeply loathed intellectual villain, draws some sort of tenuous connection, and then just substitutes the villain’s ideas for the argument, essay or analysis in question.

Awesome.  I will need to get to work thinking of other willfully bone-headed modes of critique. I will think of this post every time I review a paper or grant proposal from now on…

A similar typology that I came up with attending demography talks, first at the Harvard Center for Population and Development and later at the Population Association of America meetings, deals with discussants. The phenomenon of the discussant is still something I find a bit bizarre, as I find having a discussant adds absolutely nothing to the intellectual merit of a talk or panel in the vast majority of cases.  It also chafes a bit at my science-as-meritocracy ethos (why exactly do I need to have the talk I just sat through explained to me by some guy in a suit?).

The different flavors of discussant that I have identified include:

  1. The redundant discussant: “Author #1 said this.  Author #2 said this other thing. Author #3 said something else…” Snooze.
  2. The bitchy discussant: “The author claimed to use a Mann-Whitney U when he really used Kendall’s tau. It’s not clear why they used Coale-Demeny West 5 when a UN life table would have clearly been preferable. The assumptions of the stable model are not exactly met. And you didn’t cite me!”
  3. The pandering discussant: “In brief, this paper will change the course of human affairs.  I feel an extraordinary privilege just being in the same room as this author on this day. Hosanna.”
  4. The orthogonal discussant: “Well, we just heard a number of very interesting talks, now let me tell you about my work…”

Very rarely (so much so that it doesn’t really merit a category), a discussant does what he or she is supposed to do: synthesize and provide novel insight about how the papers in a session relate to each other. I have personally experienced all of the forms of discussant except the panderer (at least in its fullest form).  I did witness a friend receive the panderer’s treatment much to her embarrassment and, frankly, that of everyone in the room. I think it’s fair to say that everyone thought she had indeed given a very fine paper, though had not quite changed history. I think I actually prefer the orthogonal discussant to all the others because that way you get to see another talk rather than just hearing a bunch of [redundancy, bitchiness, pandering], which is not the best use of time at academic meetings. As anyone who has ever been to an academic meeting knows the best use of one’s time is, as Greg notes in his post “drink[ing] heavily with my friends, sneak[ing] off repeatedly for Mexican food, and spend[ing] most daylight hours in the publishers’ expo.” Honestly, this is one of the reasons why I’ve decided I actually like the AAAs. True, there is generally very little in the program that actually interests me.  However, there are lots of people who interest me who attend.  I can hang out and have long lunches and long dinners and even longer sessions drinking and talking anthropology with cool people and not feel the slightest bit of guilt at missing all those sessions! What could be better?

Measuring Epidemiological Contacts in Schools

I am happy to report that our paper describing the measurement of casual contacts within an American high school is finally out in the early edition of PNAS. Stanford’s great social science reporter, Adam Gorlick, has written a very nice overview of our paper for the Stanford Report (also here in the LA Times and here on Medical News Today). The lead author, and general force of nature behind this paper, is Marcel Salathé, who until recently was a post-doc here at Stanford in Marc Feldman‘s lab.  This summer, Marcel moved to the Center for Infectious Disease Dynamics at Penn State, a truly remarkable place and now all the better for having Marcel.  From the Penn State end, there is a nice video describing our results as well as well as a brief note on Marcel’s blog.  This paper has not been picked up quite like our paper on plague dynamics this summer, probably because measuring casual contacts in an American high school generally does not involve carnivorous mice.

With generous NSF funding, we were able to buy a lot of wireless sensor motes — enough to outfit every student, teacher, and staff member at a largish American high school so that we could record all of their close contacts in a single, typical day. By “close contact,” we mean any more-or-less face-to-face interaction within a radius of three meters.  As Marcel was putting together this project, we were (once again) exceptionally lucky to find ourselves at Stanford along with one of the world authorities on wireless sensor technology, Phil Levis, of Stanford’s Computer Science department.  Phil and his students, Maria and Jung Woo Lee, made this work come together in ways that I can’t even begin to fathom.  This actually leads me to a brief diversion to reflect on the nature of collaboration.  As with our plague paper or SIV mortality paper, this paper is one where collaboration between very different types of researchers (viz., Biologists, Computer Scientists, Anthropologists) is absolutely fundamental to the success of the work.  In coming up for tenure — and generally living in an anthropology department — the question of what I might call the partible paternity of papers (PPP) comes up fairly regularly. “I see you have a paper with five co-authors; I guess that means you contributed 17% to this paper, no?”  Well, no, actually.  I call this the “additive fallacy of collaboration.” When a paper is truly collaborative, then the contributions of the paper are not mutually exclusive from each other and so do not simply sum.  To use a familiar phrase, the whole is greater than the sum of the parts.  Our current paper is an example of such a truly collaborative project.  Without the contributions of all the collaborators, it’s not that the paper would be 17% less complete; it probably wouldn’t exist. I can’t speak particularly fluently to what Phil, Maria, and Jung Woo did other than by saying, “wow” (thus our collaboration), but I can say that we couldn’t have done it without them.

I’ll talk more about our actual results later.  For now, you’ll either have to read the paper (which is open access), watch the video, or read the overview in the Stanford Report.

The Igon Value Problem

Priceless. Steve Pinker wrote a spectacular review of Malcolm Gladwell’s latest book, What the Dog Saw and Other Adventures, in the New York Times today. I regularly read and enjoy Gladwell’s essays in the New Yorker, but I find his style sometimes problematic, verging on anti-intellectual, and I’m thrilled to see a scientist of Pinker’s stature calling him out.

Pinker coins a term for the problem with Gladwell’s latest book and his work more generally.  Pinker’s term: “The Igon Value Problem” is a clever play on the Eigenvalue Problem in mathematics.  You see, Gladwell apparently quotes someone referring to an “igon value.” This is clearly a concept he never dealt with himself even though it is a ubiquitous tool in the statistics and decision science about which Gladwell is frequently so critical.  According to Pinker, the Igon Value Problem occurs “when a writer’s education on a topic consists in interviewing an expert,” leading him or her to offering “generalizations that are banal, obtuse or flat wrong.”  In other words, the Igon Value Problem is one of dilettantism.  Now, this is clearly a constant concern for any science writer, who has the unenviable task of rendering extremely complex and frequently quite technical information down to something that is simultaneously accurate, understandable, and interesting. However, when the bread and butter of one’s work involves criticizing scientific orthodoxy, it seems like one needs to be extremely vigilant to get the scientific orthodoxy right.

Pinker raises the extremely important point that the decisions we make using the formal tools of decision science (and cognate fields) represent solutions to the inevitable trade-offs between information and cost.  This cost can take the form of financial cost, time spent on the problem, or computational resources, to name a few. Pinker writes:

Improving the ability of your detection technology to discriminate signals from noise is always a good thing, because it lowers the chance you’ll mistake a target for a distractor or vice versa. But given the technology you have, there is an optimal threshold for a decision, which depends on the relative costs of missing a target and issuing a false alarm. By failing to identify this trade-off, Gladwell bamboozles his readers with pseudoparadoxes about the limitations of pictures and the downside of precise information.

Pinker is particularly critical of an analogy Gladwell draws in one of his essays between predicting the success of future teachers and future professional quarterbacks.  Both are difficult decision tasks fraught with uncertainty.  Predicting whether an individual will be a quality teacher based on his or her performance on standardized tests or the presence or absence of teaching credentials is an imperfect process just as predicting the success of a quarterback in the N.F.L. based on his performance at the collegiate level.  Gladwell argues that anyone with a college degree should be allowed to teach and that the determination of the qualification for the job beyond the college degree should only be made after they have taught. This solution, he argues, is better than the standard practice of  credentialing, evaluating, and “going back and looking for better predictors.” You know, science? Pinker doesn’t hold back in his evaluation of this logic:

But this “solution” misses the whole point of assessment, which is not clairvoyance but cost-effectiveness. To hire teachers indiscriminately and judge them on the job is an example of “going back and looking for better predictors”: the first year of a career is being used to predict the remainder. It’s simply the predictor that’s most expensive (in dollars and poorly taught students) along the accuracy- cost trade-off. Nor does the absurdity of this solution for professional athletics (should every college quarterback play in the N.F.L.?) give Gladwell doubts about his misleading analogy between hiring teachers (where the goal is to weed out the bottom 15 percent) and drafting quarterbacks (where the goal is to discover the sliver of a percentage point at the top).

This evaluation is spot-on. As a bit of an aside, the discussion of predicting the quality of prospective quarterbacks also reminds me of one of the great masterpieces of statistical science and the approach described by this paper certainly has a bearing on the types of predictive problems of which Gladwell ruminates.  In a 1975 paper, Brad Efron and Carl Morris present a method for predicting 18 major league baseball players’ 1970 season batting average based on their first 45 at-bats. The naïve method for predicting (no doubt, the approach Gladwell’s straw “we” would take) is simply to use the average after the first 45 at-bats. Turns out, there is a better way to solve the problem, in the sense that you can make more precise predictions (though hardly clairvoyant).  The method turns on what a Bayesian would call “exchangeability.”  Basically, the idea is that being a major league baseball player buys you a certain base prediction for the batting average.  So if we combine the averages across the 18 players and with each individual’s average in a weighted manner, we can make a prediction that has less variation in it.  A player’s average after a small number of at-bats is a reflection of his abilities but also lots of forces that are out of his control — i.e., are due to chance.  Thus, the uncertainty we have in a player’s batting based on this small record is partly due to the inherent variability in his performance but also due to sampling error.  By pooling across players, we combine strength and remove some of this sampling error, allowing us to make more precise predictions. This approach is lucidly discussed in great detail in my colleague Simon Jackman‘s new book, draft chapters of which we used when we taught our course on Bayesian statistical methods for the social sciences.

Teacher training and credentialing can be thought of as strategies for ensuring exchangability in teachers, aiding the prediction of teacher performance.  I am not an expert, but it seems like we have a long way to go before we can make good predictions about who will become an effective teacher and who will not.  This doesn’t mean that we should stop trying.

Janet Maslin, in her review of What the Dog Saw, waxes about Gladwell’s scientific approach to his essays. She writes that the dispassionate tone of his essays “tames visceral events by approaching them scientifically.” I fear that this sentiment, like the statements made in so many Gladwell works, reflects the great gulf between most educated Americans and the realities of scientific practice (we won’t even talk about the gulf between less educated Americans and science).  Science is actually a passionate, messy endeavor and sometimes we really do get better by going back and finding better predictors.