Tag Archives: science policy

Thoughts on Black Swans and Antifragility

I have recently read the latest book by Nassim Nicholas Taleb, Antifragile. I read his famous The Black Swan a while back while in the field and wrote lots of notes. I never got around to posting those notes since they were quite telegraphic (and often not even electronic!), as they were written in the middle of the night while fighting insomnia under mosquito netting. The publication of his latest, along with the time afforded by my holiday displacement, gives me an excuse to formalize some of these notes here. Like Andy Gelman, I have so many things to say about this work on so many different topics, this will be a bit of a brain dump.

Taleb’s work is quite important for my thinking on risk management and human evolution so it is with great interest that I read both books. Nonetheless, I find his works maddening to say the least. Before presenting my critique, however, I will pay the author as big a compliment as I suppose can be made. He makes me think. He makes me think a lot, and I think that there are some extremely important ideas is his writings. From my rather unsystematic readings of other commentators, this seems to be a pretty common conclusion about his work. For example, Brown (2007) writes in The American Statistician, “I predict that you will disagree with much of what you read, but you’ll be smarter for having read it. And there is more to agree with than disagree. Whether you love it or hate it, it’s likely to change public attitudes, so you can’ t ignore it.” The problem is that I am so distracted by all the maddening bits that I regularly nearly miss the ideas, and it is the ideas that are important. There is so much ego and so little discipline on display in his books, The Black Swan and Antifragile.

Some of these sentiments have been captured in Michiko Kakutani’s excellent review of Antifragile. There are some even more hilarious sentiments communicated in Tom Bartlett’s non-profile in the Chronicle of Higher Education.

I suspect that if Taleb and I ever sat down over a bottle of wine, we would not only have much to discuss but we would find that we are annoyed — frequently to the point of apoplexy — by the same people. Nonetheless, I find one of the most frustrating things about reading his work the absurd stereotypes he deploys and broad generalizations he uses to dismiss the work of just about any academic researcher. His disdain for academic research interferes with his ability to make cogent critique. Perhaps I have spent too much time at Stanford, where the nerd is glorified, but, among other things, I find his pejorative use of the term “nerd” for people like Dr. John, as contrasted to man-of-his-wits Stereotyped, I mean, Fat Tony off-putting and rather behind the times. Gone are the days when being labeled a nerd is a devastating put down.

My reading of Taleb’s critiques of prediction and risk management is that the primary problem is hubris. Is there anything fundamentally wrong with risk assessment? I am not convinced there is, and there are quite likely substantial benefits to systematic inquiry. The problem is that the risk assessment models become reified into a kind of reality. I warn students – and try to regularly remind myself – never to fall in love with one’s own model. Something that many economists and risk modelers do is start to believe that their models are something more real than heuristic. George Box’s adage has become a bit cliche but nonetheless always bears repeating: all models are wrong, but some are useful. We need to bear in mind the wrongness of models without dismissing their usefulness.

One problem about both projection and risk analysis, that Taleb does not discuss, is that risk modelers, demographers, climate scientists, economists, etc. are constrained politically in their assessments. The unfortunate reality is that no one wants to hear how bad things can get and modelers get substantial push-back from various stakeholders when they try to account for real worst-case scenarios.

There are ways of building in more extreme events than have been observed historically (Westfall and Hilbe (2007), e.g., note the use of extreme-value modeling). I have written before about the ideas of Martin Weitzman in modeling the disutility of catastrophic climate change. While he may be a professor at Harvard, my sense is that his ideas on modeling the risks of catastrophic climate change are not exactly mainstream. There is the very tangible evidence that no one is rushing out to mitigate the risks of climate change despite the fact that Weitzman’s model makes it pretty clear that it would be prudent to do so. Weitzman uses a Bayesian approach which, as noted by Westfall and Hilbe, is a part of modern statistical reasoning that was missed by Taleb. While beyond the scope of this already hydra-esque post, briefly, Bayesian reasoning allows one to combine empirical observations with prior expectations based on theory, prior research, or scenario-building exercises. The outcome of a Bayesian analysis is a compromise between the observed data and prior expectations. By placing non-zero probability on extreme outcomes, a prior distribution allows one to incorporate some sense of a black swan into expected (dis)utility calculations.

Nor does the existence of black swans mean that planning is useless. By their very definition, black swans are rare — though highly consequential — events. Does it not make sense to have a plan for dealing with the 99% of the time when we are not experiencing a black swan event? To be certain, this planning should not interfere with our ability to respond to major events but I don’t see any evidence that planning for more-or-less likely outcomes necessarily trades-off with responding to unlikely outcomes.

Taleb is disdainful about explanations for why the bubonic plague didn’t kill more people: “People will supply quantities of cosmetic explanations involving theories about the intensity of the plague and ‘scientific models’ of epidemics.” (Black Swan, p. 120) Does he not understand that epidemic models are a variety of that lionized category of nonlinear processes he waxes about? He should know better. Epidemic models are not one of these false bell-curve models he so despises. Anyone who thinks hard about an epidemic process – in which an infectious individual must come in contact with a susceptible one in order for a transmission event to take place – should be able to infer that an epidemic can not infect everyone. Epidemic models work and make useful predictions. We should, naturally, exhibit a healthy skepticism about them as we should any model. But they are an important tool for understanding and even planning.

Indeed, our understanding gained from the study of (nonlinear) epidemic models has provided us with the most powerful tools we have for control and even eradication. As Hans Heesterbeek has noted, the idea that we could control malaria by targeting the mosquito vector of the disease is one that was considered ludicrous before Ross’s development of the first epidemic model. The logic was essentially that there are so many mosquitoes that it would be absurdly impractical to eliminate them all. But the Ross model revealed that epidemics – because of their nonlinearity – have thresholds. We don’t have to eliminate all the mosquitoes to break the malaria transmission cycle; we just need to eliminate enough to bring the system below the epidemic threshold. This was a powerful idea and it is central to contemporary public health. It is what allowed epidemiologists and public health officials to eliminate smallpox and it is what is allowing us to very nearly eliminate polio if political forces (black swans?) will permit.

Taleb’s ludic fallacy (i.e., games of chance are somehow an adequate model of randomness in the world) is great. Quite possibly the most interesting and illuminating section of The Black Swan happens on p. 130 where he illustrates the major risks faced by a casino. Empirical data make a much stronger argument than do snide stereotypes. This said, Lund (2007) makes the important point that we need to ask what exactly is being modeled in any risk assessment or projection? One of the most valuable outcomes of any formalized risk assessment (or formal model construction more generally) is that it forces the investigator to be very explicitly about what is being modeled. The output of the model is often of secondary importance.

Much of the evidence deployed in his books is what Herb Gintis has called “stylized facts” and, of course, is subject to Taleb’s own critique of “hidden evidence.” Because the stylized facts are presented anecdotally, there is no way to judge what is being left out. A fair rejoinder to this critique might be that these are trade publications meant for a mass market and are therefore not going to be rich in data regardless. However, the tone of books – ripping on economists and bankers but also statisticians, historians, neuroscientists, and any number of other professionals who have the audacity to make a prediction or provide a causal explanation – makes the need for more measured empirical claims more important. I suspect that many of these people actually believe things that are quite compatible with the conclusions of both The Black Swan and Antifragile.

On Stress

The notion of antifragility turns on systems getting stronger when exposed to stressors. But we know that not all stressors are created equally. This is where the work of Robert Sapolsky really comes into play. In his book Why Zebras Don’t Get Ulcers, Sapolsky, citing the foundational work of Hans Seyle, notes that some stressors certainly make the organism stronger. Certain types of stress (“good stress”) improves the state of the organism, making it more resistant to subsequent stressors. Rising to a physical or intellectual challenge, meeting a deadline, competing in an athletic competition, working out: these are examples of good stresses. They train body, mind, and emotions and improve the state of the individual. It is not difficult to imagine that there could be similar types of good stressors at levels of organization higher than the individual too. The way the United States come together as a society to rise to the challenge of World War II and emerge as the world’s preeminent industrial power comes to mind. An important commonality of these good stressors is the time scale over which they act. They are all acute stressors that allow recovery and therefore permit the subsequently improved performance.

However, as Sapolsky argues so nicely, when stress becomes chronic, it is no longer good for the organism. The same glucocorticoids (i.e., “stress hormones”) that liberate glucose and focus attention during an acute crisis induce fatigue, exhaustion, and chronic disease when the are secreted at high levels chronically.

Any coherent theory of antifragility will need to deal with the types of stress to which systems are resistent and, importantly, have a strengthening effect. Using the idea of hormesis – that a positive biological outcome can arise from taking low doses of toxins – is scientifically hokey and boarders on mysticism. It unfortunately detracts from the good ideas buried in Antifragile.

I think that Taleb is on to something with the notion of antifragility but I worry that the policy implications end up being just so much orthodox laissez-faire conservatism. There is the idea that interventions – presumably by the State – can do nothing but make systems more fragile and generally worse. One area where the evidence very convincingly suggests that intervention works is public health. Life expectancy has doubled in the rich countries of the developed world from the beginning of the twentieth century to today. Many of the gains were made before the sort of dramatic things that come to mind when many people think about modern medicine. It turns out that sanitation and clean water went an awful long way toward decreasing mortality well before we had antibiotics or MRIs. Have these interventions made us more fragile? I don’t think so. The jury is still out, but it seems that reducing the infectious disease burden early in life (as improved sanitation does) seems to have synergistic effects on later-life mortality, an effect is mediated by inflammation.

On The Academy

Taleb drips derision throughout his work on university researchers. There is a lot to criticize in the contemporary university, however, as with so many other external critics of the university, I think that Taleb misses essential features and his criticisms end up being off base. Echoing one of the standard talking points of right-wing critics, Taleb belittles university researchers as being writers rather than doers (echoing the H.L. Menken witticism  “Those who can do; those who can’t teach”). Skin in the game purifies thought and action, a point with which I actually agree, however, thinking that that university researchers live in a world lacking consequences is nonsense. Writing is skin in the game. Because we live in a quite free society – and because of important institutional protections on intellectual freedom like tenure (another popular point of criticism from the right) – it is easy to forget that expressing opinions – especially when one speaks truth to power – can be dangerous. Literally. Note that intellectuals are often the first ones to go to the gallows when there are revolutions from both the right and the left: Nazis, Bolsheviks, and Mao’s Cultural Revolution to name a few. I occasionally get, for lack of a better term, unbalanced letters from people who are offended by the study of evolution and I know that some of my colleagues get this a lot more than I. Intellectuals get regular hate mail, a phenomenon amplified by the ubiquity of electronic communication. Writers receive death threats for their ideas (think Salman Rushdie). Ideas are dangerous and communicating them publicly is not always easy, comfortable, or even safe, yet it is the professional obligation of the academic.

There are more prosaic risks that academics face that suggest to me that they do indeed have substantial skin in the game. There is a tendency for critics from outside the academy to see universities as ossified places where people who “can’t do” go to live out their lives, the university is a dynamic place. Professors do not emerge fully formed from the ivory tower. They must be trained and promoted. This is the most obvious and ubiquitous way that what academics write has “real world” consequences – i.e., for themselves. If peers don’t like your work, you won’t get tenure. One particularly strident critic can sink a tenure case. Both the trader and the assistant professor have skin in their respective games – their continued livelihoods depend upon their trading decisions and their writing. That’s pretty real. By the way, it is a huge sunk investment that is being risked when an assistant professor comes up for tenure. Not much fun to be forty and let go from your first “real” job since you graduated with your terminal degree… (I should note that there are problems with this – it can lead to particularly conservative scholarship by junior faculty, among other things, but this is a topic for its own post.)

Now, I certainly think that are more and less consequential things to write about. I have gotten more interested in applied problems in health and the environment as I’ve moved through my career because I think that these are important topics about which I have potentially important things to say (and, yes, do). However, I also think it is of utmost importance to promote the free flow of ideas, whether or not they have obvious applications. Instrumentally, the ability to pursue ideas freely is what trains people to solve the sort of unknown and unforecastable problems that Taleb discusses in The Black Swan. One never knows what will be relevant and playing with ideas (in the personally and professionally consequential world of the academy) is a type of stress that makes academics better at playing with ideas and solving problems.

One of the major policy suggestions of Atifragile is that tinkering with complex systems will be superior to top-down management. I am largely sympathetic to this idea and to the idea that high-frequency-of-failure tinkering is also the source of innovation. Taleb contrasts this idea of tinkering is “top-down” or “directed” research, which he argues regularly fails to produce innovations or solutions to important problems. This notion of “top-down,” “directed” research is among the worst of his various straw men and a fundamental misunderstanding of the way that science works. A scientist writes a grant with specific scientific questions in mind, but the real benefit of a funded research program is the unexpected results one discovers while pursuing the directed goals. As a simple example, my colleague Tony Goldberg has discovered two novel simian hemorrhagic viruses in the red colobus monkeys of western Uganda as a result of our big grant to study the transmission dynamics and spillover potential of primate retroviruses. In the grant proposal, we discussed studying SIV, SFV, and STLV. We didn’t discuss the simian hemorrhagic fever viruses because we didn’t know they existed! That’s what discovery means. Their not being explicitly in the grant didn’t stop Tony and his collaborators from the Wisconsin Regional Primate Center from discovering these viruses but the systematic research meant that they were in a position to discover them.

The recommendation of adaptive, decentralized tinkering in complex systems is in keeping with work in resilience (another area about which Taleb is scornful because it is the poor step-child of antifragility). Because of the difficulty of making long-range predictions that arises from nonlinear, coupled systems, adaptive management is the best option for dealing with complex environmental problems. I have written about this before here.

So, there is a lot that is good in the works of Taleb. He makes you think, even if spend a lot of time rolling your eyes at the trite stereotypes and stylized facts that make up much of the rhetoric of his books. Importantly, he draws attention to probabilistic thinking for a general audience. Too much popular communication of science trades in false certainties and the mega-success of The Black Swan in particular has done a great service to increasing awareness among decision-makers and the reading public about the centrality of uncertainty. Antifragility is an interesting idea though not as broadly applicable as Taleb seems to think it is. The inspiration for antifragility seem to lie in largely biological systems. Unfortunately, basing an argument on general principles drawn from physiology, ecology, and evolutionary biology pushes Taleb’s knowledge base a bit beyond its limit. Too often, the analogies in this book fall flat or are simply on shaky ground empirically. Nonetheless, recommendations for adaptive management and bricolage are sensible for promoting resilient systems and innovation. Thinking about the world as an evolving complex system rather than the result of some engineering design is important and if throwing his intellectual cachet behind this notion helps it to get as ingrained into the general consciousness as the idea of a black swan has become, then Taleb has done another major service.

On Global State Shifts

This is a edited version of a post I sent out to the E-ANTH listserv in response to a debate over a recent paper in Nature and the response to it on the website “Clear Science,” written by Todd Meyers. In this debate, it was suggested that the Barnosky paper is the latest iteration of alarmist environmental narratives in the tradition of the master of that genre, Paul Ehrlich. Piqued by this conversation, I read the Barnosky paper and passed along my reading of it.

The Myers’s piece on the “Clear Science” web site is quite rhetorically clever. Climate-change deniers have a difficult task if they want to convincingly buck the overwhelming majority of reputable scientists on this issue. Myers uses ideas about the progress of science developed by the philosopher Thomas Kuhn in his classic book, The Structure of Scientific Revolutions. By framing the Barnosky et al. as mindlessly toeing the Kuhnian normal-science line, he has come up with a shrewd strategy for dealing with the serious scientific consensus around global climate change. Myers suggests that “Like scientists blindly devoted to a failed paradigm, the Nature piece simply tries to force new data to fit a flawed concept.”

I think that a pretty strong argument can be made that the perspective represented in the Barnosky et al. paper is actually paradigm-breaking. For 200 years the reigning paradigm in the historical sciences has been uniformitarianism. Hutton’s notion — that processes that we observe today have always been working — greatly extended the age of the Earth and allowed Lyell and Darwin to make their remarkable contributions to human understanding. This same principle allows us to make sense of the archaeological record and of ethnographic experience. It is a very useful foil for all manner of exceptionalist explanatory logic and I use it frequently.

However, there are plenty of ways that uniformitarianism fails. If we wanted to follow the Kuhnian narrative, we might say that evidence has mounted that leads to increased contradictions arising from the uniformitarian explanatory paradigm. Rates of change show heterogeneities and when we trying to understand connected systems characterized by extensive feedback, our intuitions based on gradual change can fail, sometimes spectacularly. This is actually a pretty revolutionary idea, apocalyptic popular writings aside, in mainstream science.

Barnosky et al. draw heavily on contemporary work in complex systems. The theoretical paper (Scheffer et al. 2009) upon which the Barnosky paper relies heavily represents a real step forward in the theoretical sophistication of this corpus and does so by making unique and testable predictions about systems approaching critical transitions. I have written about it previously here.

The most difficult part of projecting the future state of complex systems is that human element. This leads too many physical and biological scientists to simply ignore social and behavioral inputs. This said, there are far too few social and behavioral scientists willing to step up and do the hard collaborative work necessary to make progress on this extremely difficult problem. The difficulty of projecting human behavior often leads to projections of the business-as-usual variety and, unfortunately, these are often mischaracterized by the media and other readers. Such projections simply assume no change in behavior and look at the consequences some time down the line. A business-as-usual projection actually provides a lot of information, albeit about a very hypothetical future. What if things stayed the way they are? Yes, behavior changes. People adapt. Agricultural production becomes more efficient. Prices increase, reducing demand and allowing sustainable substitutes. Of course, sometimes things get worse too. Despite tremendous global awareness and lots of calls to reduce greenhouse gas emissions, carbon emissions have continued to rise. So, there is nothing inherently flawed about a business-as-usual projection. We just need to be clear about what it means when we use one.

A criticism that emerged on the list is that Barnosky et al. is essentially “an opinion piece.” However, the great majority of the Barnosky et al. paper is, in fact, simply a review. There are numerous facts to be reviewed: biodiversity has declined, fisheries have crashed, massive amounts of forest have been converted and degraded, the atmosphere has warmed. They are facts. And they are facts in which many vested interests would like to sow artificial uncertainty for political purposes. Positive things have happened too (e.g., malaria eradication in temperate climes, increased food security in some places that used to be highly insecure, increased agricultural productivity — though this may be of dubious sustainability), though these are generally on more local scales and, in some cases, may simply reflect exporting the problems to rich countries to the Global South. The fact that they are not reviewed does not mean that the paper belongs in an hysterical chicken-little genre.

A common critique of the doomsday genre is the certainty with which the horrible outcomes are framed. The Barnosky paper is suffused with uncertainty. In fact, this is the main point I take away from it! The first conclusion of the paper is that “it is essential to improve biological forecasting by anticipating critical transitions that can emerge on a planetary scale and understanding how such global forcings cause local changes.” This suggests to me that the authors are acknowledging massive uncertainty about the future, not saying that we are doomed with certainty. Or how about: “the plausibility of a future planetary state shift seems high, even though considerable uncertainty remains about whether it is inevitable and, if so, how far in the future it may be”?

Myers writes “they base their conclusions on the simplest linear mathematical estimate that assumes nothing will change except population over the next 40 years. They then draw a straight line, literally, from today to the environmental tipping point.” This is a profoundly misleading statement. Barnosky et al. are using the fold catastrophe model discussed in Scheffer et al. (2009). The Scheffer et al. analysis of the fold catastrophe model uses some fairly sophisticated ideas from complex systems theory, but the ideas are relatively simple. The straight line that so offends Myers arises because this is the direction of the basin of attraction. In the figure below, I show the fold-catastrophe model. The abcissa represents the forcing conditions of the system (e.g., population size or greenhouse gas emissions). The ordinate represents the state of the system (e.g., land cover or one of many ecosystem services). The sideways N represents an attractor — a more general notion of an equilibrium. The state of the system tends toward this curve whenever it is perturbed away.

The region in the interior of the fold (indicated by the dashed line) is unstable while the upper and lower tails (indicated by solid lines) are stable and tend to draw perturbations from the attractor toward them. The grey arrows indicate the basin of attraction. When the system is perturbed off of the attractor by some random shock, the state tends to move in the direction indicated by the arrow. When the state is forced all the way down the top arc of the fold, it enters a region where a relatively small shock can send the state into a qualitatively different regime of rapid degradation. This is illustrated by the black arrow (a shock) pushing the state away from point F2. The state will settle again on the attractor, but a second shock will send the state rapidly down toward the bottom arm of the fold (point F1). Note that this region of the attractor is stable so it would take a lot of work to get it back up again (e.g., reduce population or drastically reduced total greenhouse gasses). This is what people mean when they colloquially refer to a “global tipping point.”

This is the model. It may not be right, but thanks to Scheffer et al. (2009), it makes testable predictions. By framing global change in terms of this model, Barnosky et al. are making a case for empirical investigation of the types of data that can falsify the model. Maybe because of the restrictions placed on them by Nature (and these are severe!), maybe because of some poor choices of their own, they include an insufficiently explained, fundamentally complex figure that a critic with clear interests in muddying the scientific consensus can sieze on to dismiss the whole paper as just more Ehrlich-style hysteria.

For me — as I suspect for the authors of the Barnosky et al. paper — massive, structural uncertainty about the state of our planet, coupled with a number of increasingly well-supported models of the behavior or nonlinear systems (i.e., not simply normal science) strongly suggests a precautionary principle. This is something that the economist Marty Weitzman suggested in his (highly technical and therefore not widely read) paper in 2009 and that I have written about before here and here. This is not inflammatory fear-mongering, nor is it grubbing for grant money (I wish it were that easy!). It is responsible scientists doing their best to communicate the state of the science within the constraints of society and the primary mode of scientific communication. Let’s not be taken in by writers pretending to present “just the facts” in a cool, detached manner but who actually have every reason to try to foment unnecessary uncertainty about the state of our world and impugn the integrity of people doing their level best to understand a rapidly changing planet.

References

Kuhn, T. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Scheffer, M., J. Bascompte, W. A. Brock, V. Brovkin, S. R. Carpenter, V. Dakos, H. Held, E. H. van Nes, M. Rietkerk, and G. Sugihara. 2009. Early-Warning Signals for Critical Transitions. Nature. 461 (7260):53-59.

Weitzman, M. L. 2009. On Modeling and Interpreting the Economics of Catastrophic Climate Change. The Review of Economics and Statistics. XCI (1):1-19.

 

Three Questions About Norms

Well, it certainly has been a while since I’ve written anything here. Life has gotten busy with new projects, new responsibilities, etc. Yesterday, I participated in a workshop on campus sponsored by the Woods Institute for the Environment, the Young Environmental Scholars Conference. I was asked to stand-in for a faculty member who had to cancel at the last minute. I threw together some rather hastily-written notes and figured I’d share them here (especially since I spoke quite a bit of the importance for public communication!).

The theme of the conference was “Environmental Policy, Behavior, and Norms” and we were asked to answer three questions: (1) What does doing normative research mean to you? (2) How do your own norms and values influence your research? (3) What room and role do you see for normative research in your field? So, in order, here are my answers.

What does doing normative research mean to you?

I actually don’t particularly like the term “normative research” because it sounds a little too much like imposing one’s values on other people. I am skeptical of the imposition of norms that have more to do with (often unrecognized) ideology and less about empirical truth – an idea that was later reinforced by a terrific concluding talk by Debra Satz. If I can define “normative” to mean with the intent to improve people’s lives, then OK.  Otherwise, I prefer to do “positive” research.

For me, normative research is about doing good science. As a biosocial scientist with broad interests, I wear a lot of hats. I have always been interested in questions about the natural world, and (deep) human history in particular. However, I find that the types of questions that really hold my interest these days are more and more engaged in the substantial challenges we face in the world with inequality and sustainability. In keeping with my deep pragmatist sympathies, I increasingly identify with Charles Sanders Pierce‘s idea that given the “great ocean of truth” that can potentially be uncovered by science, there is a moral burden to do things that have social value. (As an aside, I think that there is social value in understanding the natural world, so I don’t mean to imply a crude instrumentalism here.) In effect, there is a lot of cool science to be done; one may as well do something of relevance.  I personally have little patience for people who pursue racist or otherwise socially divisive agendas and cloak their work in a veil of  free scientific inquiry.  This said, I worry when advocacy interferes with intellectual fairness or an unwillingness to accept that one’s position is not actually true.

I think that we are fooling ourselves if we believe that our norms somehow don’t have an effect on our research.  Recognizing what these norms that shape your research – whether implicitly or explicitly – helps you manage your bias. Yes, I said manage. I’m not sure we can ever completely eliminate it. I see this as more of a management of a necessary trade-off, drawing an analogy between the practice of science and a classic problem in statistics, between bias and variance. The more biased one is, the less variance there is in the outcome of one’s investigation. The less bias, the greater the likelihood that results will differ from one’s expectations (or wishes). Recognizing how norms shape our research also deals with that murky area of pre-science: where do our ideas for what to study come from?

How do your own norms and values influence your research?

Some of the the norms that shape my own research and teaching include:

transparency: science works best when it is open. This places a premium on sharing data, methods, and communicating results in a manner that maximizes access to information. As a simple example, this norm shapes my belief that we should not train students from poor countries in the use of proprietary software (and other technologies) that they won’t be able to afford when they return to their home countries when there are free or otherwise open-source alternatives.

fairness: this naturally includes a sense of social justice or people playing on an equal playing field, but it also includes fairness to different ideas, alternative hypotheses, the possibility that one is wrong. This type of fairness is essential for one’s credibility as a public intellectual in science (particularly supporting policy), as noted eloquently in this interview with Dick Lewontin.

respect for people’s ultimate rationality: Trying to understand the social, ecological, and economic context of people’s decision-making, even if it violates our own normative – particularly market-based economic – expectations.

flexibility: solving real problems means that we need to be flexible in our approach, willing to go where the solutions lead us, learning new tools and collaborating. Flexibility also means a willingness to give up on a research program that is doing harm.

good-faith communication: I believe that there is no room for obscurantism in the academy of the 21st century. This includes public communication. There are, of course, complexities here with regard to the professional development of young scholars.  One of the key trade-offs for young scholars is the need for professional advancement (which comes from academic production) and activism, policy, and public communication. Within the elite universities, the reality is that neither public communication nor activism count much for tenure. However, as Jon Krosnick noted, tenure is a remarkable privilege and, while it may seem impossibly far away for a student just finishing a Ph.D., it’s not really. Once you prove that you have the requisite disciplinary chops, you have plenty of time to to use tenure for what it is designed for (i.e., protecting intellectual freedom) and engaging in critical public debate and communication.

humility: solving problems (in science and society) means caring more about the answer to a problem than one’s own pet theory. Humility is intimately related to respect for others’ rationality.  It also means recognizing the inherently collaborative nature of contemporary science: giving credit where it is due, seeking help when one is in over one’s head, etc. John DeGioia, President of Georgetown University, quoted St. Augustine in his letter of support for Georgetown Law Student, Sandra Fluke against the crude attacks by radio personality Rush Limbaugh and I think those words are quite applicable here as well.  Augustine implored his interlocutors to “lay aside arrogance” and to “let neither of us assert that he has found the truth; let us seek it as if it were unknown to both.” This is not a bad description of the way that science really should work.

What room and role do you see for normative research in your field?

I believe that there is actually an enormous amount of room for normative research, if by “normative research,” we mean research that has the potential to have a positive effect on people’s lives. If instead we mean imposing values on people, then I am less sure of its role.

Anthropology is often criticized from outside the field, and to a lesser extent, from within it for being overly politicized. You can see this in Nicholas Wade’s critical pieces in the New York Times Science Times section following the American Anthropological Association’s executive committee excising of the word “science” from the field’s long-range planning document. Wade writes,

The decision [to remove the word ‘science’ from the long-range planning document] has reopened a long-simmering tension between researchers in science-based anthropological disciplines — including archaeologists, physical anthropologists and some cultural anthropologists — and members of the profession who study race, ethnicity and gender and see themselves as advocates for native peoples or human rights.

This is a common sentiment. And it is a complete misunderstanding. It suggests that scientists can’t be advocates for native peoples or human rights.  It also suggests that one can’t study race, ethnicity, or gender from a scientific perspective.  Both these ideas are complete nonsense.  For all the leftist rhetoric, I am not impressed with the actual political practice of what I see in contemporary anthropology. There is plenty of posturing about power asymmetries and identity politics but it is always done in such a mind-numbingly opaque language and with no apparent practical tie-in to policies that make people’s lives better. And, of course, there is the outright disdain for “applied” work one sees in elite anthropology departments.

Writing specifically about Foucault, Chomsky captured my take on this whole mode of intellectual production:

The only way to understand [the mode of scholarship] is if you are a graduate student or you are attending a university and have been trained in this particular style of discourse. That’s a way of guaranteeing…that intellectuals will have power, prestige and influence. If something can be said simply, say it simply, so that the carpenter next door can understand you. Anything that is at all well understood about human affairs is pretty simple.

Ultimately, the simple truths about human affairs that I find anyone can relate to are subsistence, health, and the well-being of one’s children. These are the themes at the core of my own research and I hope that the work I do ultimately can effect some good in these areas.

Guess What: Food Prices Still Near All-Time Highs

The FAO Food Price Index (FPI) remains at near record-highs, and this at a time when record droughts and calamitous famine threaten the Horn of Africa. Using the latest data from the FAO FPI page, I plot here the FPI time series from 1990-2011.

fpi-ts-1990-2011-1

World food prices are high and have remained so since the beginning of this year, though there have been some pretty dramatic swings between 2008 and now.  There is some argument that the real problem for poverty alleviation is actually price volatility and not high prices per se.  However, a recent paper in Foreign Affairs by Barrett and Bellemare argues that the problem for the world’s poor is really high prices (a more complete working paper can be found here). I find their arguments quite persuasive. Among these, the authors wryly note “Perhaps not coincidentally, [commentators’ and politicians’] emphasis on tempering price volatility favors the same large farmers who already enjoy tremendous financial support from G-20 governments.”

Nicholas Wade on Science and Anthropology

Nicholas Wade, who normally writes really terrific stuff on science in the New York Times, has a brief piece on our Anthropology fracas du jour. It’s good to see an expression of concern for the place of science in anthropology in such a prominent place and by such an important science writer.  I just wish he had gotten a few more things right.  While the Darkness in El Dorado fiasco was not a high point for the AAA, I suspect that this had not one iota to do with the re-wording of AAA’s long-range planning document. Secondly, I was pretty horrified to learn that science can’t be used as a framework for studying gender, ethnicity, and race, nor, apparently, can scientists advocate for indigenous people’s or human rights:

The decision [to remove the word ‘science’ from the long-range planning document] has reopened a long-simmering tension between researchers in science-based anthropological disciplines — including archaeologists, physical anthropologists and some cultural anthropologists — and members of the profession who study race, ethnicity and gender and see themselves as advocates for native peoples or human rights.

I think that this will come as quite an unpleasant surprise to many fine scientific anthropologists who are apparently fooling themselves by attempting to understand race or gender or working to improve the lives of the people with whom they work.

So, I’m left with mixed feelings about this turn of events.  On the one hand, the prominence of a Science Times piece by Nicholas Wade means that debate is likely to continue for a while to come. It would be particularly helpful if this work helped engage what I suspect is a quiet majority of anthropologists who are (1) sympathetic to science maintaining a prominent place in anthropology, and (2) too busy with their work to worry about yet another shrill controversy in the professional society they may or may not belong to (having given up membership because they already felt it didn’t represent their interests). On the other hand, I think we’re going to need to stop being inflammatory and falling back on facile received categories (e.g., “postmodernists,” “sociobiologists,” etc.) at every opportunity if we are going to make this debate productive and fashion a society that is friendly to rigorous scholarship in whatever form it may take. For my part, I am sticking with my view that the best way to promote science in anthropology is to do it, do it well, and communicate with a broad scientific readership.

Back to grading my final exams…

Uncertainty and Fat Tails

A major challenge in science writing is how to effectively communicate real, scientific uncertainty.  Sometimes we just don’t know have enough information to make accurate predictions.  This is particularly problematic in the case of rare events in which the potential range of outcomes is highly variable. Two topics that are close to my heart come to mind immediately as examples of this problem: (1) understanding the consequences of global warming and (2) predicting the outcome of the emerging A(H1N1) “swine flu” influenza-A virus.

Harvard economist Martin Weitzman has written about the economics of catastrophic climate change (something I have discussed before).  When you want to calculate the expected cost or benefit of some fundamentally uncertain event, you basically take the probabilities of the different outcomes and multiply them by the utilities (or disutilities) and then sum them.  This gives you the expected value across your range of uncertainty.  Weitzman has noted that we have a profound amount of structural uncertainty (i.e., there is little we can do to become more certain on some of the central issues) regarding climate change.  He argues that this creates “fat-tailed” distributions of the climatic outcomes (i.e., the disutilities in question).  That is, the probability of extreme outcomes (read: end of the world as we know it) has a probability that, while it’s low, isn’t as low as might make us comfortable.

A very similar set of circumstances besets predicting the severity of the current outbreak of swine flu.  There is a distribution of possible outcomes.  Some have high probability; some have low.  Some are really bad; some less so.  When we plan public health and other logistical responses we need to be prepared for the extreme events that are still not impossibly unlikely.

So we have some range of outcomes (e.g., the number of degrees C that the planet warms in the next 100 years or the number of people who become infected with swine flu in the next year) and we have a measure of probability associated with each possible value in this range. Some outcomes are more likely and some are less.  Rare events are, by definition, unlikely but they are not impossible.  In fact, given enough time, most rare events are inevitable.  From a predictive standpoint, the problem with rare events is that they’re, well, rare.  Since you don’t see rare events very often, it’s hard to say with any certainty how likely they actually are.  It is this uncertainty that fattens up the tails of our probability distributions.  Say there are two rare events.  One has a probability of 10^{-6} and the other has a probability of 10^{-9}. The latter is certainly much more rare than the former. You are nonetheless very, very unlikely to ever witness either event so how can you make any judgement that the one is a 1000 times more likely than the other?

Say we have a variable that is normally distributed.  This is the canonical and ubiquitous bell-shaped distribution that arises when many independent factors contribute to the outcome. It’s not necessarily the best distribution to model the type of outcomes we are interested in but it has the tremendous advantage of familiarity. The normal distribution has two parameters: the mean (\mu) and the standard deviation (\sigma).  If we know \mu and \sigma exactly, then we know lots of things about the value of the next observation.  For instance, we know that the most likely value is actually \mu and we can be 95% certain that the value will fall between about -1.96 and 1.96. 

Of course, in real scientific applications we almost never know the parameters of a distribution with certainty.  What happens to our prediction when we are uncertain about the parameters? Given some set of data that we have collected (call it y) and from which we can estimate our two normal parameters \mu and \sigma, we want to predict the value of some as-yet observed data (which we call \tilde{y}).  We can predict the value of \tilde{y} using a device known as the posterior predictive distribution.  Essentially, we average our best estimates across all the uncertainty that we have in our data. We can write this as

 p(\tilde{y}|y,\mu,\sigma) = \int \int p(y|\mu,\sigma) p(\mu,\sigma|y) d\mu d\sigma.

 

OK, what does that mean? p(y|\mu,\sigma) is the probability of the data, given the values of the two parameters.  This is known as the likelihood of the data. p(\mu,\sigma|y) is the probability of the two parameters given the observed data.  The two integrals mean that we are averaging the product p(y|\mu,\sigma)p(\mu,\sigma|y) across the range of uncertainty in our two parameters (in statistical parlance, “integrating” simply means averaging).  

If you’ve hummed your way through these last couple paragraphs, no worries.  What really matters are the consequences of this averaging.

When we do this for a normal distribution with unknown standard deviation, it turns out that we get a t-distribution.  t-distributions are characterized by “fat tails.” This doesn’t mean they look like this. What it means is that the probabilities of unlikely events aren’t as unlikely as we might be comfortable with.  The probability in the tail(s) of the distribution approach zero more slowly than an exponential decay.  This means that there is non-zero probability on very extreme events. Here I plot a standard normal distribution in the solid line and a t-distribution with 2 (dashed) and 20 (dotted) degrees of freedom.

Standard normal (solid) and t distributions with 2 (dashed) and 20 (dotted) df.

We can see that the dashed and dotted curves have much higher probabilities at the extreme values.  Remember that 95% of the normal observations will be between -1.96 and 1.96, whereas the dashed line is still pretty high for outcome values beyond 4.  In fact, for the dashed curve,  95% of the values fall between -4.3 and 4.3. In all fairness, this is a pretty uncertain distribution, but you can see the same thing with the dotted line (where the 95% internal interval is plus/minus 2.09).  Unfortunately, when we are faced with the types of structural uncertainty we have in events of interest like the outcome of global climate change or an emerging epidemic, our predictive distributions are going to be more like the very fat-tailed distribution represented by the dashed line.

As scientists with an interest in policy, how do we communicate this type of uncertainty? It is a very difficult question.  The good news about the current outbreak of swine flu is that it seems to be fizzling in the northern hemisphere. Despite the rapid spread of the novel flu strain, sustained person-to-person transmission is not occurring in most parts of the northern hemisphere. This is not surprising since we are already past flu season.  However, as I wrote yesterday, it seems well within the realm of possibility that the southern hemisphere will be slammed by this flu during the austral winter and that it will come right back here in the north with the start of our own flu season next winter.  What I worry about is that all the hype followed by a modest outbreak in the short-term will cause people to become inured to public health warnings and predictions of potentially dire outcomes. I don’t suppose that it will occur to people that the public health measures undertaken to control this current outbreak actually worked (fingers crossed).  I think this might be a slightly different issue in the communication of science but it is clearly tied up in this fundamental problem of how to communicate uncertainty.  Lots to think about, but maybe I should get back to actually analyzing the volumes of data we have gathered from our survey!

On Journal Impact Factors

How do we evaluate the quality of published work?  This has become an issue for me recently for one general and two more specific reasons.  The general reason is that as one approaches one’s tenure decision, one tends to think about the impact of one’s oeuvre. The specific reasons are, first, I have a paper that I know has been read (and used) by a substantial number of people but was published in a journal (The Journal of Statistical Software) that is not indexed by Thompson Scientific, the keepers of the impact factor. Will this hurt me or any of the other people who write useful and important software (and perform all the research entailed in creating such a product) when I am evaluated on the quality of my work?   The second reason this question has taken on relevance for me is that I am an Associate Editor of PLoS ONE, another journal that is not indexed by Thompson. One of my duties as an AE is to encourage people to submit high-quality papers to PLoS ONE.  This can be tricky when people live and die by a journal’s impact factor.

The thing that irks me about Thompson’s impact factors is how opaque they are.  Thompson doesn’t have to answer to anyone, so they are free to do whatever they want (as long as people continue to consume their products).  Why do some journals get listed and others don’t?  What constitutes a “substantive paper” (the denominator for the impact factor calculation)?  What might the possible confounds be?  What about biases? We actually know quite a bit about these last two.  We know very little about the first two.

Moyses Szklo has a nice brief editorial in the journal Epidemiology, describing a paper in that same journal by Miguel Hernán criticizing the use of impact factors in epidemiology.  The points clearly apply to science more generally.  Three key isues affecting a journal’s impact factor listed by Szklo are: (1) the frequency of self-citation, (2) the proportion of a journal’s articles that are reviews (review papers get cited a lot), and (3) the size of the field being served by the journal.  Hernán’s paper is absolutely marvelous.  He notes that the bibliographic impact factor (BIF) is flawed — as a statistical measure, not by the manipulations described by Szklo — for three reasons: (1) a bad choice of denominator (total number of papers published), (2) the need to adjust for variables that are known to affect the measure, (3) the questionability of the mean as a summary measure for highly skewed distributions (as we know BIFs have). Hernán makes his case by presenting a parallel case of a fictional epidemiological study. To anyone trained in epidemiological methods, this case is clearly flawed.  It is exactly analogous to the way that Thompson calculates BIFs, yet we continue to use them.  The journal, Epidemiology, also published a number of interesting responses to Hernán’s paper criticizing the use of BIFs (Rich Rothenberg, social network epidemiologist-extraordinaire has a nice counterpoint essay to these). The irony is that on the Epidemiology front page, they advertise the journal by touting its impact factor!

The rub, of course, is that formulating a less flawed metric of intellectual impact is clearly a very demanding task.  Michael Jenson, of the National Academies, has written The New Metrics of Scholarly Authority.  One of the key concepts is devising a metric that measures quality at the level of the paper rather than the level of the journal.  We’ve all seen fundamentally important papers that, for whatever reason, get published in obscure journals.  Similarly, we regularly see the crap that comes out in high-prestige journals like Science, Nature, and PNAS every week!  Pete Binfield, the managing editor of PLoS ONE notes that Jenson’s ideas are very difficult to implement.  Pete is leading the way for PLoS to think about alternative metrics like the number of downloads, the number of ping-backs from relevant (uh-oh, more subjectivity!) blogs, number of bookmarks on social bookmark pages, etc.  Another way to handle Thompson’s monopoly is to use alternative metrics such as those created by Scopus or Google Scholar.  This last suggestion, while worth pursuing in the spirit of competition, is still not entirely satisfying because to whom in Science do these organizations have to answer?  I am particularly leery of Scopus because it is run by Elsevier, a big for-profit publishing house that also clearly has it’s own agenda.  PubMed is, at least, public and for the public benefit.  Of course, they don’t index all journals either — not too many Anthropology journals indexed there!

Björn Brembs, another PLoS ONE AE, makes the very reasonable suggestion that an impact factor should, at the very least, be a multivariate measure (in accordance with the criticism of lack-of-adjustment for confounders in Hernán’s essay).  Björn, in another blog posting, cites a paper published last year in PLoS ONE that I have not yet read, but clearly need to.  This paper shows that BIF inconsistently ranks journals in terms of impact (largely because the mean is such a poor measure for citation distributions) and proposes a more consistent measure.  I need to carve some time out of my schedule to read this one carefully.

More on Science in the Obama Times

As a follow-up to my post on science and the Obama Inaugural, I wanted to note a terrific essay  by Dennis Overbye on the civic virtues of science in the New York Times. He argues that virtue emerges from the process of science: “Science is not a monument of received Truth but something that people do to look for truth.”  Continuing, he writes,

That endeavor, which has transformed the world in the last few centuries, does indeed teach values. Those values, among others, are honesty, doubt, respect for evidence, openness, accountability and tolerance and indeed hunger for opposing points of view. These are the unabashedly pragmatic working principles that guide the buzzing, testing, poking, probing, argumentative, gossiping, gadgety, joking, dreaming and tendentious cloud of activity — the writer and biologist Lewis Thomas once likened it to an anthill — that is slowly and thoroughly penetrating every nook and cranny of the world.

There is a certain egalitarian, round-table ethos to science done well.  It doesn’t matter what degrees you have or where from.  What matters is whether you ask and answer interesting questions. Of course, institutions that support science frequently care about degrees and where they’re from, but in my experience, good scientists don’t. While there are certainly barriers to entry (e.g., the cost of higher education, the difficulty of mastering a subject), there is no fundamentally esoteric knowledge in science.  When it’s working right, everything is transparent.  It has to be because no one will believe you unless it can be repeated.

I certainly hope the rhetoric of respect for science and the idea that empirical research will inform policy continues and gets translated into tangible support for research in the coming years.

Unusual Editorial

This is something you don’t typically see in the editorial pages of the New York Times, viz., advocacy for reinstating US Air Force investigations of unidentified flying objects.  Pope has a point; closed-mind policies are probably never a good idea.  This is not to say that we have little green men coming to cut crop circles or probe guys in pickup trucks on rural highways.  But if there are things in the air the identity of which we don’t know, maybe we should try to figure out what they are?  I guess I’d rather see us fix health care and improve infrastructure first, but I am generally in favor of ridding ourselves of self-induced blind spots.