More on Diamond

I've been thinking some more about the issues that are raised by the debacle over Jared Diamond's 21 April 2008 New Yorker piece and the recent announcement of a lawsuit against him.  There are many things to think about here.  Probably foremost amongst these are the ethical concerns relating to preserving research subjects' privacy and informed consent.  There are secondary concerns regarding scholarship, standards of research, and obligations to adequately describe research methodology.

I am troubled by a point raised by Alex Golub in the Savage Minds blog. Golub writes, "There is also a more serious problem with [Diamond's New Yorker] article which is also the most obvious thing about it: it contrasts ‘tribal societies’ with ‘modern state societies’. "  This is something that bothers me too though I think that my response may be somewhat different than that of many contemporary cultural anthropologists. In general, I have sensibilities very much akin to Diamond's. I see tremendous value in comparative studies, and I think that there is something that we can call, for lack of a better term, a robust and fairly general Human Nature.  Human beings are biological entities with material needs and (many) material motivations and we ignore these at our explanatory (and possibly literal) peril.

The Myth-of-Isolation criticism, which also arises in the Diamond debacle, is not new in Anthropology.  I am reminded of the Kalahari Debate of Lee, Wilmsen and others. Globalization as a phenomenon of anthropological inquiry has certainly increased in currency of late and I think that this scholarship tends to make many of my colleagues skeptical of any research on, say, foraging decisions by hunting and gathering people.  The answer to this criticism is that foraging people in a globalized world, like all people, still make decisions about what to eat, what not to eat, how to eat, etc.  Their choices may be constrained by a hegemonic state or by extra-state organizations, but choices are still being made.  Understanding how such choices are made in a globalized world strikes me as being at least as important as it was 50 or 100 years ago.   This goes for hunter-gatherers as well as urban elites, agrarian peasants or just about anyone else.

Rather than taking labels such as "tribal" or "state" as sufficient descriptions of the differences between groups, I think that the science requires us to describe (and hopefully quantify) the dimensions of their difference.  I have been thinking a lot about social networks lately.  One dimension on which two societies might differ is the composition of ego networks.  How many people does a given person know?  What fraction of those are kin?  What is the gender composition of the ego network? How socially similar are the member's of ego's network to him/herself? How many would provide emotional/economic/agonistic support to you in a crisis?  Does an individual's ego network include socially important figures like government functionaries, doctors, lawyers or the equivalent? How much does ego's network overlap with his/her spouse's? Brother's? Neighbor's? Member of the next village/town? Gathering such data is clearly a major undertaking, but that's what science is about, no?

The fraught question of how to do ethical, meaningful anthropology in a globalized world that struggles with the legacy of colonial depredations has, in my view, driven too many anthropologists from science. Protecting human subjects and doing unto others what we would have done to us are important guiding principles for anthropological research, indeed, any research in the human sciences.  Describing -- and, ultimately, understanding -- how societies differ and what the implications of these differences are for human behavior should, in my opinion, be another principle.  Facile labels relating to social or economic complexity, ethnicity, religion, nationality, etc. do not help us understand the diversity of human behavior.

Jared Diamond and Anthropological Ethics

Brian McKenna sent around to the EANTH Listserv a couple of blog posts today detailing the trouble that Jared Diamond has gotten in about a New Yorker story he wrote a year ago on the power of vengeance.  This seems like a rather sordid affair but I think that Alex Golub should be commended for his very fair treatment of it in his most recent post on the Savage Minds blog.  It certainly sounds like there were some major ethical lapses in the production of Diamond's New Yorker piece, though I wonder about the motivations of Shearer, the muck-raker who has brought these issues to light.

Jeanine Pfeiffer, the director for social science at the Earthwatch Institute, posted a list of ethical guidelines from the International Society for Ethnobiology which I think are very relevant for understanding the apparent ethical lapses in Diamond's work. Honestly, I can't imagine publishing an article like Diamond's and using the informants' real names!  Then there's the issue of informed consent.  People need to know that they are "on the record" -- whatever that means in this (kind of) grey area of anthropology-meets-journalism. Seems to me that minimal twin standards that should guide anyone's writing about the lives of people "in the field" include: (1) would you write this way about people in your own immediate community? and (2) would your article's evidentiary and rhetorical standards pass muster if it was a paper submitted by an undergraduate for a class you were teaching?

As Golub notes, Jared Diamond is an easy target for academic anthropologists (or historians or geographers) because he writes well and clearly about complex topics and reaches a large audience.  There are always people grumbling about Diamond's originality, authenticity, and willingness to attribute ideas to others (in that vein, a highly recommended book). Diamond's ecological work on community assembly of pigeons in New Guinea is top-notch and I still assign his chapter from the Ecology and Evolution of Communities volume he edited with Martin Cody in 1975. Regarding books like Guns, Germs, and Steel or Collapse, I take them for what they are: synthetic popularizations that maybe could have stood a little more attribution.  It would certainly be nice if the popular spokesperson for Anthropology were actually an anthropologist. Alas, anthropologists, for the most part, have given up on the goal of writing prose that can be broadly read, appreciated, and understood.

This is a story I will be keeping my eye on.

On Intelligence

Nicholas Kristof has an interesting Op-Ed piece this week in the Times.  Reporting on University of Michigan Professor Richard Nisbett's new book, Intelligence and How to Get It, Kristof argues for the general malleability of intelligence.  He writes,

If intelligence were deeply encoded in our genes, that would lead to the depressing conclusion that neither schooling nor antipoverty programs can accomplish much. Yet while this view of I.Q. as overwhelmingly inherited has been widely held, the evidence is growing that it is, at a practical level, profoundly wrong.

I think that this is an important point that is worth pursuing.  There is indeed a widely held view that intelligence is "genetically determined" (whatever that means -- how you define it matters), perhaps most infamously articulated in Charles Murray and Richard Herrnstein's book, The Bell Curve. This idea comes from numerous studies of the correlation of relatives' scores in standardized intelligence tests, the most common design for which is the twin study.  The basic idea is that you compare the concordance in test scores of monozygotic (i.e., genetically identical) twins with dizygotic twins who share only 50% of their genes.  The assumption is that both monozygotic and dizygotic twins will share the same rearing environment.  Therefore, differences that appear in the observed concordance should be attributable to genes.

Twin studies show that IQ, like many other features of human behavior, is moderately "heritable."  Now, a key to understanding this field and the debate that it has spawned is understanding what is meant by heritability.  Geneticists posit two different conceptions of heritability.  The first is the common parlance sense: heritability simply means that a trait is genetically determined and can therefore be inherited from one's parents.  This is known as "broad-sense heritability."  In contrast, "narrow sense heritability" has a fairly precise technical meaning. Narrow sense heritability is the fraction of total phenotypic variance attributable to additive genetic variance.  Based simply on this definition, laden with unfamiliar terms, you can see why most people think in terms of broad-sense heritability.

So let's parse out the definition of narrow-sense heritability.  First, "total phenotypic variance" simply means the total observed variance in the trait in question (e.g., IQ) for some well-defined population (e.g., the sample of individuals in the study).  This variance arises from a variety of sources, some genetic, some environmental, some both.  It is very important to note that variance is central to both definitions of heritability.  A trait can be completely genetically determined (whatever that means) but have no variance in a population.  Think head-number among human beings.  This trait is so deeply developmentally canalized that there is no variance (everybody has one) and, thus, zero heritability.

As sexual beings, when we reproduce, our alleles (variants of genes) reshuffle whenever we generate our gametes, or reproductive cells (i.e., eggs and sperm), during the process of meiosis.  One of the principles that Mendel is known for is the principle of independent assortment.  This is the idea that when our alleles get reshuffled during meiosis, they appear in any given gamete independently of what the other alleles that show up in that gamete are.  It turns out that independent assortment is not in any way universal.  Some alleles assort independently while others are linked to other alleles, typically because they are near each other on a chromosome (but sometimes for more interesting reasons).  The additive genetic variance in the definition of heritability refers to the variance attributable to just the alleles that assort independently. These are the so-called additive effects.  Additivity arises from the independence of the different allelic effects.  We care so much about the additive effects because these are what let us make predictive models.  When an animal breeder wants to know the response to selection of some quantitative trait (e.g., body size, milk fat percentage, age at maturity), she uses an equation that multiplies the narrow-sense heritability and the selective advantage of the trait in question.  Now, our scientific interest in the heritability of intelligence ostensibly arises from the desire to create predictive and explanatory models like this breeder's equation.  In the absence of explanatory or predictive power, I don't see much scientific value.

Genes can express their effects in ways other than through their additive effects.  For example, there is that familiar concept from Mendelian genetics, dominance.  Dominance is a type of allele-allele interaction, just limited to the special case of occurring within a single locus.  A more general case is allelic interactions is epistasis.  An epistatic gene is one that affects the expression of one or more other genes.  The epistatic gene is a regulator which can either increase or decrease (possibly turn off) the effect of other genes.  These interactions are harder to predict and typically go into the error term in the breeder's equation.

The real gotcha in heritability analysis though is the existence of genotype-environment (GxE) interactions. These are generally not measured and can be quite large.  Lewontin, in his classic (1974) paper, first suggested that GxE interactions (in addition to other types of difficult-to-measure interactions like those arising from epistasis) might actually be large.  Much of the thought that followed has supported this idea (see, e.g., Pigliucci 2001). In twin study designs, GxE interactions are non-identifiable, meaning that we don't have enough information to simultaneously estimate the interaction, genetic, and environmental effects, so they are generally assumed to be zero. I think that it is fair to say that the consensus among population geneticists is that heritability analyses, as done though twin studies, for example, are misleading at best because of this fundamental flaw.

In my mind, the fundamental problem with twin studies of the heritability of intelligence is that they can't begin to measure GxE interactions and therefore their estimates of heritability are hopelessly suspect.

Where is heritability of intelligence likely to be large and not quite as fraught with the problems of unmeasurable and potentially large GxE interactions? One possibility is in homogenous, affluent communities, not entirely unlike Palo Alto.  Kristof notes in his Op-Ed piece that "Intelligence does seem to be highly inherited in middle-class households." In such communities, external ("environmental") sources of variation are relatively small.  Most kids have stable homes with (two) college educated parents who place high value of achievement in school, go to safe, well-funded schools with motivated and highly trained teachers, eat nutritious food and live fairly enriched lives.  When the total variance is low, whatever variance is explained by additive genetic effects is likely to be a higher fraction of the total variance. Hence, high heritability. This is a quite general point: the more environmentally homogenous a population is, the higher we should expect heritability to be.

It is very, very important, however, to note that this is generally not the case.  When we move out of relatively homogenous and affluent communities, the sources of environmental variance increase and compound.  The fact that a trait with such high measured heritability can be modified so extensively as discussed in Nisbett's book suggests that intelligence is a trait with an enormous environmental effect and, I'm betting, a huge GxE interaction effect. It seems to me that the Flynn effect, the observation that IQ increases with time, provides further suggestive evidence for a massive environmental interaction. While the genomic evidence for recent strong selection on humans is mounting (in contrast to the bizarre idea that somehow selection came to a screeching halt with the advent of the Holocene), I doubt that there have been significant selective changes in the genes for intelligence (whatever that means) in the past century.  Now, the environment certainly has changed in the last 100 years.  This is what makes me thing big GxE interactions.

So, in a phrase, sure, genes help determine intelligence.  But the action of these genes is so fundamentally tied up in environmental interactions that it seems that the explanatory power of simple genetic models for intelligence and other complex social traits such as political and economic behavior or social network measures is very low indeed. Moreover, the predictive power of these models in changing environments is low.  Without explanatory or predictive potential, we are left with something that isn't really science. I applaud efforts to more deeply understand how productive environments, good schools, and healthy decisions can maximize human potential. Heritability studies of IQ (and I worry about these other fashionable areas) seem to provide an excuse for the inexcusable failure to deal with the fundamental social inequalities that continue to mar our country -- and the larger world.

References

Lewontin, R. 1974. The analysis of variance and the analysis of causes. American Journal of Human Genetics 26: 400-411.

Pigliucci, M. 2001. Phenotypic Plasticity: Beyond Nature and Nurture. Baltimore, Johns Hopkins University Press.

New SARS on Trans-Siberian Railway?

Scary.  A woman traveling from Blagoveshchensk to Moscow by rail died, apparently of pneumonia, and all the people riding in her carriage (about 60) have to taken to a hospital for observation and quarantine.  Officials think this could be a case of SARS.  From the RIA Novosti Report:

The train was stopped in the central Russian city of Kirov and around 60 train passengers were sent to a local hospital. 6 of them are reported as running fevers, the source said, although Kirov Region officials have said that none of them were suffering from SARS. The carriage in which the woman was travelling was disconnected from the rest of the train. The train then continued on its way to Moscow. Russia's [public health] watchdog spokesman was unable to confirm that the woman had died from SARS. "Doctors are currently establishing a preliminary diagnosis," he said.

This is definitely something to keep our collective eye on.  Full report, and any new developments can be found on ProMED.

Platform for Developing Mathematical Models of Infectious Disease

Every once in a while someone asks me for advice on the platform to use for developing models of infectious disease.  I typically make the same recommendations -- unless the person asking has something very specific in mind. This happened again today and I figured I would turn it into a blog post.

The answer depends largely on (1) what types of models you want to run, (2) how comfortable you are with programming, and (3) what local resources (or lack thereof) you might have to help you when you inevitably get stuck.  If you are not comfortable with programming and you want to stick to fairly basic compartmental models, then something like Stella or Berkeley Madonna would work just fine.  There are a few books that provide guidance on developing models in these systems.  I have a book by Hannon and Ruth that is ten years old now but, if memory serves me correctly, was a pretty good introduction both to STELLA and to ecological modeling. They have a slightly newer book as well.  Models created in both systems appear in the primary scientific literature, which is always a good sign for the (scientific) utility of a piece of software.  These graphical systems lack a great deal of flexibility and I personally find them cumbersome to use, but they match the cognitive style of many people quite nicely, I think, and probably serve as an excellent introduction to mathematical modeling.

Moving on to more powerful, general-purpose numerical software...

Based on my unscientific convenience sample, I'd say that most mathematical epidemiologists use Maple.  Maple is extremely powerful software for doing symbolic calculations.  I've tried Maple a few times but for whatever reason, it never clicked for me.  Because I am mostly self-taught, the big obstacle for me using Maple has always been the lack of resources either print or internet for doing ecological/epidemiological models in this system. Evolutionary anthropologist Alan Rogers does have some excellent notes for doing population biology in Maple.

Mathematica has lots of advantages but, for the beginner, I think these are heavily outweighed by the start-ups costs (in terms of learning curve). I use Mathematica some and even took one of their courses (which was excellent if a little pricey), but I do think that Mathematica handles dynamic models in a rather clunky way. Linear algebra is worse.  I would like Mathematica more if the notebook interface didn't seem so much like Microsoft Word.  Other platforms (see below) either allow Emacs key bindings or can even be run through Emacs (this is not a great selling point for everyone, I realize, but given the likely audience for Mathematica, I have always been surprised by the interface). The real power of Mathematica comes from symbolic computation and some of the very neat and eclectic programming tools that are part of the Mathematica system. I suspect I will use Mathematica more as time goes on.

Matlab, for those comfortable with a degree of procedural-style programming, is probably the easiest platform to use to get into modeling. Again, based on my unscientific convenience sample, my sense is that most quantitative population biologists and demographers use Matlab. There are some excellent resources.  For infectious disease modeling in particular, Keeling and Rohani have a relatively new book that contains extensive Matlab code. In population biology, books by Caswell and Morris and Doak, both contain extensive Matlab code.  Matlab's routines for linear algebra and solving systems of differential equations are highly optimized so code is typically pretty fast and these calculations are relatively simple to perform.  There is a option in the preferences that allows you to set Emacs key bindings.  In fact, there is code that allows you to run Matlab from Emacs as a minor mode.  Matlab is notably bad at dealing with actual data.  For instance, you can't mix and match data types in a data frame (spreadsheet-like structure) very easily and forget about labeling columns of a data frame or rows and columns of a matrix. While its matrix capabilities are unrivaled, there is surprisingly little development of network models, a real growth area in infectious disease modeling. It would be really nice to have some capabilities in Matlab to import and export various network formats, thereby leveraging Matlab's terrific implementation of sparse matrix methods.

Perhaps not surprisingly, the best general tool, I think, is R.  This is where the best network tools can be found (outside of pure Java). R packages for dealing with social networks include the statnet suite (sna, network, ergm), igraph, graphblockmodeling, RGBL, etc. (the list goes on).  It handles compartmental models in a manner similar to Matlab using the deSolve package, though I think Matlab is generally a little easier for this.  One of the great things about R is that it makes it very easy to incorporate C or Fortran code. Keeling and Rohani's book also contains C++ and Fortran code for running their models (and such code is generally often available).  R and Matlab are about equally easy/difficult (depending on how you see it) to learn.  Matlab is somewhat better with numerically solving systems of differential equations and R is much better at dealing with data and modeling networks.  R can be run through Emacs using ESS (Emacs Speaks Statistics).  This gives you all the text-editing benefits of a state-of-the-art text editor plus an effectively unlimited buffer size.  It can be very frustrating indeed to lose your early commands in a Matlab session only to realize that you forgot to turn on the diary function. No such worries when your run R through Emacs using ESS.

One of the greatest benefits of R is its massive online (and, increasingly, print publishing) help community. I think that this is how R really trumps all the other platforms to be the obvious choice for the autodidacts out there.

I moved from doing nearly all my work in Matlab to doing most work in R, with some in Mathematica and a little still in Matlab.  These are all amazingly powerful tools.  Ultimately, it's really a matter of taste and the availability of help resources that push people to use one particular tool as much as anything else.  This whole discussion has been predicated on the notion that one wants to use numerical software.  There are, of course, compelling reasons to use totally general programming tools like C, Java, or Python, but this route is definitely not for everyone, even among those who are interested in developing mathematical models.

A Sign of the Times

Every time I go by the Stanford Shopping Center -- which is a truly absurd place, I should add -- I am reminded of an event that seems like an apt metaphor for the economic melt-down, the consequences of which we are only beginning to understand. I am, of course, talking about the replacement of Long Life Noodle House with Sprinkles Cupcakes.  Now, don't get me wrong.  Long Life Noodle House was a completely mediocre restaurant.  But it served real food.  You could go there, say, for dinner.  We did this on a regular basis, not because of its outstanding food, but because it was close, convenient, relatively inexpensive, and (given judicious choices) offered nutritious fare.  It also seems relevant to note that it was typically quite busy; it hardly seems like they were lacking for business. One night we went there after the kids' swim practice only to find it abruptly closed.  Within a month or so, the restaurant was replaced by this more than slightly ridiculous confectioner. 

Stanford Shopping Center replaced a restaurant that served real food with one that serves frivolous little confections as the United States substituted innovation and production for financial gimmickry.