Sebastian Zearing

how to be progressive without being a stupid liberal

Human biodiversity

This prediction was made on 28 July 2015.

Human biodiversity (HBD) is the notion that the biological entity known as Homo sapiens sapiens is subject to subpopulation differences in virtually all conceivable characteristics because of irregular distributions of genetic alleles. This is also identified by some with “racism,” and yet by others with “race realism.” I reject both labels as “race” isn’t a well-defined biological concept, and also essentialism is false. The main purpose of HBD is to provide a view specifically opposed to blank-slatism: though many human characteristics (such as skin color, tooth shape, susceptibility to sickle-cell anemia, adult ability to digest milk, etc.) are known to vary across subpopulations (which does happen to be a well-defined, non-essentialist biological concept), many find it difficult to extend this simple concept into the neurological, psychological, cognitive, and behavioral realms.

Probably the most important technique today for finding links between genes and phenotypes is the genome-wide association study (GWAS). However, GWAS are not very well suited to their task of finding links between genes and phenotypes because they can only investigate two-point correlations (the standard notion of correlation between two variables—in this case, a set of alleles and a set of phenotypes). As the principle of superposition does not apply in the general case to metabolic systems (equivalently, metabolic systems are highly non-linear), GWAS are usually severely limited in the extent of phenotype variance they can explain. In other words, rarely does any allele have any kind of separable, discernible influence on any phenotype. Rather, it is only the contingent conjunction of one gene or set of genes with another contingent conjunction of other genes that produces the phenotype.

Fortunately, a new data analysis method that investigates 3-point and higher correlations has been developed in the past few years. Known as Deep Learning, this method has been wildly successful in virtually every task it has been set to. Unfortunately, Deep Learning typically requires much more data than GWAS, simply because many more model parameters are necessary due to the 3-point and higher correlations. However, recent advances in human genome sequencing have greatly reduced the cost of the necessary data and are making it much more widely available. Further, from the link, it seems that even though the price tag of sequencing a human genome may not drop substantially below $1000, the usefulness of doing so for health care purposes can only continue to grow, pointing to an explosion in the amount of human genome data in the coming years.

Specific Prediction

I predict that within 10 years (although probably in more like half that), there will be a scientific paper using Deep Learning or a very similar machine learning algorithm that confirms the central notion and purpose of human biodiversity, namely that human subpopulations differ in the distribution of alleles at genetic loci causally implicated in cognitive function. I make no specific subclaims as to any subsidiary HBD notions, only to assert that I believe many people, including the champions of HBD, will be very surprised. I may begin making subclaims after this paper has been published.

A secondary prediction I make is that this paper will have a disproportionate number of Asian authors, perhaps including Stephen Hsu.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: