Sex Distribution, Life Expectancy and Educational Attainment of Comedians

Sex Distribution, Life Expectancy and Educational Attainment of Comedians

Although humor plays a central role in human life (Carroll, 2014; Hurley, Dennett & Adams, 2011), especially in dating (Campbell, Martin & Ward, 2008; Cann, Davis & Zapata, 2011), there have been surprisingly few quantitative studies of comedians, although there has been a fair number of non-quantitative descriptions (Fisher & Fisher, 1981; e.g. Mizejewski, 2014). A few studies have examined cognitive ability in comedians as well as cognitive ability's relationship with humor ability in college students (Greengross, Martin & Miller, 2012; Janus, 1975; Janus, Bess & Janus, 1978). Comedians were found to average very high IQ scores, of 126-138.1

It has been noted that comedians tend to be male, and some have proposed evolutionary reasons for this state of affairs (Hitchens, 2007, 2008; Stanley, 2008). However, no one seems to have systematically compiled data about comedians, so the sex distribution is not known at present. The purpose of this study was to examine the sex distribution of comedians in a large, worldwide sample. Furthermore, there were two secondary goals: a) to briefly examine the longevity of comedians, and b) to briefly examine whether comedians have high educational attainment which would corroborate the unusually high ability scores reported.


General approach

All analyses were done in R, see the supplementary materials for details. Data was either scraped from Wikipedia using the rvest package (Wickham & RStudio, 2016), or downloaded through Wikimedia's API by use of the WikipediR package (Keyes & Tilbert, 2016). The initial list of comedians was from the List of comedians article ( This article only contains a few comedy groups and writers, which were excluded. This resulted in a list of 1408 links to pages. Moving averages were fitted with local regression (loess) (James et al., 2013). The span parameter was chosen by cross validation using the bisoreg package (Curtis, 2015).

There are two general ways to extract information out of the Wikipedia articles. One can look for keywords in the free text and identify patterns that are useful for inferring information of interest. This approach can be quite cumbersome and fail-prone due to the messiness of natural language. The second approach is to rely on Wikipedia's categories. These are standardized, so classifying persons is easy. However, less developed pages may miss some categories and would thus lead to false negatives or missing data depending on interpretation. Furthermore, there are only a set number of categories, so if no category covers the desired information, one cannot use this approach.

Nationality and Jewishness

No attempt was made to estimate detailed ethnicities or nationalities, but a few were estimated based on categories. These were American (69%), British (18%), Canadian (6%), Australian (3%), New Zealander (0%, 1 person), and Scandinavian (0%, 2 persons). Together these summed to what amounts to 97% of cases. This is somewhat misleading because 2.98% were assigned two nationalities, 0.14% were assigned 3, while 6% were assigned none and 91% were assigned one. For the English-speaking countries, the fractions of comedians in the sample were approximately in line with the countries' population size. This is in line with the null model that no English-speaking country is particularly better than the others at being funny.2

It has been claimed that (female) comedians tend to be Jewish (Hitchens, 2007), so this was investigated. Jewishness is not a nationality, so it is not included in the previously reported numbers. In fact, a cross tabulation showed that 89% of Jews in the sample of comedians were Americans. In total, 17% of the sample was Jewish. There was no noteworthy effect of sex (17% among males, 15% among females).

Year of birth and death

It is possible to estimate each comedian's year of birth/death by either looking for birth/death information in the free text, or by relying on the categories. …

