Academic journal article Population

The Vocabulary of Demography, from Its Origins to the Present Day: A Digital Exploration

Academic journal article Population

The Vocabulary of Demography, from Its Origins to the Present Day: A Digital Exploration

Article excerpt

Textual analysis begins with a count of how frequently words appear in a text or a corpus of texts. Ngram Viewer provides a means to do this on an all-inclusive scale, by counting the annual occurrences of words or word groups in the entire corpus of books contained in the Google Books database, which currently includes more than seven million titles published over five centuries in eight languages. What are the strengths and weaknesses of this new tool? What does it teach us about the diffusion of words, in the field of demography especially, over the last century? How is word usage in the Google Books corpus linked to historical events? Analysing the rise to prominence or the demise of certain demographic terms, François Héran discovers that the vocabulary of formal demography has fallen out of favour, replaced by new themes and new notions that reflect the broadening horizons of population science.

A mere generation ago, who could have imagined that the millions of books printed since Gutenberg might be instantly accessible, and provide us with an immediate record of the frequency of word usage over a period of several centuries? First brought on line on 16 December 2010, and updated in November 2013, Ngram Viewer uses the Google Books database to explore the lexicon of eight million volumes published since the sixteenth century. It instantly counts the occurrences of words or word groups in eight languages (nine, if British and American English are counted separately), opening the door to an infinite number of queries. The use of this application to study the vocabulary of demography has been tested on two occasions, with the findings published in Population and Societies (Héran, 2013), and in Demographic Research (Bijak et al., 2014). While pointing up the limits of Ngram Viewer, these first two experiments have demonstrated its potential value for demographers.

This article proposes a more detailed historical exploration of the vocabulary of demography, focusing on moments of innovation or abrupt change, of revival or decline. How do these patterns of usage shed light on the place of demography in the body of scientific knowledge? Here, the exploration initiated in the two above-mentioned articles will focus more closely on methodological aspects, starting with two simple questions: How does Google Books select the documents to be included in its database? And how do we move from this corpus to that of Ngram Viewer? Does the omission of scientific journals from the corpus bias the results or, on the contrary, does it shed light on the relationships between demography and society, demography and culture? After an attentive look at the capabilities and limitations of Ngram Viewer using examples taken from demography and elsewhere, we will move on to a sensitive question: is demography really losing ground in written culture? If so, what are the reasons for this, and how could the situation be remedied?

I. Ngram Viewer, or how to search through an ocean of words

1. The unstoppable advance of Google Books

We cannot produce or analyse Ngram Viewer's graphs without first understanding how they are constructed and, before that, how the source it exploits - the Google Books digital library - is organized. Explanations can be found on the official websites and on some institutional blogs. The most detailed information, including a description of some of Google's methods, is contained in the technical appendix to the article in which Ngram Viewer was first presented (Michel et al., 2010). Some manuals have also recently taken an interest in this application (Hai-Jew, 2014). But these disparate elements need to be drawn together and, if possible, assessed with a critical eye.

First, it is important to distinguish between Google Books and Ngram Viewer. Google, the American search engine giant, launched the Google Books programme in October 2004 with the aim of digitizing as many as possible of the books produced since the invention of printing, starting with the major libraries of the United States and Europe (Library Project), then moving on to publishers' catalogues (Partner Program). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.