Behind the Word Clouds: Electronic Text, Machine Reading and Corpus Linguistics: Tim Shortis Argues That Corpus Linguistics Is Changing Knowledge about Language, and Explains the Theory Behind It and Its Potential for the Classroom

By Shortis, Tim | English Drama Media, October 2009 | Go to article overview

Behind the Word Clouds: Electronic Text, Machine Reading and Corpus Linguistics: Tim Shortis Argues That Corpus Linguistics Is Changing Knowledge about Language, and Explains the Theory Behind It and Its Potential for the Classroom


Shortis, Tim, English Drama Media


A revolution in knowledge about language

Hyperbole comes easily in the excited discourse around the impact of ICTs and their ongoing penultimate promises. So it is with caution that I am suggesting that there is a quiet revolution going on in what counts as knowledge about language and meaningful reading and that little of this has permeated what is done in school English lessons so far. This may be about to change as we come face to face with ever-larger collections of electronically mediated text and exemplification of new methods for reading them.

The new and ever larger collections of resources are apparent, the means of reading them less so--although recent higher education research may point a way. UK Public Library memberships now offer free home access to the full Oxford English Dictionary (www.oed.com) along with digital archives of contemporary and historical newspapers. Agencies such as the National Archive, the Old Bailey, JISC and The British Library have all put significant collections of searchable text online. Some of these are plain text, some facsimiles, and some both. For example, Oxford University/JISC collaboration's magnificent World War 1 collection (http:/ /www.oucs.ox.ac.uk/ww1lit/) offers 5,000 textual artefacts mainly in facsimile form but with searchable words-only transcripts. All these collections involve engaging with a different order of textual scale and will require different kinds of literacy to being curled up in your chair reading a book under the light of an Anglepoise, although that will of course, remain important. The question then is how are we, as English teachers, as a professional community of practice specializing in the learning of literacy, to respond to these changes in the representational resources of the written word? What is our responsibility as such archives become available to students and future citizens, including our role in equipping these people in our care to understand and resist the abuse of the associated technologies of machine--reading in its aggressive forms: data-mining for commercial and political exploitation and its infringements of privacy, for example?

The data-driven study of very large collections of electronic text, assisted by the machine reading capacities of computers, or corpus linguistics as it has become known, has transformed understanding of core domains of language study, and even of the concepts thought to be required to study it. Linguists have re-considered the relationship of speech to writing (Biber, 1991), the actual nature of informal spoken interaction, including its 'grammar' and routine creativity (see Carter and McCarthy 1997, Carter 2004), gendered patterns in computer mediated communication (Herring 1996, or her homepage), the relationship of text messaging to spoken language (Caroline Tagg, forthcoming), the histories of languages, including the actual levels of standardisation over time, and the forensic identification of the Unabomber (see Coulthard and Johnson, 2007). Corpus linguistics has tested core concepts about language to the point of their destruction, sometimes having to reach for new terms and concepts to make sense of what is being found.

[ILLUSTRATION OMITTED]

While the juggernaut of the National Strategy has been 'rolling out' the grammatical terminology of the 1960s in schools in England (Alexander 2007), linguists have been asking first questions about the role of grammar and its structuring. This has even led to 'a completely new theory of language based on how words are used in the real world': lexical priming (Hoey 2005). This argues, counter to the received wisdom of countless linguistic authorities, that 'vocabulary is complexly and systematically structured and that grammar is an outcome of this lexical structure'. Machine-reading has enabled linguists to perceive such deeper patterns in language in 'collocation': the property of language whereby two or more words seem to appear frequently in each other's company (e.

The rest of this article is only available to active members of Questia

Sign up now for a free, 1-day trial and receive full access to:

  • Questia's entire collection
  • Automatic bibliography creation
  • More helpful research tools like notes, citations, and highlights
  • Ad-free environment

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Project items

Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited article

Behind the Word Clouds: Electronic Text, Machine Reading and Corpus Linguistics: Tim Shortis Argues That Corpus Linguistics Is Changing Knowledge about Language, and Explains the Theory Behind It and Its Potential for the Classroom
Settings

Settings

Typeface
Text size Smaller Larger
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Full screen

matching results for page

Cited passage

Style
Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited passage

Welcome to the new Questia Reader

The Questia Reader has been updated to provide you with an even better online reading experience.  It is now 100% Responsive, which means you can read our books and articles on any sized device you wish.  All of your favorite tools like notes, highlights, and citations are still here, but the way you select text has been updated to be easier to use, especially on touchscreen devices.  Here's how:

1. Click or tap the first word you want to select.
2. Click or tap the last word you want to select.

OK, got it!

Thanks for trying Questia!

Please continue trying out our research tools, but please note, full functionality is available only to our active members.

Your work will be lost once you leave this Web page.

For full access in an ad-free environment, sign up now for a FREE, 1-day trial.

Already a member? Log in now.