Comparing English Worldwide: The International Corpus of English

By Sidney Greenbaum | Go to book overview

15
The Corpus as a Research Domain

GRAEME KENNEDY

The design and development of machine-readable corpora and tools for their analysis has been a major preoccupation of corpus linguistics for more than three decades. During this period there have been massive changes in the capacity and speed of computers, an increasing use of microcomputers with CD-ROM as the basis for storage, the development of new and faster means of text capture through optical scanning, and the development of more sophisticated software packages for the analysis of corpora. At the same time there have been continuing issues in the design and use of corpora. How big should a corpus be to provide a valid and reliable picture of how a language is structured and used? What aspects of a language can be validly and reliably described using a corpus of a particular size? Can a corpus be designed to be a representative sample of a language 'as a whole'? Should particular genres be represented in a corpus and, if so, which genres? What are the respective roles of automatic and manual analysis in corpus-based research?

As Quirk ( 1992) has noted, machine-readable corpora have grown in size from the one-million-word standard of the Survey of English Usage (SEU) Corpus and the Brown Corpus of the early 1960s to the 100-million-word British National Corpus (BNC) of the 1990s. The most recent developments, including vast monitor corpora of potentially unlimited size and the International Corpus of English (ICE), promise to open up new directions in the use of corpora.

Although there have been numerous corpus-based studies of English completed since 1960, the changes in technology mentioned above and issues in the design and development of corpora have in a sense been necessary prerequisites for systematic corpus-based research. It is just such systematic research, involving comprehensive lexical and grammatical description and comparisons across genres, registers, and major regional varieties which the ICE project will encourage and facilitate. The purpose of this paper is to outline some matters which might be considered part of a research agenda for the ICE corpus.

To a considerable extent, the size, nature, and structure of machine-readable corpora and the associated software, determine the kind of linguistic research which can be undertaken. Obviously, for example, however big a corpus may be, if it includes only written texts, it cannot reasonably be used as a basis for the description of the 'language as a whole' nor for lexical, grammatical, or discourse characteristics of

-217-

Notes for this page

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

Items saved from this book

This book has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this book

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this page

Cited page

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited page

Bookmark this page
Comparing English Worldwide: The International Corpus of English
Table of contents

Table of contents

  • Title Page iii
  • Preface vii
  • Contents ix
  • List of Contributors xi
  • List of Figures xiii
  • List of Tables xv
  • Abbreviations xvi
  • Part I Introduction 1
  • 1: Introducing ICe 3
  • References 12
  • 2: Learner English Around the World 13
  • References 23
  • Part II Compilation and Annotation 25
  • 3: The Design of the Corpus 27
  • References 35
  • 4: Markup Systems 36
  • Notes 45
  • References 45
  • 5: The Umb Intelligent ICe Markup Assistant 54
  • References 64
  • 6: ICe Annotation Tools 65
  • 7: Developing the ICe Corpus Utility Program 79
  • 8: About the ICe Tagset 92
  • 9: Autasys: Grammatical Tagging and Cross-Tagset Mapping 110
  • 10: An Outline of the Survey's ICe Parsing Scheme 125
  • Reference 139
  • 11: The Survey Parser: Design and Development 142
  • References 157
  • Part III Problems of Implementation 161
  • 12: The New Zealand Spoken Component of ICe: Some Methodological Challenges1 163
  • References 177
  • 13: Second-Language Corpora1 182
  • References 195
  • 14: The International Corpus of English in Hong Kong 197
  • References 213
  • Part IV Applications 215
  • 15: The Corpus as A Research Domain 217
  • 16: ICe and Teaching 227
  • 17: The Sociolinguistics of English in Nigeria and the ICe Project 239
  • 18: Why A Fiji Corpus? 249
  • References 260
  • 19: Prosice: A Spoken English Database for Prosody Research 262
  • References 278
  • Index 281
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this book

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen
/ 290

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.