Data Mining in the Humanities and Social Sciences

By Porritt, Glyn | Information Today, December 2015 | Go to article overview

Data Mining in the Humanities and Social Sciences


Porritt, Glyn, Information Today


We are all aware that the past 20 years have seen an explosion in the amount of research material and primary sources found online. However, this revolution in the availability of information naturally presents problems, not only for a range of researchers, but also for librarians as guardians and promoters of its use. How does one get the most value out of what has been collectively termed Big Data?

Data mining, or text analysis, as it is often referred to, is a concept with growing prominence that is closely associated with Big Data. Most folks in the industry have heard about it, but it is a term that can mean different things to different people.

The core concept is that computer software applies automated analytical techniques to interrogate datasets for patterns, trends, and other useful information. This process typically would be incredibly labor-intensive to complete or difficult to conceive with traditional human research.

One of the key assumptions therein is that the data is available to the software to carry this out. Indeed, the more controversial aspects of data mining revolve around the practicalities and responsibilities related to making information available to the software as well as the software "mining" that data from sources in order to analyze it. However, before addressing this in more detail, let's reflect first on the exciting possibilities that text analysis can bring.

The Benefits

Data mining has the potential to empower a new breed of humanities scholar and facilitate how he or she approaches research. This can range from the relatively complex, such as software that can recognize syntax to analyze literary composition, to the more simple illustration of new pathways and associations.

For example, traditional online searching of a well-cataloged criminal court record archive would return thousands of cases containing a certain crime. Hits are highlighted, but that is a matter primarily of discovery. Where does the undergraduate go from there?

Imagine if text analysis had been harnessed to immediately present all the crimes tried in order of frequency. One could focus on how conviction or sentencing rates changed based on gender, occupation, location, or social class. How did these trends change over time, and do they reflect key social and economic events such as economic depressions or a demobilized army? This type of text analysis is an end in itself, but it also brings the material to life.

The consistency and quality of the data is of massive importance. If you are reliant on electronic excavation of information and trends, you want the basics to be correct in order to avoid erroneous results. Full-text accuracy is critical. Tagging text and data with useful identifiers can also become invaluable when applied at scale.

Collaboration of skills in the sector is important to get the most value out of the data. There are many who are already heavily involved in the digital humanities, but software development and manipulation may not be the natural home of the arts scholar. However, this should in no way deter using text analysis. The crossover already exists in many areas, not least in the creation of online collections that have already been produced, and this relationship should be expanded.

Copyright Legislation

An Association of Research Libraries (ARL) issue brief (bit. …

The rest of this article is only available to active members of Questia

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Note: primary sources have slightly different requirements for citation. Please see these guidelines for more information.

Cited article

Data Mining in the Humanities and Social Sciences
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen
Items saved from this article
  • Highlights & Notes
  • Citations
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Search by... Author
    Show... All Results Primary Sources Peer-reviewed

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.