Mining Meets the Web

By Zorn, Peggy; Emanol, Mary et al. | Online, September-October 1999 | Go to article overview

Mining Meets the Web


Zorn, Peggy, Emanol, Mary, Marshall, Lucy, Panek, Mary, Online


Peggy Zorn (Peggy.Zorn@wl.com) and Mary Emanoil (Mary.Emanoil@wl.com) are Manager, Document Administration Services, and Documentum Consultant at respectively at Parke-Davis Pharmaceutical Research. Lucy Marshall edgeinfo@daneris.com) is with Edge Information Services, and Mary Panek (Mary.Panek@carrier.utc.com) is Information Manager at United Technologies Research Center

For most industries, the existence of very large databases with critical information is not new and pulling out the data that is needed, when it is needed, has been an age-old challenge. It is estimated that typical Fortune 500 companies manage over a terabyte of electronic information each day, with annual growth projected at 57% [1]. Most companies track sales, marketing, and other financial data in large databases, often referred to as "data warehouses." These large databases allow employees to retrieve specific portions of the data or to perform statistical tests on the data to predict forecasts and trends. Use of data mining technologies has been the standard choice for retrieval of information from these types of databases, and its use is expanding with well-established vendors such as SAS and Oracle providing the major products. Data mining use is increasing due to a variety of factors:

a growing market for mining and management of Web data is emerging.

* Trend toward use of "data warehouses" (see sidebar) for consolidation and management of large sets of related data in organizations

* Explosion in amount of information that is captured electronically

* Dramatic price decreases in data storage hardware

* Focus on knowledge management in organizations has increased pressure to share and use electronic data captured as a competitive advantage

WHAT IS DATA MINING?

Data mining can be defined as analyzing the data in large databases to identify trends, similarities, and patterns to support managerial decision making. Data mining technologies generally use algorithms and advanced statistical models to analyze data according to rules set forth by the particular application at hand. Data mining models fall into three basic categories: classification, clustering, and associations and sequencing (see Figure 1).

* Classification--involves analyzing data and assigning it to predefined concept categories or "tags," based on predefined rules. Automatically assigning controlled vocabulary terms to records based on word occurrence is an example of classification.

* Clustering--similar to classification in that different concept categories are identified through analysis of the data using distance or proximity measures, however, no predefined groups are used. All groups are auto-generated through patterns identified in the data. Clustering could be used to dynamically create a controlled vocabulary based on patterns present in the data and then format retrieval groups according to the vocabulary terms or concept categories.

* Associations and sequencing--generate descriptive models based on the data that identify rules to allow for prediction of future trends. Associations and sequences allow for modeling of "if, then" scenarios based on patterns identified in the data.

All data mining models can be predictive and are often used for forecasting of future behavior.

Data mining applications in the sales and marketing, actuarial, strategic planning, and risk-benefit analysis areas are prevalent. Analysis of sales data over a period of time can be used to predict future consumer trends and expected profit levels. Predictive statistical models run against huge databases form the basis of actuarial work in all areas, and strategic planning and risk-benefit analysis rely heavily on analysis of large sets of past data to forecast future trends. While these applications for data mining are maturing, a growing market for mining and management of Web data is emerging. …

The rest of this article is only available to active members of Questia

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Note: primary sources have slightly different requirements for citation. Please see these guidelines for more information.

Cited article

Mining Meets the Web
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen
Items saved from this article
  • Highlights & Notes
  • Citations
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA 8, MLA 7, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Search by... Author
    Show... All Results Primary Sources Peer-reviewed

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.