Unsupervised Learning Aided by Clustering and Local-Global Hierarchical Analysis in Knowledge Exploration

By Zhang, Yihao; Orgun, Mehmet A. et al. | Journal of Digital Information Management, August 2007 | Go to article overview

Unsupervised Learning Aided by Clustering and Local-Global Hierarchical Analysis in Knowledge Exploration


Zhang, Yihao, Orgun, Mehmet A., Lin, Weiqiang, Journal of Digital Information Management


ABSTRACT: Unsupervised learning plays an important role in the knowledge exploration discovery. The basic task of unsupervised learning is to find latent variablesor relationships in a given dataset without any assumed regularities or patterns. In this paper we apply two advanced models, clustering analysis and hierarchial analysis to accomplish unsupervised learning. K-Means clustering presents its strength in large scale clustering. The original data can be preprocessed and the potential variables are targeted. Correlations among these variables are explored in the subsequent sets by Local Global Hierarchial Analysis (LGHA) assisted by three main steps. In the first step, we use a structural approach to find qualititative patterns from the given variables. Then, the second step applies a quantitative based algorithm to find quantitative patterns from those variables. The and last step generated global hybrid patterns by combining the local patterns obtained from the first two steps based on a certain criterion. Both of the K-Means and Local Global Hierarchial Analysis (LGHA) models are applied in experiments with real world longitutional medical datasets.

Keywords: Hierarchical Analysis, K-Means Clustering, Unsupervised Learning, Knowledge Exploration

Categories and Subject Descriptors I 2.4 [Knowledge representation formalism and methods]; I.2.6 [Learning]: I.5.3 [Clustering]

General Terms

Knowledge representation, Learning classification

1. Introduction

From a traditional point of view, knowledge exploration can be categorized into supervised learning and unsupervised learning (Jordan and Jacobs 1994). In the last decade, there have been research activities on supervised learning approaches and techniques, whereby class information is available before any knowledge exploration takes place. The most utilized approach is to achieve a predetermined independent measurement in order to preferentially target classes. Then a classification algorithm is applied in the data pre-processing stage (Liu and Motoda 1998, Liu and Yu 2005). However, this approach is not robust to be effectively applied on features with irregular sizes or nonrecurring, high-dimensional variables.

Unsupervised learning is a recent approach in knowledge exploration. It is widely used on/with unlabeled data, such as extracting relevance that exists in records. Unsupervised learning is an important supplementary method to category data since it could increase the precision of clustering results. Unlike supervised learning, unsupervised learning attempts to find the most reasonable patterns by uncovering relationships best instead of using preferential classification labels (Dy and Brodley 2000, 2004). Because the idea behind unsupervised learning is to run an unsupervised algorithm on raw data (Kohavi and John 1997), most researchers consider the applications of data clustering and data reduction (including dimension reduction, size reduction, etc.) as two key issues in the framework of knowledge exploration. The use of an unsupervised learning method could save time in data processing by removing the matching and ranking process used for specified classes, and avoiding redundant analysis.

In this paper, we propose to combine two models to achieve unsupervised learning. K-Means Clustering Analysis (KMeans) is used to partition the original combine two models to achieve unsupervised learning. K-Means Clustering Analysis (K-Means) is used to partition the original data according to a certain criterion. As a robust model, K-Means semiautomatically generates clusters and assigns data into different clusters. The data within these clusters will be labelled prior to when we collect observational sets.

Local-Global Hierarchical Analysis (LGHA) attempts to discover accurate and relevant correlations from observational data (Lin and Orgun 2000, Lin and Orgun 2004, Lin et al 2000, Zhang et al 2006). …

The rest of this article is only available to active members of Questia

Sign up now for a free, 1-day trial and receive full access to:

  • Questia's entire collection
  • Automatic bibliography creation
  • More helpful research tools like notes, citations, and highlights
  • Ad-free environment

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited article

Unsupervised Learning Aided by Clustering and Local-Global Hierarchical Analysis in Knowledge Exploration
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Full screen

matching results for page

Cited passage

Style
Citations are available only to our active members.
Sign up now to cite pages or passages in MLA, APA and Chicago citation styles.

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

"Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited passage

Welcome to the new Questia Reader

The Questia Reader has been updated to provide you with an even better online reading experience.  It is now 100% Responsive, which means you can read our books and articles on any sized device you wish.  All of your favorite tools like notes, highlights, and citations are still here, but the way you select text has been updated to be easier to use, especially on touchscreen devices.  Here's how:

1. Click or tap the first word you want to select.
2. Click or tap the last word you want to select.

OK, got it!

Thanks for trying Questia!

Please continue trying out our research tools, but please note, full functionality is available only to our active members.

Your work will be lost once you leave this Web page.

For full access in an ad-free environment, sign up now for a FREE, 1-day trial.

Already a member? Log in now.