Academic journal article
By Yu, Gong
Communications of the IIMA , Vol. 7, No. 4
Social network analysis is a new research field in data mining. The clustering in social network analysis is different from traditional clustering. It requires grouping objects into classes based on their links as well as their attributes. While traditional clustering algorithms group objects only based on objects' similarity, and it can't be applied to social network analysis. So on the basis of BSP (business system planning) clustering algorithm, a social network clustering analysis algorithm is proposed. The proposed algorithm, different from traditional clustering algorithms, can group objects in a social network into different classes based on their links and identify relation among classes.
Social network analysis, which can be applied to analysis of the structure and the property of personal relationship, web page links, and the spread of messages, is a research field in sociology. Recently social network analysis has attracted increasing attention in the data mining research community. From the viewpoint of data mining, a social network is a heterogeneous and multi-relational dataset represented by graph (Han & Kamber, 2006).
Research on social network analysis in the data mining community includes following areas: clustering analysis (Bhattacharya & Getoor, 2005; Kubica, Moore and Schneider, 2003), classification (Lu & Getoor, 2003), link prediction (Liben-Nowell & Kleinberg, 2003; Krebs, 2002). Other achievements include PageRank (Page, Brin, Motwani and Winograd, 1998) and Hub-Authority (Kleinberg, 1999) in web search engine.
In this paper, clustering analysis of social network is studied. In the second section, a social network clustering algorithm is proposed based on BSP clustering algorithm. The algorithm can group objects in a social network into different classes based on their links, and it can also identify the relations among classes. In the third section, an example of social network clustering algorithm is presented, and then the conclusion and the future work direction are given.
SOCIAL NETWORK ANALYSIS BASED ON BSP CLUSTERING
There has been extensive research work on clustering in data mining. Traditional clustering algorithms (Han & Kamber, 2006) divide objects into classes based on their similarity. Objects in a class are similar to each other and are very dissimilar from objects in different classes.
Social network clustering analysis, which is different from traditional clustering problem, divides objects into classes based on their links as well as their attributes. The biggest challenge of social network clustering analysis is how to divide objects into classes based on objects' links, thus we need find algorithms that can meet this challenge.
The BSP (business system planning) clustering algorithm (Gao, Wu and Yu, 2002) is proposed by IBM. It designed to define information architecture for the firm in business system planning. This algorithm analyses business process and their data classes, cluster business process into sub-systems, and define the relationship of these sub-systems.
Basically BSP clustering algorithm uses objects (business processes) and links among objects (data classes) to make clustering analysis. Similarly social network also includes objects and links among these objects. In view of the same pre-condition, the BSP clustering algorithm can be used in social network clustering analysis.
According to graph theory, social network is a directed graph composed by objects and their relationship. Figure 1 shows a sample of social network, the circle in the figure represents an object; the line with arrow is an edge of the graph, and it represents directed link between two objects, so a social network is a directed graph.
[FIGURE 1 OMITTED]
In figure 1, let [O.sub.i] be an object in social network (i = 1 ... m), let [E. …