Academic journal article Kuram ve Uygulamada Egitim Bilimleri

Feature Extraction and Learning Effect Analysis for MOOCS Users Based on Data Mining

Academic journal article Kuram ve Uygulamada Egitim Bilimleri

Feature Extraction and Learning Effect Analysis for MOOCS Users Based on Data Mining

Article excerpt

(ProQuest: ... denotes formulae omitted.)

Massive open online courses (MOOCs) are becoming an emerging research area for education analytics (Breslow, Pritchard & Deboer, 2013). Modeling MOOCs users' behaviors can understand user demands better, and the analytics results which are more conducive to learning can be offered. Data mining that users generated in MOOCs platforms provide the possibility to study learning effect, predict dropout rates, and designed targeted interventions for teaching guidance (Lockyer, Heathcote & Dawson, 2013).

Despite the increasing number of MOOC users, there were some common features of the behaviors of these users on the MOOCs platform. Through extracting these behavioral features from huge amounts of data, we can use machine learning algorithm for prediction.

In this paper, we firstly proposed some input features and learning behavior features which were extracted from the data of MOOCs platforms. Then, two machine learning algorithms based on support vector machines and artificial neural network for predicting dropout of MOOC course were proposed, respectively.

The rest of the paper is divided into the following sections. Related works regarding machine learning algorithms used to predict dropout and various types of features used in them are discussed in Section 2. Proposed machine learning algorithms based on artificial neural network is described in Section 3. Another machine learning algorithm based on support vector machines is proposed in Section 4. The experiments and analysis are given in Section 5. Finally, Section 6 is conclusions.

Related work

With the popularity of MOOCs, there had been many studies to predict future behavior of users in MOOCs through extracting a wide variety of features from user behavioral data and applying various machine learning algorithms.

Wen et al. used a linguistic algorithm to analyze the MOOC forum data and discovered valuable features for predicting dropout of users (Sinclair & Kalvala, 2016). Brinton, Buccapatnam & Chiang, (2015) depended on clickstream of video lecture in MOOC to capture users' behavioral patterns, which were used to construct information processing indicators of users. Hughes & Dobbins, (2015) proposed a hidden Markov model and some related features such as number of forum posts, percentage of lectures watched, and so on to predict attrition of users in MOOC. Sinha, Li and Jermann, (2014) used graph theory to capture sequence of active and passive user behavior and used graph metrics as features for predicting dropout. Amnueypornsakul, Bhat & Chinprutthiwong, (2014) proposed a new predicting model by some quiz related and behavior related features. Kloft, Stiehler & Zheng, (2014) extracted more than 15 features representing user behavior from clickstream log.

All above approaches use different machine learning algorithms including logistic regression, support vector machine, hidden Markov model and random forest. These algorithms have a common point, that is, they all need feature extraction.

Feature extraction is also called attribute selection. It is the process of selecting a subset of relevant features for the predictive problem in model construction (Ahmed, Qahwaji & Colak, 2013). Feature extraction can be used to identify relevant attributes from dataset that do really contribute to the accuracy of predictive model. Rossi et al. studied discussion threads in Coursera MOOC forums, and used machine learning techniques to analyze and classify by feature extraction (Rossi & Gnawali, 2014). Zhang, Yang & Huang, (2017) designed a personalized MOOC recommendation system based on feature extraction, and it can recommend the best suitable course for the users.

Compared with these existing literatures, the objective of our feature extraction is: improving the prediction performance of the predictors and providing faster and more cost-effective predictors. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.