Data pre-processing: a case study in predicting student’s retention in MOOC

  • N. Mohamad
  • N.B. Ahmad
  • S. Sulaiman
Keywords: data pre-processing, e-learning, massive open online course, student’s retention, prediction.


Data pre-processing is a crucial phase prior to analytic task and yet rarely been discussed
especially for e-learning data which has multilevel data. Providing a reliable data
pre-processing is important to provide quality dataset. Therefore, this study investigates the
problems arise in data pre-processing and in this case, for identifying the significant factors to
implement prediction task. A MOOC dataset is selected for the data pre-processing task. The
process in generating the summary of dataset is explained and the ultimate aim is to produce a
dataset with features that are ready for data mining task. The study also proposed a process
model and suggestions, which can be applied to support more comprehensible tools for
educational domain who is the end user. Subsequently, the data pre-processing become more
efficient for predicting student’s retention in MOOC

Keywords: data pre-processing; e-learning; massive open online course; student’s retention;

Journal Identifiers

eISSN: 1112-9867