In any distance learning environment ability of predicting a student’s
performance is very important which is advantageous for the teachers and tutors
to identify the students with different capabilities and their capacities. When
it comes to University education where many students are accessing or following
their studies through open and distance environment it requires a
identification process upon the students to measure whether they achieve the
required level of performance. Otherwise due to the nature of the distance
education some students can be lagging behind while peer students have passed
them by miles. If teachers and tutors can recognize them at the early stage of
the course module necessary steps or decisions can be made in order to prevent
them from dropping out from the course modules.
S. Kotsiantis, C. Pierrakeas, and P. Pintelas [2003] have suggested an
approach which has used machine learning algorithms with the LMS data to
prevent, student dropouts in university distance education. They tried to
investigate the efficiency of machine learning techniques in such an
environment with trained data sets provided by the “informatics” course of the
Hellenic Open University.
In their research they used five different algorithms to study student
data and they found that these algorithms can be used more appropriately to
predict the student dropouts in study programs. In this research they used most
common machine learning techniques which are Decision Trees, Bayesian Nets,
Perceptron-based Learning, Instance-Based Learning and Rule-learning.
In their data collection process they collected student data under two
categories of attributes which are Demographic attributes and Performance
attributes. The Demographic attributes were collected by concerning students’
sex, age, marital status, number of children and occupation and Performance
attributes represents attributes which were collected from tutors’ records
concerning students’ marks on the written assignments and their presence or
absence in face-to-face meetings.
In the above mentioned algorithms categories they used C4.5 algorithm for
representing the decision tree, Naive Bayes algorithm was the representative of
the Bayesian networks, the RIPPER algorithm was the representative of the
rule-learning techniques, WINNOW as the representative of perceptron-based
algorithms and finally 3-NN or 3- Nearest Neighbor as the Instance-Based Learning
algorithm.
In order to rank the representative algorithms they used the
prediction accuracy criterion was used. In the evaluation of the algorithms
they found that there was no statistically significant difference between
algorithms, but it showed that the Naive Bayes algorithm and the RIPPER had the
best accuracy than the others. Among the Naive Bayes algorithm and the RIPPER, Naive
Bayes has the advantage short computational time requirement and importantly Naive
Bayes classifier can use data with missing values as inputs, whereas RIPPER
cannot work with which gives a indication that the Naive Bayes is the most appropriate
learning algorithm to be used for the construction of a software support tool
in Learning Management Systems.
Other than the above it was found that there exist some obvious and
some less obvious attributes that demonstrate a strong correlation with student
performance where some gives the higher importance in consideration. Also it
can be argued that the learning algorithms could enable tutors to predict student
performance with satisfying accuracy long before final examination.
Reference: S. Kotsiantis, C. Pierrakeas, and P.
Pintelas, “Efficiency of Machine Learning Techniques in Predicting Students’
Performance in Distance Learning Systems,” Citeseer, 2002.
No comments:
Post a Comment