Student Mark Prediction
Abstract
The prediction of students’ potential academic is a useful strategy to mitigate failure, to promote the attainment of better results and to better accomplish resources in education institutions. Also, the facility to predict student performance in a course creates chances to improve educational outcomes. With effective performance prediction approaches, teachers can allocate resources and training more accurately to the students. Existing methods have used features which are mostly related to tests conducted in the educational institutions, while features belonging to reading and writing of student’s information are ignored.
In this paper, an effort is made to explore academic success inferred from both reading and writing score. Linear Regression Technique is applied to predict whether a student will be able to achieve performance level or not. The data were collected from Kaggle that contains the students’ gender, ethnicity, parental level of education, lunch, reading score, writing score information were analysed. Furthermore, data transformation and pre-processing techniques were carried out to reduce the features. Additionally, Decision Tree, Naïve Bayes, and Rule Based classification techniques are applied to the students’ data in order to produce the best students’ academic performance prediction model.
Experimental results show that proposed method significantly outperforms existing methods due to exploitation of students’ reading and writing information feature sets. Linear regression is a best model among the other techniques by receiving the highest accuracy value of 81.3%. The extracted knowledge from prediction model will be used to identify and profile the student to determine the students’ level of success in the examination result.
INTRODUTION
Motivation:
The economic success of any country highly depends on making higher education more affordable and that considers one of the main concerns for any government. The factors that contribute to the educational expenses is the studying time spent by students in order to graduate. For example, the loan debt of the American students has been increased due to the failure of many students in getting graduated on time. Higher education is provided for free to the students in Iraq by the government. Yet, failing of graduating on time costs the government extra expenses. To avoid these expenses, the government has to ensure that the student graduate on time. Machine learning techniques can be used to forecast the performance of the students and identifying the at-risk students as early as possible so appropriate actions can be taken to enhance their performance. One of the most important steps when using these techniques is choosing the attributes or the descriptive features which used as input to the machine learning algorithm. The attributes can be categorized into GPA and grades, demographics, psychological profile, cultural, academic progress, and educational background.
Proposed system:
In the proposed system, we will predict the performance of the students using two difference data mining classifiers, namely ID3 data mining classifier and the Naive Bayes classifier. The main aim of the proposed system, is to find the more efficient data mining classifier amongst the two. This would result to finding out a more efficient and time saving algorithm to predict.
Description of the Data
About dataset:
The dataset used in this research is collected from the Archeology department and the Sociology department of the college of Humanities at Al-Muthanna University during the 2015 and 2016 academic years. Two data sources have been used, survey collected from the students and the students’ grades data records.
Size of data:
The dataset contains 700 student records, 333male and 367female,
It consists of 700 rows and 9 columns,
The dataset contains twenty attributes.
Features of data:
The attributes can be divided into 8categories which are
· gender,
· ethnicity,
· parental level of education,
· lunch,
· test preparation course,
· reading score,
· writing score,
· math score.
Source and Methods of Collecting Data:
Model Architecture
Model used here is
Linear regression,
Support Vector Regressor,
Decision Tree Regression and
Random Forest Regressor.
Algorithms Applied
In this Proposed system have used different machine learning algorithms to see if they care able to accurately predict the closing student marks.
I have used different regression machine learning algorithms that’s are shown in below.
1. Linear Regression,
2. Support Vector Regressor,
3. Decision Tree Regression and
4. Random Forest Regressor.
Finally, we ensemble top four best performing algorithms and compare them performance with one algorithm.
CONCLUSION
Summary:
In this proposed system various types regression machine learning algorithm namely Linear regression, KNN regressor, Decision Tree regressor but these all not efficient so proposed obviously go to ensemble techniques in ensemble techniques are available in adaboost regressor, xg boost regressor but developer easily use random forest techniques that techniques are given high accuracy so developer finalize the algorithm.