Вы находитесь на странице: 1из 7

Society of Information Technology Students Journal 1

PROPOSED FORECASTING MODEL FOR THE STUDENTS ACADEMIC PERFORMANCE OF BSCS STUDENTS IN NEW ERA UNIVERSITY
Teddy Eddie Q. Disp !". Libis Dike 1,Brgy. Balite St., Montalban, Rodriguez Rizal teqdispojr@gmail.com Re#ie$ %e&&e'( P. T)*+& #4 Manansala St., Krus na Ligas, Diliman, Quezon ity rkenneth_tugano@yahoo.com ABSTRACT !"e Data mining tool is a##e$ted as a de#ision making tool %"i#" is able to &a#ilitate better resour#e utilization in terms o& students' $er&orman#e. (t is essential &or de#ision)makers to obtain early &eedba#k on a#ademi# $er&orman#e and t"e e&&e#ti*eness o& di&&erent learning strategies. (n t"is $a$er t"e data &rom om$uter S#ien#e student "as been taken and *arious data mining met"ods "a*e been $er&ormed to im$ro*e students' a#ademi# $er&orman#e and to in#rease t"e de#reasing $o$ulation o& om$uter S#ien#e students' &rom &irst year to &ourt" year. Des#ri$ti*e met"od %as used to analyze t"e data and &ore#ast Ba#"elor o& S#ien#e in om$uter S#ien#e %ill &inis" #ourse &our years s$an and graduate on time. !o ensure im$artiality o& data t"e resear#"ers used t"e elements in t"e $o$ulation as its sam$le making in more in#lusi*e and re$resented so t"at t"e study %ill "a*e su&&i#ient and ade+uate data &or greater statisti#al e&&i#ien#y. !"e aim o& t"is study is to a$$ly di&&erent data mining te#"ni+ues to analyze t"e best model t"at %ill &it in &ore#asting students' a#ademi# $er&orman#e. !"e result o& study using t%o met"ods o& de#ision tree is to re$resent rule t"at is easy to inter$ret and by t"e used o& t"is met"od (D, algorit"m gi*es -..4./ a##urate results. %ey, "ds- Data Mining, lassi&i#ation, 0ore#asting, De#ision tree, Regression, 1er&orman#e INTRODUCTION !"e ability to $redi#t a students' a#ademi# $er&orman#e is *ery im$ortant in edu#ational en*ironments. 1redi#tion models t"at in#lude all $ersonal, so#ial, $sy#"ologi#al and ot"er en*ironment *ariables are ne#essitated &or t"e e&&e#ti*e $redi#tion in t"e $er&orman#e o& t"e students. !"e $redi#tion o& student $er&orman#e %it" "ig" a##ura#y is bene&i#ial to identi&y %"o among t"e students need a s$e#ial attention in t"eir studies. (t is re+uired t"at t"e identi&ied students be assisted more by t"e tea#"er so t"at t"eir $er&orman#e %ill im$ro*e in t"e &uture 213. Data mining e4tra#ts interesting non)tri*ial, im$li#it, $re*iously unkno%n and $otentially use&ul in&ormation or $atterns &rom data. (t #an be a$$lied to a number o& di&&erent a$$li#ations, su#" as data summarization, learning #lassi&i#ation rules, &inding asso#iations, analyzing #"anges and dete#ting anomalies 253. Data mining is a data analysis met"odology used to identi&y "idden kno%ledge o& a large data in databases and it "as been su##ess&ully used in di&&erent areas in#luding t"e edu#ational en*ironment. Data mining met"odology is used to study students' $er&orman#e and $ro*ide many tasks t"at #ould be used in $redi#ting and &ore#asting a#ademi# $er&orman#e. !"e reasons o& good or bad $er&orman#es o& t"e students s"ould be one o& t"e main interests o& tea#"ers. !"e tea#"ers #an $lan and #ustomize t"eir tea#"ing $rogram, based on t"e &eedba#k o& t"e students 2,3. Data mining is one o& t"e $o%er&ul analyti#al tool a$$roa#"es, %"i#" #an $ro*ide an e&&e#ti*e assistan#e in re*ealing #om$le4 relations"i$s be"ind t"e students' grades and $er&orman#es 243.

Society of Information Technology Students Journal 5


METHODOLOGY ../ DESCRIPTIVE RESEARCH !"is study des#ribed t"e $"enomena and %as analyzed in t"e dis#i$line o& +uantitati*ely t"e main &eatures o& a #olle#tion o& in&ormation. Des#ri$ti*e study is one in %"i#" in&ormation is #olle#ted %it"out #"anging t"e en*ironment and #an in*ol*e a one)time intera#tion %it" t"e grou$s. orrelational resear#" determines t"e relations"i$ bet%een t%o or more *ariables. !"e data is #olle#ted &rom *arious *ariables and #orrelational statisti#al te#"ni+ues are t"en used 263. !"e resear#"ers #onsidered t"e elements in t"e $o$ulation as its sam$le making in more in#lusi*e and re$resented so t"at t"e study %ill "a*e su&&i#ient and ade+uate data &or greater statisti#al e&&i#ien#y. 7lso t"e resear#"ers used di&&erent statisti#al tools to e*aluate t"e #riteria o& t"e &ore#asting Model as %ell su#" as 1er#entage, Mean, Standard De*iation, 1er#entage 8rror, !)test, M718 9mean absolute $er#entage error: and Multi$le Linear Regression. Statisti#al so&t%are $a#kage su#" as Ra$idMiner, S1SS and ;8K7 used to $ro#ess t"e data &or &aster and greater reliability o& t"e results. RESEARCH FRAMEWOR%

Fi*)"e /- F"+0e, "1 2 " A3+de0i3 Pe"2 "0+&3e !"e data &rom t"e student or a$$li#ant %ill store into database. !"e system %ill get t"e data &rom t"e database and &lat &iles to #ombine t"e $ossible data needed in order to get %"at indi#ator or $redi#tor %ill used. !"e large data %ill &iltered using #leanse and trans&orm to utilize t"e $redi#tors to kno% t"e in$ut *alue to #reate &ore#asting model to $redi#t t"e $robability o& t"e students to &inis" t"e Ba#"elor o& S#ien#e in om$uter S#ien#e #ourse in &our years in time and %"o among t"e student are not. !"e de#ision *ariable ser*e as t"e inde$endent *ariable in t"is study and t"e $robability o& graduating %ill be t"e de$endent *ariable. !"e $attern re#ognition $ro*ides t"e reasonable ans%er &or all $ossible in$uts and t"e de#ision makers in*ol*ed on %"at are t"e results in *isualization and *alidation &or t"e $robability o& t"e graduating students. 7s a %"ole t"e de#ision makers "a*e an in&luen#e to de#ide t"ings and #an iterate t"e $ro#ess o& t"e $ro$osed study to make t"e model more e&&i#ient and a##urate.

Society of Information Technology Students Journal ,


E4PECTED OUTPUT !"is in#ludes analyzes, inter$retation and im$li#ations o& t"e &indings &rom t"e data gat"ered by t"e resear#"ers and to look &or%ard to t"e $robable o##urren#e or a$$earan#e %"i#" a#ti*ate and modi&y a $ro#ess. (t also dis#usses t"e ty$es o& testing $er&ormed on t"e &ore#asting model in t"is study. !"e data t"at t"e resear#"ers used in t"e study %ere tabulated and $la#ed into t"e data &ile using statisti#al so&t%are $a#kages. 5./ PREDICTORS IN FORECASTING STUDENTS ACADEMIC PERFORMANCE !"e *ariables used in t"is study %ere di*ided into t%o ty$es o& inde$endent *ariable and de$endent *ariable. 7n inde$endent *ariable is also kno%n as a $redi#tor *ariable, it re$resented t"e in$uts or #auses to see i& t"ey %ere t"e #ause %"ile de$endent *ariable re$resented t"e out$ut or e&&e#t to see i& it is e&&e#ti*e. !"e resear#"ers "ad an internal *ariable %"i#" %as t"e $ro&ile o& t"e res$ondents in#luding student name, student number, sub<e#ts=sub<e#t #odes and grades. !"ese *ariables #onsidered as t"e $redi#tors or t"e inde$endent *ariable &or BS S students %"o #an &inis" #ourse &our years in time %"ile t"e graduates %ere t"e de$endent *ariable or t"e out$ut used in t"is study.

!"e resear#"ers s"o%ed t"e $redi#tors to be #onsidered %"i#" %ere t"e sub<e#t #ode &rom mat"emati#s and s#ien#e sub<e#ts, ma<or sub<e#ts, and general edu#ation sub<e#ts to easily *isualize sub<e#ts in #urri#ulum &rom t"e sub<e#ts in &irst year to &ourt" year in om$uter S#ien#e su#" as S>!8 ?, @S!11, 8@AL>1, S>5,1 A;7, M7!>1B5, S>445, S>1,5 A;7, S>545, 18>,, @S!15, 8@AL>., 8@AL>5, M7!>1B1, S>,44, 1?(LC>1, S>,,6 A;7, 18>5, S>4,,, S>4,4, M7!>,41, S>,,5, S>545 A;7, 18>4, 0(L>57, 1CL>S (>5, S>145 A;7, 1?D>5 A;7, S>,41 A;7, S>141 A;7, 8@AL>4, E7LF8S, S>,41 A;7, S>541 A;7, 1?D>1 A;7, S>5,, A;7, 0(L>1, S>,,1 A;7, L(!>1, S>,,,, S>4,5, S>5,5 A;7 and S>,45 A;7. !"ese *ariables #an be #onsidered to "a*e an in&luen#e on t"e $er&orman#e o& students 2G3. 5.. CORRELATIONS OF THE PREDICTORS TO THE ACADEMIC PERFORMANCE OF BSCS STUDENTS orrelation des#ribed t"e degree o& #orres$onden#e bet%een t%o or t"ree *ariables. !"is ty$e o& Bi*ariate #orrelation test re+uired t"at t"e *ariables bot" "a*e a s#ale le*el o& measurement order &or t"e *alues and t"e distan#e in bet%een t"e *alues #an be determined 2B3. !"e resear#"ers sim$li&ied t"e $redi#tor *ariables into ,

Society of Information Technology Students Journal 4


#ategories t"ey areH Mat"emati#s and S#ien#e 9Mat I S#i.: sub<e#ts, Ma<or sub<e#ts and Aeneral 8du#ation sub<e#ts 9Aen8d:. %"en any one o& t"e inde$endent *ariables is *aried and to model t"e relations"i$ o& bet%een s#alars.

Fi*)"e .- M de$ S)00+"y i& M)$'ip$e Li&e+" Re*"essi & R means is a #om$anion to a$$ly regression and its automati#ally $ro#ess t"e log base 5 o& in#ome in t"e e+uation %"i#" is t"e Multi$le Linear Regression model. R s+uare measures t"e relations"i$ bet%een a $ort&olio and its ben#"mark. (t #an be measure "o% #lose t"e data are to t"e &itted regression line. !"e resear#"ers test #oming &rom t"e "istori#al data o& t"e res$ondents, t"e *alue o& R is e+ual to .B.K and R s+uare is e+ual to .GK-, it means t"at R indi#ates t"e model e4$lains all t"e *ariability o& t"e res$onse data around its mean. !"e result *alue &rom t"e model summary in R L .B.K, R s+uare L .KGK- and ad<usted R s+uare L .41- is better, be#ause in general t"e "ig"er t"e R)s+uared t"e better model &its in t"e data. (& t"e results o& R s+uare indi#ate K/ meaning t"e model e4$lains none o& t"e *ariability o& t"e res$onse data around its mean. !"e standard error o& t"e 8stimate is #losely related to t"e +uantity o& standard de*iation. Standard error o& t"e 8stimate is e+ual to .KG,/ it means &rom 1KK/ a##ura#y o& t"e model t"e test result is almost G/ e+ui*alent o& -4/ to 1KK/. G/ is not t"at bad using standard error be#ause t"e true *alue o& t"e standard de*iation is usually unkno%n. (n su#" #ases it is im$ortant to be #lear about %"at "as been done and to attem$t to take $ro$er a##ount o& t"e &a#t t"at t"e standard error is only an estimate. !"e resear#"ers test t"e true *alue or t"e a##ura#y o& t"e Multi$le Linear Regression using M718 9mean absolute $er#entage error:. @ormal 1robability $lot #om$ares t"e distribution o& t"e residuals to a normal distribution and assessing %"et"er or not a data set is a$$ro4imately normal distributed. !"e data are $lotted against t"eoreti#al normal distribution in a

!"e 1earsonJs orrelation bet%een *ariables is a measure o& "o% %ell t"ey are related. !"e most #ommon measure o& #orrelation in stats is t"e 1earson orrelation 9te#"ni#ally #alled t"e 1earson 1rodu#t Moment orrelation or 11M :. (t s"o%s t"e linear relations"i$ bet%een t%o sets o& data. !"ere is strong relations"i$ bet%een t"e *ariables i& t"e $)*alue is #lose to 1, it means t"at #"anges in one *ariable are strongly #orrelated %it" t"e #"anges in t"e se#ond *ariable. !"e Sig. 95) tailed: *alue tells i& t"ere is a statisti#ally signi&i#an#e #orrelations bet%een your *ariables. (& t"e Sig. 95) tailed: *alue is less t"an to .K1 it #on#lude t"at t"ere is a signi&i#an#e #orrelation bet%een your *ariables. (n t"is #ase, $)*alue &or Ma<or sub<e#ts is e+ual to . G6K, Mat I S#i. sub<e#ts is e+ual to .44- and Aen8d s"o%ed a number o& .66- %"i#" means t"e relations"i$ bet%een t"e Ma<or and Aen8d sub<e#ts are more moderate asso#iation. !"e relations"i$ o& Mat I S#i. sub<e#ts is %eak #orrelated %"ile t"e Sig. 95)tailed: *alue &or Ma<or sub<e#ts, Mat I S#i. sub<e#ts and Aen8d sub<e#ts is .KKK it means t"at t"ere is a signi&i#an#e #orrelations bet%een Ma<or, Mat I S#i. and Aen8d sub<e#ts. 5.6 DATA MINING TECHNIQUES AND ALGORITHMS 5.6./ REGRESSION Regression analysis is a statisti#al te#"ni+ue &or studying linear relations"i$s among *ariables and to $redi#t a #ontinuous de$endent *ariable &rom number o& inde$endent *ariables and t"e a#t or an instan#e o& regressing. !"e resear#"ers used t"e regression analysis to "el$ understand "o% t"e ty$i#al *alue o& t"e de$endent *ariable #"anges

Society of Information Technology Students Journal 6


%ay t"at t"e $oints s"ould &orm an a$$ro4imate straig"t line. !"e diagonal line re$resents t"e normal distribution. !"e #loser t"e obser*ed #umulati*e $robabilities o& t"e residuals are to t"is line, t"e #loser t"e distribution o& t"e residuals is to t"e normal distribution. !"e resear#"ers used t%o met"ods o& de#ision tree %"i#" are t"e ?7(D, and (D, algorit"ms. ?7(D #an be used &or $redi#tion as %ell #lassi&i#ation and &or dete#tion o& intera#tion bet%een *ariables %"ile (D, uses in&ormation gain measure to #"oose t"e s$litting attribute. ?7(D 9 "i)s+uared 7utomati# (ntera#tion Dete#tion: #"ooses t"e inde$endent $redi#tor *ariable t"at "as t"e strongest intera#tion %it" t"e de$endent *ariable %"ile (D, #onstru#t t"e de#ision tree by em$loying a to$)do%n, greedy sear#" t"roug" t"e gi*en sets to test ea#" attribute at e*ery tree node.

Fi*)"e 6- N "0+$ P" 7+7i$i'y P$ ' Usi&* M)$'ip$e Li&e+" Re*"essi & (n e4$e#ted #umulati*e $robability s"o%s t"at uni&orm distribution "as an S s"a$e and it mat#"es t"e $attern o& a set o& $aired data. !"e resear#"ers belie*e t"at it indi#ates normal distribution into long)tailed be#ause t"e #ur*e starts belo% t"e normal line, bends to &ollo% t"e #ur*e and ends abo*e. (t means t"at more *arian#e t"an you %ould e4$e#t in a normal distribution and t"e resear#"ers agree t"at normal distribution #an be im$ro*e u$on as a model &or testing. 5.6.. DECISION TREE De#ision tree #reates a tree)based #lassi&i#ation model. (t #lassi&ies #ases into grou$s or $redi#ts *alues o& a de$endent *ariable based on *alues o& inde$endent *ariables. !"e $ro#edure $ro*ides *alidation tool &or e4$loratory and #on&irmatory #lassi&i#ation analysis 2.3.

Fi*)"e 5- M de$ S)00+"y Usi&* CHAID Me'( d !"e resear#"ers used ?7(D met"od to #ategorize ea#" $redi#tor i& ea#" *ariable are not signi&i#antly di&&erent %it" res$e#t to t"e de$endent *ariable. 0igure 4, indi#ates t"at only one o& t"e sele#ted inde$endent *ariables made a signi&i#ant enoug" #ontribution to be in#luded in t"e model %"i#" is t"e CM!>441.

Society of Information Technology Students Journal G

Fi*)"e 8- M de$ S)00+"y P" d)3ed 7y !59 De3isi & T"ee M4. s"o%s t"e error le*el %"en a$$lying t"e #lassi&ier to t"e training data. !"e most im$ortant &igures &rom model summary are t"e numbers o& #orre#tly and in#orre#tly #lassi&ied instan#es. Fsing M4. #lassi&ier, #orre#tly #lassi&ied instan#es is e+ual to G5/ %"ile in#orre#tly #lassi&ied instan#es is e+ual to ,./. Mean absolute error is e+ual to K.K165 %"i#" is t"e measure "o% #lose t"e &ore#asts or $redi#tion are to t"e e*entual out#omes. !"e results using ?7(D met"od is a$$ro4imately "ig" #om$ared to J48 #lassi&ier.

!able 4 s"o%s t"e a##ura#y and e&&i#ien#y o& t"e model. (D, te#"ni+ue "as a lo%est $er#entage error o& K.K165/ or 1.65/ indi#ates t"at t"e a##ura#y le*el o& t"e gi*en model is -..4./ out o& 1KK/ 21K3. ?7(D met"od also s"o%ed an a##e$table le*el o& a##ura#y. (n Multi$le Linear Regression, t"e resear#"ers used Mean 7bsolute 1er#entage 8rror 9M718: in order to #al#ulate t"e e&&i#ien#y o& t"e model %"i#" results to $er#entage error o& 5.-,/. (t means t"at t"e a##ura#y le*el using Regression analysis is -B.KB/. Multi)layer &eed)&or%ard algorit"m s"o%ed a "ig"est $er#entage error. CONCLUSION !"is study #ould be a great "el$ &or om$uter S#ien#e students and &or t"e tea#"ers to im$ro*e students' a#ademi# $er&orman#e, trim do%n &ailure rate, to better understand students' be"a*ior, and to im$ro*e tea#"ing. !"is study #an "el$ de*elo$ a &ait" on data mining te#"ni+ues so t"at $resent edu#ation systems may ado$t t"is as a strategi# management tool. Arade $oint a*erage 9A17: is used in "ig"er learning institution to dis#o*er kno%ledge &rom edu#ation data and students' $er&orman#e $lays an im$ortant role in $rodu#ing t"e best +uality graduates. 7#ademi# a#"ie*ement, grades are t"e main &a#tors t"at #an se#ure a stable <ob in li&e and all t"e students must gi*e t"e greatest e&&ort. (n sim$li&ying t"e *ariables into t"ree #ategories su#" as Mat"emati#s I S#ien#e, Ma<or, and Aeneral 8du#ation sub<e#ts t"ere are signi&i#ant relations"i$ bet%een t"em. !"e result o& t"is study indi#ates t"at data mining te#"ni+ues $ro*ided e&&e#ti*e im$ro*ing tools &or students' a#ademi# $er&orman#e. (t s"o%s "o% use&ul data mining #an be in "ig"er learning institutions es$e#ially using De#ision tree and Regression $arti#ularly to $redi#t a number and estimates t"e *alue o& t"e target as a &un#tion o& t"e $redi#tors &or ea#" #ase in t"e build data. 7lso S1SS gi*es an entire analyti#al $ro#ess &rom

!able , s"o%s t"e a##ura#y o& ?7(D, (D,, and Multi)layer &eed)&or%ard algorit"ms &or #lassi&i#ation a$$lied on t"e data. ?7(D te#"ni+ue "as "ig"est a##ura#y o& B5.G/ #om$ared to ot"er met"ods. (D, algorit"m also s"o%ed an a##e$table le*el o& a##ura#y %"ile Multi)layer &eed)&or%ard "as a lo%est a##ura#y o& 6K../ 2-3.

Society of Information Technology Students Journal B


$lanning to data #olle#tion, analysis and re$orting de$loyment o& t"e results. RECOMMENDATION Based &rom t"e summary o& &indings and #on#lusions o& t"e study, t"e resear#"ers re#ommend a$$lying t"is &ore#asting model in e4ternal *ariables t"at #an in&luen#e grades o& students su#" as lo#ation, so#ial, be"a*ior, and &amily su$$ort. 7lso a$$ly ot"er data mining te#"ni+ues on an e4$anded data set %it" more distin#ti*e attributes to get more a##urate and e&&i#ient results. 7$$li#ation o& data mining te#"ni+ues in edu#ational &ield #an be used to de*elo$ $er&orman#e monitoring and e*aluation tools system. REFERENCES
Bhardwaj, B.K. and Pal, S. 2011. Data Mining: A prediction for performance improvement using classification. International Journal of Computer Science and Information Security. e!land, "., #o$ert%, S., and &illiam%, '. A Data Mining Tutorial Sin!h, #., (iwari, ". and )imal, *. 201+. An Empirical Study of Application of Data Mining Techniques for Predicting Student Performance in igher Education. International Journal of Computer Science and "o$ile Computin!. (iwari, "., and )imal, *. Evaluation of Student performance !y an Application of Data Mining Techniques. http,--www.a%..com-/ue%tion-definition0of0de%cripti1e0 correlational0re%earch 2hmad, &.3., 24rai, 2., *ayan, 5., *ordin, S., and 5ahya, *. 2012. A "onceptual #rame$or% in E&amining the "ontri!uting #actors to 'o$ Academic Achievement: Self(Efficacy) "ognitive A!ility) Support System and Socio(Economic. International Conference on "ana!ement, Social Science and umanitie% 2012. http,--coo.li$rary.tow%on.edu-help!uide%-!uide%-correlation%p%%. pdf Chuchra, #. 2012. *se of Data Mining Techniques for the Evaluation of Student Performance: A "ase Study Bharadwaj, B., Pal, S., and 5ada1, S.K. 2011+ Data Mining Applications: A comparative Study for Predicting Students, performance. International Journal of Inno1ati1e (echnolo!y and Creati1e 6n!ineerin!. (iwari, "., and )imal, *. Evaluation of Student performance !y an Application of Data Mining Techniques.

Вам также может понравиться