Вы находитесь на странице: 1из 9

,EEE 4th International Conference on Knowledge-Based Engineering and Innovation (KBEI)

Dec. 22QG, 2017


,ran University of Science and Technology) – Tehran, Iran
I

$Q2YHUYLHZRQ([WUDFWLYH7H[W6XPPDUL]DWLRQ
6KRKUHK5DG5DKLPL $OL7RRIDQ]DGHK0R]KGHKL
'HSDUWPHQWRI&RPSXWHU(QJLQHHULQJ )DFXOW\RI&RPSXWHU(QJLQHHULQJ4D]YLQ%UDQFK
0DUOLN+LJKHU(GXFDWLRQ,QVWLWXWH ,VODPLF$]DG8QLYHUVLW\
1RZVKDKUH,UDQ 4D]YLQ,UDQ
6KRKUHKUDGUDKLPL#JPDLOFRP 0R]KGHKL#TLDXDFLU

0RKDPDG$EGRODKL
,UDQLDQ$FDGHPLF&HQWHUIRU(GXFDWLRQ&XOWXUHDQG5HVHDUFK $&(&5 
0DVKKDG,UDQ
0DEGRODKL#\DKRRRP

Abstract² With the increasing of online information and VXPPDUL]DWLRQ ILHOG DQG WKHUH LV D ODUJH JDS WR DFKLHYH DQ
recourse texts, text summarization has become an essential and HIILFLHQW V\VWHP WKDW FDQ DFWV OLNH D KXPDQ DJHQW 3UREOHPV
more favorite domain to preserve and show the main purpose of RXWOLQHG LQ WKH 3HUVLDQ ODQJXDJH DUH IDU PRUH WKDQ RWKHU
textual information. It is very difficult for human beings to ODQJXDJHV 7KH FRPSOH[LW\ RI ODQJXDJH DQG ODFN RI SUHFLVLRQ
summarize manually large documents of text. Text WRROV DUH FXUUHQW SUREOHPV IDFLQJ WKH 3HUVLDQ ODQJXDJH
summarization is the process of automatically creating and SURFHVVLQJ7KHUHIRUHWKHUHYLHZRIRSHUDWLRQVDQGSURFHGXUHV
condensing form of a given document and preserving its SHUIRUPHG RQ RWKHU ODQJXDJHV FRQVLGHULQJ WKH VHPDQWLF ILHOG
information content source into a shorter version with overall DQG XVH WKH VHPDQWLF UHODWLRQVKLS XVLQJ WRROV VXFK DV JUDSK
meaning. Nowadays text summarization is one of the most WKHRU\ VWDWLVWLFDO PHWKRGV IX]]\ ORJLF DQG GDWD PLQLQJ
favorite research areas in natural language processing and could WHFKQLTXHV FDQ PDNH D VLJQLILFDQW FRQWULEXWLRQ WR 3HUVLDQ
attracted more attention of NLP researchers. There are also
ODQJXDJHSURFHVVLQJ
much more close relationships between text mining and text
summarization. According to difference requirements summary $WH[WPDGHXSRIFRPSRQHQWVVXFKDVZRUGVSKUDVHVDQG
with respect to input text, established summarization systems VHQWHQFHV FRQQHFWHG FRPSOHWHO\ DQG PHDQLQJIXOO\ WRJHWKHU
should be created and classified based on the type of input text. 2QH RI WKH PDLQ DUHDV RI QDWXUDO ODQJXDJH SURFHVVLQJ LV WH[W
In this study, at first, the topic of text mining and its relationship PLQLQJ ZKLFK PHDQV GLVFRYHU DQG H[WUDFW QHZ LQIRUPDWLRQ
with text summarization are considered. Then a review has been IURPWKHGRFXPHQWV7H[WPLQLQJLVDQDO\VLVWKHGRFXPHQWVWR
done on some of the summarization approaches and their H[WUDFW YDOXDEOH KLGGHQ SDWWHUQV IURP WKH WH[W ,W LV DOVR
important parameters for extracting predominant sentences, LQYROYHV WKH GHWHFWLRQ RI WKH FRQQHFWLRQ EHWZHHQ ZRUGV DQG
identified the main stages of the summarizing process, and the
VHQWHQFHVFODVVLI\ DQG VXPPDUL]H WH[WV7KH PDLQSURSRVH LQ
most significant extraction criteria are presented. Finally, the
most fundamental proposed evaluation methods are considered.
WKLVUHVHDUFKLVDQRYHUYLHZRQWH[WVXPPDUL]DWLRQ
0RUHWKDQKDOIDFHQWXU\SDVVHGVLQFHWKHILUVWUHVHDUFKRQ
Keywords— Text summarization; Text mining; Fuzzy text WKH DXWRPDWLF WH[W VXPPDUL]DWLRQ   6WXG\LQJ RQ WH[W
summarization; Statistical text summarization; Text clustering. VXPPDUL]DWLRQ V\VWHPV HPHUJHG LQ ெV ZKLFK IRFXVHG RQ
WKHVRPHEDVLFIHDWXUHVRIWKHWH[WVXFKDVWKHSRVLWLRQRIWKH
VHQWHQFHV LQ WKH WH[W GXH WR ODFN RI SRZHUIXO FRPSXWHUV DQG
, ,1752'8&7,21 RWKHUSUREOHPVLQQDWXUDOODQJXDJHSURFHVVLQJ6LQFHWKHQDORW
:LWK DQ LQFUHDVLQJ DPRXQW RI GDWD DYDLODEOH RQ WKH ZHE RIPHWKRGVZLWKSRZHUIXOWRROVZHUHSUHVHQWHGWRVLPXODWHWH[W
ULVLQJ RI QHZV ZHEVLWHV SXEOLFDWLRQ RI YDULRXV HOHFWURQLF SURFHVVLQJVXFKDVKXPDQEUDLQ
ERRNV DQG D VLJQLILFDQW JURZWK LQ WKH QXPEHU RI SXEOLVKHG
DUWLFOHVLQGLIIHUHQWILHOGVRIVWXG\RQHRIWKHPDLQFKDOOHQJHV
IRU UHVHDUFKHUV RI VW FHQWXU\ KDV EHHQ WKDW RI DFFHVVLQJ
DFFXUDWH DQG UHOLDEOH GDWD 7KH ZLGHVSUHDG YROXPH RI ,, $5(9,(:2)7(;70,1,1*$1',765(/$7,216+,3:,7+
LQIRUPDWLRQ DYDLODEOH RQ RQH KDQG DQG WLPH OLPLWDWLRQ RQ WKH $7(;76800$5,=$7,21
RWKHU KDV GLUHFWHG WKH UHVHDUFKHUV WR WKH LQWHUHVWLQJ DUHD RI $ VLJQLILFDQW SRUWLRQ RI DYDLODEOH LQIRUPDWLRQ LV VWRUHG LQ
VXPPDUL]LQJ WH[WV DQG SRZHUIXO V\VWHP WR VXPPDUL]H WH[WGDWD7H[WGDWDEDVHVDUHJURZLQJUDSLGO\GXHWRLQFUHDVLQJ
GRFXPHQWV *LYHQ WKDW PXFK UHVHDUFKHV SDSHU KDV EHHQ GRQH WKHDPRXQWRILQIRUPDWLRQLQHOHFWURQLFIRUP7RGD\ PRVWRI
RQ WKH VXPPDUL]DWLRQ VXEMHFW DQG PDQ\ DUWLFOHV KDYH EHHQ WKH GDWD DYDLODEOH LQ WKH LQGXVWU\ EXVLQHVV DQG RWKHU
SXEOLVKHG DERXW LW %XW WKHUH LV VWLOO D ORW RI ZHDNQHVV LQ RUJDQL]DWLRQVDUHVWRUHGLQWKHIRUPRIWH[WGDWDEDVH  

‹,(((
0054
7H[W PLQLQJ LV D UHVHDUFK ILHOG WKDW KDV HPHUJHG IURP WKH Case Folding: $W WKLV VWDJH DOO ZRUGV ZLWK ELJ DQG VPDOO
FRPELQDWLRQ RI VHYHUDO RWKHU UHVHDUFK DUHDV VXFK DV GDWD OHWWHUV FDPH LQ D XQLIRUP $OVR DOZD\V WU\ WR FRQYHUWHG
PLQLQJQDWXUDOODQJXDJHSURFHVVLQJDQGLQIRUPDWLRQUHWULHYDO XSSHUFDVHOHWWHUWRORZHUFDVH
%XW WKHUH DUH VRPH GLIIHUHQFHV EHWZHHQ GDWD PLQLQJ DQG WH[W
Stemming: 7KLV VWDJH LW URRW H[WUDFWLRQ RI ZRUGV ZLWKRXW
PLQLQJ
FRQVLGHULQJ GLIIHUHQW VFHQDULRV VXFK DV VLQJXODU RU SOXUDO WKH
'DWD PLQLQJ RIWHQ GHDOV ZLWK VWUXFWXUHG GDWD EXW WH[W WLPH DQG DOO SUHIL[ DQG VXIIL[ 7KLV UHTXLUHV NQRZOHGJH RI D
PLQLQJLQYROYHGZLWKXQVWUXFWXUHGRUVHPLVWUXFWXUHGGDWD2Q SDUWLFXODU ODQJXDJH DQG PDQ\ DOJRULWKPV KDYH EHHQ SURSRVHG
WKH RWKHU KDQG HYHQ LI WKHUH LV VWLOO VRPH FRQVWUXFWLRQ LQ WKH IRU HDFK ODQJXDJH ([DPSOHV RI WKHVH W\SHV RI ZRUGV LQ WKH
WH[W WKHUH LV DQRWKHU UHDVRQ WKDW WH[W PLQLQJ LV PXFK PRUH (QJOLVK ODQJXDJH FDQ EH FRPSUHVVHG DQG FRPSUHVVLRQ WKDW
GLIILFXOWWKDQGDWDPLQLQJ8QOLNHRWKHUGDWDVXFKDVQXPHULFDO FRQYHUWWRFRPSUHVV
GDWD WKHUH DUH VRPH VHPDQWLF FRQFHSWV LQ WH[W DQG LW FDQ EH
Stop Words: 6RPH IUHTXHQW ZRUGV ZLWKRXW FRQFHSW
GLIILFXOW WR PRGHO WKHP ZLWK WUDGLWLRQDO NQRZOHGJH VWUXFWXUHV
([DPSOHVRIWKHVHW\SHVRIZRUGVLQWKH(QJOLVKODQJXDJHVXFK
)RU H[DPSOH WKHUH DUH VRPH GLIILFXOWLHV VXFK DV ZRUGV ZLWK
DVGRGRHVZLOODQGVRRQ
VDPH PHDQLQJ DQG GLIIHUHQW GLFWDWLRQ DQG GLIIHUHQW PHDQLQJ
ZLWKVDPHGLFWDWLRQ N-grams:1JUDPVDUHDVXEVHWRI1ZRUGVFRPHWRJHWKHU
7KH\VKRXOGEHSURWHFWHGEXWLWLVEHWWHUWRFRPHLQWKHIRUPRI
A. Text mining applications XQLIRUP
7KHUHLVDZLGHGHILQLWLRQRIWH[WPLQLQJ$VDUHVXOWWKHUH Tokenization: 7RNHQL]DWLRQ LV D GLYLGLQJ WKH WH[W WR
DUH GLIIHUHQW WKHRULHV DERXW LWV DSSOLFDWLRQV 7H[W PLQLQJ LV D VPDOOHU XQLWV ZKLFK DUH RIWHQ ZRUGV %XW LQ VRPH FDVHV LW LV
QHZILHOGLQ1/3EXWLWVVRIWZDUHDQDO\]HVKDYHEHHQDYDLODEOH GLIILFXOWWRGHILQHWKHZRUGDQGLWVVFRSH$ZRUGDVDXQLWRI
VLQFHODWH$PRQJWKHPRVWFRPPRQDSSOLFDWLRQVRIWH[W WH[WKDVWKHIROORZLQJSURSHUWLHV
PLQLQJ FDQ EH QRWHG WR WKH VHDUFK HQJLQHV7KH\ FDQ ILQGWKH
PRVW UHOHYDQW GRFXPHQWV ZKHWKHU WKH XVHU W\SHV D PLVVSHOOHG x (DFK VHW RI FRQWLQXRXV FKDUDFWHUV LV DOZD\V D
ZRUGRUSKUDVH6RPHRWKHUDSSOLFDWLRQVRIWH[WPLQLQJWKDWFDQ ZRUG7KHVHFKDUDFWHUVFDQHYHQILJXUHDVZHOO
EHPHQWLRQHGDUH x 5DQJH RI D ZRUG FDQ EH D ZKLWHVSDFH FKDUDFWHU
Spam identification: $QDO\]LQJ WKH WLWOH DQG FRQWHQW RI V\PEROVRUHQGRIWKHOLQH
HPDLOWRGHWHUPLQHZKHWKHULWLVVSDP
x 6RPHWLPHVZKLWHVSDFHFKDUDFWHUVDQGSXQFWXDWLRQ
Supervision:0RQLWRULQJWKHEHKDYLRURIDSHUVRQRUJURXS LVQRWUDQJHZRUGRUWRNHQ
RISHRSOH7KHUHDUHVRPHVRIWZDUHWKDWFDQGHWHFWDQGFRQWURO
,Q ODQJXDJHV VXFK DV (QJOLVK DQG 3HUVLDQ WKDW ZRUGV
SHRSOH EHKDYLRU IURP WHOHSKRQH LQWHUQHW DQG RWKHU
VHSDUDWHGZLWKZKLWHVSDFHGHILQLQJWKHERXQGDU\ZRUGVLVQRW
FRPPXQLFDWLRQGHYLFHV
DOZD\V DQ HDV\ WDVN )RU H[DPSOH DFURQ\PV WKDW KDV SRLQW
Aliases identification ZRUGVLQK\SHUWH[WOLQNVDQGWKHZRUGVWKDWDUHVHSDUDWHGZLWK
V\PEROLFFKDUDFWHUVDQGZRUGV ZLWKVSDFHFKDUDFWHUOLNH1HZ
Concepts relationship: 2FFXUUHQFH RI VRPH ZRUGV <RUN  
GHSHQGHQWRQVRPHRWKHUZRUGV
2QH RI WKH LPSRUWDQW IXQFWLRQV RI WKH WH[W PLQLQJ LV WH[W
Search and Retrieval FOXVWHULQJ 7KH PDLQ DLP RI WH[W FOXVWHULQJ LV WR SODFH WKH
Classification and clustering data VLPLODUVHQWHQFHVLQVDPHFOXVWHUV7KHQXPEHURIFOXVWHUVFDQ
EH GHWHUPLQHG E\ WKH XVHU RU DXWRPDWLFDOO\ E\ WKH SURJUDP
Text summarization 7KH ILUVW VWHS RI WH[W FOXVWHULQJ LV GLYLGLQJ WKH WH[W WR LWV
7KHUH DUH VRPH ELJ SUREOHPV LQ WH[W PLQLQJ $ ELJJHVW FRPSRQHQWDQGVHSDUDWHGWKHVHQWHQFHV
SUREOHP LV WKDW DQ\ GRFXPHQW LV D YHU\ ODUJH VHWRI ZRUGV ,I
DQ\ ZRUG LV DVVXPHG DQ HOHPHQW RI D YHFWRU DQG FRQVLGHUV B. Text mining and text summarization
GLIIHUHQWSUHVHQFHVFHQDULRVLQWKHWH[WWKHQ ZH ZLOOILQGWKDW
ZHDUHGHDOLQJZLWKD13+DUGSUREOHPZLWKKLJKGLPHQVLRQDO ,WLVREYLRXVO\ WKDWLWFDQQRWEH VHSDUDWHG WH[W PLQLQJ DQG
DUHD :RUG UHGXFWLRQ LV RQH RI WKH SUHSURFHVVLQJ RSHUDWLRQV WH[WVXPPDUL]DWLRQ6HYHUDOVWXGLHVKDYHEHHQFRQGXFWHGLQWKH
WKDWOHDGWRUHGXFHWKHGLPHQVLRQRIWKHSUREOHP3UHSURFHVVLQJ ILHOGRIWH[WPLQLQJDQGWH[WVXPPDUL]DWLRQ$OORIWKHPVKRZ
DQG ZRUG UHGXFWLRQ VKRXOG EH VR HIILFLHQW DQG SRZHUIXO WKDW WKH WH[W VXPPDUL]DWLRQ LV DOVR NLQG RI PLQLQJ DQG
EHFDXVH HOLPLQDWLQJ ZRUGV PD\ EH OHDG WR VRPH WH[W QRLVHV H[SORULQJWRILQGWKHLPSRUWDQWSDUWVRIDGRFXPHQW7KHPDLQ
VXFKDVJUDPPDWLFDOHUURU$SDUWIURPUHGXFLQJWKHVL]HRIWKH SUREOHPVIDFLQJWRWH[WSURFHVVLQJDQGVXPPDUL]LQJDUHODFNRI
WH[WSURFHVVLQJFRQVLVWVRIVRPHRWKHUVWHSVDVIROORZV TXDOLILHG OLQJXLVW H[SHUW ZRUN ZLWK SURJUDPPHUV JUDPPDU
EDVHG RI OLQJXLVWLF WKHRULHV DQG WKH ODFN RI UHDVRQLQJ DQG
WKLQNLQJZRUGSURFHVVRUVPDFKLQHV

0055
7KH ILUVW SUREOHP FDQ EH RYHUFRPH ZLWK HPSOR\LQJ WRXVHWH[WSKUDVHV(DFKDEVWUDFWLYHVXPPDUL]DWLRQFRQVLVWVRI
OLQJXLVWLF H[SHUW 7KH VHFRQG SUREOHP DFFRUGLQJ WR WKH FRPSUHKHQVLRQ SDUW WR LQWHUSUHW WKH WH[W DQG ILQG WKH QHZ
VHPDQWLFV FDQ EH UHVROYHG E\ UHYLVLQJ WKH ODQJXDJH WKHRULHV FRQFHSWVDQGSURGXFWLRQSDUWWRJHQHUDWHQHZVKRUWHUWH[WZLWK
%XW LW LV GLIILFXOW WR FUHDWH PDFKLQHV ZLWK UHDVRQLQJ DQG PRVWLPSRUWDQWLQIRUPDWLRQIURPWKHRULJLQDOGRFXPHQW,QWKLV
WKLQNLQJSRZHU PHWKRG VHQWHQFHV FRXOG EH RPLWWHG RU FKDQJHG RU HYHQ QHZ
VHQWHQFHV FRXOG EH JHQHUDWHG ,W VKRXOG EH QRWHG WKDW WKLV
2Q WKH RWKHU KDQG GHDOLQJ ZLWK KLJK YROXPH GDWD RQ PHWKRG LV YHU\ FRPSOLFDWHG DQG HYHQ PRUH FRPSOLFDWHG WKDQ
LQWHUQHW HVSHFLDOO\ ZKHQ WKH JRDO XQGHUVWDQGV WKH PDLQ PDFKLQHWUDQVODWLRQ
FRQFHSWRIGRFXPHQWVXVLQJ PDFKLQHVLQVWHDGRIKXPDQVFDQ
EHHDVLHUIDVWHUDQGPRUHUHDVRQDEOH
B. Summarization based on details
$W WKH EHJLQQLQJ FRQVLGHU WKH GLIIHUHQW YLHZV RQ WKH 7H[W 6XPPDUL]DWLRQ PHWKRGV DOVR FDQ EH FODVVLILHG LQWR
DXWRPDWLF WH[W VXPPDUL]DWLRQ DQG WKH VXPPDULHV FUHDWHG E\ DQRWKHU FODVVHV QDPHG ,QGLFDWLYH DQG ,QIRUPDWLYH
WKH PDQ 7KHUH DUH WKUHH PDLQ YLHZV WKDW DUH QR GLIIHUHQFHV VXPPDUL]DWLRQ,QLQGLFDWLYHVXPPDU\WKHPDLQLGHDRIWKHWH[W
EHWZHHQ KXPDQ DQG PDFKLQH VXPPDU\ SUHIHUHQFH RI KXPDQ LV SUHVHQWHG DQG LW LV XVXDOO\ DERXW  WR  SHUFHQW RI WKH
VXPPDU\ RQ PDFKLQH VXPPDU\ DQG SUHIHUHQFH RI PDFKLQH RULJLQDOWH[W7KLVNLQGRIDEVWUDFWLRQLVXVHGWRHQFRXUDJHWKH
VXPPDU\RQKXPDQVXPPDU\ UHDGHUWRUHDGWKHRULJLQDOWH[W)RUH[DPSOHDEULHIVXPPDU\RI
7KH WKHRU\ RI WKH VXSHULRULW\ RI SURFHVVLQJ DQG DPRYLHRUDVWRU\LQWKHFRQWH[WRILWVDGYHUWLVLQJZKLFKRQO\
VXPPDUL]LQJ PDQ EHOLHYHV WKDW KXPDQ PLQG PRUH SRZHUIXO OHDGV WR IXUWKHU TXHVWLRQV DQG HQFRXUDJH WKH UHDGHUV WR ZDWFK
WKDQPDFKLQHSURFHVVRU'HFLVLRQPDNLQJSRZHUDQGFKRLFHRI WKHILOPDQGUHDGWKHVWRU\,QIRUPDWLYHVXPPDU\FRQVLVWRIWKH
FXUUHQWPDFKLQHVDUHOHVVWKDQKXPDQPLQG7KHWKHRU\RIWKH PDLQDEVWUDFWLRQDQGWKHLPSRUWDQWLVVXHVRIWKHWH[W7KLVNLQG
VXSHULRULW\ RI PDFKLQH SURFHVVLQJ DQG VXPPDUL]LQJ EHOLHYHV RIVXPPDU\LVEHWZHHQDQGSHUFHQWRIWKHRULJLQDOWH[W
WKDW LQ WKH QHDU IXWXUH FRPSXWHUV FDQ SURFHVV WKH OLQJXLVWLF DQGFRQWDLQVDOOWKHPDLQSRLQWVRIWKHWH[W
LQIRUPDWLRQ ZLWK PRUH VSHHG DQG DFFXUDF\ WKDQ KXPDQ 7KH
UHDVRQIRUWKLVFODLPLVWKHPRUHTXLFNO\RSHUDWLRQRIPDWKDQG C. Summarization based on contents
PDFKLQHU\FRPSXWLQJWKDQDKXPDQ $QRWKHUFODVVLILFDWLRQLQWH[WVXPPDUL]DWLRQLVFRQWHQWEDVH
7KHUH DUH WKUHH PDMRU DGYDQWDJHV RI DXWRPDWLF JHQHUDWLRQ FODVVHV 7KLV W\SH RI FODVVLILFDWLRQ FDQ EH GLYLGHG LQWR WZR
RI VXPPDU\ E\ WKH PDFKLQHV 7KH DGYDQWDJHV DUH VXPPDU\ FDWHJRULHV *HQHULF DQG 4XHU\%DVHG *HQHULF VXPPDUL]DWLRQ
VL]H LV FRQWUROODEOH LWV FRQWHQW LV SUHGLFWDEOH DQG LW FDQ EH LV QRW GHSHQGHG RQ WRSLF RI WH[W DQG LW LV DVVXPHG WKDW WKH
GHWHUPLQHGWKDWDQ\SDUWRIWKHVXPPDU\UHODWHGWRZKLFKSDUW UHDGHU GRHV QRW KDYH DQ\ EDVLF NQRZOHGJH DERXW WKH WH[W
RIWKHRULJLQDOWH[W *HQHULFVXPPDU\FRQWDLQVDOODVSHFWVDQGLPSRUWDQWLVVXHVRI
PDLQ WH[W DQG UHDGHUV DUH DEOH WR DFKLHYH D WKRURXJK
XQGHUVWDQGLQJ RI WKH VXEMHFW ZLWKRXW SULRU NQRZOHGJH RI WKH
,,, ',))(5(17&5,7(5,$,17+(7(;76800$5,=$7,21 WH[W
6<67(06'(6,*1,1*
%XWLQ4XHU\%DVHGVXPPDU\LWLVDVVXPHGWKDWWKHUHDGHU
7KHUHDUHYDULRXVDSSURDFKHVWRWH[WVXPPDUL]DWLRQVRPH KDV D JHQHUDO NQRZOHGJH DERXW WKH WRSLF DQG MXVW ORRNLQJ IRU
RI ZKLFK KDYH EHHQ H[WDQW IRU DERXW  \HDUV 7H[W VSHFLILFLQIRUPDWLRQLQWKHWH[W,QWKLVFDVHEDVHGRQWKHXVHU
V
VXPPDUL]DWLRQDSSURDFKHVFDQEHFDWHJRUL]HGLQGLIIHUHQWZD\V TXHVWLRQ D UHODWHG VXPPDU\ LV FUHDWHG 0RVW RI WKHVH
DFFRUGLQJ WR WKH YDULRXV PHDVXUHV DQG IHDWXUHV )RU H[DPSOH VXPPDUL]DWLRQV\VWHPVDUHH[WUDFWLYH
DFFRUGLQJWR+RY\DQG/LQ>@WKHPHDVXUHVDQGIHDWXUHVDUH
UHODWHGWRLQSXWRXWSXWSXUSRVHDQGUHVXOWLQGLIIHUHQWW\SHVRI
VXPPDU\ ,Q WKH IROORZLQJ YDULRXV PHWKRGV KDYH EHHQ D. Summarization based on limitation
FRQVLGHUHGLQJHQHUDOFODVVLILFDWLRQ   7KHUHLVDQRWKHUFDWHJRU\EDVHGRQWKHOLPLWDWLRQVRILQSXW
WH[W7KHPHQWLRQHGJURXSKDVWKUHHFDWHJRULHVWKDWZKLFKDUH
A. Summarization based on output summary ,QGHSHQGHQW'RPDLQGHSHQGHQWDQG*HQUHVSHFLILF
7H[W 6XPPDUL]DWLRQ PHWKRGV FDQ EH FODVVLILHG LQWR ,QGHSHQGHQW VXPPDUL]DWLRQ LV VRPHWKLQJ OLNH JHQHULF
H[WUDFWLYH DQG DEVWUDFWLYH VXPPDUL]DWLRQ $Q H[WUDFWLYH VXPPDU\ DFFHSWHG HYHU\ WH[W RI HDFK ILHOG DQG JHQHUDWH D
VXPPDUL]DWLRQ PHWKRG FRQVLVWV RI VHOHFWLQJ LPSRUWDQW JHQHUDOVXPPDU\UHJDUGOHVVRIWKHWH[WVFRSHRUW\SH
VHQWHQFHV RU SDUDJUDSKV IURP WKH RULJLQDO WH[W DQG JDWKHULQJ
'RPDLQ GHSHQGHQW VXPPDUL]DWLRQ DFFHSWV WH[WV ZLWK
WKHPLQWRVKRUWHUWH[W7KHLPSRUWDQFHRIVHQWHQFHVLVGHFLGHG
VSHFLILF ILHOG RI OLWHUDWXUH DQG W\SH 7KHUH DUH PDQ\ VSHFLILF
EDVHG RQ VRPH VWDWLVWLFDO DQG OLQJXLVWLF IHDWXUHV RI VHQWHQFHV
WH[W SDWWHUQV VXFK DV 1HZV VFLHQFH WH[W ILFWLRQ VSRUWV
H[WUDFW DQG SODFHG LQ WKH RXWSXW WH[W ,Q WKLV SDSHU WKH PRUH
*HQHUDWHG VXPPDU\ RI GRPDLQ GHSHQGHQW V\VWHPV DUH
HPSKDVLVLVRQH[WUDFWLRQWHFKQLTXHV
DFFRUGLQJWRWKHLQSXWWH[WW\SH*HQUHVSHFLILFVXPPDUL]DWLRQ
$QDEVWUDFWLYHVXPPDUL]DWLRQDWWHPSWVWRH[WUDFWWKHPDLQ LVWU\LQJWRVXPPDU\PXFKPRUHVSHFLDOL]HGILHOG7KLVJURXS
FRQFHSWRIWKHWH[WLQFOHDUQDWXUDOODQJXDJHZLWKRXWQHFHVVLW\

0056
FDQ EH DEVWUDFW VSHFLILF OLWHUDWXUH VXFK DV VSRUWV QHZV WH[WV WKHPDUHLQWHQGHGDVWKHILQDOZHLJKWRIHDFKVHQWHQFH)LQDOO\
UHODWHGWRWKHILHOGRIJHRJUDSK\SROLWLFDO1HZVDQGHWF VHQWHQFHVZLWKWKHPD[LPXPZHLJKWEDVHGRQFRPSUHVVLRQUDWH
DUHH[WUDFWHGDQGE\RUGHULQJSUHVHQFHLQPDLQWH[WWUDQVIHUUHG
E. Summarization based on the number of input texts WRWKHVXPPDU\*LYHQWKDWVRPHPHQWLRQHGSDUDPHWHUVUHGXFH
$XWRPDWLFVXPPDUL]DWLRQV\VWHPVFDQEHGLYLGHGRWKHUWZR WKH LPSRUWDQFH RI D VHQWHQFH WKHLU YDOXH LV FDOFXODWHG DV D
PDLQ FDWHJRULHV EDVHG RQ WKHLU LQSXW WH[W 7KH PHQWLRQHG QHJDWLYH   6RPH RI 7KH SDUDPHWHUV DQG IHDWXUHV DUH
FDWHJRULHV DUH 6LQJOH GRFXPHQW DQG 0XOWL GRFXPHQW GHVFULEHGEHORZ
VXPPDUL]DWLRQ 7KH LQSXW RI VLQJOHGRFXPHQW VXPPDUL]DWLRQ
V\VWHPV LV RQO\ RQH WH[W EXW PXOWLGRFXPHQW VXPPDUL]DWLRQ A. Content keyword feature
ZKLFKLVYHU\SRSXODUWKHVHGD\VLVWKHLPSURYHPHQWRIVLQJOH .H\ZRUGV DUH RIWHQ QRXQ DQG GHWHUPLQHG E\ WI  LGI
GRFXPHQW VXPPDUL]DWLRQ WR FROOHFWLRQV RI UHODWHG GRFXPHQWV FULWHULD6HQWHQFHVFRQWDLQLQJNH\ZRUGVKDYHPRUHFKDQFHVWR
7KH PDLQ SURSRVH RI PXOWLGRFXPHQW VXPPDUL]DWLRQ LV H[SRVXUH LQ VXPPDUL]HG RXWSXW 2WKHU PHWKRGV KDYH EHHQ
VXPPDUL]LQJ WH[WV DQG UHPRYLQJ UHGXQGDQF\ DQG FRQVLGHULQJ SURSRVHGIRUH[WUDFWLQJNH\ZRUGV6RPHRIWKHVHPHWKRGVDUH
WKH VLPLODULWLHV DQG GLIIHUHQFHV LQ WKH LQIRUPDWLRQ FRQWHQW RI ZRUG DQDO\VLV RI PRUSKRORJLFDO VWDWHPHQWV H[WUDFWLRQ DQG
GLIIHUHQW GRFXPHQWV 0XOWL GRFXPHQW WH[W VXPPDUL]DWLRQ VFRULQJ WKHP DQG QRXQ SKUDVHV H[WUDFWLRQ FOXVWHULQJ DQG
DFFHSWVPXOWLSOHGRFXPHQWVZLWKFRPPRQVFRSHLQDGLIIHUHQW UDQNLQJWKHP:RUGPRUSKRORJLFDODQDO\VLVSOD\VDQLPSRUWDQW
SHUVSHFWLYHDQGFORVHO\WLHGWRDQVZHULQJV\VWHPVDQGVHDUFK UROH LQ QDWXUDO ODQJXDJH SURFHVVLQJ DQG KHOSV WR UHVROYH WKH
EDVHG VXPPDUL]DWLRQ   7KHUH DUH WZR PDMRU DSSURDFKHV WR DPELJXLW\ LQ WKH ZRUGV 7KH PHQWLRQHG FULWHULD VWXG\ RI URRW
VXPPDUL]H PXOWLSOH GRFXPHQWV 7KH ILUVW DSSURDFK XVHV WKH ZRUGVSUHIL[HVDQGVXIIL[HVDWWDFKHGWRDZRUG
XVXDO PHWKRGV RI VLQJOH GRFXPHQW VXPPDUL]DWLRQ DQG
VXPPDUL]HVHDFKGRFXPHQWVHSDUDWHO\7KHQFRPELQLQJDOORI B. Similarity of sentence and title of document
WKH VXPPDULHV WRJHWKHU DQG WULHG WR UHPRYH UHGXQGDQF\ E\ 6LPLODULW\RIVHQWHQFHDQGWLWOHRIGRFXPHQWLVWKHQXPEHU
RYHUODSVLPLODUVHQWHQFHVWRSURGXFHWKHILQDOVXPPDU\7KHUH RI FRPPRQ ZRUGV EHWZHHQ WLWOH DQG VHQWHQFH  6HQWHQFHV
DUH VRPH SURSRVHG PHWKRGV WRR ZKLFK EHKDYH FUHDWHG VLQJOH LQFOXGLQJWLWOHZRUGVKDYHKLJKLPSRUWDQFHDQGPRUHFKDQFHWR
VXPPDUL]H DV LQSXWV DQG WKHQ PHUJH WKHP WR FUHDWH PDLQ SODFHLQRXWSXWVXPPDU\,QPRVWRIWKHSURSRVHGPHWKRGVWKH
VXPPDU\   FULWHULDLVFDOFXODWHGDV(T  
7KH VHFRQG DSSURDFK LV VSHFLILFDOO\ GHVLJQHG IRU PXOWLSOH
GRFXPHQWV ,Q WKH PHQWLRQHG DSSURDFKHV DOO GRFXPHQWV ;L _6Lŀ7_     
DVVXPHV DV RQH GRFXPHQW DQG DOO WKH LPSRUWDQW VHQWHQFHVDUH ,Q(T  WKHQXPHUDWRULVWKHQXPEHURIVLPLODUVHQWHQFH
H[WUDFWHG IURP DOO RI WKH GRFXPHQWV WRJHWKHU XVLQJ PHWKRGV DQG WLWOH ZRUGV DQG WKH GHQRPLQDWRU LV WKH VTXDUH URRW RI WKH
OLNH JUDSKV RU FOXVWHULQJ 7KLV DSSURDFK LV PRUH FKDOOHQJLQJ SURGXFWRIWKHWLWOHOHQJWKDQGVHQWHQFHOHQJWK
LQWHOOLJHQW DQG FRPSOLFDWHG $Q H[DPSOH RI WKH VHFRQG
DSSURDFK LV WKH 6800216 V\VWHP ZKLFK H[WUDFWV DQG C. Sentence location in the document
FRPELQHV LQIRUPDWLRQ IURP PXOWLSOH VRXUFHV DQG SDVVHV WKHP 7KLV PHDVXUH LV EDVHG RQ WKH DVVXPSWLRQ WKDW VHQWHQFHV
WR D ODQJXDJH JHQHUDWLRQ FRPSRQHQW DQG SURGXFHV WKH ILQDO RFFXUULQJLQLQLWLDODQGHQGSRVLWLRQRIERWKWH[WDQGLQGLYLGXDO
VXPPDU\ ,Q JHQHUDO WKH FRPSOH[LW\ RI D VLQJOH GRFXPHQW SDUDJUDSKVKDYHDKLJKHUSUREDELOLW\RIEHLQJUHOHYDQWWRPDLQ
PRGHOLVIDUOHVVWKDQPXOWLGRFXPHQWPRGHOV   WRSLFRIWKHGRFXPHQW
F. Summarization based on language acceptance 7KH H[SHULPHQWDO UHVXOWV VKRZHG WKDW WKH EHVW FRUUHODWLRQ
EHWZHHQWKHDXWRPDWLFDQGKXPDQPDGHH[WUDFWVZDVDFKLHYHG
$WH[WVXPPDUL]DWLRQV\VWHPDOVRFDQEHGLYLGHGWRPRQR
XVLQJ ILUVW DQG HQG VHQWHQFHV LQ VXPPDUL]HG RXWSXW %XW WKH
ODQJXDJH DQG PXOWLODQJXDJH 0RQR ODQJXDJH VXPPDUL]DWLRQ
ILUVW VHQWHQFHLV PRUHLPSRUWDQW WKDQ ODVWRQH%\ UHDGLQJ WKH
RQO\DFFHSWVGRFXPHQWVLQDVSHFLILFODQJXDJHVXFKDV(QJOLVK
ILUVW VHQWHQFH RI HYHU\ SDUDJUDSK FDQ EH XQGHUVWDQG LWV PDLQ
3HUVLDQ %XW D PXOWLODQJXDJH VXPPDUL]DWLRQ DGRSWHG E\
WRSLF 7KH (T   LV XVHG WR FDOFXODWH WKH LPSRUWDWLRQ UDWH RI
GLIIHUHQWODQJXDJHVDQGFDQEHDEOHWRVXPPDUL]HWKHP
VHQWHQFHORFDWLRQ
36L    36L&  ORJ   36L&  ± 
36L&  ORJ ±  36L& 
,9 6,0,/$5,7<0($685(62)(;75$&7,9( ;L 36L0D[M Q 36L    
6800$5,=$7,21
$V PHQWLRQHG HDUOLHU H[WUDFWLYH VXPPDU\ LV XVLQJ PDLQ 36L LV HIIHFWLYH YDOXH RI WKH UHODWLYH SRVLWLRQ RI WKH LWK
WH[W VHQWHQFHV WR FUDWH VXPPDU\ 7KLV PHDQV WKDW PRUH VHQWHQFHWKDWFDOFXODWHGEDVHGRQHQWURS\7KH&LVDFRQVWDQW
LPSRUWDQWVHQWHQFHVDUHIRXQGDQGWUDQVIHUUHGWRWKHVXPPDU\ YDOXHEHWZHHQ]HURDQGRQHDQG;LLVDVHQWHQFHORFDWLRQYDOXH
7RGHWHUPLQHWKHLPSRUWDQWVHQWHQFHVVRPHSDUDPHWHUVVKRXOG
EHFRQVLGHUHG$IWHUGHWHUPLQLQJWKHSDUDPHWHUVYDOXHVXPRI

0057
D. Important words and phrases I. The continuity and similarity of a sentence with other
7KHUH DUH VRPH ZRUGV DQG SKUDVHV WKDW LQFUHDVH WKH sentences
VHQWHQFH LPSRUWDQFH 1XPEHUV LQFOXGLQJ GDWH WLPH SULFH )RU HDFK VHQWHQFH LQ WKH WH[W UHVHPEODQFH ZLWK RWKHU
SHUFHQWDJH DQG ZHLJKW  DQG SURSHU QDPHV LQFOXGLQJ SHUVRQ VHQWHQFHVDUHFRQVLGHUHGDQGVHQWHQFHZLWKPRUHUHVHPEODQFH
SODFH DQG WLPH  KDYH VSHFLILF LPSRUWDQW LQIRUPDWLRQ DQG KDVKLJKVFRUH7KHSURFHVVLVUHSHDWHGIRUDOOVHQWHQFHVRIWH[W
GHWHUPLQLQJWKHLULPSRUWDQFHGHSHQGVWRWH[WVW\SH7KHVFRUH DQG VHQWHQFHV ZLWK KLJKHU VFRUHV DUH PRUH OLNHO\ WR KDYH D
UDWH RI D VHQWHQFH UHJDUG WR WKH LPSRUWDQW ZRUGV DQG SKUDVHV SUHVHQFH LQ VXPPDU\ 7KH VLPLODULW\ RI WZR VHQWHQFHV LV
DUHREWDLQHGDFFRUGLQJWRWKH(T  E\GLYLGLQJWKHQXPEHURI VLPLODULW\EHWZHHQFRQWDLQHGZRUGVVHQWHQFHOHQJWKDQGRWKHU
WKHVHZRUGVRQWKHWRWDOZRUGVLQVHQWHQFH FULWHULDPHQWLRQHGEHIRUH
;L _6LPS__6L_  
J. The similarity of a sentence with a paragraph topic
E. Capitalized words and acronyms sentence
&DSLWDOL]HGZRUGVRUDFURQ\PVDUHKDYLQJVSHFLDOFRQFHSWV ,Q WKLV VHFWLRQ WKH VLPLODULW\ RI HDFK VHQWHQFH ZLWK
7KH ZRUGV ZLWK FDSLWDO OHWWHUV DUH LQ WKH /DWLQ DOSKDEHW SDUDJUDSK WRSLF VHQWHQFH LV FRQVLGHUHG 8VXDOO\ WKH WRSLF
ODQJXDJHVDQGWKHUHDUHQRVXFKZRUGVLQ3HUVLDQ%XWWKHUHDUH VHQWHQFH LQ HDFK SDUDJUDSK LV D ILUVW VHQWHQFH LQ SDUDJUDSK
DFURQ\PVLQDOOODQJXDJHVDQGLQ3HUVLDQODQJXDJHFDQEHXVHG 6LPLODULW\PHDVXUHVLQWKLVVHFWLRQDUHWKHVDPHFULWHULDOLNHLQ
WKHPWRGHWHUPLQHWKHVLJQLILFDQFHRIDVHQWHQFH WKHSUHYLRXVVHFWLRQV

F. Cue-Phrase K. Specific terms


&XH3KUDVHVDUHVRPHSKUDVHVOLNH³ILQDOO\´³DVUHVXOW´³LQ 6SHFLILFWHUPVDQGSKUDVHVLQWH[WDUHPDUNHGE\V\PSWRPV
WKLV SDSHU´ 6WDWHPHQWV FRQWDLQLQJ WKHVH ZRUGV FHUWDLQO\ VXFKDV!!HWF5HYLHZRIWKHOLWHUDWXUHVXPPDUL]HGE\
LQFOXGLQJ LPSRUWDQW WRSLF DQG PXVW EH SUHVHUYHG 7KHUH DUH WKH KXPDQ VKRZV WKDW WKH VHQWHQFHV FRQWDLQLQJ WKHVH SKUDVHV
VRPHZRUGVWKDWOHDGWRLQFUHDVHWKHLPSRUWDQFHRIDQRXQDQG DUHRIPRUHLPSRUWDQW6HQWHQFHVZLWKVSHFLILFWHUPVVFRUHVDV
LWVVHQWHQFH)RUH[DPSOHWKHZRUGVOLNH'U0UDQG0LVV (T  
7KHUH DUH VRPH RWKHU ZRUGV SODFHG DW WKH EHJLQQLQJ RI D ;L _6LS__6L_     
VHQWHQFH7KHVH ZRUGV OHDG WR GHFUHDVH WKHLPSRUWDQFH RIWKH 6LSLVWKHQXPEHURIVSHFLILFWHUPVLQWKHVHQWHQFH6LOHQJWK
VHQWHQFH 7KH PHQWLRQHG VHQWHQFHV DUH FRPH WR FRPSOHWH WKH RI VHQWHQFH DQG ;L VFRULQJ UDWH RI WKH VHQWHQFH EDVHG RQ
PHDQLQJ RI SUHYLRXV VHQWHQFH DQG FDQ EH LJQRUH LW 6RPH PHQWLRQHGFULWHULD
H[DPSOHRIWKHPDUH³EHFDXVH´³IRUH[DPSOH´³WKHUHIRUH´DQG
³WKXV´ 7KH ZHLJKW RI VWDWHPHQWV FRQWDLQLQJ WKHVH ZRUGV LQ L. The length of sentences
ERWKSRVLWLYHDQGQHJDWLYHIRUPFDOFXODWHGDV(T  
6HQWHQFH OHQJWK LV DQ LPSRUWDQW FULWHULRQ LQ WH[W
;L  _6L__6SL__6QL_  _6L_    VXPPDUL]DWLRQ7KHEHVWVHQWHQFHVIRUVXPPDU\RXWSXWDUHWKH
RQHVZLWKDYHUDJHOHQJWK6HQWHQFHVFRQWDLQLQJOHVVWKDQDSUH
GHILQHGQXPEHURIZRUGVDUHQRWLQFOXGHGLQWKHDEVWUDFW%XW
,QWKH(T  6LLVWKHQXPEHURIZRUGVLQVHQWHQFHVL6SL VRPHWLPHVWKHHIIHFWVRISUHYLRXVPHDVXUHVLQFUHDVHVWKHVKRUW
LVWKHQXPEHURISRVLWLYH&XH3KUDVHVDQG6QLLVWKHQXPEHURI VHQWHQFH LPSRUWDQFH DQG FDQQRW EH DEVROXWHO\ HOLPLQDWHG LW
QHJDWLYH&XH3KUDVHV 7KH (T   LV GHWHUPLQHG DQG UDWHG WKH HIIHFW RI VHQWHQFH
OHQJWK  7KH VHQWHQFHV RXWRIWKUHVKROG ZLWK QHJDWLYH UDWH LV
G. Specific words DOVRFRQVLGHUHG
6SHFLILFZRUGVDUHFHUWDLQZRUGVWKDWKDYHVSHFLDOPHDQLQJ
;L 56L /RJ 56L ± ±56L  /RJ ±56L 
DQG PD\ YDU\ E\ NH\ZRUG LQ WKH WH[W 7KHVH ZRUGV FDQ EH
VFLHQWLILF LVVXHV VXFK DV PDWKHPDWLFV SV\FKRORJ\  7KH\ 56L _6L_0D[M Q _6L_    
LQFUHDVHWKHLPSRUWDQFHRIVHQWHQFH
,Q WKH (T   6L LV VHQWHQFH OHQJWK DQG 0D[M Q _6L_  LV
PD[LPXPVHQWHQFHOHQJWKLQWKHGRFXPHQW
H. Sentences containing words with different fonts and font
effects
6RPH RI WKH ZRUGV WKDW DUH LPSRUWDQW RU VSHFLILF LVVXHV M. Pronouns
FKDUJHGZLWKFDSLWDOOHWWHUVVWDUWHGRUEROGLWDOLFRUXQGHUOLQHG
7KH\KDYHDKLJKHUFKDQFHRIH[SRVXUHVHQWHQFHVLQWKHRXWSXW 6HQWHQFHVZLWKSURQRXQVKDYHOHVVLPSRUWDQFHWKDQWKHRQH
ZLWK QRXQV 3URQRXQV UHIHU WR QRXQV WKDW GHVFULEHG LQ WKH
SUHYLRXVVHQWHQFHV7KHSRVLWLRQRIWKHSURQRXQLQWKHVHQWHQFH
LVDOVRLPSRUWDQW)RUH[DPSOHDVHQWHQFHZLWKSURQRXQDWWKH
EHJLQQLQJ LV OHVV LPSRUWDQW VHQWHQFH %XW LI WKH SURQRXQ LV

0058
SODFHG DW WKH PLGGOH DQG HQG RI D VHQWHQFH LW LVD OLWWOH PRUH DQGYRFDEXODU\VLPXOWDQHRXVO\,WXVHVDOJHEUDLFVLQJXODUYDOXH
LPSRUWDQW VHQWHQFH 7KH VFRUH RI VHQWHQFH LQFOXGLQJ SURQRXQ GHFRPSRVLWLRQ 69'  PHWKRG WR GHWHUPLQH WKH UHODWLRQVKLS
FDOFXODWHGDV(T   EHWZHHQVHQWHQFHVDQGZRUGV$GGLWLRQWREHLQJDEOHWRPRGHO
WKH UHODWLRQVKLS EHWZHHQ ZRUGV DQG VHQWHQFH 69' DOVR FDQ
;L  %3L_63L_  _6L_      UHGXFHWKHQRLVHVDQGOHDGWRLPSURYHWKHDFFXUDF\
,Q (T   WKH _ 6SL _ LV WKH QXPEHU RI SURQRXQV LQ
VHQWHQFHV 6L LV WKH VHQWHQFH OHQJWK DQG %3L LV D FRQVWDQW 6XPPDUL]DWLRQDOJRULWKPEDVHGRQ/6$PHWKRGFRQVLVWVRI
SDUDPHWHU,IWKHSURQRXQORFDWLRQLVLQWKHILUVWWKUHHZRUGLQ WKUHHVWHSVFUHDWHWKHLQSXWPDWUL[DSSOLHGWKH69'PHWKRGRQ
WKHVHQWHQFH%3L RWKHUZLVH%3L  WKHFUHDWHGPDWUL[DQGVHQWHQFHH[WUDFWLRQ
/6$DOVRKDVVRPHOLPLWDWLRQV7KHPRVWLPSRUWDQWRIWKHP
9 ',))(5(17$3352$&+(6727(;76800$5,=$7,21 DUHDVIROORZV
x 7KH DOJRULWKP GRHV QRW XVH WKH LQIRUPDWLRQ DERXW WKH
$OWKRXJK VXPPDULHV FUHDWHG E\ KXPDQ DUH RIWHQ ZRUGV DUUDQJHPHQW LQ VHQWHQFHV JUDPPDU DQG
DEVWUDFWLYH EXW WKH PRVW VXFFHVVIXO PDFKLQH VXPPDU\ DUH PRUSKRORJ\ UHODWLRQVKLS +RZHYHU WKLV LQIRUPDWLRQ
H[WUDFWLYH %HFDXVH RI WKH VHPDQWLF DPELJXLW\ LQ QDWXUDO FDQEHXVHIXOWREHWWHUXQGHUVWDQGZRUGVDQGVHQWHQFHV
ODQJXDJHSURFHVVLQJDJRRGH[WUDFWLYHVXPPDU\LVPXFKEHWWHU
UHVXOWWKDQDEVWUDFWLYHVXPPDUL]DWLRQ,QIDFWLWLVIDUDZD\WR x 7KH DOJRULWKP GRHV QRW XVH DQ\ ZRUG NQRZOHGJH DQG
FUHDWH DQ LGHDO DXWRPDWLF DEVWUDFWLYH VXPPDUL]DWLRQ PHWKRG ZRUGGDWDEDVH
7KHILUVWDXWRPDWLFWH[WVXPPDUL]DWLRQDQGWKHPRVWLPSRUWDQW
QHZ SURSRVHG PHWKRGV DUH H[WUDFWLYH V\VWHPV WKDW H[WUDFW x %\ LQFUHDVLQJ WKH QXPEHU RI GLIIHUHQW ZRUGV DQG
LPSRUWDQW VHQWHQFHV EDVHG RQ KHXULVWLF IHDWXUHV VXFK DV KHWHURJHQHRXV GDWD WKH SHUIRUPDQFH RI WKH DOJRULWKP
VHQWHQFHSRVLWLRQVLQWKHWH[WWKHIUHTXHQF\RIZRUGVLQLWDQG JUHDWO\ UHGXFHG 3HUIRUPDQFH UHGXFWLRQ LV GXH WR WKH
VRPHLPSRUWDQWNH\ZRUGVUHODWHGWRWH[W    7RFUHDWHD WLPHDQGPHPRU\FRPSOH[LW\RIWKH69'PHWKRG
V\VWHP ZLWK KLJKTXDOLW\ GXH WR D GLIIHUHQW W\SH RI ODQJXDJH
DQG WH[W LQSXW YDULRXV PHWKRGV DQG DOJRULWKPV LQ PDFKLQH B. Lexical Chain based approaches
OHDUQLQJDQGQDWXUDOODQJXDJHSURFHVVLQJLVSURSRVHG0RVWRI /H[LFDO FKDLQ SURGXFHV D SUHVHQWDWLRQ RI WH[W FRQWLJXRXV
H[WUDFWLYH VXPPDU\ PHWKRGV DUH HPSKDVLV RQ NH\ VHQWHQFHV VWUXFWXUHV%DVLFDOO\LWXVHVWKHZRUGQHWGDWDEDVHWRGHWHUPLQH
7KH GLIIHUHQFHV EHWZHHQ PHWKRGV DUH WKH GLIIHUHQW DOJRULWKP WKHFRQQHFWLRQEHWZHHQWKHWHUPVDQGWKHQFUHDWHVDFRQWLQXXP
XVHGWRGHWHFWLGHQWLI\H[WUDFWDQGWKHZD\WRSXWVHQWHQFHVLQ EHWZHHQWKHVHWHUPV7KHILUVWFRPSXWDWLRQDOPRGHORIOH[LFDO
WKHVXPPDU\,QWKHIROORZLQJWKHPRVWLPSRUWDQWDSSURDFKHV FKDLQVZDVSUHVHQWHGE\0RULVDQG+LUVWLQDQGWKHILUVW
ZLOOEHGLVFXVVHG XVH RI OH[LFDO FKDLQV LQ WH[W VXPPDUL]DWLRQ ZDV SURSRVHG E\
%DU]LOD\DQG(OKDGDG    
A. Statistical approaches 7KH VFRUHV WKDW FRQVLGHUHG IRU WHUPV DUH DFFRUGLQJ WR WKH
8VLQJ VWDWLVWLFDO PHWKRGV WR VXPPDUL]H WKH WH[W LV DQ W\SH DQG QXPEHU RI WKH UHODWLRQVKLS RI FKDLQV VHW 7KH ILQDO
HIIHFWLYH DSSURDFK WKDW KDV EHHQ XVHG LQ PDQ\ DUWLFOHV ,Q VXPPDU\ LV LQFOXGHG VHQWHQFHV ZLWK VWURQJHVW FKDLQ
VWDWLVWLFDOPHWKRGVWKHLPSRUWDQWVHQWHQFHVDUHVHOHFWHGEDVHG FRQQHFWLRQ /H[LFDO FKDLQV FDQ EH FRPSXWHG WKH VHPDQWLFDOO\
RQ ZRUG IUHTXHQF\ LQGLFDWRU SKUDVHV DQG RWKHU IHDWXUHV UHODWLRQ RI ZRUGV ,W FDQV DOVR LGHQWLI\ WKH V\QRQ\PV DQG
UHJDUGOHVV RI WKH PHDQLQJ RI WKH ZRUGV WKDW PHQWLRQHG LQ K\SRQ\PVWRSODFHWKHPLQDJURXSLQWRWKHVDPHOH[LFDOFKDLQ
SUHYLRXVVHFWLRQV /H[LFDO FKDLQ DOVR XVHG IRU LQIRUPDWLRQ UHWULHYDO DQG
7KHUH DUH VHYHUDO PHWKRGV WR GHWHUPLQH WKH NH\ VHQWHQFHV JUDPPDWLFDOHUURUFRUUHFWLRQV
VXFK DV 7KH 7LWOH 0HWKRG   7KH /RFDWLRQ 0HWKRG   7KHUHDUHWZRGUDZEDFNVLQOH[LFDOFKDLQDSSURDFKHV7KH
7KH $JJUHJDWLRQ 6LPLODULW\ 0HWKRG   7KH )UHTXHQF\ ILUVWRQHLVDPELJXLW\LQWKHZRUGFKDLQ,IVRPHRIZRUGVKDYH
0HWKRG   7) %DVHG 4XHU\ 0HWKRG   DQG /DWHQW VHPDQWLFDPELJXLW\WKHFUHDWHGFKDLQDOVRZLOOKDYHVHPDQWLF
6HPDQWLF $QDO\VLV   %XW WKH PRVW FRPPRQ PHWKRGV DUH DPELJXLW\ 7KH VHFRQG GUDZEDFN LV WKH UHODWLRQ RI FUHDWHG
&ODVVLILHU %D\HVLDQ %D\HVLDQ &ODVVLILHU  DQG FRPPXQLFDWLRQ FKDLQWRPDLQWRSLF$OOFKDLQVDUHQRWUHODWHGWRWKHPDLQWRSLF
FRQFHSW UHODWLRQV DSSURDFKHV 2QH RI WKH PRVW IDPRXV
VWDWLVWLFDOPHWKRGVWRVXPPDUL]HWKHWH[WLV/6$PHWKRG   C. Graph based approach
/6$ LV DQ DOJHEUDLFVWDWLVWLFDO PHWKRG WKDW H[WUDFWV KLGGHQ
VWUXFWXUHVPHDQLQJRIZRUGVDQGVHQWHQFHV  7KLVDSSURDFK *UDSK EDVHG DSSURDFK SURYLGHV WH[W VXPPDUL]DWLRQ
LVDQXQVXSHUYLVHGPHWKRGWKDWH[WUDFWVWH[WVWUXFWXUHVMXVWE\ PHWKRGV XVLQJ JUDSK WKHRULHV $IWHU FRPPRQ SUHSURFHVVLQJ
LQIRUPDWLRQ RIWKH ZRUGV LQ VHQWHQFH ZLWKRXW WKH QHHGRI DQ\ VWHSV VXFK DV VWHPPLQJ DQG VWRS ZRUG UHPRYDO VHQWHQFHV LQ
RWKHUNQRZOHGJH7KHLGHDLQ/69LVZRUGVWKDWWKH\DUHVKDUHG WKH GRFXPHQWV DUH UHSUHVHQWHG DV QRGHV LQ D GLUHFWHG JUDSK
EHWZHHQ GLIIHUHQW VHQWHQFHV DUH WKH UHDVRQ RI PHDQLQJ 6HQWHQFHV DUH FRQQHFWHG WR HDFK RWKHU E\ HGJHV DFFRUGLQJ WR
GHSHQGHQFH/6$DOVREHDEOHWRVKRZWKH PHDQLQJRI ZRUGV VHQWHQFH 7KH EDVLF LGHD RI JUDSKEDVHG DSSURDFKHV LV
VRPHWKLQJ OLNH YRWLQJ $V UHVXOW ZKHQ DQ HGJH FRQQHFWHG D

0059
QRGHWRRWKHUQRGHLWPHDQVWKDWYRWHVWRLWDQGZKDWHYHUWKH PHWKRG ODUJHVW FOXVWHU LV FRQVLGHUHG DV PDLQ WRSLF LQSXW WH[W
LQSXWGHJUHHRI D QRGH LV KLJK LW KDVD KLJKHUSULRULW\ ,QWKH VL]HKDVDGLUHFWLPSDFWRQGHWHUPLQLQJWKHQXPEHURIFOXVWHUV
PHQWLRQHG PHWKRG WKHUH LV DOVR YRWLQJ VFRUH ,I D QRGH KDV )RUH[DPSOHDELJ.OHDGWRVPDOOFOXVWHUVDQGDVUHVXOWVPDOO
JUHDWHURXWSXWGHJUHHLWVLPSRUWDQFHLQFUHDVHV7KHLPSRUWDQFH DQG GLVSHUVH VXPPDU\ ZLWK YHU\ ORZ FRUUHODWLRQ DQG VPDOO .
GHJUHH RI HDFK QRGH LV FDOFXODWHG E\ 6 9D  DV (T   IRU DOO OHDG WR ELJ DQG GHQVH FOXVWHUV DQG DV UHVXOW ORZ FRPSUHVVLRQ
QRGHV LQ D JUDSK UHFXUVLYHO\ 7KH SURFHVV FRQWLQXHV XQWLO WH[W
FRYHUDJHDQGQRFKDQJHLQ6 9D 
1,0HJKDQD 06%HZRRUSURSRVHGDQRWKHUWHFKQLTXHWR
6 9D   ±G G    FUHDWH D TXHU\ EDVHG VXPPDUL]DWLRQ V\VWHP XVLQJ FOXVWHULQJ
PHWKRGV  7KH\XVHG([SHFWDWLRQ0D[LPL]DWLRQ&OXVWHULQJ
,QWKH(T  9LVQRGHV(LVHGJHVLQ 9D LVWKHQXPEHU (0  DOJRULWKP DQG LPSOHPHQWDWLRQ PHWKRGRORJ\ LV GLYLGHG
RI 9D LQSXW HGJHVRXW 9D  LV WKH QXPEHU RI 9DRXWSXW HGJHV LQWRWZRVHFWLRQV7KH(0DOJRULWKPKDVEHHQLPSOHPHQWHGLQ
DQGGLVDQLQSXWSDUDPHWHUEHWZHHQ]HURDQGRQH6 9D LV9D WKHILUVWSDUWDQGTXHU\EDVHGVXPPDUL]DWLRQLVGRQHLQVHFRQG
VFRUHDQG6 9E LV9EVFRUH7KHDOJRULWKPFDQDOVREHDSSOLHG SDUW7KHSURSRVHGPHWKRGKDVEHHQXVHGWKH:RUGQHW
RQ DQ XQGLUHFWHG JUDSK EXW WKH RXWSXW VXPPDU\ LV PRUH
GLIIHUHQWZLWKFRPSOH[LW\RIWLPH
A. Fuzzy logic based approaches
$Q HIIHFWLYH JUDSK EDVHG VXPPDU\ SURSRVHG DOJRULWKP LV
7H[W 5DQN DOJRULWKP   ,W XVHV DQ XQVXSHUYLVHG PHWKRG WR )X]]\ ORJLF EDVHG DSSURDFKHV FRQVLGHU HDFK FKDUDFWHULVWLF
H[WUDFWNH\ZRUGVDQGVHQWHQFHVZLWKVFRULQJWRQRGHVEDVHGRQ RIDWH[WDVWKHLQSXWRIIX]]\V\VWHP7KHVHPHWKRGVRQO\XVHG
SUHYLRXVPHQWLRQHGVLPLODULW\PHDVXUHV  7KHDOJRULWKPLV IX]]\ORJLFWRGHWHFWDQGH[WUDFWWKHLPSRUWDQWVHQWHQFHV)X]]\
VWDUWHG ZLWK DQ RSWLRQDO QRGH YDOXHV DQG UHFXUVLYHO\ UHSHDWHG WH[WVXPPDUL]DWLRQPHWKRGVDUHGLIIHUEDVHGRQGLIIHUHQFHVLQ
XQWLO FRYHUDJH WR SUHGHILQHG WKUHVKROG 7KH RSWLRQDO QRGHV H[WUDFWLQJ IHDWXUHV IX]]\ UXOHV OLQJXLVWLF YDULDEOHV
YDOXHKDVQRHIIHFWRQWKHILQDOVFRUHV PHPEHUVKLS IXQFWLRQV IX]]LILFDWLRQ DQG GHIX]]LILFDWLRQ
PHWKRGV 6RPH FKDQJHV LQ QXPHULFDO YDOXHV OHDG WR SURGXFH
D. Cluster based approaches EHWWHU OLQJXLVWLF YDULDEOHV   ) .\RRPDUVL + .KRVUDYL
6RPHDXWRPDWLFVXPPDUL]DWLRQV\VWHPVDUHXVHGFOXVWHUVWR SURSRVHGDPHWKRGWKDWVXPPDUL]HVWKHWH[WLQWZRVWDJHV  
SURGXFH VLJQLILFDQW VXPPDULHV ,Q WKLV DSSURDFK GLIIHUHQW 7KH ILUVW VWDJH LV SUHSURFHVVLQJ DQG WKH VHFRQG VWDJH LV
FOXVWHULQJ DOJRULWKPV DUH DSSOLHG IRU GLYLGLQJ WKH WH[W LQWR SHUIRUPLQJ IX]]\ DQDO\VLV RQ WH[W 7KH IHDWXUHV WKDW WKH\
WRNHQVVXFKDVZRUGVSKUDVHVVHQWHQFHVDQGHYHQSDUDJUDSKV H[WUDFWHG DUH WKH QXPEHU RI FRPPRQ ZRUGV LQFOXGLQJ WKH
&OXVWHUEDVHGDSSURDFKLVDQH[WUDFWLYHVXPPDUL]DWLRQDQGWKH VHQWHQFH DQG WLWOH WKH QXPEHU RI ZRUGV LQ VHQWHQFH WKH
PRVW VLPLODU VHQWHQFHV DUH EDVHG RQ PHQWLRQHG VLPLODULW\ VLPLODULW\ RI D VHQWHQFH ZLWK D SDUDJUDSK WRSLF VHQWHQFH DQG
PHDVXUHV DUH SODFHG LQ WKH FOXVWHUV 7KH ELJJHVW FOXVWHU LV VLPLODULW\RIVHQWHQFHDQGILUVWVHQWHQFHLQSDUDJUDSK
VHOHFWHG DV PDLQ WRSLF DQG LWV VHQWHQFHV DUH H[WUDFWHG DQG
SODFHG LQ VXPPDU\ RXWSXW (XFOLGHDQ GLVWDQFH &DUWHVLDQ 9, ,17(*5$7,216,0,/$567$7(0(176,1(;75$&7,9(
VLPLODULW\FRVLQHDQGVRPHRWKHUVLPLODULW\PHDVXUHVDUHXVHG 6800$5,=$7,21
IRUGHILQLQJWKHVLPLODULW\DQGGLVVLPLODULW\RIFOXVWHUV  
5HGXQGDQF\LQDVHULHVRIVHQWHQFHVLQWH[WVHVSHFLDOO\RQ
$ $JUDZDO  8 *XSWD  SURSRVHG .PHDQV WKHZHELVRQHRIWKHSUREOHPVLQQDWXUDOODQJXDJHSURFHVVLQJ
FOXVWHULQJ DOJRULWKP WR SODFH LPSRUWDQW VHQWHQFHV ZLWKLQ DQGRQWKHRWKHUKDQGFUHDWHVDQHZUHVHDUFKRSSRUWXQLWLHVLQ
FOXVWHUV DQG FKRVH WKH ELJJHVW FOXVWHU DV PDLQ WRSLF  7KH WH[W VXPPDUL]DWLRQ DUHD 7KHUH DUH PDQ\ VHQWHQFHV ZLWK
GRFXPHQWV DUH UHSUHVHQWHG XVLQJ WHUP IUHTXHQF\LQYHUVH VLPLODU PHDQLQJ DQG FRQFHSW LQ HYHU\ GRFXPHQW %HFDXVH RI
GRFXPHQWIUHTXHQF\ 7),') ,QWKHFRQWH[WWHUPIUHTXHQF\ GHWHUPLQLQJ WKH QXPEHU RI VHQWHQFHV LQ WKH VXPPDU\ E\ WKH
7) LVWKHDYHUDJHQXPEHURIRFFXUUHQFH E\GRFXPHQW LQWKH XVHUVWDWHPHQWVZLWKDVLPLODUFRQFHSWDUHSODFHGLQVXPPDU\
FOXVWHU7KHWRSLF LVUHSUHVHQWHGE\ ZRUGVRI ZKLFK WKH YDOXH 7KHUHDUHDOVRVRPHRWKHULPSRUWDQWFRQFHSWVLQWKHPDLQWH[W
7),')LVKLJKHULQWKHFOXVWHU7KHVHOHFWLRQRIWKHLPSRUWDQW DQGGXHWRWKHOLPLWDWLRQQXPEHURIVHQWHQFHVLQVXPPDU\WKH\
VHQWHQFHVLVEDVHGRQWKHVLPLODULW\PHDVXUHVRIWKHVHQWHQFHV GRQRWVHOHFWDQGH[WUDFW6RPDQ\VHQWHQFHVZLWKRQHFRQFHSW
ZLWKWKHWRSLFRIFOXVWHU SODFHGLQVXPPDU\DQGPDQ\RWKHULPSRUWDQWSDUWVDUHLJQRUHG
7ZR RSWLPL]DWLRQ LGHDV DUH SURSRVHG LQ WKHLU DOJRULWKP 7KH SUREOHP LV PRUH FRPPRQ LQ PXOWL GRFXPHQW
7KH ILUVW RQH LV D PHWKRG WR FDOFXODWH VHQWHQFH VFRUH DQG WKH VXPPDUL]DWLRQ  6LPLODUFRQFHSWVHQWHQFHVDUHDOVRFUHDWHG
VHFRQGRQHLVDZD\WRJHWWKHRSWLPDOQXPEHURIFOXVWHUV7R WKHDPELJXLW\LQTXHVWLRQDQVZHULQJV\VWHPV
FDOFXODWLQJ VHQWHQFH VFRUH WKH DOJRULWKP XVHV 7),') DQG WKH 7KHUH DUH VRPH SURSRVHG PHWKRGV WR FRPELQH VLPLODU
VHQWHQFHOHQJWK ;  VHQWHQFHV WR H[WUDFW PXFK PRUH LPSRUWDQW VHQWHQFHV IURP WKH
PDLQ WH[W 2QH RI WKH PRVW LPSRUWDQW VHQWHQFH IXVLRQV LV
6FRUH [      
GHSHQGHQF\ WUHHV .DWMD )LOLSSRYD LV SURSRVHG D GLUHFWHG
2QH RI WKH PRVW LPSRUWDQW LVVXHV LQ NPHDQV FOXVWHULQJ LV DF\FOLF JUDSK '$*  WR VHQWHQFH IXVLRQ   &OXVWHULQJ
GHWHUPLQLQJWKHRSWLPDOQXPEHURIFOXVWHUV*LYHQWKDWLQWKLV

0060
DSSURDFKHV DUH DOVR FDQ EH XVHG IRU FRPELQLQJ VLPLODU FRXQW 1JUDP  LV DOVR WKH QXPEHU RI 1JUDPV LQ KXPDQ
VHQWHQFHV,QWKHVHDSSURDFKHVVPDOOFOXVWHUVDUHFRQVLGHUHGWR VXPPDU\
SODFH YHU\ VLPLODU VHQWHQFHV DQG WKHQ HDFK FOXVWHU FDQ EH
FRPELQHGWRRQHVHQWHQFH 9,,, &21&/86,216
7H[W VXPPDUL]DWLRQ LV RQH RI WKH PRVW H[FLWLQJ UHVHDUFK
9,, $66(660(17$1'(9$/8$7,212)6800$5,=,1* DUHDV LQ QDWXUDO ODQJXDJH SURFHVVLQJ ,W LV DQ RSHQ UHVHDUFK
0(7+2'6 DUHDV DQG D ORW RI UHVHDUFKHV DUH KDYLQJ EHHQ GRQH DERXW LW
'XHWRH[WHQVLYHDXWRPDWLFVXPPDUL]DWLRQPHWKRGVDVVHVV 7H[WVXPPDUL]DWLRQFDQEHFODVVLILHGLQWRGLIIHUHQWJURXSVDQG
WKHDFFXUDF\RIFUHDWHGPHWKRGVLVLPSRUWDQW,WFDQEHVDLGWKDW DSSURDFKHVDQGWKHPRVWQRWDEO\RIWKHPVWXGLHGLQWKLVSDSHU
WKH SURFHVV RI HYDOXDWLRQ LV PXFK PRUH GLIILFXOW WKDQ ,Q WKLV VWXG\ DW ILUVW WKH WRSLF RI WH[W PLQLQJ DQG LWV
VXPPDUL]LQJ WKH WH[W ,Q PRVW FDVHV WKHUH LV QRW DQ LGHDO UHODWLRQVKLSZLWKWH[WVXPPDUL]DWLRQDUHFRQVLGHUHG,PSRUWDQW
VXPPDUL]HRIDGRFXPHQW7KHPDLQSUREOHPLQVXPPDUL]LQJ GHVLJQLQJFULWHULDLQWH[WVXPPDUL]DWLRQV\VWHPVDUHSUHVHQWHG
DVVHVVPHQWLVXVLQJH[WHQVLYHFULWHULDDQGDEVHQFHRIDVWDQGDUG 'LIIHUHQW DSSURDFKHV IRU VXPPDUL]DWLRQ DQG LPSRUWDQW
PHWKRGIRUHYDOXDWLRQ6XFKSUREOHPVDUHUDUHLQRWKHUQDWXUDO SDUDPHWHUV QHHGHG WRUDWH LPSRUWDQW VHQWHQFHV DUH LQWURGXFHG
ODQJXDJHSURFHVVLQJFDVHV )LQDOO\LPSRUWDQWHYDOXDWLRQDSSURDFKHVDUHLQWURGXFHG
0RVW RI WKH GHYHORSHG PHWKRGV DUH XVHG WZR DSSURDFKHV
IRUHYDOXDWLRQ7KHILUVWDSSURDFKLVWRMXGJHE\KXPDQDQGWKH
VHFRQGDSSURDFKLVFRPSDUHGWRUHIHUHQFHVXPPDU\ 5()(5(1&(6
>@ +/XKQ³7KHDXWRPDWLFFUHDWLRQRIOLWHUDWXUHDEVWUDFWLRQ´,%0-RXUQDO
,QWKHILUVWDSSURDFKVXPPDU\JHQHUDWHGE\PDFKLQHFDQEH RIUHVHDUFKDQGGHYHORSPHQW9ROSS
FRPSDUH ZLWK WKH VDPH VXPPDU\ JHQHUDWHG E\ D PDQ 7KH >@ 9 *XSWD *6 /HKDO ³$ 6XUYH\ RI 7H[W 0LQLQJ 7HFKQLTXHV DQG
SUREOHPZLWKWKLVDSSURDFKLVWKHGLIIHUHQWSHUVRQDOWDVWHVDQG $SSOLFDWLRQV´ -RXUQDO RI (PHUJLQJ 7HFKQRORJLHV LQ :HE ,QWHOOLJHQFH
RSLQLRQV 7R UHGXFH WKH LPSDFW RI WKH HIIHFWV DXWRPDWH 9ROQR
VXPPDU\FRPSDUHVZLWKPRUHWKDQRQHVXPPDU\JHQHUDWHGE\ >@ *2 0DNEXOH , &LFHNOL ) 1XU $OSDVODQ ³7H[W 6XPPDUL]DWLRQ RI
KXPDQVDQGLWLVWLPHFRQVXPLQJ 7XUNLVK 7H[WV XVLQJ /DWHQW 6HPDQWLF $QDO\VLV´ UG ,QWHUQDWLRQDO
&RQIHUHQFHRQ&RPSXWDWLRQDO/LQJXLVWLFVSS±
,Q VHFRQG DSSURDFK WKHUH LV PRUH WKDQ RQH RULJLQDO WH[W >@ ( +RY\ & /LQ ´$XWRPDWHG 7H[W 6XPPDUL]DWLRQ DQG WKH
ZLWKWKHLUVXPPDU\LQDGDWDVHW7KHRULJLQDOGRFXPHQWDQGLWV 6800$5,67 6\VWHP ,Q $GYDQFHV LQ $XWRPDWLF 7H[W
RSWLPL]HG VXPPDU\ LV D JRRG EDVH IRU HYDOXDWH WKH SURSRVHG 6XPPDUL]DWLRQ´0,73UHVVSS
VXPPDUL]DWLRQDOJRULWKP >@ 6 *KRODPUH]D]DGHK 0 $PLQL % *KRODP]DGHK ³$ &RPSUHKHQVLYH
6XUYH\RQ7H[W6XPPDUL]DWLRQ6\VWHPV´,(((
2QHRIWKHPRVWIDPRXVHYDOXDWLRQPHWKRGVLV528*(1 >@ 7 +LUDR < 6DVDNL + ,VR]DNL ³$Q H[WULQVLF HYDOXDWLRQ IRU TXHVWLRQ
7KHPHQWLRQHYDOXDWLRQPHWKRGFRPSDUHVDXWRPDWHJHQHUDWHG ELDVHG WH[W VXPPDUL]DWLRQ RQ TD WDVNV´ ,Q 3URFHHGLQJV RI 1$$&/
VXPPDU\ ZLWK VXPPDULHV JHQHUDWHG E\ WKH KXPDQ     ZRUNVKRSRQ$XWRPDWLFVXPPDUL]DWLRQ
528*(1XVHVWKUHHHYDOXDWLRQFULWHULD3UHFLVLRQ S 5HFDOO >@ -9 *ROGVWHLQ - 0LWWDO - &DUERQHOO  0 .DQWURZLW]W ³0XOWL
5 DQG)PHDVXUHDQGWKH\DUHFDOFXODWHGDV(T   GRFXPHQW VXPPDUL]DWLRQ E\ VHQWHQFH H[WUDFWLRQ´ ,Q 3URF RI WKH
$1/31$$&/:RUNVKRSRQ$XWRPDWLF6XPPDUL]DWLRQ
3  _VXPBUHIŀVXPBFDQG_ _VXPBFDQG_ >@ *20DNEXOH³7H[W6XPPDUL]DWLRQXVLQJ/DWHQW6HPDQWLF$QDO\VLV´
5  _VXPBUHIŀVXPBFDQG_ _VXPBUHI_ 06WKHVLV0LGGOH(DVW7HFKQLFDO8QLYHUVLW\
) 35 35      >@ 9 *XSWD * /HKDO ³$ 6XUYH\ RI 7H[W 6XPPDUL]DWLRQ ([WUDFWLYH
7HFKQLTXHV´ -RXUQDO RI HPHUJLQJ WHFKQRORJLHV LQ ZHE LQWHOOLJHQFH
92/12
,Q WKH (T   VXPBUHIGHWHUPLQHV VXPPDU\ H[WUDFWHGE\ >@ +(GPXQGVRQ³1HZ0HWKRGVLQ$XWRPDWLF([WUDFWLQJ´-RXUQDORIWKH
H[SHUWV DQG VXPBFDQG GHWHUPLQHV VXPPDU\ H[WUDFWHG E\ $VVRFLDWLRQIRU&RPSXWLQJ0DFKLQHU\9RO  33
V\VWHP >@ . <RXQJNRRQJ 6 -XQJ\XQ ³$Q (IIHFWLYH 6HQWHQFH ([WUDFWLRQ
7HFKQLTXH8VLQJ&RQWH[WXDO,QIRUPDWLRQDQG6WDWLVWLFDO$SSURDFKHVIRU
$QRWKHU HYDOXDWLRQ FULWHULRQ PHQWLRQHG LQ 528*(1 LV 7H[W6XPPDUL]DWLRQ´3DWWHUQ5HFRJQLWLRQ/HWWHUV
FDOFXODWHGE\WKH(T  7KHPHDVXUHFRPSDUHVQXPEHURI1 >@ 0:DVVRQ³8VLQJOHDGLQJWH[WIRUQHZVVXPPDULHV(YDOXDWLRQUHVXOWV
JUDPVLQPDFKLQHVXPPDU\DQGKXPDQVXPPDU\ DQG LPSOLFDWLRQV IRU FRPPHUFLDO VXPPDUL]DWLRQ DSSOLFDWLRQV´ LQ 3URF
WK ,QWHUQDWLRQDO &RQIHUHQFH RQ &RPSXWDWLRQDO /LQJXLVWLFV DQG WK
528*(1     $QQXDO0HHWLQJRIWKH$&/SS
FRXQWBPDWFK 1JUDP   >@ : DOVDQLH ³7RZDUGV DQ LQIUDVWUXFWXUH IRU $UDELF WH[W VXPPDUL]DWLRQ
XVLQJUKHWRULFDOVWUXFWXUHWKHRU\´067KHVLV´'HSDUWPHQWRIFRPSXWHU
FRXQW 1JUDP    VFLHQFH.LQJ6DXGXQLYHUVLW\5L\DGK.LQJGRPRI6DXGL$UDELD
>@ * 6DOWRQ ³$XWRPDWLF 7H[W 3URFHVVLQJ 7KH 7UDQVIRUPDWLRQ $QDO\VLV
,QWKH(T  1LVWKHQXPEHURI1JUDPVFRXQWBPDWFK DQG5HWULHYDORI,QIRUPDWLRQE\&RPSXWHU´$GGLVRQ:HVOH\3XEOLVKLQJ
1JUDP  LV PD[LPXP QXPEHU RI 1JUDPV ZKLFK DUH LQ &RPSDQ\
PDFKLQH VXPPDU\ DQG KXPDQ VXPPDU\ VLPXOWDQHRXVO\

0061
>@ - %HOOHJDUGD ³([SORLWLQJ ODWHQW VHPDQWLF LQIRUPDWLRQ LQ VWDWLVWLFDO >@ ) .\RRPDUVL + NKRVUDYL ( (VODPL 0 'DYRXGL ³([WUDFWLRQEDVHG
ODQJXDJH PRGHOLQJ´ LQ 3URF  ,((( 9RO  1R  SS  7H[W 6XPPDUL]DWLRQ XVLQJ )X]]\ $QDO\VLV´ ,UDQLDQ -RXUQDO RI )X]]\
 6\VWHPV9RO1RSS
>@ 1 $ODPL 0 0HNQDVVL 1 5DLV ³$XWRPDWLF 7H[WV 6XPPDUL]DWLRQ >@ 0 3RXUYDOL $ $EDGHK 0RKDPPDG ³$XWRPDWHG WH[W VXPPDUL]DWLRQ
&XUUHQW6WDWHRIWKH$UW´-RXUQDORI$VLDQ6FLHQWLILF5HVHDUFK9RO   EDVH RQ OH[LFDO FKDLQ DQG JUDSK XVLQJ RI ZRUG QHW DQG :LNLSHGLD
SS NQRZOHGJH EDVH´ ,-&6, ,QWHUQDWLRQDO -RXUQDO RI &RPSXWHU 6FLHQFH
>@ 5 %DU]LOD\ 0 (OKDGDG ³8VLQJ /H[LFDO &KDLQV IRU 7H[W ,VVXHV1RYRO
6XPPDUL]DWLRQ´WKH0,73UHVVSS >@ +6KDNHUL6*KRODPUH]D]DGHK0$PLQL6DOHKL)*KDGDP\DUL³$
>@ 5 0LKDOFHD ³*UDSKEDVHG UDQNLQJ DOJRULWKPV IRU VHQWHQFH H[WUDFWLRQ 1HZ *UDSK%DVHG $OJRULWKP IRU 3HUVLDQ 7H[W 6XPPDUL]DWLRQ´
DSSOLHG WR WH[W VXPPDUL]DWLRQ´ LQ 3URFHHGLQJV RI WKH $&/  RQ &RPSXWHU 6FLHQFH DQG &RQYHUJHQFH /HFWXUH 1RWHV LQ (OHFWULFDO
,QWHUDFWLYHSRVWHUDQGGHPRQVWUDWLRQVHVVLRQV (QJLQHHULQJ
>@ $ $JUDZDO 8 *XSWD ³([WUDFWLRQ EDVHG DSSURDFK IRU WH[W >@ 5 'UDJRPLU - %XG]LNRZVND ³&HQWURLGEDVHG VXPPDUL]DWLRQ RI
VXPPDUL]DWLRQ XVLQJ NPHDQV FOXVWHULQJ´ ,QWHUQDWLRQDO -RXUQDO RI PXOWLSOH GRFXPHQWV 6HQWHQFH H[WUDFWLRQ XWLOLW\EDVHG HYDOXDWLRQ DQG
6FLHQWLILFDQG5HVHDUFK3XEOLFDWLRQV9RO,VVXH XVHUVWXGLHV´,Q3URFHHGLQJVRIWKH$1/31$$&/:RUNVKRSRQ
$XWRPDWLF6XPPDUL]DWLRQSS±
>@ 1, 0HJKDQD 06 %HZRRU 06 6+ 3DWLO ³7H[W 6XPPDUL]DWLRQ
XVLQJ ([SHFWDWLRQ 0D[LPL]DWLRQ &OXVWHULQJ $OJRULWKP´ ,QWHUQDWLRQDO >@ . )LOLSSRYD 0 6WUXEH ´ 6HQWHQFH )XVLRQ YLD 'HSHQGHQF\ *UDSK
-RXUQDO RI (QJLQHHULQJ 5HVHDUFK DQG $SSOLFDWLRQV ,-(5$  9RO  &RPSUHVVLRQ´ LQ 3URFHHGLQJV RI WKH  &RQIHUHQFH RQ (PSLULFDO
,VVXHSS 0HWKRGVLQ1DWXUDO/DQJXDJH3URFHVVLQJSS±
>@ /6XDQPDOL06DOHP%6DOLP16DOLP³6HQWHQFH)HDWXUHV)XVLRQ
IRU7H[WVXPPDUL]DWLRQXVLQJ)X]]\/RJLF,(((SS

0062

Вам также может понравиться