Академический Документы
Профессиональный Документы
Культура Документы
$Q2YHUYLHZRQ([WUDFWLYH7H[W6XPPDUL]DWLRQ
6KRKUHK5DG5DKLPL $OL7RRIDQ]DGHK0R]KGHKL
'HSDUWPHQWRI&RPSXWHU(QJLQHHULQJ )DFXOW\RI&RPSXWHU(QJLQHHULQJ4D]YLQ%UDQFK
0DUOLN+LJKHU(GXFDWLRQ,QVWLWXWH ,VODPLF$]DG8QLYHUVLW\
1RZVKDKUH,UDQ 4D]YLQ,UDQ
6KRKUHKUDGUDKLPL#JPDLOFRP 0R]KGHKL#TLDXDFLU
0RKDPDG$EGRODKL
,UDQLDQ$FDGHPLF&HQWHUIRU(GXFDWLRQ&XOWXUHDQG5HVHDUFK$&(&5
0DVKKDG,UDQ
0DEGRODKL#\DKRRRP
Abstract² With the increasing of online information and VXPPDUL]DWLRQ ILHOG DQG WKHUH LV D ODUJH JDS WR DFKLHYH DQ
recourse texts, text summarization has become an essential and HIILFLHQW V\VWHP WKDW FDQ DFWV OLNH D KXPDQ DJHQW 3UREOHPV
more favorite domain to preserve and show the main purpose of RXWOLQHG LQ WKH 3HUVLDQ ODQJXDJH DUH IDU PRUH WKDQ RWKHU
textual information. It is very difficult for human beings to ODQJXDJHV 7KH FRPSOH[LW\ RI ODQJXDJH DQG ODFN RI SUHFLVLRQ
summarize manually large documents of text. Text WRROV DUH FXUUHQW SUREOHPV IDFLQJ WKH 3HUVLDQ ODQJXDJH
summarization is the process of automatically creating and SURFHVVLQJ7KHUHIRUHWKHUHYLHZRIRSHUDWLRQVDQGSURFHGXUHV
condensing form of a given document and preserving its SHUIRUPHG RQ RWKHU ODQJXDJHV FRQVLGHULQJ WKH VHPDQWLF ILHOG
information content source into a shorter version with overall DQG XVH WKH VHPDQWLF UHODWLRQVKLS XVLQJ WRROV VXFK DV JUDSK
meaning. Nowadays text summarization is one of the most WKHRU\ VWDWLVWLFDO PHWKRGV IX]]\ ORJLF DQG GDWD PLQLQJ
favorite research areas in natural language processing and could WHFKQLTXHV FDQ PDNH D VLJQLILFDQW FRQWULEXWLRQ WR 3HUVLDQ
attracted more attention of NLP researchers. There are also
ODQJXDJHSURFHVVLQJ
much more close relationships between text mining and text
summarization. According to difference requirements summary $WH[WPDGHXSRIFRPSRQHQWVVXFKDVZRUGVSKUDVHVDQG
with respect to input text, established summarization systems VHQWHQFHV FRQQHFWHG FRPSOHWHO\ DQG PHDQLQJIXOO\ WRJHWKHU
should be created and classified based on the type of input text. 2QH RI WKH PDLQ DUHDV RI QDWXUDO ODQJXDJH SURFHVVLQJ LV WH[W
In this study, at first, the topic of text mining and its relationship PLQLQJ ZKLFK PHDQV GLVFRYHU DQG H[WUDFW QHZ LQIRUPDWLRQ
with text summarization are considered. Then a review has been IURPWKHGRFXPHQWV7H[WPLQLQJLVDQDO\VLVWKHGRFXPHQWVWR
done on some of the summarization approaches and their H[WUDFW YDOXDEOH KLGGHQ SDWWHUQV IURP WKH WH[W ,W LV DOVR
important parameters for extracting predominant sentences, LQYROYHV WKH GHWHFWLRQ RI WKH FRQQHFWLRQ EHWZHHQ ZRUGV DQG
identified the main stages of the summarizing process, and the
VHQWHQFHVFODVVLI\ DQG VXPPDUL]H WH[WV7KH PDLQSURSRVH LQ
most significant extraction criteria are presented. Finally, the
most fundamental proposed evaluation methods are considered.
WKLVUHVHDUFKLVDQRYHUYLHZRQWH[WVXPPDUL]DWLRQ
0RUHWKDQKDOIDFHQWXU\SDVVHGVLQFHWKHILUVWUHVHDUFKRQ
Keywords— Text summarization; Text mining; Fuzzy text WKH DXWRPDWLF WH[W VXPPDUL]DWLRQ 6WXG\LQJ RQ WH[W
summarization; Statistical text summarization; Text clustering. VXPPDUL]DWLRQ V\VWHPV HPHUJHG LQ ெV ZKLFK IRFXVHG RQ
WKHVRPHEDVLFIHDWXUHVRIWKHWH[WVXFKDVWKHSRVLWLRQRIWKH
VHQWHQFHV LQ WKH WH[W GXH WR ODFN RI SRZHUIXO FRPSXWHUV DQG
, ,1752'8&7,21 RWKHUSUREOHPVLQQDWXUDOODQJXDJHSURFHVVLQJ6LQFHWKHQDORW
:LWK DQ LQFUHDVLQJ DPRXQW RI GDWD DYDLODEOH RQ WKH ZHE RIPHWKRGVZLWKSRZHUIXOWRROVZHUHSUHVHQWHGWRVLPXODWHWH[W
ULVLQJ RI QHZV ZHEVLWHV SXEOLFDWLRQ RI YDULRXV HOHFWURQLF SURFHVVLQJVXFKDVKXPDQEUDLQ
ERRNV DQG D VLJQLILFDQW JURZWK LQ WKH QXPEHU RI SXEOLVKHG
DUWLFOHVLQGLIIHUHQWILHOGVRIVWXG\RQHRIWKHPDLQFKDOOHQJHV
IRU UHVHDUFKHUV RI VW FHQWXU\ KDV EHHQ WKDW RI DFFHVVLQJ
DFFXUDWH DQG UHOLDEOH GDWD 7KH ZLGHVSUHDG YROXPH RI ,, $5(9,(:2)7(;70,1,1*$1',765(/$7,216+,3:,7+
LQIRUPDWLRQ DYDLODEOH RQ RQH KDQG DQG WLPH OLPLWDWLRQ RQ WKH $7(;76800$5,=$7,21
RWKHU KDV GLUHFWHG WKH UHVHDUFKHUV WR WKH LQWHUHVWLQJ DUHD RI $ VLJQLILFDQW SRUWLRQ RI DYDLODEOH LQIRUPDWLRQ LV VWRUHG LQ
VXPPDUL]LQJ WH[WV DQG SRZHUIXO V\VWHP WR VXPPDUL]H WH[WGDWD7H[WGDWDEDVHVDUHJURZLQJUDSLGO\GXHWRLQFUHDVLQJ
GRFXPHQWV *LYHQ WKDW PXFK UHVHDUFKHV SDSHU KDV EHHQ GRQH WKHDPRXQWRILQIRUPDWLRQLQHOHFWURQLFIRUP7RGD\ PRVWRI
RQ WKH VXPPDUL]DWLRQ VXEMHFW DQG PDQ\ DUWLFOHV KDYH EHHQ WKH GDWD DYDLODEOH LQ WKH LQGXVWU\ EXVLQHVV DQG RWKHU
SXEOLVKHG DERXW LW %XW WKHUH LV VWLOO D ORW RI ZHDNQHVV LQ RUJDQL]DWLRQVDUHVWRUHGLQWKHIRUPRIWH[WGDWDEDVH
,(((
0054
7H[W PLQLQJ LV D UHVHDUFK ILHOG WKDW KDV HPHUJHG IURP WKH Case Folding: $W WKLV VWDJH DOO ZRUGV ZLWK ELJ DQG VPDOO
FRPELQDWLRQ RI VHYHUDO RWKHU UHVHDUFK DUHDV VXFK DV GDWD OHWWHUV FDPH LQ D XQLIRUP $OVR DOZD\V WU\ WR FRQYHUWHG
PLQLQJQDWXUDOODQJXDJHSURFHVVLQJDQGLQIRUPDWLRQUHWULHYDO XSSHUFDVHOHWWHUWRORZHUFDVH
%XW WKHUH DUH VRPH GLIIHUHQFHV EHWZHHQ GDWD PLQLQJ DQG WH[W
Stemming: 7KLV VWDJH LW URRW H[WUDFWLRQ RI ZRUGV ZLWKRXW
PLQLQJ
FRQVLGHULQJ GLIIHUHQW VFHQDULRV VXFK DV VLQJXODU RU SOXUDO WKH
'DWD PLQLQJ RIWHQ GHDOV ZLWK VWUXFWXUHG GDWD EXW WH[W WLPH DQG DOO SUHIL[ DQG VXIIL[ 7KLV UHTXLUHV NQRZOHGJH RI D
PLQLQJLQYROYHGZLWKXQVWUXFWXUHGRUVHPLVWUXFWXUHGGDWD2Q SDUWLFXODU ODQJXDJH DQG PDQ\ DOJRULWKPV KDYH EHHQ SURSRVHG
WKH RWKHU KDQG HYHQ LI WKHUH LV VWLOO VRPH FRQVWUXFWLRQ LQ WKH IRU HDFK ODQJXDJH ([DPSOHV RI WKHVH W\SHV RI ZRUGV LQ WKH
WH[W WKHUH LV DQRWKHU UHDVRQ WKDW WH[W PLQLQJ LV PXFK PRUH (QJOLVK ODQJXDJH FDQ EH FRPSUHVVHG DQG FRPSUHVVLRQ WKDW
GLIILFXOWWKDQGDWDPLQLQJ8QOLNHRWKHUGDWDVXFKDVQXPHULFDO FRQYHUWWRFRPSUHVV
GDWD WKHUH DUH VRPH VHPDQWLF FRQFHSWV LQ WH[W DQG LW FDQ EH
Stop Words: 6RPH IUHTXHQW ZRUGV ZLWKRXW FRQFHSW
GLIILFXOW WR PRGHO WKHP ZLWK WUDGLWLRQDO NQRZOHGJH VWUXFWXUHV
([DPSOHVRIWKHVHW\SHVRIZRUGVLQWKH(QJOLVKODQJXDJHVXFK
)RU H[DPSOH WKHUH DUH VRPH GLIILFXOWLHV VXFK DV ZRUGV ZLWK
DVGRGRHVZLOODQGVRRQ
VDPH PHDQLQJ DQG GLIIHUHQW GLFWDWLRQ DQG GLIIHUHQW PHDQLQJ
ZLWKVDPHGLFWDWLRQ N-grams:1JUDPVDUHDVXEVHWRI1ZRUGVFRPHWRJHWKHU
7KH\VKRXOGEHSURWHFWHGEXWLWLVEHWWHUWRFRPHLQWKHIRUPRI
A. Text mining applications XQLIRUP
7KHUHLVDZLGHGHILQLWLRQRIWH[WPLQLQJ$VDUHVXOWWKHUH Tokenization: 7RNHQL]DWLRQ LV D GLYLGLQJ WKH WH[W WR
DUH GLIIHUHQW WKHRULHV DERXW LWV DSSOLFDWLRQV 7H[W PLQLQJ LV D VPDOOHU XQLWV ZKLFK DUH RIWHQ ZRUGV %XW LQ VRPH FDVHV LW LV
QHZILHOGLQ1/3EXWLWVVRIWZDUHDQDO\]HVKDYHEHHQDYDLODEOH GLIILFXOWWRGHILQHWKHZRUGDQGLWVVFRSH$ZRUGDVDXQLWRI
VLQFHODWH$PRQJWKHPRVWFRPPRQDSSOLFDWLRQVRIWH[W WH[WKDVWKHIROORZLQJSURSHUWLHV
PLQLQJ FDQ EH QRWHG WR WKH VHDUFK HQJLQHV7KH\ FDQ ILQGWKH
PRVW UHOHYDQW GRFXPHQWV ZKHWKHU WKH XVHU W\SHV D PLVVSHOOHG x (DFK VHW RI FRQWLQXRXV FKDUDFWHUV LV DOZD\V D
ZRUGRUSKUDVH6RPHRWKHUDSSOLFDWLRQVRIWH[WPLQLQJWKDWFDQ ZRUG7KHVHFKDUDFWHUVFDQHYHQILJXUHDVZHOO
EHPHQWLRQHGDUH x 5DQJH RI D ZRUG FDQ EH D ZKLWHVSDFH FKDUDFWHU
Spam identification: $QDO\]LQJ WKH WLWOH DQG FRQWHQW RI V\PEROVRUHQGRIWKHOLQH
HPDLOWRGHWHUPLQHZKHWKHULWLVVSDP
x 6RPHWLPHVZKLWHVSDFHFKDUDFWHUVDQGSXQFWXDWLRQ
Supervision:0RQLWRULQJWKHEHKDYLRURIDSHUVRQRUJURXS LVQRWUDQJHZRUGRUWRNHQ
RISHRSOH7KHUHDUHVRPHVRIWZDUHWKDWFDQGHWHFWDQGFRQWURO
,Q ODQJXDJHV VXFK DV (QJOLVK DQG 3HUVLDQ WKDW ZRUGV
SHRSOH EHKDYLRU IURP WHOHSKRQH LQWHUQHW DQG RWKHU
VHSDUDWHGZLWKZKLWHVSDFHGHILQLQJWKHERXQGDU\ZRUGVLVQRW
FRPPXQLFDWLRQGHYLFHV
DOZD\V DQ HDV\ WDVN )RU H[DPSOH DFURQ\PV WKDW KDV SRLQW
Aliases identification ZRUGVLQK\SHUWH[WOLQNVDQGWKHZRUGVWKDWDUHVHSDUDWHGZLWK
V\PEROLFFKDUDFWHUVDQGZRUGV ZLWKVSDFHFKDUDFWHUOLNH1HZ
Concepts relationship: 2FFXUUHQFH RI VRPH ZRUGV <RUN
GHSHQGHQWRQVRPHRWKHUZRUGV
2QH RI WKH LPSRUWDQW IXQFWLRQV RI WKH WH[W PLQLQJ LV WH[W
Search and Retrieval FOXVWHULQJ 7KH PDLQ DLP RI WH[W FOXVWHULQJ LV WR SODFH WKH
Classification and clustering data VLPLODUVHQWHQFHVLQVDPHFOXVWHUV7KHQXPEHURIFOXVWHUVFDQ
EH GHWHUPLQHG E\ WKH XVHU RU DXWRPDWLFDOO\ E\ WKH SURJUDP
Text summarization 7KH ILUVW VWHS RI WH[W FOXVWHULQJ LV GLYLGLQJ WKH WH[W WR LWV
7KHUH DUH VRPH ELJ SUREOHPV LQ WH[W PLQLQJ $ ELJJHVW FRPSRQHQWDQGVHSDUDWHGWKHVHQWHQFHV
SUREOHP LV WKDW DQ\ GRFXPHQW LV D YHU\ ODUJH VHWRI ZRUGV ,I
DQ\ ZRUG LV DVVXPHG DQ HOHPHQW RI D YHFWRU DQG FRQVLGHUV B. Text mining and text summarization
GLIIHUHQWSUHVHQFHVFHQDULRVLQWKHWH[WWKHQ ZH ZLOOILQGWKDW
ZHDUHGHDOLQJZLWKD13+DUGSUREOHPZLWKKLJKGLPHQVLRQDO ,WLVREYLRXVO\ WKDWLWFDQQRWEH VHSDUDWHG WH[W PLQLQJ DQG
DUHD :RUG UHGXFWLRQ LV RQH RI WKH SUHSURFHVVLQJ RSHUDWLRQV WH[WVXPPDUL]DWLRQ6HYHUDOVWXGLHVKDYHEHHQFRQGXFWHGLQWKH
WKDWOHDGWRUHGXFHWKHGLPHQVLRQRIWKHSUREOHP3UHSURFHVVLQJ ILHOGRIWH[WPLQLQJDQGWH[WVXPPDUL]DWLRQ$OORIWKHPVKRZ
DQG ZRUG UHGXFWLRQ VKRXOG EH VR HIILFLHQW DQG SRZHUIXO WKDW WKH WH[W VXPPDUL]DWLRQ LV DOVR NLQG RI PLQLQJ DQG
EHFDXVH HOLPLQDWLQJ ZRUGV PD\ EH OHDG WR VRPH WH[W QRLVHV H[SORULQJWRILQGWKHLPSRUWDQWSDUWVRIDGRFXPHQW7KHPDLQ
VXFKDVJUDPPDWLFDOHUURU$SDUWIURPUHGXFLQJWKHVL]HRIWKH SUREOHPVIDFLQJWRWH[WSURFHVVLQJDQGVXPPDUL]LQJDUHODFNRI
WH[WSURFHVVLQJFRQVLVWVRIVRPHRWKHUVWHSVDVIROORZV TXDOLILHG OLQJXLVW H[SHUW ZRUN ZLWK SURJUDPPHUV JUDPPDU
EDVHG RI OLQJXLVWLF WKHRULHV DQG WKH ODFN RI UHDVRQLQJ DQG
WKLQNLQJZRUGSURFHVVRUVPDFKLQHV
0055
7KH ILUVW SUREOHP FDQ EH RYHUFRPH ZLWK HPSOR\LQJ WRXVHWH[WSKUDVHV(DFKDEVWUDFWLYHVXPPDUL]DWLRQFRQVLVWVRI
OLQJXLVWLF H[SHUW 7KH VHFRQG SUREOHP DFFRUGLQJ WR WKH FRPSUHKHQVLRQ SDUW WR LQWHUSUHW WKH WH[W DQG ILQG WKH QHZ
VHPDQWLFV FDQ EH UHVROYHG E\ UHYLVLQJ WKH ODQJXDJH WKHRULHV FRQFHSWVDQGSURGXFWLRQSDUWWRJHQHUDWHQHZVKRUWHUWH[WZLWK
%XW LW LV GLIILFXOW WR FUHDWH PDFKLQHV ZLWK UHDVRQLQJ DQG PRVWLPSRUWDQWLQIRUPDWLRQIURPWKHRULJLQDOGRFXPHQW,QWKLV
WKLQNLQJSRZHU PHWKRG VHQWHQFHV FRXOG EH RPLWWHG RU FKDQJHG RU HYHQ QHZ
VHQWHQFHV FRXOG EH JHQHUDWHG ,W VKRXOG EH QRWHG WKDW WKLV
2Q WKH RWKHU KDQG GHDOLQJ ZLWK KLJK YROXPH GDWD RQ PHWKRG LV YHU\ FRPSOLFDWHG DQG HYHQ PRUH FRPSOLFDWHG WKDQ
LQWHUQHW HVSHFLDOO\ ZKHQ WKH JRDO XQGHUVWDQGV WKH PDLQ PDFKLQHWUDQVODWLRQ
FRQFHSWRIGRFXPHQWVXVLQJ PDFKLQHVLQVWHDGRIKXPDQVFDQ
EHHDVLHUIDVWHUDQGPRUHUHDVRQDEOH
B. Summarization based on details
$W WKH EHJLQQLQJ FRQVLGHU WKH GLIIHUHQW YLHZV RQ WKH 7H[W 6XPPDUL]DWLRQ PHWKRGV DOVR FDQ EH FODVVLILHG LQWR
DXWRPDWLF WH[W VXPPDUL]DWLRQ DQG WKH VXPPDULHV FUHDWHG E\ DQRWKHU FODVVHV QDPHG ,QGLFDWLYH DQG ,QIRUPDWLYH
WKH PDQ 7KHUH DUH WKUHH PDLQ YLHZV WKDW DUH QR GLIIHUHQFHV VXPPDUL]DWLRQ,QLQGLFDWLYHVXPPDU\WKHPDLQLGHDRIWKHWH[W
EHWZHHQ KXPDQ DQG PDFKLQH VXPPDU\ SUHIHUHQFH RI KXPDQ LV SUHVHQWHG DQG LW LV XVXDOO\ DERXW WR SHUFHQW RI WKH
VXPPDU\ RQ PDFKLQH VXPPDU\ DQG SUHIHUHQFH RI PDFKLQH RULJLQDOWH[W7KLVNLQGRIDEVWUDFWLRQLVXVHGWRHQFRXUDJHWKH
VXPPDU\RQKXPDQVXPPDU\ UHDGHUWRUHDGWKHRULJLQDOWH[W)RUH[DPSOHDEULHIVXPPDU\RI
7KH WKHRU\ RI WKH VXSHULRULW\ RI SURFHVVLQJ DQG DPRYLHRUDVWRU\LQWKHFRQWH[WRILWVDGYHUWLVLQJZKLFKRQO\
VXPPDUL]LQJ PDQ EHOLHYHV WKDW KXPDQ PLQG PRUH SRZHUIXO OHDGV WR IXUWKHU TXHVWLRQV DQG HQFRXUDJH WKH UHDGHUV WR ZDWFK
WKDQPDFKLQHSURFHVVRU'HFLVLRQPDNLQJSRZHUDQGFKRLFHRI WKHILOPDQGUHDGWKHVWRU\,QIRUPDWLYHVXPPDU\FRQVLVWRIWKH
FXUUHQWPDFKLQHVDUHOHVVWKDQKXPDQPLQG7KHWKHRU\RIWKH PDLQDEVWUDFWLRQDQGWKHLPSRUWDQWLVVXHVRIWKHWH[W7KLVNLQG
VXSHULRULW\ RI PDFKLQH SURFHVVLQJ DQG VXPPDUL]LQJ EHOLHYHV RIVXPPDU\LVEHWZHHQDQGSHUFHQWRIWKHRULJLQDOWH[W
WKDW LQ WKH QHDU IXWXUH FRPSXWHUV FDQ SURFHVV WKH OLQJXLVWLF DQGFRQWDLQVDOOWKHPDLQSRLQWVRIWKHWH[W
LQIRUPDWLRQ ZLWK PRUH VSHHG DQG DFFXUDF\ WKDQ KXPDQ 7KH
UHDVRQIRUWKLVFODLPLVWKHPRUHTXLFNO\RSHUDWLRQRIPDWKDQG C. Summarization based on contents
PDFKLQHU\FRPSXWLQJWKDQDKXPDQ $QRWKHUFODVVLILFDWLRQLQWH[WVXPPDUL]DWLRQLVFRQWHQWEDVH
7KHUH DUH WKUHH PDMRU DGYDQWDJHV RI DXWRPDWLF JHQHUDWLRQ FODVVHV 7KLV W\SH RI FODVVLILFDWLRQ FDQ EH GLYLGHG LQWR WZR
RI VXPPDU\ E\ WKH PDFKLQHV 7KH DGYDQWDJHV DUH VXPPDU\ FDWHJRULHV *HQHULF DQG 4XHU\%DVHG *HQHULF VXPPDUL]DWLRQ
VL]H LV FRQWUROODEOH LWV FRQWHQW LV SUHGLFWDEOH DQG LW FDQ EH LV QRW GHSHQGHG RQ WRSLF RI WH[W DQG LW LV DVVXPHG WKDW WKH
GHWHUPLQHGWKDWDQ\SDUWRIWKHVXPPDU\UHODWHGWRZKLFKSDUW UHDGHU GRHV QRW KDYH DQ\ EDVLF NQRZOHGJH DERXW WKH WH[W
RIWKHRULJLQDOWH[W *HQHULFVXPPDU\FRQWDLQVDOODVSHFWVDQGLPSRUWDQWLVVXHVRI
PDLQ WH[W DQG UHDGHUV DUH DEOH WR DFKLHYH D WKRURXJK
XQGHUVWDQGLQJ RI WKH VXEMHFW ZLWKRXW SULRU NQRZOHGJH RI WKH
,,, ',))(5(17&5,7(5,$,17+(7(;76800$5,=$7,21 WH[W
6<67(06'(6,*1,1*
%XWLQ4XHU\%DVHGVXPPDU\LWLVDVVXPHGWKDWWKHUHDGHU
7KHUHDUHYDULRXVDSSURDFKHVWRWH[WVXPPDUL]DWLRQVRPH KDV D JHQHUDO NQRZOHGJH DERXW WKH WRSLF DQG MXVW ORRNLQJ IRU
RI ZKLFK KDYH EHHQ H[WDQW IRU DERXW \HDUV 7H[W VSHFLILFLQIRUPDWLRQLQWKHWH[W,QWKLVFDVHEDVHGRQWKHXVHU
V
VXPPDUL]DWLRQDSSURDFKHVFDQEHFDWHJRUL]HGLQGLIIHUHQWZD\V TXHVWLRQ D UHODWHG VXPPDU\ LV FUHDWHG 0RVW RI WKHVH
DFFRUGLQJ WR WKH YDULRXV PHDVXUHV DQG IHDWXUHV )RU H[DPSOH VXPPDUL]DWLRQV\VWHPVDUHH[WUDFWLYH
DFFRUGLQJWR+RY\DQG/LQ>@WKHPHDVXUHVDQGIHDWXUHVDUH
UHODWHGWRLQSXWRXWSXWSXUSRVHDQGUHVXOWLQGLIIHUHQWW\SHVRI
VXPPDU\ ,Q WKH IROORZLQJ YDULRXV PHWKRGV KDYH EHHQ D. Summarization based on limitation
FRQVLGHUHGLQJHQHUDOFODVVLILFDWLRQ 7KHUHLVDQRWKHUFDWHJRU\EDVHGRQWKHOLPLWDWLRQVRILQSXW
WH[W7KHPHQWLRQHGJURXSKDVWKUHHFDWHJRULHVWKDWZKLFKDUH
A. Summarization based on output summary ,QGHSHQGHQW'RPDLQGHSHQGHQWDQG*HQUHVSHFLILF
7H[W 6XPPDUL]DWLRQ PHWKRGV FDQ EH FODVVLILHG LQWR ,QGHSHQGHQW VXPPDUL]DWLRQ LV VRPHWKLQJ OLNH JHQHULF
H[WUDFWLYH DQG DEVWUDFWLYH VXPPDUL]DWLRQ $Q H[WUDFWLYH VXPPDU\ DFFHSWHG HYHU\ WH[W RI HDFK ILHOG DQG JHQHUDWH D
VXPPDUL]DWLRQ PHWKRG FRQVLVWV RI VHOHFWLQJ LPSRUWDQW JHQHUDOVXPPDU\UHJDUGOHVVRIWKHWH[WVFRSHRUW\SH
VHQWHQFHV RU SDUDJUDSKV IURP WKH RULJLQDO WH[W DQG JDWKHULQJ
'RPDLQ GHSHQGHQW VXPPDUL]DWLRQ DFFHSWV WH[WV ZLWK
WKHPLQWRVKRUWHUWH[W7KHLPSRUWDQFHRIVHQWHQFHVLVGHFLGHG
VSHFLILF ILHOG RI OLWHUDWXUH DQG W\SH 7KHUH DUH PDQ\ VSHFLILF
EDVHG RQ VRPH VWDWLVWLFDO DQG OLQJXLVWLF IHDWXUHV RI VHQWHQFHV
WH[W SDWWHUQV VXFK DV 1HZV VFLHQFH WH[W ILFWLRQ VSRUWV
H[WUDFW DQG SODFHG LQ WKH RXWSXW WH[W ,Q WKLV SDSHU WKH PRUH
*HQHUDWHG VXPPDU\ RI GRPDLQ GHSHQGHQW V\VWHPV DUH
HPSKDVLVLVRQH[WUDFWLRQWHFKQLTXHV
DFFRUGLQJWRWKHLQSXWWH[WW\SH*HQUHVSHFLILFVXPPDUL]DWLRQ
$QDEVWUDFWLYHVXPPDUL]DWLRQDWWHPSWVWRH[WUDFWWKHPDLQ LVWU\LQJWRVXPPDU\PXFKPRUHVSHFLDOL]HGILHOG7KLVJURXS
FRQFHSWRIWKHWH[WLQFOHDUQDWXUDOODQJXDJHZLWKRXWQHFHVVLW\
0056
FDQ EH DEVWUDFW VSHFLILF OLWHUDWXUH VXFK DV VSRUWV QHZV WH[WV WKHPDUHLQWHQGHGDVWKHILQDOZHLJKWRIHDFKVHQWHQFH)LQDOO\
UHODWHGWRWKHILHOGRIJHRJUDSK\SROLWLFDO1HZVDQGHWF VHQWHQFHVZLWKWKHPD[LPXPZHLJKWEDVHGRQFRPSUHVVLRQUDWH
DUHH[WUDFWHGDQGE\RUGHULQJSUHVHQFHLQPDLQWH[WWUDQVIHUUHG
E. Summarization based on the number of input texts WRWKHVXPPDU\*LYHQWKDWVRPHPHQWLRQHGSDUDPHWHUVUHGXFH
$XWRPDWLFVXPPDUL]DWLRQV\VWHPVFDQEHGLYLGHGRWKHUWZR WKH LPSRUWDQFH RI D VHQWHQFH WKHLU YDOXH LV FDOFXODWHG DV D
PDLQ FDWHJRULHV EDVHG RQ WKHLU LQSXW WH[W 7KH PHQWLRQHG QHJDWLYH 6RPH RI 7KH SDUDPHWHUV DQG IHDWXUHV DUH
FDWHJRULHV DUH 6LQJOH GRFXPHQW DQG 0XOWL GRFXPHQW GHVFULEHGEHORZ
VXPPDUL]DWLRQ 7KH LQSXW RI VLQJOHGRFXPHQW VXPPDUL]DWLRQ
V\VWHPV LV RQO\ RQH WH[W EXW PXOWLGRFXPHQW VXPPDUL]DWLRQ A. Content keyword feature
ZKLFKLVYHU\SRSXODUWKHVHGD\VLVWKHLPSURYHPHQWRIVLQJOH .H\ZRUGV DUH RIWHQ QRXQ DQG GHWHUPLQHG E\ WI
LGI
GRFXPHQW VXPPDUL]DWLRQ WR FROOHFWLRQV RI UHODWHG GRFXPHQWV FULWHULD6HQWHQFHVFRQWDLQLQJNH\ZRUGVKDYHPRUHFKDQFHVWR
7KH PDLQ SURSRVH RI PXOWLGRFXPHQW VXPPDUL]DWLRQ LV H[SRVXUH LQ VXPPDUL]HG RXWSXW 2WKHU PHWKRGV KDYH EHHQ
VXPPDUL]LQJ WH[WV DQG UHPRYLQJ UHGXQGDQF\ DQG FRQVLGHULQJ SURSRVHGIRUH[WUDFWLQJNH\ZRUGV6RPHRIWKHVHPHWKRGVDUH
WKH VLPLODULWLHV DQG GLIIHUHQFHV LQ WKH LQIRUPDWLRQ FRQWHQW RI ZRUG DQDO\VLV RI PRUSKRORJLFDO VWDWHPHQWV H[WUDFWLRQ DQG
GLIIHUHQW GRFXPHQWV 0XOWL GRFXPHQW WH[W VXPPDUL]DWLRQ VFRULQJ WKHP DQG QRXQ SKUDVHV H[WUDFWLRQ FOXVWHULQJ DQG
DFFHSWVPXOWLSOHGRFXPHQWVZLWKFRPPRQVFRSHLQDGLIIHUHQW UDQNLQJWKHP:RUGPRUSKRORJLFDODQDO\VLVSOD\VDQLPSRUWDQW
SHUVSHFWLYHDQGFORVHO\WLHGWRDQVZHULQJV\VWHPVDQGVHDUFK UROH LQ QDWXUDO ODQJXDJH SURFHVVLQJ DQG KHOSV WR UHVROYH WKH
EDVHG VXPPDUL]DWLRQ 7KHUH DUH WZR PDMRU DSSURDFKHV WR DPELJXLW\ LQ WKH ZRUGV 7KH PHQWLRQHG FULWHULD VWXG\ RI URRW
VXPPDUL]H PXOWLSOH GRFXPHQWV 7KH ILUVW DSSURDFK XVHV WKH ZRUGVSUHIL[HVDQGVXIIL[HVDWWDFKHGWRDZRUG
XVXDO PHWKRGV RI VLQJOH GRFXPHQW VXPPDUL]DWLRQ DQG
VXPPDUL]HVHDFKGRFXPHQWVHSDUDWHO\7KHQFRPELQLQJDOORI B. Similarity of sentence and title of document
WKH VXPPDULHV WRJHWKHU DQG WULHG WR UHPRYH UHGXQGDQF\ E\ 6LPLODULW\RIVHQWHQFHDQGWLWOHRIGRFXPHQWLVWKHQXPEHU
RYHUODSVLPLODUVHQWHQFHVWRSURGXFHWKHILQDOVXPPDU\7KHUH RI FRPPRQ ZRUGV EHWZHHQ WLWOH DQG VHQWHQFH 6HQWHQFHV
DUH VRPH SURSRVHG PHWKRGV WRR ZKLFK EHKDYH FUHDWHG VLQJOH LQFOXGLQJWLWOHZRUGVKDYHKLJKLPSRUWDQFHDQGPRUHFKDQFHWR
VXPPDUL]H DV LQSXWV DQG WKHQ PHUJH WKHP WR FUHDWH PDLQ SODFHLQRXWSXWVXPPDU\,QPRVWRIWKHSURSRVHGPHWKRGVWKH
VXPPDU\ FULWHULDLVFDOFXODWHGDV(T
7KH VHFRQG DSSURDFK LV VSHFLILFDOO\ GHVLJQHG IRU PXOWLSOH
GRFXPHQWV ,Q WKH PHQWLRQHG DSSURDFKHV DOO GRFXPHQWV ;L _6Lŀ7_
DVVXPHV DV RQH GRFXPHQW DQG DOO WKH LPSRUWDQW VHQWHQFHVDUH ,Q(TWKHQXPHUDWRULVWKHQXPEHURIVLPLODUVHQWHQFH
H[WUDFWHG IURP DOO RI WKH GRFXPHQWV WRJHWKHU XVLQJ PHWKRGV DQG WLWOH ZRUGV DQG WKH GHQRPLQDWRU LV WKH VTXDUH URRW RI WKH
OLNH JUDSKV RU FOXVWHULQJ 7KLV DSSURDFK LV PRUH FKDOOHQJLQJ SURGXFWRIWKHWLWOHOHQJWKDQGVHQWHQFHOHQJWK
LQWHOOLJHQW DQG FRPSOLFDWHG $Q H[DPSOH RI WKH VHFRQG
DSSURDFK LV WKH 6800216 V\VWHP ZKLFK H[WUDFWV DQG C. Sentence location in the document
FRPELQHV LQIRUPDWLRQ IURP PXOWLSOH VRXUFHV DQG SDVVHV WKHP 7KLV PHDVXUH LV EDVHG RQ WKH DVVXPSWLRQ WKDW VHQWHQFHV
WR D ODQJXDJH JHQHUDWLRQ FRPSRQHQW DQG SURGXFHV WKH ILQDO RFFXUULQJLQLQLWLDODQGHQGSRVLWLRQRIERWKWH[WDQGLQGLYLGXDO
VXPPDU\ ,Q JHQHUDO WKH FRPSOH[LW\ RI D VLQJOH GRFXPHQW SDUDJUDSKVKDYHDKLJKHUSUREDELOLW\RIEHLQJUHOHYDQWWRPDLQ
PRGHOLVIDUOHVVWKDQPXOWLGRFXPHQWPRGHOV WRSLFRIWKHGRFXPHQW
F. Summarization based on language acceptance 7KH H[SHULPHQWDO UHVXOWV VKRZHG WKDW WKH EHVW FRUUHODWLRQ
EHWZHHQWKHDXWRPDWLFDQGKXPDQPDGHH[WUDFWVZDVDFKLHYHG
$WH[WVXPPDUL]DWLRQV\VWHPDOVRFDQEHGLYLGHGWRPRQR
XVLQJ ILUVW DQG HQG VHQWHQFHV LQ VXPPDUL]HG RXWSXW %XW WKH
ODQJXDJH DQG PXOWLODQJXDJH 0RQR ODQJXDJH VXPPDUL]DWLRQ
ILUVW VHQWHQFHLV PRUHLPSRUWDQW WKDQ ODVWRQH%\ UHDGLQJ WKH
RQO\DFFHSWVGRFXPHQWVLQDVSHFLILFODQJXDJHVXFKDV(QJOLVK
ILUVW VHQWHQFH RI HYHU\ SDUDJUDSK FDQ EH XQGHUVWDQG LWV PDLQ
3HUVLDQ %XW D PXOWLODQJXDJH VXPPDUL]DWLRQ DGRSWHG E\
WRSLF 7KH (T LV XVHG WR FDOFXODWH WKH LPSRUWDWLRQ UDWH RI
GLIIHUHQWODQJXDJHVDQGFDQEHDEOHWRVXPPDUL]HWKHP
VHQWHQFHORFDWLRQ
36L
36L&
ORJ
36L&±
36L&
ORJ±
36L&
,9 6,0,/$5,7<0($685(62)(;75$&7,9( ;L 36L0D[M Q36L
6800$5,=$7,21
$V PHQWLRQHG HDUOLHU H[WUDFWLYH VXPPDU\ LV XVLQJ PDLQ 36L LV HIIHFWLYH YDOXH RI WKH UHODWLYH SRVLWLRQ RI WKH LWK
WH[W VHQWHQFHV WR FUDWH VXPPDU\ 7KLV PHDQV WKDW PRUH VHQWHQFHWKDWFDOFXODWHGEDVHGRQHQWURS\7KH&LVDFRQVWDQW
LPSRUWDQWVHQWHQFHVDUHIRXQGDQGWUDQVIHUUHGWRWKHVXPPDU\ YDOXHEHWZHHQ]HURDQGRQHDQG;LLVDVHQWHQFHORFDWLRQYDOXH
7RGHWHUPLQHWKHLPSRUWDQWVHQWHQFHVVRPHSDUDPHWHUVVKRXOG
EHFRQVLGHUHG$IWHUGHWHUPLQLQJWKHSDUDPHWHUVYDOXHVXPRI
0057
D. Important words and phrases I. The continuity and similarity of a sentence with other
7KHUH DUH VRPH ZRUGV DQG SKUDVHV WKDW LQFUHDVH WKH sentences
VHQWHQFH LPSRUWDQFH 1XPEHUV LQFOXGLQJ GDWH WLPH SULFH )RU HDFK VHQWHQFH LQ WKH WH[W UHVHPEODQFH ZLWK RWKHU
SHUFHQWDJH DQG ZHLJKW DQG SURSHU QDPHV LQFOXGLQJ SHUVRQ VHQWHQFHVDUHFRQVLGHUHGDQGVHQWHQFHZLWKPRUHUHVHPEODQFH
SODFH DQG WLPH KDYH VSHFLILF LPSRUWDQW LQIRUPDWLRQ DQG KDVKLJKVFRUH7KHSURFHVVLVUHSHDWHGIRUDOOVHQWHQFHVRIWH[W
GHWHUPLQLQJWKHLULPSRUWDQFHGHSHQGVWRWH[WVW\SH7KHVFRUH DQG VHQWHQFHV ZLWK KLJKHU VFRUHV DUH PRUH OLNHO\ WR KDYH D
UDWH RI D VHQWHQFH UHJDUG WR WKH LPSRUWDQW ZRUGV DQG SKUDVHV SUHVHQFH LQ VXPPDU\ 7KH VLPLODULW\ RI WZR VHQWHQFHV LV
DUHREWDLQHGDFFRUGLQJWRWKH(TE\GLYLGLQJWKHQXPEHURI VLPLODULW\EHWZHHQFRQWDLQHGZRUGVVHQWHQFHOHQJWKDQGRWKHU
WKHVHZRUGVRQWKHWRWDOZRUGVLQVHQWHQFH FULWHULDPHQWLRQHGEHIRUH
;L _6LPS__6L_
J. The similarity of a sentence with a paragraph topic
E. Capitalized words and acronyms sentence
&DSLWDOL]HGZRUGVRUDFURQ\PVDUHKDYLQJVSHFLDOFRQFHSWV ,Q WKLV VHFWLRQ WKH VLPLODULW\ RI HDFK VHQWHQFH ZLWK
7KH ZRUGV ZLWK FDSLWDO OHWWHUV DUH LQ WKH /DWLQ DOSKDEHW SDUDJUDSK WRSLF VHQWHQFH LV FRQVLGHUHG 8VXDOO\ WKH WRSLF
ODQJXDJHVDQGWKHUHDUHQRVXFKZRUGVLQ3HUVLDQ%XWWKHUHDUH VHQWHQFH LQ HDFK SDUDJUDSK LV D ILUVW VHQWHQFH LQ SDUDJUDSK
DFURQ\PVLQDOOODQJXDJHVDQGLQ3HUVLDQODQJXDJHFDQEHXVHG 6LPLODULW\PHDVXUHVLQWKLVVHFWLRQDUHWKHVDPHFULWHULDOLNHLQ
WKHPWRGHWHUPLQHWKHVLJQLILFDQFHRIDVHQWHQFH WKHSUHYLRXVVHFWLRQV
0058
SODFHG DW WKH PLGGOH DQG HQG RI D VHQWHQFH LW LVD OLWWOH PRUH DQGYRFDEXODU\VLPXOWDQHRXVO\,WXVHVDOJHEUDLFVLQJXODUYDOXH
LPSRUWDQW VHQWHQFH 7KH VFRUH RI VHQWHQFH LQFOXGLQJ SURQRXQ GHFRPSRVLWLRQ 69' PHWKRG WR GHWHUPLQH WKH UHODWLRQVKLS
FDOFXODWHGDV(T EHWZHHQVHQWHQFHVDQGZRUGV$GGLWLRQWREHLQJDEOHWRPRGHO
WKH UHODWLRQVKLS EHWZHHQ ZRUGV DQG VHQWHQFH 69' DOVR FDQ
;L %3L_63L__6L_ UHGXFHWKHQRLVHVDQGOHDGWRLPSURYHWKHDFFXUDF\
,Q (T WKH _ 6SL _ LV WKH QXPEHU RI SURQRXQV LQ
VHQWHQFHV 6L LV WKH VHQWHQFH OHQJWK DQG %3L LV D FRQVWDQW 6XPPDUL]DWLRQDOJRULWKPEDVHGRQ/6$PHWKRGFRQVLVWVRI
SDUDPHWHU,IWKHSURQRXQORFDWLRQLVLQWKHILUVWWKUHHZRUGLQ WKUHHVWHSVFUHDWHWKHLQSXWPDWUL[DSSOLHGWKH69'PHWKRGRQ
WKHVHQWHQFH%3L RWKHUZLVH%3L WKHFUHDWHGPDWUL[DQGVHQWHQFHH[WUDFWLRQ
/6$DOVRKDVVRPHOLPLWDWLRQV7KHPRVWLPSRUWDQWRIWKHP
9 ',))(5(17$3352$&+(6727(;76800$5,=$7,21 DUHDVIROORZV
x 7KH DOJRULWKP GRHV QRW XVH WKH LQIRUPDWLRQ DERXW WKH
$OWKRXJK VXPPDULHV FUHDWHG E\ KXPDQ DUH RIWHQ ZRUGV DUUDQJHPHQW LQ VHQWHQFHV JUDPPDU DQG
DEVWUDFWLYH EXW WKH PRVW VXFFHVVIXO PDFKLQH VXPPDU\ DUH PRUSKRORJ\ UHODWLRQVKLS +RZHYHU WKLV LQIRUPDWLRQ
H[WUDFWLYH %HFDXVH RI WKH VHPDQWLF DPELJXLW\ LQ QDWXUDO FDQEHXVHIXOWREHWWHUXQGHUVWDQGZRUGVDQGVHQWHQFHV
ODQJXDJHSURFHVVLQJDJRRGH[WUDFWLYHVXPPDU\LVPXFKEHWWHU
UHVXOWWKDQDEVWUDFWLYHVXPPDUL]DWLRQ,QIDFWLWLVIDUDZD\WR x 7KH DOJRULWKP GRHV QRW XVH DQ\ ZRUG NQRZOHGJH DQG
FUHDWH DQ LGHDO DXWRPDWLF DEVWUDFWLYH VXPPDUL]DWLRQ PHWKRG ZRUGGDWDEDVH
7KHILUVWDXWRPDWLFWH[WVXPPDUL]DWLRQDQGWKHPRVWLPSRUWDQW
QHZ SURSRVHG PHWKRGV DUH H[WUDFWLYH V\VWHPV WKDW H[WUDFW x %\ LQFUHDVLQJ WKH QXPEHU RI GLIIHUHQW ZRUGV DQG
LPSRUWDQW VHQWHQFHV EDVHG RQ KHXULVWLF IHDWXUHV VXFK DV KHWHURJHQHRXV GDWD WKH SHUIRUPDQFH RI WKH DOJRULWKP
VHQWHQFHSRVLWLRQVLQWKHWH[WWKHIUHTXHQF\RIZRUGVLQLWDQG JUHDWO\ UHGXFHG 3HUIRUPDQFH UHGXFWLRQ LV GXH WR WKH
VRPHLPSRUWDQWNH\ZRUGVUHODWHGWRWH[W7RFUHDWHD WLPHDQGPHPRU\FRPSOH[LW\RIWKH69'PHWKRG
V\VWHP ZLWK KLJKTXDOLW\ GXH WR D GLIIHUHQW W\SH RI ODQJXDJH
DQG WH[W LQSXW YDULRXV PHWKRGV DQG DOJRULWKPV LQ PDFKLQH B. Lexical Chain based approaches
OHDUQLQJDQGQDWXUDOODQJXDJHSURFHVVLQJLVSURSRVHG0RVWRI /H[LFDO FKDLQ SURGXFHV D SUHVHQWDWLRQ RI WH[W FRQWLJXRXV
H[WUDFWLYH VXPPDU\ PHWKRGV DUH HPSKDVLV RQ NH\ VHQWHQFHV VWUXFWXUHV%DVLFDOO\LWXVHVWKHZRUGQHWGDWDEDVHWRGHWHUPLQH
7KH GLIIHUHQFHV EHWZHHQ PHWKRGV DUH WKH GLIIHUHQW DOJRULWKP WKHFRQQHFWLRQEHWZHHQWKHWHUPVDQGWKHQFUHDWHVDFRQWLQXXP
XVHGWRGHWHFWLGHQWLI\H[WUDFWDQGWKHZD\WRSXWVHQWHQFHVLQ EHWZHHQWKHVHWHUPV7KHILUVWFRPSXWDWLRQDOPRGHORIOH[LFDO
WKHVXPPDU\,QWKHIROORZLQJWKHPRVWLPSRUWDQWDSSURDFKHV FKDLQVZDVSUHVHQWHGE\0RULVDQG+LUVWLQDQGWKHILUVW
ZLOOEHGLVFXVVHG XVH RI OH[LFDO FKDLQV LQ WH[W VXPPDUL]DWLRQ ZDV SURSRVHG E\
%DU]LOD\DQG(OKDGDG
A. Statistical approaches 7KH VFRUHV WKDW FRQVLGHUHG IRU WHUPV DUH DFFRUGLQJ WR WKH
8VLQJ VWDWLVWLFDO PHWKRGV WR VXPPDUL]H WKH WH[W LV DQ W\SH DQG QXPEHU RI WKH UHODWLRQVKLS RI FKDLQV VHW 7KH ILQDO
HIIHFWLYH DSSURDFK WKDW KDV EHHQ XVHG LQ PDQ\ DUWLFOHV ,Q VXPPDU\ LV LQFOXGHG VHQWHQFHV ZLWK VWURQJHVW FKDLQ
VWDWLVWLFDOPHWKRGVWKHLPSRUWDQWVHQWHQFHVDUHVHOHFWHGEDVHG FRQQHFWLRQ /H[LFDO FKDLQV FDQ EH FRPSXWHG WKH VHPDQWLFDOO\
RQ ZRUG IUHTXHQF\ LQGLFDWRU SKUDVHV DQG RWKHU IHDWXUHV UHODWLRQ RI ZRUGV ,W FDQV DOVR LGHQWLI\ WKH V\QRQ\PV DQG
UHJDUGOHVV RI WKH PHDQLQJ RI WKH ZRUGV WKDW PHQWLRQHG LQ K\SRQ\PVWRSODFHWKHPLQDJURXSLQWRWKHVDPHOH[LFDOFKDLQ
SUHYLRXVVHFWLRQV /H[LFDO FKDLQ DOVR XVHG IRU LQIRUPDWLRQ UHWULHYDO DQG
7KHUH DUH VHYHUDO PHWKRGV WR GHWHUPLQH WKH NH\ VHQWHQFHV JUDPPDWLFDOHUURUFRUUHFWLRQV
VXFK DV 7KH 7LWOH 0HWKRG 7KH /RFDWLRQ 0HWKRG 7KHUHDUHWZRGUDZEDFNVLQOH[LFDOFKDLQDSSURDFKHV7KH
7KH $JJUHJDWLRQ 6LPLODULW\ 0HWKRG 7KH )UHTXHQF\ ILUVWRQHLVDPELJXLW\LQWKHZRUGFKDLQ,IVRPHRIZRUGVKDYH
0HWKRG 7) %DVHG 4XHU\ 0HWKRG DQG /DWHQW VHPDQWLFDPELJXLW\WKHFUHDWHGFKDLQDOVRZLOOKDYHVHPDQWLF
6HPDQWLF $QDO\VLV %XW WKH PRVW FRPPRQ PHWKRGV DUH DPELJXLW\ 7KH VHFRQG GUDZEDFN LV WKH UHODWLRQ RI FUHDWHG
&ODVVLILHU %D\HVLDQ %D\HVLDQ &ODVVLILHU DQG FRPPXQLFDWLRQ FKDLQWRPDLQWRSLF$OOFKDLQVDUHQRWUHODWHGWRWKHPDLQWRSLF
FRQFHSW UHODWLRQV DSSURDFKHV 2QH RI WKH PRVW IDPRXV
VWDWLVWLFDOPHWKRGVWRVXPPDUL]HWKHWH[WLV/6$PHWKRG C. Graph based approach
/6$ LV DQ DOJHEUDLFVWDWLVWLFDO PHWKRG WKDW H[WUDFWV KLGGHQ
VWUXFWXUHVPHDQLQJRIZRUGVDQGVHQWHQFHV7KLVDSSURDFK *UDSK EDVHG DSSURDFK SURYLGHV WH[W VXPPDUL]DWLRQ
LVDQXQVXSHUYLVHGPHWKRGWKDWH[WUDFWVWH[WVWUXFWXUHVMXVWE\ PHWKRGV XVLQJ JUDSK WKHRULHV $IWHU FRPPRQ SUHSURFHVVLQJ
LQIRUPDWLRQ RIWKH ZRUGV LQ VHQWHQFH ZLWKRXW WKH QHHGRI DQ\ VWHSV VXFK DV VWHPPLQJ DQG VWRS ZRUG UHPRYDO VHQWHQFHV LQ
RWKHUNQRZOHGJH7KHLGHDLQ/69LVZRUGVWKDWWKH\DUHVKDUHG WKH GRFXPHQWV DUH UHSUHVHQWHG DV QRGHV LQ D GLUHFWHG JUDSK
EHWZHHQ GLIIHUHQW VHQWHQFHV DUH WKH UHDVRQ RI PHDQLQJ 6HQWHQFHV DUH FRQQHFWHG WR HDFK RWKHU E\ HGJHV DFFRUGLQJ WR
GHSHQGHQFH/6$DOVREHDEOHWRVKRZWKH PHDQLQJRI ZRUGV VHQWHQFH 7KH EDVLF LGHD RI JUDSKEDVHG DSSURDFKHV LV
VRPHWKLQJ OLNH YRWLQJ $V UHVXOW ZKHQ DQ HGJH FRQQHFWHG D
0059
QRGHWRRWKHUQRGHLWPHDQVWKDWYRWHVWRLWDQGZKDWHYHUWKH PHWKRG ODUJHVW FOXVWHU LV FRQVLGHUHG DV PDLQ WRSLF LQSXW WH[W
LQSXWGHJUHHRI D QRGH LV KLJK LW KDVD KLJKHUSULRULW\ ,QWKH VL]HKDVDGLUHFWLPSDFWRQGHWHUPLQLQJWKHQXPEHURIFOXVWHUV
PHQWLRQHG PHWKRG WKHUH LV DOVR YRWLQJ VFRUH ,I D QRGH KDV )RUH[DPSOHDELJ.OHDGWRVPDOOFOXVWHUVDQGDVUHVXOWVPDOO
JUHDWHURXWSXWGHJUHHLWVLPSRUWDQFHLQFUHDVHV7KHLPSRUWDQFH DQG GLVSHUVH VXPPDU\ ZLWK YHU\ ORZ FRUUHODWLRQ DQG VPDOO .
GHJUHH RI HDFK QRGH LV FDOFXODWHG E\ 69D DV (T IRU DOO OHDG WR ELJ DQG GHQVH FOXVWHUV DQG DV UHVXOW ORZ FRPSUHVVLRQ
QRGHV LQ D JUDSK UHFXUVLYHO\ 7KH SURFHVV FRQWLQXHV XQWLO WH[W
FRYHUDJHDQGQRFKDQJHLQ69D
1,0HJKDQD 06%HZRRUSURSRVHGDQRWKHUWHFKQLTXHWR
69D ±GG
FUHDWH D TXHU\ EDVHG VXPPDUL]DWLRQ V\VWHP XVLQJ FOXVWHULQJ
PHWKRGV7KH\XVHG([SHFWDWLRQ0D[LPL]DWLRQ&OXVWHULQJ
,QWKH(T9LVQRGHV(LVHGJHVLQ9DLVWKHQXPEHU (0 DOJRULWKP DQG LPSOHPHQWDWLRQ PHWKRGRORJ\ LV GLYLGHG
RI 9D LQSXW HGJHVRXW9D LV WKH QXPEHU RI 9DRXWSXW HGJHV LQWRWZRVHFWLRQV7KH(0DOJRULWKPKDVEHHQLPSOHPHQWHGLQ
DQGGLVDQLQSXWSDUDPHWHUEHWZHHQ]HURDQGRQH69DLV9D WKHILUVWSDUWDQGTXHU\EDVHGVXPPDUL]DWLRQLVGRQHLQVHFRQG
VFRUHDQG69ELV9EVFRUH7KHDOJRULWKPFDQDOVREHDSSOLHG SDUW7KHSURSRVHGPHWKRGKDVEHHQXVHGWKH:RUGQHW
RQ DQ XQGLUHFWHG JUDSK EXW WKH RXWSXW VXPPDU\ LV PRUH
GLIIHUHQWZLWKFRPSOH[LW\RIWLPH
A. Fuzzy logic based approaches
$Q HIIHFWLYH JUDSK EDVHG VXPPDU\ SURSRVHG DOJRULWKP LV
7H[W 5DQN DOJRULWKP ,W XVHV DQ XQVXSHUYLVHG PHWKRG WR )X]]\ ORJLF EDVHG DSSURDFKHV FRQVLGHU HDFK FKDUDFWHULVWLF
H[WUDFWNH\ZRUGVDQGVHQWHQFHVZLWKVFRULQJWRQRGHVEDVHGRQ RIDWH[WDVWKHLQSXWRIIX]]\V\VWHP7KHVHPHWKRGVRQO\XVHG
SUHYLRXVPHQWLRQHGVLPLODULW\PHDVXUHV7KHDOJRULWKPLV IX]]\ORJLFWRGHWHFWDQGH[WUDFWWKHLPSRUWDQWVHQWHQFHV)X]]\
VWDUWHG ZLWK DQ RSWLRQDO QRGH YDOXHV DQG UHFXUVLYHO\ UHSHDWHG WH[WVXPPDUL]DWLRQPHWKRGVDUHGLIIHUEDVHGRQGLIIHUHQFHVLQ
XQWLO FRYHUDJH WR SUHGHILQHG WKUHVKROG 7KH RSWLRQDO QRGHV H[WUDFWLQJ IHDWXUHV IX]]\ UXOHV OLQJXLVWLF YDULDEOHV
YDOXHKDVQRHIIHFWRQWKHILQDOVFRUHV PHPEHUVKLS IXQFWLRQV IX]]LILFDWLRQ DQG GHIX]]LILFDWLRQ
PHWKRGV 6RPH FKDQJHV LQ QXPHULFDO YDOXHV OHDG WR SURGXFH
D. Cluster based approaches EHWWHU OLQJXLVWLF YDULDEOHV ) .\RRPDUVL + .KRVUDYL
6RPHDXWRPDWLFVXPPDUL]DWLRQV\VWHPVDUHXVHGFOXVWHUVWR SURSRVHGDPHWKRGWKDWVXPPDUL]HVWKHWH[WLQWZRVWDJHV
SURGXFH VLJQLILFDQW VXPPDULHV ,Q WKLV DSSURDFK GLIIHUHQW 7KH ILUVW VWDJH LV SUHSURFHVVLQJ DQG WKH VHFRQG VWDJH LV
FOXVWHULQJ DOJRULWKPV DUH DSSOLHG IRU GLYLGLQJ WKH WH[W LQWR SHUIRUPLQJ IX]]\ DQDO\VLV RQ WH[W 7KH IHDWXUHV WKDW WKH\
WRNHQVVXFKDVZRUGVSKUDVHVVHQWHQFHVDQGHYHQSDUDJUDSKV H[WUDFWHG DUH WKH QXPEHU RI FRPPRQ ZRUGV LQFOXGLQJ WKH
&OXVWHUEDVHGDSSURDFKLVDQH[WUDFWLYHVXPPDUL]DWLRQDQGWKH VHQWHQFH DQG WLWOH WKH QXPEHU RI ZRUGV LQ VHQWHQFH WKH
PRVW VLPLODU VHQWHQFHV DUH EDVHG RQ PHQWLRQHG VLPLODULW\ VLPLODULW\ RI D VHQWHQFH ZLWK D SDUDJUDSK WRSLF VHQWHQFH DQG
PHDVXUHV DUH SODFHG LQ WKH FOXVWHUV 7KH ELJJHVW FOXVWHU LV VLPLODULW\RIVHQWHQFHDQGILUVWVHQWHQFHLQSDUDJUDSK
VHOHFWHG DV PDLQ WRSLF DQG LWV VHQWHQFHV DUH H[WUDFWHG DQG
SODFHG LQ VXPPDU\ RXWSXW (XFOLGHDQ GLVWDQFH &DUWHVLDQ 9, ,17(*5$7,216,0,/$567$7(0(176,1(;75$&7,9(
VLPLODULW\FRVLQHDQGVRPHRWKHUVLPLODULW\PHDVXUHVDUHXVHG 6800$5,=$7,21
IRUGHILQLQJWKHVLPLODULW\DQGGLVVLPLODULW\RIFOXVWHUV
5HGXQGDQF\LQDVHULHVRIVHQWHQFHVLQWH[WVHVSHFLDOO\RQ
$ $JUDZDO 8 *XSWD SURSRVHG .PHDQV WKHZHELVRQHRIWKHSUREOHPVLQQDWXUDOODQJXDJHSURFHVVLQJ
FOXVWHULQJ DOJRULWKP WR SODFH LPSRUWDQW VHQWHQFHV ZLWKLQ DQGRQWKHRWKHUKDQGFUHDWHVDQHZUHVHDUFKRSSRUWXQLWLHVLQ
FOXVWHUV DQG FKRVH WKH ELJJHVW FOXVWHU DV PDLQ WRSLF7KH WH[W VXPPDUL]DWLRQ DUHD 7KHUH DUH PDQ\ VHQWHQFHV ZLWK
GRFXPHQWV DUH UHSUHVHQWHG XVLQJ WHUP IUHTXHQF\LQYHUVH VLPLODU PHDQLQJ DQG FRQFHSW LQ HYHU\ GRFXPHQW %HFDXVH RI
GRFXPHQWIUHTXHQF\7),'),QWKHFRQWH[WWHUPIUHTXHQF\ GHWHUPLQLQJ WKH QXPEHU RI VHQWHQFHV LQ WKH VXPPDU\ E\ WKH
7)LVWKHDYHUDJHQXPEHURIRFFXUUHQFHE\GRFXPHQWLQWKH XVHUVWDWHPHQWVZLWKDVLPLODUFRQFHSWDUHSODFHGLQVXPPDU\
FOXVWHU7KHWRSLF LVUHSUHVHQWHGE\ ZRUGVRI ZKLFK WKH YDOXH 7KHUHDUHDOVRVRPHRWKHULPSRUWDQWFRQFHSWVLQWKHPDLQWH[W
7),')LVKLJKHULQWKHFOXVWHU7KHVHOHFWLRQRIWKHLPSRUWDQW DQGGXHWRWKHOLPLWDWLRQQXPEHURIVHQWHQFHVLQVXPPDU\WKH\
VHQWHQFHVLVEDVHGRQWKHVLPLODULW\PHDVXUHVRIWKHVHQWHQFHV GRQRWVHOHFWDQGH[WUDFW6RPDQ\VHQWHQFHVZLWKRQHFRQFHSW
ZLWKWKHWRSLFRIFOXVWHU SODFHGLQVXPPDU\DQGPDQ\RWKHULPSRUWDQWSDUWVDUHLJQRUHG
7ZR RSWLPL]DWLRQ LGHDV DUH SURSRVHG LQ WKHLU DOJRULWKP 7KH SUREOHP LV PRUH FRPPRQ LQ PXOWL GRFXPHQW
7KH ILUVW RQH LV D PHWKRG WR FDOFXODWH VHQWHQFH VFRUH DQG WKH VXPPDUL]DWLRQ6LPLODUFRQFHSWVHQWHQFHVDUHDOVRFUHDWHG
VHFRQGRQHLVDZD\WRJHWWKHRSWLPDOQXPEHURIFOXVWHUV7R WKHDPELJXLW\LQTXHVWLRQDQVZHULQJV\VWHPV
FDOFXODWLQJ VHQWHQFH VFRUH WKH DOJRULWKP XVHV 7),') DQG WKH 7KHUH DUH VRPH SURSRVHG PHWKRGV WR FRPELQH VLPLODU
VHQWHQFHOHQJWK; VHQWHQFHV WR H[WUDFW PXFK PRUH LPSRUWDQW VHQWHQFHV IURP WKH
PDLQ WH[W 2QH RI WKH PRVW LPSRUWDQW VHQWHQFH IXVLRQV LV
6FRUH[
GHSHQGHQF\ WUHHV .DWMD )LOLSSRYD LV SURSRVHG D GLUHFWHG
2QH RI WKH PRVW LPSRUWDQW LVVXHV LQ NPHDQV FOXVWHULQJ LV DF\FOLF JUDSK '$* WR VHQWHQFH IXVLRQ &OXVWHULQJ
GHWHUPLQLQJWKHRSWLPDOQXPEHURIFOXVWHUV*LYHQWKDWLQWKLV
0060
DSSURDFKHV DUH DOVR FDQ EH XVHG IRU FRPELQLQJ VLPLODU FRXQW1JUDP LV DOVR WKH QXPEHU RI 1JUDPV LQ KXPDQ
VHQWHQFHV,QWKHVHDSSURDFKHVVPDOOFOXVWHUVDUHFRQVLGHUHGWR VXPPDU\
SODFH YHU\ VLPLODU VHQWHQFHV DQG WKHQ HDFK FOXVWHU FDQ EH
FRPELQHGWRRQHVHQWHQFH 9,,, &21&/86,216
7H[W VXPPDUL]DWLRQ LV RQH RI WKH PRVW H[FLWLQJ UHVHDUFK
9,, $66(660(17$1'(9$/8$7,212)6800$5,=,1* DUHDV LQ QDWXUDO ODQJXDJH SURFHVVLQJ ,W LV DQ RSHQ UHVHDUFK
0(7+2'6 DUHDV DQG D ORW RI UHVHDUFKHV DUH KDYLQJ EHHQ GRQH DERXW LW
'XHWRH[WHQVLYHDXWRPDWLFVXPPDUL]DWLRQPHWKRGVDVVHVV 7H[WVXPPDUL]DWLRQFDQEHFODVVLILHGLQWRGLIIHUHQWJURXSVDQG
WKHDFFXUDF\RIFUHDWHGPHWKRGVLVLPSRUWDQW,WFDQEHVDLGWKDW DSSURDFKHVDQGWKHPRVWQRWDEO\RIWKHPVWXGLHGLQWKLVSDSHU
WKH SURFHVV RI HYDOXDWLRQ LV PXFK PRUH GLIILFXOW WKDQ ,Q WKLV VWXG\ DW ILUVW WKH WRSLF RI WH[W PLQLQJ DQG LWV
VXPPDUL]LQJ WKH WH[W ,Q PRVW FDVHV WKHUH LV QRW DQ LGHDO UHODWLRQVKLSZLWKWH[WVXPPDUL]DWLRQDUHFRQVLGHUHG,PSRUWDQW
VXPPDUL]HRIDGRFXPHQW7KHPDLQSUREOHPLQVXPPDUL]LQJ GHVLJQLQJFULWHULDLQWH[WVXPPDUL]DWLRQV\VWHPVDUHSUHVHQWHG
DVVHVVPHQWLVXVLQJH[WHQVLYHFULWHULDDQGDEVHQFHRIDVWDQGDUG 'LIIHUHQW DSSURDFKHV IRU VXPPDUL]DWLRQ DQG LPSRUWDQW
PHWKRGIRUHYDOXDWLRQ6XFKSUREOHPVDUHUDUHLQRWKHUQDWXUDO SDUDPHWHUV QHHGHG WRUDWH LPSRUWDQW VHQWHQFHV DUH LQWURGXFHG
ODQJXDJHSURFHVVLQJFDVHV )LQDOO\LPSRUWDQWHYDOXDWLRQDSSURDFKHVDUHLQWURGXFHG
0RVW RI WKH GHYHORSHG PHWKRGV DUH XVHG WZR DSSURDFKHV
IRUHYDOXDWLRQ7KHILUVWDSSURDFKLVWRMXGJHE\KXPDQDQGWKH
VHFRQGDSSURDFKLVFRPSDUHGWRUHIHUHQFHVXPPDU\ 5()(5(1&(6
>@ +/XKQ³7KHDXWRPDWLFFUHDWLRQRIOLWHUDWXUHDEVWUDFWLRQ´,%0-RXUQDO
,QWKHILUVWDSSURDFKVXPPDU\JHQHUDWHGE\PDFKLQHFDQEH RIUHVHDUFKDQGGHYHORSPHQW9ROSS
FRPSDUH ZLWK WKH VDPH VXPPDU\ JHQHUDWHG E\ D PDQ 7KH >@ 9 *XSWD *6 /HKDO ³$ 6XUYH\ RI 7H[W 0LQLQJ 7HFKQLTXHV DQG
SUREOHPZLWKWKLVDSSURDFKLVWKHGLIIHUHQWSHUVRQDOWDVWHVDQG $SSOLFDWLRQV´ -RXUQDO RI (PHUJLQJ 7HFKQRORJLHV LQ :HE ,QWHOOLJHQFH
RSLQLRQV 7R UHGXFH WKH LPSDFW RI WKH HIIHFWV DXWRPDWH 9ROQR
VXPPDU\FRPSDUHVZLWKPRUHWKDQRQHVXPPDU\JHQHUDWHGE\ >@ *2 0DNEXOH , &LFHNOL ) 1XU $OSDVODQ ³7H[W 6XPPDUL]DWLRQ RI
KXPDQVDQGLWLVWLPHFRQVXPLQJ 7XUNLVK 7H[WV XVLQJ /DWHQW 6HPDQWLF $QDO\VLV´ UG ,QWHUQDWLRQDO
&RQIHUHQFHRQ&RPSXWDWLRQDO/LQJXLVWLFVSS±
,Q VHFRQG DSSURDFK WKHUH LV PRUH WKDQ RQH RULJLQDO WH[W >@ ( +RY\ & /LQ ´$XWRPDWHG 7H[W 6XPPDUL]DWLRQ DQG WKH
ZLWKWKHLUVXPPDU\LQDGDWDVHW7KHRULJLQDOGRFXPHQWDQGLWV 6800$5,67 6\VWHP ,Q $GYDQFHV LQ $XWRPDWLF 7H[W
RSWLPL]HG VXPPDU\ LV D JRRG EDVH IRU HYDOXDWH WKH SURSRVHG 6XPPDUL]DWLRQ´0,73UHVVSS
VXPPDUL]DWLRQDOJRULWKP >@ 6 *KRODPUH]D]DGHK 0 $PLQL % *KRODP]DGHK ³$ &RPSUHKHQVLYH
6XUYH\RQ7H[W6XPPDUL]DWLRQ6\VWHPV´,(((
2QHRIWKHPRVWIDPRXVHYDOXDWLRQPHWKRGVLV528*(1 >@ 7 +LUDR < 6DVDNL + ,VR]DNL ³$Q H[WULQVLF HYDOXDWLRQ IRU TXHVWLRQ
7KHPHQWLRQHYDOXDWLRQPHWKRGFRPSDUHVDXWRPDWHJHQHUDWHG ELDVHG WH[W VXPPDUL]DWLRQ RQ TD WDVNV´ ,Q 3URFHHGLQJV RI 1$$&/
VXPPDU\ ZLWK VXPPDULHV JHQHUDWHG E\ WKH KXPDQ ZRUNVKRSRQ$XWRPDWLFVXPPDUL]DWLRQ
528*(1XVHVWKUHHHYDOXDWLRQFULWHULD3UHFLVLRQS5HFDOO >@ -9 *ROGVWHLQ - 0LWWDO - &DUERQHOO 0 .DQWURZLW]W ³0XOWL
5 DQG)PHDVXUHDQGWKH\DUHFDOFXODWHGDV(T GRFXPHQW VXPPDUL]DWLRQ E\ VHQWHQFH H[WUDFWLRQ´ ,Q 3URF RI WKH
$1/31$$&/:RUNVKRSRQ$XWRPDWLF6XPPDUL]DWLRQ
3 _VXPBUHIŀVXPBFDQG__VXPBFDQG_ >@ *20DNEXOH³7H[W6XPPDUL]DWLRQXVLQJ/DWHQW6HPDQWLF$QDO\VLV´
5 _VXPBUHIŀVXPBFDQG__VXPBUHI_ 06WKHVLV0LGGOH(DVW7HFKQLFDO8QLYHUVLW\
) 3535 >@ 9 *XSWD * /HKDO ³$ 6XUYH\ RI 7H[W 6XPPDUL]DWLRQ ([WUDFWLYH
7HFKQLTXHV´ -RXUQDO RI HPHUJLQJ WHFKQRORJLHV LQ ZHE LQWHOOLJHQFH
92/12
,Q WKH (T VXPBUHIGHWHUPLQHV VXPPDU\ H[WUDFWHGE\ >@ +(GPXQGVRQ³1HZ0HWKRGVLQ$XWRPDWLF([WUDFWLQJ´-RXUQDORIWKH
H[SHUWV DQG VXPBFDQG GHWHUPLQHV VXPPDU\ H[WUDFWHG E\ $VVRFLDWLRQIRU&RPSXWLQJ0DFKLQHU\9RO33
V\VWHP >@ . <RXQJNRRQJ 6 -XQJ\XQ ³$Q (IIHFWLYH 6HQWHQFH ([WUDFWLRQ
7HFKQLTXH8VLQJ&RQWH[WXDO,QIRUPDWLRQDQG6WDWLVWLFDO$SSURDFKHVIRU
$QRWKHU HYDOXDWLRQ FULWHULRQ PHQWLRQHG LQ 528*(1 LV 7H[W6XPPDUL]DWLRQ´3DWWHUQ5HFRJQLWLRQ/HWWHUV
FDOFXODWHGE\WKH(T7KHPHDVXUHFRPSDUHVQXPEHURI1 >@ 0:DVVRQ³8VLQJOHDGLQJWH[WIRUQHZVVXPPDULHV(YDOXDWLRQUHVXOWV
JUDPVLQPDFKLQHVXPPDU\DQGKXPDQVXPPDU\ DQG LPSOLFDWLRQV IRU FRPPHUFLDO VXPPDUL]DWLRQ DSSOLFDWLRQV´ LQ 3URF
WK ,QWHUQDWLRQDO &RQIHUHQFH RQ &RPSXWDWLRQDO /LQJXLVWLFV DQG WK
528*(1
$QQXDO0HHWLQJRIWKH$&/SS
FRXQWBPDWFK1JUDP >@ : DOVDQLH ³7RZDUGV DQ LQIUDVWUXFWXUH IRU $UDELF WH[W VXPPDUL]DWLRQ
XVLQJUKHWRULFDOVWUXFWXUHWKHRU\´067KHVLV´'HSDUWPHQWRIFRPSXWHU
FRXQW1JUDP VFLHQFH.LQJ6DXGXQLYHUVLW\5L\DGK.LQJGRPRI6DXGL$UDELD
>@ * 6DOWRQ ³$XWRPDWLF 7H[W 3URFHVVLQJ 7KH 7UDQVIRUPDWLRQ $QDO\VLV
,QWKH(T1LVWKHQXPEHURI1JUDPVFRXQWBPDWFK DQG5HWULHYDORI,QIRUPDWLRQE\&RPSXWHU´$GGLVRQ:HVOH\3XEOLVKLQJ
1JUDP LV PD[LPXP QXPEHU RI 1JUDPV ZKLFK DUH LQ &RPSDQ\
PDFKLQH VXPPDU\ DQG KXPDQ VXPPDU\ VLPXOWDQHRXVO\
0061
>@ - %HOOHJDUGD ³([SORLWLQJ ODWHQW VHPDQWLF LQIRUPDWLRQ LQ VWDWLVWLFDO >@ ) .\RRPDUVL + NKRVUDYL ( (VODPL 0 'DYRXGL ³([WUDFWLRQEDVHG
ODQJXDJH PRGHOLQJ´ LQ 3URF ,((( 9RO 1R SS 7H[W 6XPPDUL]DWLRQ XVLQJ )X]]\ $QDO\VLV´ ,UDQLDQ -RXUQDO RI )X]]\
6\VWHPV9RO1RSS
>@ 1 $ODPL 0 0HNQDVVL 1 5DLV ³$XWRPDWLF 7H[WV 6XPPDUL]DWLRQ >@ 0 3RXUYDOL $ $EDGHK 0RKDPPDG ³$XWRPDWHG WH[W VXPPDUL]DWLRQ
&XUUHQW6WDWHRIWKH$UW´-RXUQDORI$VLDQ6FLHQWLILF5HVHDUFK9RO EDVH RQ OH[LFDO FKDLQ DQG JUDSK XVLQJ RI ZRUG QHW DQG :LNLSHGLD
SS NQRZOHGJH EDVH´ ,-&6, ,QWHUQDWLRQDO -RXUQDO RI &RPSXWHU 6FLHQFH
>@ 5 %DU]LOD\ 0 (OKDGDG ³8VLQJ /H[LFDO &KDLQV IRU 7H[W ,VVXHV1RYRO
6XPPDUL]DWLRQ´WKH0,73UHVVSS >@ +6KDNHUL6*KRODPUH]D]DGHK0$PLQL6DOHKL)*KDGDP\DUL³$
>@ 5 0LKDOFHD ³*UDSKEDVHG UDQNLQJ DOJRULWKPV IRU VHQWHQFH H[WUDFWLRQ 1HZ *UDSK%DVHG $OJRULWKP IRU 3HUVLDQ 7H[W 6XPPDUL]DWLRQ´
DSSOLHG WR WH[W VXPPDUL]DWLRQ´ LQ 3URFHHGLQJV RI WKH $&/ RQ &RPSXWHU 6FLHQFH DQG &RQYHUJHQFH /HFWXUH 1RWHV LQ (OHFWULFDO
,QWHUDFWLYHSRVWHUDQGGHPRQVWUDWLRQVHVVLRQV (QJLQHHULQJ
>@ $ $JUDZDO 8 *XSWD ³([WUDFWLRQ EDVHG DSSURDFK IRU WH[W >@ 5 'UDJRPLU - %XG]LNRZVND ³&HQWURLGEDVHG VXPPDUL]DWLRQ RI
VXPPDUL]DWLRQ XVLQJ NPHDQV FOXVWHULQJ´ ,QWHUQDWLRQDO -RXUQDO RI PXOWLSOH GRFXPHQWV 6HQWHQFH H[WUDFWLRQ XWLOLW\EDVHG HYDOXDWLRQ DQG
6FLHQWLILFDQG5HVHDUFK3XEOLFDWLRQV9RO,VVXH XVHUVWXGLHV´,Q3URFHHGLQJVRIWKH$1/31$$&/:RUNVKRSRQ
$XWRPDWLF6XPPDUL]DWLRQSS±
>@ 1, 0HJKDQD 06 %HZRRU 06 6+ 3DWLO ³7H[W 6XPPDUL]DWLRQ
XVLQJ ([SHFWDWLRQ 0D[LPL]DWLRQ &OXVWHULQJ $OJRULWKP´ ,QWHUQDWLRQDO >@ . )LOLSSRYD 0 6WUXEH ´ 6HQWHQFH )XVLRQ YLD 'HSHQGHQF\ *UDSK
-RXUQDO RI (QJLQHHULQJ 5HVHDUFK DQG $SSOLFDWLRQV ,-(5$ 9RO &RPSUHVVLRQ´ LQ 3URFHHGLQJV RI WKH &RQIHUHQFH RQ (PSLULFDO
,VVXHSS 0HWKRGVLQ1DWXUDO/DQJXDJH3URFHVVLQJSS±
>@ /6XDQPDOL06DOHP%6DOLP16DOLP³6HQWHQFH)HDWXUHV)XVLRQ
IRU7H[WVXPPDUL]DWLRQXVLQJ)X]]\/RJLF,(((SS
0062