Академический Документы
Профессиональный Документы
Культура Документы
Vo Tri Thong.
1. INTRODUCTIONS
A. Neural Machine Translation (NMT)
This report discusses the architecture and the implementation of the neural
machine translation which was devised by (Luong, Brevdo, & Zhao, 2017). The
document also covers related concepts such as thought vector, attention mechanism
and beam search.
B. BLEU Score
i. Introductions
Bleu score is a method for automatic evaluation of machine translation. As
human evaluations are expensive and time-consuming, an automatic set of
matrices is needed to evaluate translation results. Bleu is a relatively quick to
apply, and it highly correlates with human evaluation. Bleu has been proven as
one of the most prominent method to evaluate translation results because of its
correlation to human judgments. (Kishore Papineni, 2012)
ii. Algorithm
The Bleu score is a number between 0 and 1. In order to score the translation,
one will compare n-grams of the candidate translation with the n-grams of the
reference translation and count the number of matches. The process includes
the evaluation for various n-gram sizes and compute the weighted average.
2
2. EXPERIMENTS
A. Dataset
Experiments in this report are trained and tested based on the IWSLT English-
Vietnamese dataset. The training set has 133K sentence pairs provided by the
IWSLT Evaluation Campaign.
B. The ‘Vanilla’ NMT model:
i. Configurations:
The parameters of this model is referred from the Standard HParams iwslt15.json
with a slight adjustment: Attention is none.
Key configurations:
"attention": "",
"attention_architecture": "standard",
"learning_rate": 1.0,
"num_units": 512,
"optimizer": "sgd",
"beam_width": 10
ii. Results:
Blue score is relatively slow without attention mechanism, maximum test Blue
at 8.9
# Best bleu, step 9000 lr 0.0625 step-time 0.00s wps 0.00K ppl 0.00 gN 0.00 dev
ppl 21.77, dev bleu 10.0, test ppl 24.12, test bleu 8.9, Mon Jan 21 09:14:51 2019
Time to train: 30’ using a rig with Nvidia1080 Ti
C. NMT with Attention model
i. Model with SGD optimizer
1. Configurations:
The parameters of this model is referred from the Standard HParams iwslt15
without any adjustment.
"attention": "scaled_luong",
"attention_architecture": "standard",
"learning_rate": 1.0,
"num_units": 512,
3
"optimizer": "sgd",
"beam_width": 10
2. Results
Blue score is higher with attention mechanism, maximum test Blue at 23.1
# Best bleu, step 12000 lr 0.125 step-time 0.14s wps 40.07K ppl 4.87 gN 5.96
dev ppl 9.88, dev bleu 20.3, test ppl 8.39, test bleu 23.1, Mon Jan 21 06:58:25
2019
Time to train: 30’ using a rig with Nvidia 1080 ti
ii. Model with Adam optimizer and learning rate is 0.001:
1. Configurations:
"attention": "scaled_luong",
"attention_architecture": "standard",
"learning_rate": 0.001,
"num_units": 512,
"optimizer": "adam",
"beam_width": 10
2. Results
Adam optimizer quickly achieved a high Bluescore at step 5000 but it didn’t
reach the same high as SGD. Also one observation is that Adam optimizer
generated significantly more log files.
# Best bleu, step 5000 lr 0.000125 step-time 0.18s wps 30.47K ppl 2.79 gN 8.81
dev ppl 10.88, dev bleu 19.6, test ppl 9.69, test bleu 21.6, Mon Jan 21 10:09:06
2019
Time to train 1 hour.