Академический Документы
Профессиональный Документы
Культура Документы
Maarten Versteegh
NLP Research Engineer
Overview
Error
Hidden
Input
Rectified Linear Units
Backpropagation involves repeated multiplication with derivative of activation function
→ Problem if result is always smaller than 1!
Text Classification
Traditional approach: BOW + TFIDF
“The car might also need a front end alignment”
F1-Score*
BOW+TFIDF+SVM Some number
Output
Hidden x 256
Hidden x 512
BOW
x 1000
Features
from keras.layers import Input, Dense
from keras.models import Model
input_layer = Input(shape=(1000,))
fc_1 = Dense(512, activation='relu')(input_layer)
fc_2 = Dense(256, activation='relu')(fc_1)
output_layer = Dense(10, activation='softmax')(fc_2)
model.fit(bow, newsgroups.target)
predictions = model.predict(features).argmax(axis=1)
20 newsgroups performance
F1-Score*
BOW+TFIDF+SVM Some number
BOW+TFIDF+SVD+ 2-layer NN Some slightly higher number
layer = Embedding(...)(input_layer)
layer = Convolution1D(
128, # number of filters
5, # filter size
activation='relu',
)(layer)
layer = MaxPooling1D(5)(layer)
Performance
F1-Score*
BOW+TFIDF+SVM Some number
CBOW+TFIDF+SVD+NN Some slightly higher number
ConvNet (3 layers) Quite a bit higher now
ConvNet (6 layers) Look mom, even higher!
It's a fact: Trump has tiny hands. Will Guardian 595 17 17 225 2 8
this be the one that sinks him?
Donald Trump Explains His Obama- NYTimes 2059 32 284 1214 80 2167
Founded-ISIS Claim as ‘Sarcasm’
Can hipsters stomach the unpalatable Guardian 3655 0 396 44 773 69
truth about avocado toast?
Tim Kaine skewers Donald Trump's MSNBC 1094 111 6 12 2 26
military policy
Top 5 Most Antisemitic Things Hillary Breitbart 1067 7 134 35 22 372
Clinton Has Done
17 Hilarious Tweets About Donald Buzzfeed 11390 375 16 4121 4 5
Trump Explaining Movies
Go deeper: ResNet
Convolutional Layers with shortcuts
input_layer = ...
ResNet Block
… Conv (128) x 10
ResNet Block
Source: wikipedia
Regularization
● Norm penalties on hidden layer weights, never
on first and last
● Dropout
● Early stopping
Size of data set
● Just get more data already
● Augment data:
– Textual replacements
– Word vector perturbation
– Noise Contrastive Estimation
● Semi-supervised learning:
– Adapt word embeddings to your domain
Monitor your model
Training loss:
– Does the model converge?
– Is the learning rate too low or too high?
Training loss and learning rate
www.textkernel.com/jobs
Questions?
Source: http://visualqa.org/