Академический Документы
Профессиональный Документы
Культура Документы
когда данных недостаточно для того, чтобы сеть сделала хорошее обобщение
закономерностей;
слишком много нейронов: даже при достаточно большом количестве данных, при
большой емкости нейросети с увеличением количества эпох переобученность
нейросети растет;
import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt # библиотека построения графиков
plt.style.use('ggplot') # устанавливаем стиль построения графиков
т.е., если у нас слишком много нейронов, то они начинают настраиваться на детали, на
отдельные маленькие группки объектов и перестают строить обобщенные границы,
которые учитывают все объекты; т.е. возникает слишком сильная
скоординированность нейронов одного слоя с нейронами другого слоя;
чтобы исключить это чрезмерную перенастройку, будем на каждой эпохе или шаге
обучения менять список доступных нейронов с предыдущего слоя, т.е. будем
"отключать/отбрасывать" некоторые нейроны из предыдущего слоя, чтобы нейроны
следующего слоя не могли перенастроиться только на выходы определенных
нейронов предыдущего слоя (переобучиться) - см. рис.1;
Релизация Dropout
Данный слой должен располагаться после слоя Dense или другого слоя;
Обозначим выход слоя, после которого расположен слой Dropout c параметром rate = p ,
через ā, где ā = (a1 , a2 , . . . , am ) .
Применение Dropout к выходу предыдущего слоя на этапе обучения можно представить как:
di = Xi ∗ ai , i = 1..m
q
раз, то,
чтобы сохранить энергию выхода ненулевые выходы на этапе обучения умножаются на 1
q
:
1
¯
d = ∗ X̄ ∗ ā
q
# нормировка данных
x_train /= 255
x_test /= 255
print('train shape:', x_train.shape, y_train.shape)
print('test shape:', x_test.shape, y_test.shape)
array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.], dtype=float32)
model_2.add(tf.keras.layers.Dense(512, activation='tanh'))
model_2.add(tf.keras.layers.Dropout(rate=0.25))
model_2.add(tf.keras.layers.Dense(num_classes, activation='softmax'))
model_2.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_3 (Dense) (None, 512) 401920
_________________________________________________________________
dropout_2 (Dropout) (None, 512) 0
_________________________________________________________________
dense_4 (Dense) (None, 512) 262656
_________________________________________________________________
dropout_3 (Dropout) (None, 512) 0
_________________________________________________________________
dense_5 (Dense) (None, 10) 5130
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
# Готовим модель для обучения
model_2.compile(
loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.SGD(learning_rate=0.01),
metrics=['accuracy']
)
# обучаем модель
batch_size = 256
epochs = 200
history_2 = model_2.fit(
x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test)
)
235/235 [==============================] - 1s 4ms/step - loss: 0.1463 - accuracy: 0.9
Epoch 170/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1434 - accuracy: 0.9
Epoch 171/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1439 - accuracy: 0.9
Epoch 172/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1478 - accuracy: 0.9
Epoch 173/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1459 - accuracy: 0.9
Epoch 174/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1393 - accuracy: 0.9
Epoch 175/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1450 - accuracy: 0.9
Epoch 176/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1383 - accuracy: 0.9
Epoch 177/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1425 - accuracy: 0.9
Epoch 178/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1400 - accuracy: 0.9
Epoch 179/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1431 - accuracy: 0.9
Epoch 180/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1395 - accuracy: 0.9
Epoch 181/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1435 - accuracy: 0.9
Epoch 182/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1437 - accuracy: 0.9
Epoch 183/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1384 - accuracy: 0.9
Epoch 184/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1370 - accuracy: 0.9
Epoch 185/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1383 - accuracy: 0.9
Epoch 186/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1377 - accuracy: 0.9
Epoch 187/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1361 - accuracy: 0.9
Epoch 188/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1365 - accuracy: 0.9
Epoch 189/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1365 - accuracy: 0.9
Epoch 190/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1345 - accuracy: 0.9
Epoch 191/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1349 - accuracy: 0.9
Epoch 192/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1300 - accuracy: 0.9
Epoch 193/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1317 - accuracy: 0.9
Epoch 194/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1338 - accuracy: 0.9
Epoch 195/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1351 - accuracy: 0.9
Epoch 196/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1336 - accuracy: 0.9
Epoch 197/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1310 - accuracy: 0.9
Epoch 198/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1276 - accuracy: 0.9
model_3.summary()
Model: "sequential_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_26 (Dense) (None, 2048) 1607680
_________________________________________________________________
dropout_19 (Dropout) (None, 2048) 0
_________________________________________________________________
dense_27 (Dense) (None, 1024) 2098176
_________________________________________________________________
dropout_20 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_28 (Dense) (None, 512) 524800
_________________________________________________________________
dropout_21 (Dropout) (None, 512) 0
_________________________________________________________________
dense_29 (Dense) (None, 10) 5130
=================================================================
Total params: 4,235,786
Trainable params: 4,235,786
Non-trainable params: 0
_________________________________________________________________
# б
# обучаем модель
batch_size = 256
epochs = 200
history_3 = model_3.fit(
x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test)
)
p /
235/235 [==============================] - 1s 4ms/step - loss: 0.1233 - accuracy: 0.9
Epoch 170/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1260 - accuracy: 0.9
Epoch 171/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1242 - accuracy: 0.9
Epoch 172/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1253 - accuracy: 0.9
Epoch 173/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1225 - accuracy: 0.9
Epoch 174/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1269 - accuracy: 0.9
Epoch 175/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1211 - accuracy: 0.9
Epoch 176/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1248 - accuracy: 0.9
Epoch 177/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1226 - accuracy: 0.9
Epoch 178/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1216 - accuracy: 0.9
Epoch 179/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1212 - accuracy: 0.9
Epoch 180/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1213 - accuracy: 0.9
Epoch 181/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1244 - accuracy: 0.9
Epoch 182/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1206 - accuracy: 0.9
Epoch 183/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1162 - accuracy: 0.9
Epoch 184/200
235/235 [==============================] - 1s 4ms/step - loss: 0.1211 - accuracy: 0.9
Epoch 185/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1176 - accuracy: 0.9
Epoch 186/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1154 - accuracy: 0.9
Epoch 187/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1208 - accuracy: 0.9
Epoch 188/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1179 - accuracy: 0.9
Epoch 189/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1161 - accuracy: 0.9
Epoch 190/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1175 - accuracy: 0.9
Epoch 191/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1176 - accuracy: 0.9
Epoch 192/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1175 - accuracy: 0.9
Epoch 193/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1207 - accuracy: 0.9
Epoch 194/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1163 - accuracy: 0.9
Epoch 195/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1176 - accuracy: 0.9
Epoch 196/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1177 - accuracy: 0.9
Epoch 197/200
235/235 [==============================] - 1s 5ms/step - loss: 0.1127 - accuracy: 0.9
Epoch 198/200
dir(model.layers[1])
model.layers[1].output_shape[-1]
#trpars = [ for w in model.trainable_weights]
model2_sizes = [(layer.trainable_weights[0].shape[0]*(layer.trainable_weights[0].shape[1]+1
model2_sizes
model3_sizes = [(layer.trainable_weights[0].shape[0]*(layer.trainable_weights[0].shape[1]+1
model3_sizes
[1606416, 2099200, 525312, 5632]
512-512-512-512-10-10-10-
0 sequential_1 670480 200 0.960600 0.9643
10-10-10-10-10
2048-2048-1024-1024-512-
1 sequential 7 4236560 200 0 964167 0 9726
learning_step = 0.001
learning_step = 0.01
Резюмируем
dropout позволяет избежать переобучения за счет отключения части нейронов, так
чтобы избыток нейронов не повлиял на обобщающие свойства нейросети;
применяя dropout можно увеличивать емкость нейросети и качество ее работы, не
боясь переобучения;