Вы находитесь на странице: 1из 8


Multilayer Perceptron

• Python 3 (3.5+) – Tutorial built using 3.6.3
• Installed packages:
– scikit-learn
– numpy
– matplotlib
• Github example code:
– https://github.com/acun1994/scikit-learn-
Dataset loading
• First we load the Iris flower dataset.
iris = datasets.load_iris()
X = iris.data
y = iris.target

• This snippet loads the Iris dataset from scikit-

learn’s collection, and splits it into the Data(X)
and Label(y) part
Best practice
• This part is not strictly necessary, but is a best
# Dataset splitting
X_train, X_test, y_train, y_test = train_test_split(X,
y, test_size=0.33)
# Dataset scaling
scaler = StandardScaler()
X_scaledTrain = scaler.fit_transform(X_train)
X_scaledTest = scaler.transform(X_test)
X_scaledAll = scaler.transform(X)

• MLP is exteremely sensitive to feature scaling, so use

StandardScaler to scale.
• We only fit on the TRAIN dataset. We then transform the TEST
dataset according to the TRAIN dataset’s values.
• We then train the MLP model
clf = MLPClassifier(solver='lbfgs', alpha=1e-5,
hidden_layer_sizes=(5, 3)).fit(X_scaledTrain,

• You will need to adjust the hidden_layer_sizes depending on the problem.

The first value is for the hidden layer (how many hidden nodes), while the
second value is for the output layer (how many classes). For Iris, we have 3
classes, so set it to 3.
• There are other parameters of MLP that you can tweak to increase the
performance. Refer to https://
• Now we need to test our model
predict_y = []
correct = 0
for i in range(len(X_test)):
predict_me = np.array(X_scaledTest[i].astype(float))
predict_me = predict_me.reshape(-1, len(predict_me))
prediction = clf.predict(predict_me)
if prediction[0] == y_test[i]:
correct += 1
print ("Success rate : ", correct/len(X_test))

• You may have to retrain the model several

times to get good accuracy
• The source code includes functions to visualise
the clusters
# Visualization
np.concatenate((X_scaledTrain,X_scaledTest),axis = 0),
np.concatenate((y_train, predict_y), axis=0),
"Train + Test", 1)
visualise(X_scaledAll,y, "True", 2)
• As you can see, the true clusters and the predicted clusters
are rather similar, which means our model is trained correctly.
• Pay attention to the grouping of the datapoints. You can drag
around the scatter plots to see other perspectives.

Вам также может понравиться