Artikel

Machines learning about machines

By 16. Februar 2018 One Comment

Let’s say person A has to answer a question asked by person Q while person B is observing person A and guesses if person A will give a correct answer.

Machinelearning_I

Person B sees the question but doesn’t necessarily need to know the answer himself in order to perform his task. If person B were good at his job, it would be smart if person A asked person B about his opinion before actually answering Q.

The purpose of this article is to transfer this situation into the machine learning world.

In this sense, machine A, machine B and machine Q replace person A, person B and person Q, respectively. Possible use cases where a ’second opinion‘ machine could be of value are:

  • Machine B recognizes if the question deals with morally critical situations (e.g. all these dilemma situations a driverless car can slip into).
  • Unseen input combined with machine B’s guesses could serve as additional training data for machine A.
  • Machine B could detect misunderstandings between machines A and Q.

More generally, machine B learns about machine A so that in a further step machine A could learn from machine B.

I couldn’t think of any simpler way than to use the Iris flower data set to illustrate this situation. The code I will be using is based on Jason Brownlee’s very straightforward tutorial (https://machinelearningmastery.com/machine-learning-in-python-step-by-step/). I have to add that the goal of this article is not to present decent and useful results (actually they won’t be useful at all­čśë), it is just about showing the idea in a simple way and maybe start a discussion on how it could be used and implemented in a more advanced way. I’m just playing around a little bit without knowing how much sense this all makes.

Basically, the steps are the following:

  1. Train machine A to predict the correct flower based on some input.
  2. Validate machine A and train machine B.
  3. Validate machine B.

First of all, I load the needed libraries and the dataset:

# Load libraries
import pandas
from sklearn import model_selection
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.neighbors import KNeighborsClassifier


# Load dataset
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class'] dataset = pandas.read_csv(url, names=names)

Now the tricky part begins. We have to decide how we split the dataset into training and validation data. To illustrate the idea I would like machine A to do its job rather badly. Therefore I even ignore two columns of the dataset (I omit ‚petal-length‘ and ‚petal-width‘) and take only 20% of the dataset as training data. Two decisions that obviously don’t make sense at all if the goal is to train machine A in the best possible way (any other goal may be useless anyway but let’s not care about this right now­čśŐ). I hope the following figure makes it clear how the dataset (150 flowers in total) is split up:

Machinelearning_II

 

# Split-out training set for A
array = dataset.values
X = array[:,0:2] Y = array[:,4] validation_size_A = 0.8
seed = 7
X_train_A, X_remaining, Y_train_A, Y_remaining = model_selection.train_test_split(X, Y, test_size=validation_size_A, random_state=seed)
# Split-out validation sets for A
validation_size_B = 0.2
X_validation_A1, X_validation_A2, Y_validation_A1, Y_validation_A2 = model_selection.train_test_split(X_remaining, Y_remaining, test_size=validation_size_B, random_state=seed)

 

Now we can train machine A (using orange flowers) and validate it using validation set A1 (green flowers). I use k-nearest neighbor algorithm since it delivered the most demonstrative results:


# Make predictions on first validation set for A
knn_A = KNeighborsClassifier()
knn_A.fit(X_train_A, Y_train_A)
predictions_A1 = knn_A.predict(X_validation_A1)
print(accuracy_score(Y_validation_A1, predictions_A1))
print(confusion_matrix(Y_validation_A1, predictions_A1))
print(classification_report(Y_validation_A1, predictions_A1))

 

Machinelearning_III(fig 3)

 

Subsequently, in order to create the training set for machine B we again validate machine A but now using the smaller validation set A2 (blue flowers). Check the results below.

 

# Make predictions on second validation set for A
predictions_A2 = knn_A.predict(X_validation_A2)
print(accuracy_score(Y_validation_A2, predictions_A2))
print(confusion_matrix(Y_validation_A2, predictions_A2))
print(classification_report(Y_validation_A2, predictions_A2))

 

Machinelearning_III(fig 4)

 

Finally, we can create the training set for machine B. The input will be the same as in validation set A1 (green flowers). However, the output will not contain flowers but a boolean telling if machine A’s answer was correct or not. We also create machine B’s validation set in a similar way.

 

# Create training and validation set for B
X_train_B = X_validation_A1
Y_train_B = Y_validation_A1 == predictions_A1
X_validation_B = X_validation_A2
Y_validation_B = Y_validation_A2 == predictions_A2

 

Last but not least we train machine B (using green flowers) and validate it (using blue flowers):

 

# Make predictions on validation set for B
knn_B = KNeighborsClassifier()
knn_B.fit(X_train_B, Y_train_B)
predictions_B = knn_B.predict(X_validation_B)
print(accuracy_score(Y_validation_B, predictions_B))
print(confusion_matrix(Y_validation_B, predictions_B))
print(classification_report(Y_validation_B, predictions_B))

 

Machinelearning_III(fig 5)

 

I massively weakened machine A in order to train machine B which sounds quite stupid. However, I hope the idea is clear now. Interesting points to me are:

  • Can we apply an adapted version of cross-validation so that we don’t weaken machine A that much?
  • Let’s say machine A’s answer is Iris-setosa (97% precision according to first validation) but machine B predicts that this answer is wrong (60% precision). How could this situation be interpreted?
  • Similarly, let’s say machine A’s answer is Iris-virginica (68% precision) and machine B backs this answer by predicting it is correct (84% precision). Is this really additional value or would it be better to have a more precise machine A and forget about machine B at all?

One Comment

Leave a Reply