Let’s say person A has to answer a question asked by person Q while person B is observing person A and guesses if person A will give a correct answer.
Person B sees the question but doesn’t necessarily need to know the answer himself in order to perform his task. If person B were good at his job, it would be smart if person A asked person B about his opinion before actually answering Q.
The purpose of this article is to transfer this situation into the machine learning world.
In this sense, machine A, machine B and machine Q replace person A, person B and person Q, respectively. Possible use cases where a ’second opinion‘ machine could be of value are:
- Machine B recognizes if the question deals with morally critical situations (e.g. all these dilemma situations a driverless car can slip into).
- Unseen input combined with machine B’s guesses could serve as additional training data for machine A.
- Machine B could detect misunderstandings between machines A and Q.
More generally, machine B learns about machine A so that in a further step machine A could learn from machine B.
I couldn’t think of any simpler way than to use the Iris flower data set to illustrate this situation. The code I will be using is based on Jason Brownlee’s very straightforward tutorial (https://machinelearningmastery.com/machine-learning-in-python-step-by-step/). I have to add that the goal of this article is not to present decent and useful results (actually they won’t be useful at all😉), it is just about showing the idea in a simple way and maybe start a discussion on how it could be used and implemented in a more advanced way. I’m just playing around a little bit without knowing how much sense this all makes.
Basically, the steps are the following:
- Train machine A to predict the correct flower based on some input.
- Validate machine A and train machine B.
- Validate machine B.
First of all, I load the needed libraries and the dataset:
# Load libraries
from sklearn import model_selection
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.neighbors import KNeighborsClassifier
# Load dataset
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class'] dataset = pandas.read_csv(url, names=names)
Now the tricky part begins. We have to decide how we split the dataset into training and validation data. To illustrate the idea I would like machine A to do its job rather badly. Therefore I even ignore two columns of the dataset (I omit ‚petal-length‘ and ‚petal-width‘) and take only 20% of the dataset as training data. Two decisions that obviously don’t make sense at all if the goal is to train machine A in the best possible way (any other goal may be useless anyway but let’s not care about this right now😊). I hope the following figure makes it clear how the dataset (150 flowers in total) is split up:
# Split-out training set for A
array = dataset.values
X = array[:,0:2] Y = array[:,4] validation_size_A = 0.8
seed = 7
X_train_A, X_remaining, Y_train_A, Y_remaining = model_selection.train_test_split(X, Y, test_size=validation_size_A, random_state=seed)
# Split-out validation sets for A
validation_size_B = 0.2
X_validation_A1, X_validation_A2, Y_validation_A1, Y_validation_A2 = model_selection.train_test_split(X_remaining, Y_remaining, test_size=validation_size_B, random_state=seed)
Now we can train machine A (using orange flowers) and validate it using validation set A1 (green flowers). I use k-nearest neighbor algorithm since it delivered the most demonstrative results:
# Make predictions on first validation set for A
knn_A = KNeighborsClassifier()
predictions_A1 = knn_A.predict(X_validation_A1)
Subsequently, in order to create the training set for machine B we again validate machine A but now using the smaller validation set A2 (blue flowers). Check the results below.
# Make predictions on second validation set for A
predictions_A2 = knn_A.predict(X_validation_A2)
Finally, we can create the training set for machine B. The input will be the same as in validation set A1 (green flowers). However, the output will not contain flowers but a boolean telling if machine A’s answer was correct or not. We also create machine B’s validation set in a similar way.
# Create training and validation set for B
X_train_B = X_validation_A1
Y_train_B = Y_validation_A1 == predictions_A1
X_validation_B = X_validation_A2
Y_validation_B = Y_validation_A2 == predictions_A2
Last but not least we train machine B (using green flowers) and validate it (using blue flowers):
# Make predictions on validation set for B
knn_B = KNeighborsClassifier()
predictions_B = knn_B.predict(X_validation_B)
I massively weakened machine A in order to train machine B which sounds quite stupid. However, I hope the idea is clear now. Interesting points to me are:
- Can we apply an adapted version of cross-validation so that we don’t weaken machine A that much?
- Let’s say machine A’s answer is Iris-setosa (97% precision according to first validation) but machine B predicts that this answer is wrong (60% precision). How could this situation be interpreted?
- Similarly, let’s say machine A’s answer is Iris-virginica (68% precision) and machine B backs this answer by predicting it is correct (84% precision). Is this really additional value or would it be better to have a more precise machine A and forget about machine B at all?