Logistic Regression in Python – Testing
Logistic Regression in Python – Testing
Before putting the classifier we created above into production, we need to test it. If testing reveals that the model does not achieve the expected accuracy, we will have to go back to the process above, select a different set of features (data fields), build the model again, and test it again. This will be an iterative process until the classifier meets your desired accuracy requirements. So, let’s test our classifier.
Predicting Test Data
To test the classifier, we use the test data generated in the previous stage. We call the predict method on the created object and pass it the X array of test data as shown in the following command.
In [24]: predicted_y = classifier.predict(X_test)
This will generate a single-dimensional array for the entire training dataset giving the predicted result for each row in the X array. You can check this array using the following command.
In [25]: predicted_y
Below is the output after executing the above two commands.
Out[25]: array([0, 0, 0, ..., 0, 0, 0])
The output shows that the first and last three customers are not potential candidates for fixed deposits. You can check the entire array to sort out the potential customers. To do this, use the following Python code snippet –
In [26]: for x in range(len(predicted_y)):
if (predicted_y[x] == 1):
print(x, end="t")
The output of running the above code is shown in the figure below.
The output shows the indices of all the rows that are possible candidates for subscribing to TD. You can now give this output to the bank’s marketing team, who will pick out the contact information of each customer in the selected row and continue their work.
Before we put this model into production, we need to verify the accuracy of the predictions.
Verifying Accuracy
To test the accuracy of the model, use the scoring method on the classifier as shown below −
In [27]: print('Accuracy: {:.2f}'.format(classifier.score(X_test, Y_test)))
The screen output of running this command is shown below
Accuracy: 0.90
It shows that our model has an accuracy of 90%, which is considered very good in most applications. Therefore, no further tuning is required. Now, our client is ready to run the next campaign, obtain a list of potential customers, and chase them to open TD, with a high success rate.