Delude of Accuracy

April 10, 2021

Accuracy is one of the metrics used to evaluate a predictive model. This word is also an every-day English word. How this word is interpreted/communicated in meetings between business users and ML engineers can have significant implications. While there are many other metrics such as precision, recall, F1 score, and so on[1], most business users relate to accuracy. Many a time, this metric, accuracy can be misleading. Decisions based on an on-the-surface evaluation of any single model metric can result in losses.

Let me clarify what I mean. A bank wants to predict who is likely to default on a loan and decide if it should disburse the loan or not. Now, what is an acceptable accuracy for the predictive model? Can I use a model that has 20% accuracy? Is 90% good enough to put my models in production? Well, it depends.

Is 95 % accuracy good enough?

Accuracy is the number of records that are correctly classified. Let us say, we have 95% accuracy. The bank sees about two defaulters for every 100 applications. So, the accuracy of 95% is not good enough. The model can classify all the 100 applications as non-defaulters and get a 98% accuracy.

Let’s look at another hypothetical example. Imagine a self-driving car solely relying on AI-based image recognition to determine obstacles in the car’s path. If your model has a 99% accuracy, you are likely to run into an accident once every 100 days. That is three accidents a year!

Can I put a model with 20% accuracy in production?

Looking at the other extreme, we encounter business cases where accuracy can be very low, but we can still use the model. One of my previous employers received over 10K web enquiries every day. Tele-callers reached out to these enquiries to close the sale. Tele-calling capacity was only 2.5K per day. Only about 3% of these enquiries translated to a sale. Although our model was only 20% accurate, it managed to capture most people who would buy the product. The sales percentage jumped up to 5.5% of total enquiries. Also, by modulating the threshold probability the accuracy metric can be altered.

So, it is important to understand the results of a predictive model in detail and well beyond “accuracy”. Both business stakeholders and ML teams should be aware of the actions to be taken after making a prediction and be aware of associated costs and benefits. Every ML project should have an action plan associated with the prediction. The predictive model must be evaluated based on the nature of action that the business intends to take. There is often a big gap between the data science community and the business communities that run the programs. The majority of the time, business stakeholders interact with machine learning teams after the models have been developed. These are some of the reasons why data science projects fail. Engaging consultants with the right skills can improve the chances for success for AI / ML projects.

Quodlibet