How Do I Know If My Model Is Good?

Jul 3, 2019 2 min read

One of the most common lines of questioning from data scientists today goes something like this: “How do I know if my model is good? Should I just take R^2 score? Should I take AUC? What does it mean for a given model to know if it’s good or not?”

The best way to decide whether or not a model is good is to have the business interests directly plugged into that decision.

That means instead of:

My accuracy is very high.

The question or concern should be something like:

How much money do I gain (or lose) from each prediction?

Data scientists who can’t answer whether the business is gaining or losing money from a given prediction, shouldn’t even think about putting that model into production.

Another component of knowing whether a model is good or not comes down to proper monitoring. Often times when putting together a dataset that will be used to build a model, data scientists think about building the model that one time, but they don’t think about how they’re going to keep track of how well it’s actually doing in the future. How does one get the information - the baseline - to compare the model against?

It’s a tricky feedback loop that must be built into workflows, but it gets even harder:

The more rare the thing you’re trying to predict is OR
The longer it takes for your prediction to be proven right or wrong

So for example, when trying to predict whether somebody will default on a loan, one might not know for 10 years if the prediction was correct or not. Nevertheless, (s)he must think about how to feed that back into the model.

The bottom line is making sure to work within a framework when building models instead of building in isolation and then figuring out monitoring - or even pushing to production - after the fact.

You Rock!

Rihad Variawa

Data Scientist

I am the Sr. Data Scientist at Malastare AI and head of global Fintech Research, responsible for overall vision and strategy, investment priorities and offering development. Working in the financial services industry, helping clients adopt new technologies that can transform the way they transact and engage with their customers. I am passionate about data science, super inquisitive and challenge seeker; looking at everything through a lens of numbers and problem-solver at the core. From understanding a business problem to collecting and visualizing data, until the stage of prototyping, fine-tuning and deploying models to real-world applications, I find the fulfillment of tackling challenges to solve complex problems using data.