One of the most common lines of questioning from data scientists today goes something like this: “How do I know if my model is good? Should I just take R^2 score? Should I take AUC? What does it mean for a given model to know if it’s good or not?”
The best way to decide whether or not a model is good is to have the business interests directly plugged into that decision.
That means instead of:
My accuracy is very high.
The question or concern should be something like:
How much money do I gain (or lose) from each prediction?
Data scientists who can’t answer whether the business is gaining or losing money from a given prediction, shouldn’t even think about putting that model into production.
Another component of knowing whether a model is good or not comes down to proper monitoring. Often times when putting together a dataset that will be used to build a model, data scientists think about building the model that one time, but they don’t think about how they’re going to keep track of how well it’s actually doing in the future. How does one get the information - the baseline - to compare the model against?
It’s a tricky feedback loop that must be built into workflows, but it gets even harder:
The more rare the thing you’re trying to predict is OR
The longer it takes for your prediction to be proven right or wrong
So for example, when trying to predict whether somebody will default on a loan, one might not know for 10 years if the prediction was correct or not. Nevertheless, (s)he must think about how to feed that back into the model.
The bottom line is making sure to work within a framework when building models instead of building in isolation and then figuring out monitoring - or even pushing to production - after the fact.