Expert Commentary

How to Choose among Three Forecasting Models: Machine Learning, Statistical and Expert

Get to know their strengths and weaknesses.

Von Yue Li

First published on August 08, 2019
Min. Lesezeit


Forecasting methods usually fall into three categories: statistical models, machine learning models and expert forecasts, with the first two being automated and the latter being manual. Statistical methods, including time series models and regression analysis, are considered traditional, while machine learning methods, such as neural network, random forest and the gradient-boosting model, are more modern. Yet when selecting a forecasting method, the “modern vs. traditional” or “automated vs. manual” comparisons can mislead. Preferences will depend on the modeler’s training: Those with data science training will prefer machine learning models, while modelers with business backgrounds have more trust in expert forecasts. In fact, each of the three methods has different strengths and can play important roles in forecasting. Statistical models Statistical models usually have better explanatory power because they demonstrate how the forecast variable projects out or how causal factors drive the forecast variable in an explicit form. Because of the explicit form of such modeling, however, the causal relationship may be relatively simpler than what machine learning models can model. The highly predictable behavior of statistical models makes them suited for individual series, such as a sales forecast for a particular SKU in a store or a total sales forecast for all SKUs in the store. Since each individual series is modeled independently, parallelization of the modeling process should be considered for scaling purposes. Different statistical models use different assumptions so that they work fairly well on a specific pattern, such as the Croston method for an intermittent demand series or an autoregressive integrated moving average model for series that are autocorrelated. Due to the specific assumptions, applying statistical models usually requires the modeler to have deeper analytical knowledge. Machine learning models Machine learning models can model complicated relationships between the causal factors and forecast variables. They work more similarly to a black box, however, in that they cannot express such relationships in a clear form. There have been efforts to make the black box more interpretable, with the interpretability coming from ranking the importance of the factors, such as the Gini index in a random forecast model, or a unified approach, such as Shapley additive explanations. For individual series, machine learning models could be computationally slow and have poor performance due to overfitting. A good strategy, therefore, is to apply them to modeling a group of series together, such as sales forecasts for all SKUs in a store. Since this consists of one big generic model for a group, machine learning models usually have good overall performance, but they might not generate similarly strong results at individual series levels. The differences in forecast quality usually come from feature generation and model parameter tuning, which require the modeler to have a good understanding of the data and spend time on an iterative process of trial and error. Expert forecasts Experts can excel at projecting qualitative information in a forecast. In the fashion industry, for example, trend information is hard to quantify, which makes an expert’s experiences and judgment more valuable. In addition, automated forecasts assume that the future will resemble the past. When a market changes quickly, an expert who understands the market dynamics will have a more reliable sense of its future direction. Expert forecasts are subjective, however, and prone to bias. Forecast quality will hinge on the expert’s experience, the information he or she was exposed to and subjective impressions. The amount of data collected is one factor that helps determine the forecast method. Expert forecasts require minimal or no data. Statistical models have more data requirements as the number of observations must exceed the parameters used in the model. Machine learning models tend to work effectively only on large data sets, since the models often are more complicated—for example, a deep learning model will not forecast market growth because the data is too small and noisy for the model. Stability requirements of forecast results also come into play. If a company wants high consistency of results each time it reruns the model, it should first consider a statistical model. This type of model runs individual series separately, has the flexibility to remodel a portion of the series as needed and, because of the high predictability of the model form, produces more stable results. Machine learning models, by contrast, treat a group of series as one big model and are more unpredictable in form, so they must be retrained for all series and may create a less stable forecast. The differences in stability between the two types of methods, however, will depend on the particular business and the data. It’s essential to understand the priorities of the people using the forecast. We have seen situations in which the users had a complicated and highly automated business, so a machine learning model addressed their needs. In another situation, the companies originally said they wanted a state-of-art machine learning model, but the end users of the forecasting system either did not trust results from black box models or needed additional information from the model to make decisions. Instead of implementing a forecast system that no one will use, engaging end users in the design phase to understand what decisions they want out of the forecast, how much interpretability they need to make the decision and what type of models they are comfortable with to improve the forecasting process all raise the odds of success. When the situation permits, the best strategy may be to combine the strengths of different methods. We have done this in several recent demand-forecasting cases. By combining forecast results from statistical methods targeting individual series patterns with machine learning methods, which model the effect of complicated causal factors, we have significantly improved forecast accuracy for a large grocery store chain. By designing an appropriate tool to present the automated forecasting results and facilitate the forecasting adjustment process, a food company combined an expert forecast with the automated forecast to incorporate both the qualitative information and quantified results. This not only improved forecast accuracy, which led to millions of dollars in inventory cost savings and higher revenue from a reduction in lost sales, but also instilled more trust in the forecast from end users, making it easier for users to actually adopt the forecast and apply it in the business instead of producing numbers no one uses. Yue Li is an expert with Bain & Company’s Advanced Analytics practice. She is based in Los Angeles.

Forecasting methods usually fall into three categories: statistical models, machine learning models and expert forecasts, with the first two being automated and the latter being manual. Statistical methods, including time series models and regression analysis, are considered traditional, while machine learning methods, such as neural network, random forest and the gradient-boosting model, are more modern. Yet when selecting a forecasting method, the “modern vs. traditional” or “automated vs. manual” comparisons can mislead. Preferences will depend on the modeler’s training: Those with data science training will prefer machine learning models, while modelers with business backgrounds have more trust in expert forecasts. In fact, each of the three methods has different strengths and can play important roles in forecasting.

Statistical models

Statistical models usually have better explanatory power because they demonstrate how the forecast variable projects out or how causal factors drive the forecast variable in an explicit form. Because of the explicit form of such modeling, however, the causal relationship may be relatively simpler than what machine learning models can model.

The highly predictable behavior of statistical models makes them suited for individual series, such as a sales forecast for a particular SKU in a store or a total sales forecast for all SKUs in the store. Since each individual series is modeled independently, parallelization of the modeling process should be considered for scaling purposes.

Different statistical models use different assumptions so that they work fairly well on a specific pattern, such as the Croston method for an intermittent demand series or an autoregressive integrated moving average model for series that are autocorrelated. Due to the specific assumptions, applying statistical models usually requires the modeler to have deeper analytical knowledge.

Machine learning models

Machine learning models can model complicated relationships between the causal factors and forecast variables. They work more similarly to a black box, however, in that they cannot express such relationships in a clear form. There have been efforts to make the black box more interpretable, with the interpretability coming from ranking the importance of the factors, such as the Gini index in a random forecast model, or a unified approach, such as Shapley additive explanations.

For individual series, machine learning models could be computationally slow and have poor performance due to overfitting. A good strategy, therefore, is to apply them to modeling a group of series together, such as sales forecasts for all SKUs in a store.

Since this consists of one big generic model for a group, machine learning models usually have good overall performance, but they might not generate similarly strong results at individual series levels. The differences in forecast quality usually come from feature generation and model parameter tuning, which require the modeler to have a good understanding of the data and spend time on an iterative process of trial and error.

Expert forecasts

Experts can excel at projecting qualitative information in a forecast. In the fashion industry, for example, trend information is hard to quantify, which makes an expert’s experiences and judgment more valuable. In addition, automated forecasts assume that the future will resemble the past. When a market changes quickly, an expert who understands the market dynamics will have a more reliable sense of its future direction. Expert forecasts are subjective, however, and prone to bias. Forecast quality will hinge on the expert’s experience, the information he or she was exposed to and subjective impressions.

The amount of data collected is one factor that helps determine the forecast method. Expert forecasts require minimal or no data. Statistical models have more data requirements as the number of observations must exceed the parameters used in the model. Machine learning models tend to work effectively only on large data sets, since the models often are more complicated—for example, a deep learning model will not forecast market growth because the data is too small and noisy for the model.

Stability requirements of forecast results also come into play. If a company wants high consistency of results each time it reruns the model, it should first consider a statistical model. This type of model runs individual series separately, has the flexibility to remodel a portion of the series as needed and, because of the high predictability of the model form, produces more stable results. Machine learning models, by contrast, treat a group of series as one big model and are more unpredictable in form, so they must be retrained for all series and may create a less stable forecast. The differences in stability between the two types of methods, however, will depend on the particular business and the data.

It’s essential to understand the priorities of the people using the forecast. We have seen situations in which the users had a complicated and highly automated business, so a machine learning model addressed their needs. In another situation, the companies originally said they wanted a state-of-art machine learning model, but the end users of the forecasting system either did not trust results from black box models or needed additional information from the model to make decisions. Instead of implementing a forecast system that no one will use, engaging end users in the design phase to understand what decisions they want out of the forecast, how much interpretability they need to make the decision and what type of models they are comfortable with to improve the forecasting process all raise the odds of success.

When the situation permits, the best strategy may be to combine the strengths of different methods. We have done this in several recent demand-forecasting cases. By combining forecast results from statistical methods targeting individual series patterns with machine learning methods, which model the effect of complicated causal factors, we have significantly improved forecast accuracy for a large grocery store chain. By designing an appropriate tool to present the automated forecasting results and facilitate the forecasting adjustment process, a food company combined an expert forecast with the automated forecast to incorporate both the qualitative information and quantified results. This not only improved forecast accuracy, which led to millions of dollars in inventory cost savings and higher revenue from a reduction in lost sales, but also instilled more trust in the forecast from end users, making it easier for users to actually adopt the forecast and apply it in the business instead of producing numbers no one uses.

Yue Li is an expert with Bain & Company’s Advanced Analytics practice. She is based in Los Angeles.

How to Choose among Three Forecasting Models: Machine Learning, Statistical and Expert

How to Choose among Three Forecasting Models: Machine Learning, Statistical and Expert