Skip to Content
  • Γραφεία

    Γραφεία

    North & Latin America
    • Atlanta
    • Austin
    • Bogota
    • Boston
    • Buenos Aires
    • Chicago
    • Dallas
    • Denver
    • Houston
    • Los Angeles
    • Mexico City
    • Minneapolis
    • Monterrey
    • Montreal
    • New York
    • Rio de Janeiro
    • San Francisco
    • Santiago
    • São Paulo
    • Seattle
    • Silicon Valley
    • Toronto
    • Washington, DC
    Europe & Africa
    • Amsterdam
    • Athens
    • Berlin
    • Brussels
    • Copenhagen
    • Dusseldorf
    • Frankfurt
    • Helsinki
    • Istanbul
    • Johannesburg
    • Kyiv
    • Lisbon
    • London
    • Madrid
    • Milan
    • Munich
    • Oslo
    • Paris
    • Rome
    • Stockholm
    • Vienna
    • Warsaw
    • Zurich
    Middle East
    • Doha
    • Dubai
    • Riyadh
    Asia & Australia
    • Bangkok
    • Beijing
    • Bengaluru
    • Brisbane
    • Ho Chi Minh City
    • Hong Kong
    • Jakarta
    • Kuala Lumpur
    • Manila
    • Melbourne
    • Mumbai
    • New Delhi
    • Perth
    • Shanghai
    • Singapore
    • Sydney
    • Tokyo
    See all offices
  • Alumni
  • Media Center
  • Εγγραφή
  • Επικοινωνία
  • Greece | Elliniká

    Select your region and language

    Global
    • Global (English)
    North & Latin America
    • Brazil (Português)
    • Argentina (Español)
    • Canada (Français)
    • Chile (Español)
    • Colombia (Español)
    Europe, Middle East, & Africa
    • France (Français)
    • DACH Region (Deutsch)
    • Italy (Italiano)
    • Spain (Español)
    • Greece (Elliniká)
    Asia & Australia
    • China (中文版)
    • Korea (한국어)
    • Japan (日本語)
  • Saved items (0)
    Saved items (0)

    You have no saved items.

    Bookmark content that interests you and it will be saved here for you to read or share later.

    Explore Bain Insights
  • Κλάδοι
    Main menu

    Κλάδοι

    • Aerospace & Defense
    • Agribusiness
    • Chemicals
    • Construction & Infrastructure
    • Consumer Products
    • Financial Services
    • Healthcare & Life Sciences
    • Industrial Machinery & Equipment
    • Media & Entertainment
      Κλάδοι
      Media & Entertainment
      • Media Lab
    • Metals
    • Mining
    • Oil & Gas
    • Paper & Packaging
    • Private Equity
      Κλάδοι
      Private Equity
      • Due Diligence
      • Exit Planning
      • Firm Strategy & Operations
      • Portfolio Value Creation
    • Social Impact
    • Retail
    • Technology
    • Telecommunications
      Κλάδοι
      Telecommunications
      • Capital Expenditure
      • Telco Digital Transformation
    • Transportation
    • Travel & Leisure
    • Utilities & Renewables
  • Συμβουλευτικές Υπηρεσίες
    Main menu

    Συμβουλευτικές Υπηρεσίες

    • Customer Experience
    • Sustainability
    • Innovation
    • M&A
    • Operations
    • People & Organization
    • Private Equity
    • Sales & Marketing
    • Strategy
    • AI, Insights, and Solutions
    • Technology
    • Transformation
  • Digital
  • Πληροφορίες
    Main menu

    Πληροφορίες

    • Industry Insights
    • Services Insights
    • Bain Books
    • Webinars
    • Bain Futures
    View all Insights
    Featured topics
    • Artificial Intelligence
    • Managing Inflation
    • Thriving in Uncertainty
    • The Talent Imperative
    • Macro Trends
    • Healthcare Private Equity Report
    • CEO's Guide to Sustainability
    • Technology Report
    • Energy & Natural Resources Report
    • Paper & Packaging Report
    • CEO Insights
    • CFO Insights
    • COO Insights
    • CIO Insights
    • CMO Insights
    View all featured topics
  • Σχετικά με εμάς
    Main menu

    Σχετικά με εμάς

    • What We Do
    • What We Believe
    • Our People & Leadership
    • Client Results
    • Awards & Recognition
    • Global Affiliations
    • Social Impact
    • Sustainability
    • World Economic Forum
    Learn more about Further
  • Careers
    Main menu

    Careers

    • Work with Us
      Careers
      Work with Us
      • Find Your Place
      • Our Work Areas
      • Integrated Teams
      • Students
      • Internships & Programs
      • Recruiting Events
    • Life at Bain
      Careers
      Life at Bain
      • Blog: Inside Bain
      • Career Stories
      • Our People
      • Where We Work
      • Supporting Your Growth
      • Affinity Groups
      • Benefits
    • Impact Stories
    • Hiring Process
      Careers
      Hiring Process
      • What to Expect
      • Interviewing
    FIND JOBS
  • Γραφεία
    Main menu

    Γραφεία

    • North & Latin America
      Γραφεία
      North & Latin America
      • Atlanta
      • Austin
      • Bogota
      • Boston
      • Buenos Aires
      • Chicago
      • Dallas
      • Denver
      • Houston
      • Los Angeles
      • Mexico City
      • Minneapolis
      • Monterrey
      • Montreal
      • New York
      • Rio de Janeiro
      • San Francisco
      • Santiago
      • São Paulo
      • Seattle
      • Silicon Valley
      • Toronto
      • Washington, DC
    • Europe & Africa
      Γραφεία
      Europe & Africa
      • Amsterdam
      • Athens
      • Berlin
      • Brussels
      • Copenhagen
      • Dusseldorf
      • Frankfurt
      • Helsinki
      • Istanbul
      • Johannesburg
      • Kyiv
      • Lisbon
      • London
      • Madrid
      • Milan
      • Munich
      • Oslo
      • Paris
      • Rome
      • Stockholm
      • Vienna
      • Warsaw
      • Zurich
    • Middle East
      Γραφεία
      Middle East
      • Doha
      • Dubai
      • Riyadh
    • Asia & Australia
      Γραφεία
      Asia & Australia
      • Bangkok
      • Beijing
      • Bengaluru
      • Brisbane
      • Ho Chi Minh City
      • Hong Kong
      • Jakarta
      • Kuala Lumpur
      • Manila
      • Melbourne
      • Mumbai
      • New Delhi
      • Perth
      • Shanghai
      • Singapore
      • Sydney
      • Tokyo
    See all offices
  • Alumni
  • Media Center
  • Εγγραφή
  • Επικοινωνία
  • Greece | Elliniká
    Main menu

    Select your region and language

    • Global
      Select your region and language
      Global
      • Global (English)
    • North & Latin America
      Select your region and language
      North & Latin America
      • Brazil (Português)
      • Argentina (Español)
      • Canada (Français)
      • Chile (Español)
      • Colombia (Español)
    • Europe, Middle East, & Africa
      Select your region and language
      Europe, Middle East, & Africa
      • France (Français)
      • DACH Region (Deutsch)
      • Italy (Italiano)
      • Spain (Español)
      • Greece (Elliniká)
    • Asia & Australia
      Select your region and language
      Asia & Australia
      • China (中文版)
      • Korea (한국어)
      • Japan (日本語)
  • Saved items  (0)
    Main menu
    Saved items (0)

    You have no saved items.

    Bookmark content that interests you and it will be saved here for you to read or share later.

    Explore Bain Insights
  • Κλάδοι
    • Κλάδοι

      • Aerospace & Defense
      • Agribusiness
      • Chemicals
      • Construction & Infrastructure
      • Consumer Products
      • Financial Services
      • Healthcare & Life Sciences
      • Industrial Machinery & Equipment
      • Media & Entertainment
      • Metals
      • Mining
      • Oil & Gas
      • Paper & Packaging
      • Private Equity
      • Social Impact
      • Retail
      • Technology
      • Telecommunications
      • Transportation
      • Travel & Leisure
      • Utilities & Renewables
  • Συμβουλευτικές Υπηρεσίες
    • Συμβουλευτικές Υπηρεσίες

      • Customer Experience
      • Sustainability
      • Innovation
      • M&A
      • Operations
      • People & Organization
      • Private Equity
      • Sales & Marketing
      • Strategy
      • AI, Insights, and Solutions
      • Technology
      • Transformation
  • Digital
  • Πληροφορίες
    • Πληροφορίες

      • Industry Insights
      • Services Insights
      • Bain Books
      • Webinars
      • Bain Futures
      View all Insights
      Featured topics
      • Artificial Intelligence
      • Managing Inflation
      • Thriving in Uncertainty
      • The Talent Imperative
      • Macro Trends
      • Healthcare Private Equity Report
      • CEO's Guide to Sustainability
      • Technology Report
      • Energy & Natural Resources Report
      • Paper & Packaging Report
      • CEO Insights
      • CFO Insights
      • COO Insights
      • CIO Insights
      • CMO Insights
      View all featured topics
  • Σχετικά με εμάς
    • Σχετικά με εμάς

      • What We Do
      • What We Believe
      • Our People & Leadership
      • Client Results
      • Awards & Recognition
      • Global Affiliations
      Further: Our global responsibility
      • Social Impact
      • Sustainability
      • World Economic Forum
      Learn more about Further
  • Careers
    Popular Searches
    • Agile
    • Digital
    • Strategy
    Your Previous Searches
      Recently Visited Pages

      Content added to saved items

      Saved items (0)

      Removed from saved items

      Saved items (0)

      Expert Commentary

      Mission Possible: Driver Analysis with Collinear Variables

      Mission Possible: Driver Analysis with Collinear Variables

      Many commonly used methods have serious limitations when assessing the variable importance of collinear drivers.

      By Eleonora Nazander and Ilker Carikcioglu

      • min read
      }

      Brief

      Mission Possible: Driver Analysis with Collinear Variables
      en
      At a Glance
      • To determine which drivers have the greatest influence on an outcome variable, many analysts turn to techniques such as multiple linear regression, random forest, or Shapley values.
      • But these methods don’t work well when several drivers are highly collinear.
      • To understand overall variable importance, simple methods such as Pearson correlation can more effectively assess the strength of the relationship between a driver and the outcome variable independently of other potential drivers.
      • With that understanding, managers can then address how to improve performance along the relevant drivers, either singly or in logical clusters.

      Analytical techniques to perform driver analysis come in handy when a company seeks to understand a particular outcome, such as customer satisfaction or profit per store, as a function of several potential drivers. Ranking potential drivers by how strongly they affect the outcome metric allows the company to focus resources on improving performance along the right ones.

      Common techniques include multiple linear regression (MLR), random forest, Shapley values, Johnson’s relative weights, partial correlations, and Pearson correlation. Many of these methods control for effects of other drivers and might not be suitable for ranking them in terms of importance. The reason they may not be suitable is the distinction (which we will explain) between concepts of overall variable importance and marginal variable importance.

      Elements of Value® in retail banking

      Consider the case of a retail bank trying to understand what drives customer advocacy. Data comes from a survey of 2,500 consumers, asking how likely they are to recommend a certain brand to a friend or colleague—the core Net Promoter ScoreSM question. This likelihood to recommend becomes the outcome variable, with our goal being to understand which variables have the strongest associations with this metric.

      Potential drivers are a set of 30 attitudinal statements capturing how well a certain brand performs on the Elements of Value as experienced by customers (see Figure 1). The survey asked respondents to rate their experience with the bank on each Element of Value using a scale of 0–10. Delivering on multiple Elements of Value can lift products or services above commodity status.

      Figure 1
      The Elements of Value®
      Elements of Value®
      Elements of Value®

      In such a scenario, some analysts would turn to MLR, interpreting standardized coefficients1 as indicators of variable importance. (Standardized coefficients imply normalization of driver variables for differences in scale, so standardized coefficients are more comparable across variables than raw coefficients.) Anyone familiar with MLR knows that for a model to produce meaningful results, one must first select the appropriate variables. As is common in psychometric research, some of the 30 drivers correlate highly with others, a phenomenon called multicollinearity. Figure 2 contains model coefficients as well as additional statistical metrics for each driver selected by the algorithm.

      Figure 2
      With a likelihood to recommend as the outcome variable, here’s what MLR produces
      With a likelihood to recommend as the outcome variable, here’s what MLR produces
      With a likelihood to recommend as the outcome variable, here’s what MLR produces

      Here we need to acknowledge that a set of variables included in MLR will differ depending on how the analyst selects them. A hypothesis-driven approach ensures the highest possible interpretability of the model. But regardless of the approach, the analyst can only include a subset of variables in the model. After we removed insignificant variables, 7 of our 30 potential ones remained. The other 23 were excluded because they lacked relevance for predicting the outcome variable, or because of high collinearity with drivers already included in the model.

      Because collinearity was one reason for excluding certain variables, we cannot conclude that the seven drivers included in the model are the only important ones. In other words, MLR didn’t produce a ranking of all potential drivers by their importance.

      As for the seven included in the model, can we interpret coefficients as being relative levels of importance of each driver included in the model? Coefficients of MLR indicate what increase in the outcome variable is associated with a one-unit increase in each driver, keeping other drivers constant. In this case, though, it’s not possible to keep those constant. When drivers are collinear, which often happens with psychometric statements, improving one will likely improve others as well. MLR coefficients thus have limited practical importance here.

      MLR can still prove useful for predicting how the outcome variable would change if the company improved drivers included in the model.

      How random forest falls short

      Turning to another common predictive method, random forest,2 a useful feature of this algorithm is that it estimates how model performance would suffer if you left out a particular variable. (Random forest is an ensemble method that constructs a large number of decision trees and produces a mean or mode prediction depending on whether the dependent variable is numerical or categorical. We used the programming language R’s randomForest package and kept the default value of mtry hyperparameter—that is, the number of drivers divided by 3, or in our case mtry=10. Ntree=5,000.)  We rank our drivers from highest to lowest according to random forest’s metric of variable importance, %IncMSE3 (see Figure 3). IncMSE is defined as a percentage increase in Mean Squared Error after a driver was randomly permuted. It indicates a decrease in accuracy associated with leaving a certain driver out of the model.

      Figure 3
      Random forest’s %IncMSE produces this ranking of potential drivers
      Random forest's %IncMSE produces this ranking of potential drivers
      Random forest's %IncMSE produces this ranking of potential drivers

      As discussed earlier, psychometric data typically features high collinearity between some of the drivers. When choosing a method for analyzing variable importance, we need to make sure it provides robust results even when drivers are highly collinear. An important driver should, in theory, still be important even if it is collinear with others. To test whether random forest produces a reliable ranking of drivers even in cases of high degrees of collinearity, we created a duplicate of the driver ranked second, quality, and included this duplicate variable in the model. (A duplicate variable is a copy of the original variable and is the extreme case of collinearity—the correlation coefficient between the variable and the duplicate is 1.)

      We would expect that duplicating a driver would have no effect on its ranking. However, in this experiment, neither “quality” nor its copy ranks second any longer (see Figure 4). Just like the MLR coefficients, the drivers in random forest were penalized for high collinearity with other drivers. (One can minimize the effect of collinearity in random forest by setting mtry=1. Such hyperparameter settings will make random forest consider only one independent variable at each split, making the decision trees less similar to each other. When using random forest for predictive purposes, though, this technique might decrease performance of the model.)

      Figure 4
      After duplicating “quality,” which originally ranked second, here is the new ranking of drivers by %IncMSE
      After duplicating “quality,” which originally ranked second, here is the new ranking of drivers by %IncMSE
      After duplicating “quality,” which originally ranked second, here is the new ranking of drivers by %IncMSE

      This result makes sense once we recall how random forest defines variable importance: It indicates how model performance would suffer if we left out a particular driver—an illustration of marginal variable importance. What we want to understand, though, is how strongly each driver relates to our outcome variable independent of the effect of other drivers—the phenomenon of overall variable importance.

      Other methods commonly employed for driver analysis include Shapley values, Johnson’s relative weights, and partial correlations. Similar to MLR coefficients and random forest variable importance measures, scores provided by these methods are affected by the collinearity of drivers. Such collinear drivers often receive lower scores than drivers of similar predictive strength that don’t correlate with other drivers.

      To truly understand overall variable importance, we need to explore methods that assess the strength of the relationship between a driver and the outcome variable independently of other potential drivers. The simplest, most common of such methods is Pearson correlation.

      First we rank potential drivers by their importance for predicting the likelihood to recommend (see Figure 5). Note that Pearson correlations do not give reliable estimates of strength of relationships between variables if the variables are non-normally distributed, the data contains outliers, or the associations between variables are non-linear. In such cases, one should use other methods, such as Spearman rank correlation, which is more robust in the presence of outliers.

      Figure 5
      The Pearson correlation produces a different ranking of drivers
      The Pearson correlation produces a different ranking of drivers
      The Pearson correlation produces a different ranking of drivers

      Many companies conclude their driver analysis by shortlisting the top 5 or 10 drivers. We recommend taking an additional step, namely analyzing whether top drivers are interrelated and may be addressed simultaneously. We will explore this topic in an upcoming article.

      • Endnotes (click to expand)

        1 Standardized coefficients imply normalization of driver variables for differences in scale, so standardized coefficients are more comparable across variables than raw coefficients.

        2 Random forest is an ensemble method that constructs a large number of decision trees and produces a mean or mode prediction depending on whether the dependent variable is numerical or categorical. We used the programming language R’s randomForest package and kept the default value of mtry hyperparameter (that is, the number of drivers divided by 3, or in our case mtry=10). Ntree=5,000.

        3%IncMSE is defined as a percentage increase in Mean Squared Error after a driver was randomly permuted. It indicates a decrease in accuracy associated with leaving a certain driver out of the model.


      Elements of Value® is a registered trademark of Bain & Company, Inc.

      Authors
      • Headshot of Eleonora Nazander
        Eleonora Nazander
        Expert Senior Manager, Data Science, Denver
      • Headshot of Ilker Carikcioglu
        Ilker Carikcioglu
        Expert Associate Partner, Boston
      Contact us
      Related Consulting Services
      • AI, Insights, and Solutions
      Advanced Analytics Expert Commentary
      Defining the Intelligent Enterprise

      A recap from DeepLearning.AI’s AI Dev 25 × NYC.

      More
      AI, Insights, and Solutions
      How AI Is Starting to Transform Circular Packaging

      There are 15 AI use cases companies across the value chain can use today to accelerate circularity.

      More
      Advanced Analytics Expert Commentary
      Making Friends with Collinearity: How Driver Interactions Can Inform Targeted Interventions

      Driver analysis helps inform decisions on which drivers deserve the greatest effort.

      More
      AI, Insights, and Solutions
      How Life Sciences Leaders Are Widening the AI Capability Gap

      Most pharma and medtech companies agree that a strong data foundation is table stakes. Few invest equally in the behaviors needed to move from pilots to adoption.

      More
      Advanced Analytics Expert Commentary
      An Alternative Methodology for Demand Forecasting with Small Data Sets

      Nested bivariate regressions can provide confidence in situations containing multiple predictors.

      More
      First published in Νοέμβριος 2022
      Tags
      • Advanced Analytics Expert Commentary
      • AI, Insights, and Solutions

      How We've Helped Clients

      Advanced Analytics Breakthrough Lets Metals Company Optimize Yield Cost

      Read case study

      Advanced Analytics powers up UtilityCo’s reliability, and customers notice

      Read case study

      Direct marketing excellence through experimental design

      Read case study

      Έτοιμοι να μιλήσουμε

      Συνεργαζόμαστε με φιλόδοξους ηγέτες που θέλουν να καθορίσουν το μέλλον και όχι. Όχι να κρυφτούν από αυτό. Μαζί, επιτυγχάνουμε πετυχαίνουμε εξαιρετικά αποτελέσματα.

      Net Promoter®, NPS®, NPS Prism®, and the NPS-related emoticons are registered trademarks of Bain & Company, Inc., NICE Systems, Inc., and Fred Reichheld. Net Promoter Score℠ and Net Promoter System℠ are service marks of Bain & Company, Inc., NICE Systems, Inc., and Fred Reichheld.

      Μείνετε μπροστά σε έναν γρήγορα εξελισσόμενο κόσμο. Εγγραφείτε στο Bain Insights, τη μηνιαία μας επισκόπηση των κρίσιμων θεμάτων που αντιμετωπίζουν οι παγκόσμιες επιχειρήσεις

      *Έχω διαβάσει την Πολιτική Απορρήτου και συμφωνώ με τους όρους της.

      Please read and agree to the Privacy Policy.
      Bain & Company
      Επικοινωνήστε μαζί μας Sustainability Accessibility Όροι χρήσης Privacy Cookie Policy Sitemap Log In

      © 1996-2026 Bain & Company, Inc.

      Contact Bain

      How can we help you?

      • Business inquiry
      • Career information
      • Press relations
      • Partnership request
      • Speaker request
      See all offices