LogoLogo
Log InSign UpHomepage
  • đź‘‹Welcome
  • Account and Team Setup
    • Sign up
    • Subscription Plans
    • Profile information
    • Account information
    • Roles
    • Users
    • Tags
    • Logs
  • FAQ
  • UNDERSTANDING MACHINE LEARNING
    • What is Graphite Note?
      • Graphite Note Insights Lifecycle
    • Introduction to Machine Learning
      • What is Machine Learning
      • Data Analitycs Maturity
    • Machine Learning concepts
      • Key Drivers
      • Confusion Matrix
      • Supervised vs Unsupervised ML
  • Demo datasets
    • Demo Datasets
      • Ads
      • Churn
      • CO2 Emission
      • Diamonds
      • eCommerce Orders
      • Housing Prices
      • Lead Scoring
      • Mall Customers
      • Marketing Mix
      • Car Sales
      • Store Item Demand
      • Upsell
    • What Dataset do I need for my use case?
      • Predict Cross Selling: Dataset
      • Predict Customer Churn: Dataset
      • Predictive Lead Scoring: Dataset
      • Predict Revenue : Dataset
      • Product Demand Forecast: Dataset
      • Predictive Ads Performance: Dataset
      • Media Mix Modeling (MMM): Dataset
      • Customer Lifetime Value Prediction : Dataset
      • RFM Customer Segmentation : Dataset
    • Dataset examples - from online sources
      • Free datasets for Machine Learning
  • Datasets
    • Introduction
    • Prepare your Data
      • Data Labeling
      • Expanding datasets
      • Merging datasets
      • CSV File creating and formatting
    • Data sources in Graphite Note
      • Import data from CSV file
        • Re-upload or append CSV
        • CSV upload troubleshooting tips
      • MySQL Connector
      • MariaDB Connector
      • PostgreSQL Connector
      • Redshift Connector
      • Big Query Connector
      • MS SQL Connector
      • Oracle Connector
  • Models
    • Introduction
    • Preprocessing Data
    • Machine Learning Models
      • Timeseries Forecast
      • Binary Classification
      • Multiclass Classification
      • Regression
      • General Segmentation
      • RFM Customer Segmentation
      • Customer Lifetime Value
      • Customer Cohort Analysis
      • ABC Pareto Analysis
      • New vs Returning Customers
    • Advanced ML model settings
      • Actionable insights
      • Advanced parameters
      • Model Overview
      • Regressors
      • Model execution logs
    • Predict with ML Models
    • Improve your ML Models
  • Notebooks
    • What is Notebook?
    • My first Notebook
    • Data Visualization
  • REST API
    • API Introduction
    • Dataset API
      • Create
      • Fill
      • Complete
    • Prediction API
      • Request v1
        • Headers
        • Body
      • Request v2
        • Headers
        • Body
      • Response
      • Usage Notes
    • Model Results API
      • Request
        • Headers
        • Body
      • Response
      • Usage Notes
      • Code Examples
    • Model Info API
      • Request
        • Headers
        • Body
      • Response
      • Usage notes
      • Code Examples
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. UNDERSTANDING MACHINE LEARNING
  2. Machine Learning concepts

Key Drivers

PreviousMachine Learning conceptsNextConfusion Matrix

Last updated 2 days ago

Was this helpful?

Introduction

In predictive modeling, key drivers (or influencers) are pivotal in discerning which features within a dataset most significantly impact the target variable. These influencers provide insights into the relative importance of each variable, enabling data scientists and analysts to understand and predict outcomes more accurately.

By highlighting the strongest predictors, key influencers inform the prioritization of features for model optimization, ensuring that models are precise and interpretable in real-world scenarios. This foundational understanding is crucial for refining models and aligning them closely with the underlying patterns and trends present in the data.


Reading Key Drivers

When examining the visualization of key influencers in Graphite Note Models, you'll find features arrayed according to their influence on the target variable, organized from most to least important on the left.

This ranking allows for a quick assessment of which factors are pivotal in the model's predictions.

By observing the length and direction of the bars associated with each feature, one can gauge the strength of influence they have on the target outcome.


Understanding Key driver influence

The image displays a visual breakdown of how different tenure values (the length of time a customer has stayed with a service) influence the likelihood of customer churn, specifically the outcome “Churn = Yes”.

The chart divides tenure into several ranges (or bins) and shows how each range increases or decreases the likelihood of churn compared to the average baseline. The churn likelihood is expressed as a multiplier:

  • Customers with short tenure (1.93 to 7.58 months) are 2.18 times more likely to churn, the highest risk group in this analysis.

  • Those in the 7.58 to 13.17 range also show an elevated risk, 1.46x more likely to churn.

  • Churn likelihood continues to decrease with tenure — between 13.17 and 18.75 months, customers are 1.27x more likely to churn than average.

  • In contrast, customers with tenure between 18.75 and 24.33 months are 1.04x less likely to churn — a marginal decrease, but moving into safer territory.

  • This trend improves further for tenure ranges of 24.33 to 29.92 and 29.92 to 35.5 months, both showing a 1.18x decrease in churn risk.

Key takeaway: Longer-tenured customers churn less. The risk of churn is significantly higher during the first 18 months of a customer’s lifecycle and begins to drop off afterward. This suggests that businesses should focus retention efforts early in the customer journey — especially within the first year — where churn risk is highest.


Statistical Methodology Used

Graphite Note uses a smart and transparent approach to show how each feature (or column) in your dataset influences the prediction outcome—like whether a customer is likely to churn. First, it calculates feature importance by checking how much the model relies on each column to make accurate predictions.

This is done using a method called permutation importance: the system shuffles the values of one feature and observes how much the model’s performance drops. The bigger the drop, the more important that feature is.

For numeric columns such as “tenure,” Graphite Note then goes one step further. It splits the values into logical ranges (called bins) and looks at how likely the predicted outcome (e.g., Churn = Yes) is for each range. It compares the average churn rate across the full dataset with the churn rate in each of those bins. If the churn rate in a bin is higher than average, it says the likelihood “increases” (e.g., by 2.18x); if it’s lower, it says it “decreases” (e.g., by 1.18x).

This lets you clearly see which value ranges increase or reduce the chances of a specific outcome. To keep insights relevant, the system automatically removes categories or ranges that don’t have enough data to be meaningful.

This makes the analysis not only accurate but also easy to act on—giving you clarity on what drives results and where to focus your strategy.

Key drivers in Binary Classification
Visual representation of most important Key drivers