Model Overview

This documentation page explains the Model Overview in Graphite Note, including key performance metrics, health checks, and suggestions to improve dataset quality.

The Model Overview page in Graphite Note helps you assess the quality and reliability of your trained model before using it for predictions. This section appears under the Performance tab after model training and provides valuable diagnostics including performance metrics, dataset analysis, and training behavior insights. Detailed Model Health Check section helps users assess how reliable and stable the model is — and whether it’s ready to be used for prediction or needs improvement.

The Overview tab in Graphite Note is divided into three distinct sections each designed to help you understand a different aspect of your machine learning model’s performance, quality, and readiness for real-world application:

Depending on the model type — classification, regression, or time series forecast — different metrics and diagnostics will be shown.

Model Overview

summarizes all technical elements of your model setup. It includes:

Dataset Characteristics and Columns
A list of included columns, their data quality (e.g., 0% nulls), and categorical/numerical split.
Target Column Analysis
A breakdown of the target variable — its class distribution, type (binary, multiclass, continuous), and any known imbalances.
Model Run Parameters
Key configuration settings: which models were tested, how imbalance was handled, percentage split between training/testing, outlier removal, and multicollinearity treatment.
Data Preprocessing
A summary of how your data was cleaned and transformed — including imputation, normalization (e.g., z-score), encoding, and feature engineering.
Model Comparison & Selection
An overview of tested models (e.g., Logistic Regression, Random Forest, LightGBM) and which model performed best based on your primary metric (e.g., F1 score).
Performance Metrics
Full display of KPIs including Accuracy, F1, AUC, Precision, Recall (for classification), or R², MAE, MAPE, RMSE (for regression and time series).
Train vs. Test Consistency
Indicates how the model performed on unseen test data compared to training. Small performance drops are acceptable; large gaps should be investigated.
Next Steps
Instructions for what you can do next — such as using the Predict tab to forecast on new data, or exporting results via the Graphite Note API.

Performance KPIs

Performance KPIs shown on the top of Overview screen depend on Model type

For Binary & Multiclass Classification Models, you’ll see:

F1 Score: Balance between precision and recall; useful for imbalanced classes.
Accuracy: Overall percentage of correct predictions.
AUC (Area Under Curve): Measures how well the model distinguishes between classes.
Precision: How many predicted positives were actually correct.
Recall: How many actual positives were correctly predicted.

For Regression Models, you’ll see:

R-Squared: Measures how much variance in the target variable is explained by the model.
MAPE (Mean Absolute Percentage Error): Shows prediction error as a percentage of actual values.
MAE (Mean Absolute Error): Average of absolute differences between predicted and actual values.
RMSE (Root Mean Squared Error): Penalizes larger errors more than MAE.
MSE (Mean Squared Error): Square of MAE, another standard error metric.

For Time Series Forecasting Models, you’ll see the same regression-style KPIs but without explanation popovers:

R-Squared
MAPE
MAE
RMSE

These values give you an immediate sense of how accurate your forecast is. Since time series involves predicting over future dates, low MAPE and RMSE values typically indicate strong forecasting performance.

Model Health Check

This section provides a deeper explanation of the internal diagnostics automatically generated by Graphite Note during model training. These insights include:

Target Column Imbalance Graphite Note analyzes the distribution of classes in your target variable to identify potential imbalance issues. For classification models, it shows the number and percentage of each class. If the dataset is heavily skewed toward one class, the model may perform well overall (especially in accuracy) but fail to capture the minority class — an issue especially relevant for binary and multiclass models. For regression and time series, this check is not applicable.
Dataset Volume We automatically evaluate the ratio between the number of rows and the number of features (columns). A healthy rows-to-columns ratio (e.g., 352:1) typically provides better generalization power. A low ratio might indicate overfitting risk or the need for more data.
Performance Stability This compares the model’s performance between the training and testing datasets. Large gaps between training and test scores may point to overfitting. A model that performs well on both is likely to generalize better to unseen data. Metrics such as F1, AUC, and Accuracy (for classification) or R² and MAPE (for regression and time series) are considered here.
Data Leakage Risk If the model achieves perfect scores across all metrics (100% accuracy, precision, recall), it may indicate data leakage — meaning the model might have access to target-related information it shouldn’t. This results in artificially inflated performance. Feature importance is also analyzed to detect irrelevant or suspiciously strong signals.

Potential Dataset Improvements

Based on model diagnostics, Graphite Note suggests ways to improve your model through engineered or additional features. These include:

Engagement scores
Payment behavior
Usage trends
Support interaction history
Customer demographics or lifetime value proxies

These tips help guide users toward actionable ways to increase model value and robustness.

PreviousAdvanced parameters NextRegressors

Last updated 2 months ago

Was this helpful?