LogoLogo
Log InSign UpHomepage
  • đź‘‹Welcome
  • Account and Team Setup
    • Sign up
    • Subscription Plans
    • Profile information
    • Account information
    • Roles
    • Users
    • Tags
    • Logs
  • FAQ
  • UNDERSTANDING MACHINE LEARNING
    • What is Graphite Note?
      • Graphite Note Insights Lifecycle
    • Introduction to Machine Learning
      • What is Machine Learning
      • Data Analitycs Maturity
    • Machine Learning concepts
      • Key Drivers
      • Confusion Matrix
      • Supervised vs Unsupervised ML
  • Demo datasets
    • Demo Datasets
      • Ads
      • Churn
      • CO2 Emission
      • Diamonds
      • eCommerce Orders
      • Housing Prices
      • Lead Scoring
      • Mall Customers
      • Marketing Mix
      • Car Sales
      • Store Item Demand
      • Upsell
    • What Dataset do I need for my use case?
      • Predict Cross Selling: Dataset
      • Predict Customer Churn: Dataset
      • Predictive Lead Scoring: Dataset
      • Predict Revenue : Dataset
      • Product Demand Forecast: Dataset
      • Predictive Ads Performance: Dataset
      • Media Mix Modeling (MMM): Dataset
      • Customer Lifetime Value Prediction : Dataset
      • RFM Customer Segmentation : Dataset
    • Dataset examples - from online sources
      • Free datasets for Machine Learning
  • Datasets
    • Introduction
    • Prepare your Data
      • Data Labeling
      • Expanding datasets
      • Merging datasets
      • CSV File creating and formatting
    • Data sources in Graphite Note
      • Import data from CSV file
        • Re-upload or append CSV
        • CSV upload troubleshooting tips
      • MySQL Connector
      • MariaDB Connector
      • PostgreSQL Connector
      • Redshift Connector
      • Big Query Connector
      • MS SQL Connector
      • Oracle Connector
  • Models
    • Introduction
    • Preprocessing Data
    • Machine Learning Models
      • Timeseries Forecast
      • Binary Classification
      • Multiclass Classification
      • Regression
      • General Segmentation
      • RFM Customer Segmentation
      • Customer Lifetime Value
      • Customer Cohort Analysis
      • ABC Pareto Analysis
      • New vs Returning Customers
    • Advanced ML model settings
      • Actionable insights
      • Advanced parameters
      • Model Overview
      • Regressors
      • Model execution logs
    • Predict with ML Models
    • Improve your ML Models
  • Notebooks
    • What is Notebook?
    • My first Notebook
    • Data Visualization
  • REST API
    • API Introduction
    • Dataset API
      • Create
      • Fill
      • Complete
    • Prediction API
      • Request v1
        • Headers
        • Body
      • Request v2
        • Headers
        • Body
      • Response
      • Usage Notes
    • Model Results API
      • Request
        • Headers
        • Body
      • Response
      • Usage Notes
      • Code Examples
    • Model Info API
      • Request
        • Headers
        • Body
      • Response
      • Usage notes
      • Code Examples
Powered by GitBook
On this page
  • Start Simple: Use Default Settings
  • Always look the Model Overview & Health Check
  • Low F1 score or R-Squared? Don’t Panic.
  • Tips to Improve Your Model
  • Additional Levers for Improvement
  • Iterate Methodically

Was this helpful?

Export as PDF
  1. Models

Improve your ML Models

PreviousPredict with ML ModelsNextWhat is Notebook?

Last updated 2 days ago

Was this helpful?

Even when a platform handles most of the heavy lifting, a little preparation and post-training care can unlock noticeably better predictions. Improving your machine learning models is a process. With better data, clear structure, and a few practical tweaks, you’ll get results that are not only accurate—but also actionable. Graphite Note is here to guide you every step of the way.

Start Simple: Use Default Settings

If you’re not familiar with advanced machine learning concepts, it’s best to leave the unchanged. The default settings in Graphite Note are optimized to give solid performance in most scenarios. You don’t need to tweak anything unless you’re confident about what each parameter does.


Always look the Model Overview & Health Check

Once your model is trained, always review the Model Overview screen. This includes the Model Health Check, which will highlight potential issues like class imbalance or overfitting. The “Potential Dataset Improvements” section will also suggest ways to improve the dataset used for training. Don’t skip this part—it’s your shortcut to better model quality.


Low F1 score or R-Squared? Don’t Panic.

If your classification model shows a low F1 score or your regression model has a weak R-Squared value, that doesn’t always mean it’s a “bad” model. It could simply mean the model doesn’t have enough quality data to learn from, or the problem you’re solving is more complex and needs richer input.

Instead of judging only by the score, explore what your model is telling you:

  • Are there enough examples of each target value (Yes/No, or different classes)?

  • Do features (columns) have enough variation and filled values?

  • Are there hidden patterns that could become visible with more data?

Low score DOES NOT MEAN bad model.

A classifier with an F1 of 0.45 in a domain where random chance is 0.05 may be excellent; an R² of 0.30 on financial time series can be perfectly usable for ranking decisions. Context matters!


Tips to Improve Your Model

Here are practical, beginner-friendly tips to level up your ML models in Graphite Note

Improve Data Quality and Consistency

  • Fill missing values if possible, or clean them using simple methods (like replacing with average or most frequent value).

  • Avoid empty or half-filled columns—features need data to be useful.

  • Ensure consistent formatting (e.g., “Yes” and “yes” should be the same).

Add More Data

  • More data = more signal. The model performs better when it sees more examples.

  • For time series models, more historical data is crucial. One or two months of data won’t give strong predictions. Ideally, use several years of data if available.

  • If you’re predicting daily or weekly events (e.g., sales), having frequent and consistent data points is key. If a product only appears once every 3 months, the model won’t have enough information to learn from it.

Enrich Your Dataset

  • More features help the model “see” more patterns.

  • Add meaningful attributes like customer type, region, channel, or weather—anything that might influence the outcome.

  • These additional features become Key Drivers in your analysis and help unlock deeper insights.

Use Derived Features for Time Context

  • If you want to simulate time series prediction using regression, create new columns from your date: extract Day, Month, Year, Weekday, IsWeekend, etc.

  • These features help regression models understand temporal patterns without needing a full time series setup.

Ensure Good Target Balance

  • For classification models, ensure your target column isn’t too skewed. A model trained with 95% “No” and 5% “Yes” will struggle to predict the “Yes” cases.

  • If imbalance exists, Graphite Note shows it in the Health Check—use techniques like resampling or gathering more examples of the underrepresented class.

Avoid Redundant or Highly-Correlated Features

  • When building a regression model, skip predictor columns that are direct arithmetic combinations of each other.

  • For instance, if you’re forecasting Revenue and you already include Price and UnitsSold as inputs, do not also feed Revenue (or Price Ă— UnitsSold) back in as a feature—the model would essentially be learning from its own answer, leading to multicollinearity and unreliable coefficients. Keep either the individual components or the combined metric, but never both.

Keep It Relevant

  • Remove features that are irrelevant or only available after the event happens. These can lead to data leakage, giving your model false confidence.

  • Always think: “Would I have this information before making the prediction?”

Name Your Features Clearly

  • Use clear, intuitive names for your columns. This helps when reviewing Key Drivers, insights, and actionable recommendations.

Monitor Model Drift

  • If you’re using live data, monitor how your model performs over time. A model trained six months ago may become outdated if customer behavior or market conditions change.

  • Periodically retrain your model with fresh data.


Additional Levers for Improvement

  • Feature selection / dimensionality reduction – remove redundant or noisy variables to cut over-fitting.

  • Automated hyper-parameter search – once you’re comfortable, use grid/random/Bayesian search systematically instead of one-off tweaks.

  • Cross-validation – prefer k-fold or rolling-window CV to a single split, especially for small or non-IID data.

  • Regular retraining & monitoring – schedule retrains when fresh data arrives and set up drift alerts on key metrics.

  • Human-in-the-loop review – domain experts can spot impossible predictions long before metrics change.

  • Ensembling – combine diverse algorithms (e.g., gradient-boosted trees + neural nets) for stability.

  • Explainability tools – SHAP values or partial-dependence plots reveal where the model is right and where it is fragile.


Iterate Methodically

  1. Plan a single change (e.g., add holiday indicator).

  2. Retrain and document both the new metric values and qualitative observations.

  3. Compare to the previous run; keep the change only if it consistently helps.

  4. Repeat—small, controlled experiments beat guessing.

Advanced Parameters