1 of 100

Graphite Note Documentation

Welcome

Welcome to the Graphite Note documentation portal. These guides will show you how to predict, visualize, and analyze your data using machine learning with no code. Graphite Note is a powerful tool designed to democratize the power of data analysis and machine learning, making it accessible to individuals and teams of all skill levels. Whether you're a marketer looking to segment your audience, a sales team predicting lead conversions, or an operations manager forecasting product demand, Graphite Note is your go-to platform.

The platform is built with a user-friendly interface that allows you to connect your data, generate predictive models, and share your results with just a few clicks. It's not just about making predictions; it's about understanding them. With Graphite Note's built-in prescriptive analytics and data storytelling feature, you can transform complex data into meaningful narratives that drive strategic actions. This documentation will guide you through every step of the process, from setting up your account to making your first prediction. So let's dive in and start exploring the power of no-code machine learning with Graphite Note.

For every lexical terms you can check on the machine learning glossary.

Account and Team Setup

Sign up

Welcome to Graphite Note App! To get started, create an account or log in if you already have one. With an active account, you can upload datasets and create machine learning models in just minutes.

Refer to our documentation to learn more about machine learning and how to apply ML models to your data to boost your business, uncover insights, and make predictions.

Profile information

You can access the Profile information setup by clicking on the small user profile icon located in the top-right corner of the interface. This icon typically displays the first letters of your name and surname (e.g., “CP” for Chris Parker). Once clicked, a dropdown menu appears where you can select Profile to view or edit your profile settings.

This page is where users can manage their personal information and customize their profile settings:

User Code: A unique identifier for the user (e.g., a system-generated code like “e056ac433952”).
Name: Displays the user’s name. This can be updated to reflect the preferred name.
AI Generated Content Language: Lets the user select the language for AI-generated content (e.g., “English”). All AI generated content related to different models in Graphite note will be generate in selected language. When the language is changed, you need to rerun the model to generate content in the currently selected language.
Select Avatar: Allows users to choose a personalized avatar color to visually represent their profile.
Email: Shows the registered email address of the user.
Role: Indicates the user’s role in the system (e.g., “Administrator” or “Viewer”).
Password: Offers an option to change the user’s password with a Change Password button.

Account information

You can access the Account information page by clicking on the small wheel icon (⚙️) in the top-right corner of Graphite Note, and then selecting the Account info drop-down item. This page features your account information, including active plan information and plan usege statistics along with information on different fatures included in the plan.

Assistance

A toggle option to enable assistance from the Graphite Note Support Team. By enabling this feature, users grant the support team access to their datasets, models, and notebooks. This facilitates quicker issue resolution and support for content creation. Assistance option is enabled by default.

Current Plan

Displays the subscription plan the account is currently on. Click on the Contact Sales button if you wish to upgrade, downgrade, or discuss your subscription options.

API Token

A secure token is shown (on the account info page it is partially masked for privacy) that users can use to integrate Graphite Note with other applications or systems. It provides a unique, secure key for API access, enabling advanced integration with external systems in a different ways:

Dataset API - enables users to easily populate their datasets by sending data directly to Graphite Note, ensuring seamless data integration.
Prediction API - allows users to request predictions based on attributes they provide, leveraging Graphite Note’s machine learning models to generate accurate business forecasts.
Model Results API - lets users fetch the outputs of their trained models in a structured, paginated format. This is especially useful for viewing or processing large prediction result sets in batches.

Click the eye icon next to the token to view the full API token. More information about API Token usage you can find in the .

Plan usage

Gives a quick overview of your plan limits, usage, and enabled features:

Max Data Rows in Database: Shows the total data rows allowed (e.g., 10M), how many are used (e.g., 2.28M), and how many are left (e.g., 7.72M). A progress bar helps you track usage.
Max Number of Users: Displays the total users allowed (e.g., 10), how many are used (e.g., 1), and available slots (e.g., 9). You can invite new users with the Invite User link.
Available Dataset Plans: Lists supported dataset types like CSV, database integrations, model data, merged datasets, and BigQuery.
Additional Features: Confirms whether key features like AI insights, API access, white-labeled notebooks, and advanced model settings are enabled under your plan.

Roles

In Graphite Note, every user within a team is assigned a role that determines their access level and permissions. These roles help control what users can view or modify within the platform. Roles can be managed on the Roles Administration Page, accessed by clicking the wheel icon (⚙️) in the top-right corner and selecting Roles.

Role Types

By default in Graphite Note you will have two types of predefined roles that cover the most common access needs:

Administrator: Grants full access to read and modify all entities, including datasets, models, notebooks, users, and system settings. Administrator role cannot be deleted but serve as the foundation for managing basic access levels.
Viewer: Provides read-only access, allowing users to view entities but not edit or modify them.

Administrators can create custom roles tailored to the specific needs of their organization. Custom roles allow for detailed control over permissions. Custom roles are created by clicking New Role button, providing a name and description, and defining permissions (e.g., Read & Modify or No Access) for each module.

With both default and customizable roles, Graphite Note provides a robust system for managing user access and ensuring data security across your team.

Role Assignment

Roles are assigned to users on the. When inviting a new user to the team, administrators select a role for them, ensuring the appropriate level of access. Existing roles can also be updated later to adapt to changing team responsibilities.

Viewing Assigned Role

Users can see their currently assigned role on the page, accessible via the profile dropdown menu in the top-right corner of the platform. This transparency allows users to understand their permissions within the system.

Users

The Users Page in Graphite Note provides administrators with tools to manage team members, assign roles, and update user details. It can be accessed by clicking the wheel icon (⚙️) in the top-right corner and selecting Users.

User List

Displays all team members with details such as user code, name, AI content language, email, assigned role, and activation status. Administrators can manage users directly from this list.

Inviting New Users

New users can be invited by clicking on the Invite user button and entering their email address and assigning a role (e.g., Viewer, Administrator, or a custom role) from the Users Page. Once the invitation is sent, a user profile is automatically created within the system, even before the user accepts the invitation. This allows administrators to manage and edit the user’s details, such as their name, role, or preferences, immediately after the invitation is issued. The profile becomes fully active once the user accepts the invitation via email.

Editing Invited user

Once a user is invited, administrators can edit their details by clicking the gear icon under the Action column on the Users Page. This opens the Edit User Panel, where the following information can be found:

• User Code: A unique identifier for the user (non-editable).

• Name: Modify the user’s display name to ensure accuracy or reflect changes.

• AI Generated Content Language: Select the user’s preferred language for AI-generated content.

• Select Avatar: Customize the user’s profile avatar by choosing from a range of color options for better visual distinction.

• Email: Update the user’s registered email address.

• Role: Change the user’s assigned role (e.g., Viewer, Administrator, or a custom role) to adjust their access permissions.

Changes are saved by clicking the Save button, ensuring the user profile reflects updated details. This flexibility allows for seamless management of team members’ information and roles.

Logs

The Logs Page in Graphite Note lets you review a complete history of model-training runs—showing start / finish times, durations, statuses, and model codes for quick auditing and troubleshooting. Use it to confirm that a model finished successfully, inspect failures, or copy a model-code for API calls. You can open the Logs Page by clicking the wheel icon (⚙️) in the top-right corner and selecting Logs.

For a full breakdown of every column and filter, see the page.

UNDERSTANDING MACHINE LEARNING

What is Graphite Note?

Graphite Note is a no-code machine learning platform that enables data analysts to create predictive models in minutes. By connecting your data, the platform automatically preprocesses it, allowing you to train AI models without coding. It identifies key drivers affecting predictions and provides actionable insights through generative AI, facilitating data-driven decision-making. Graphite Note supports various industries with prebuilt model templates, making complex data analysis accessible and efficient.

Graphite Note Insights Lifecycle

No-code, Automated Machine Learning for Data Analytics Teams

Data to Insights Lifecycle

Dataset: Begin with a dataset containing historical data.
Feature Selection: Identify the most important variables (features) for the model.
Best Algorithm Search: Test different algorithms to find the best fit for your data.
Model Generation: Create a predictive model based on selected features and the best algorithm.
Model Tuning: Fine-tune the model’s parameters to improve accuracy.
Model Deployment: Deploy the final model for real-world usage.
Explore Key Drivers: Analyze the key factors influencing the model’s predictions.
Explore What-If Scenarios: Test different hypothetical situations to see their impact.
Predict Future Outcomes: Use the model to forecast future trends or outcomes.

Introduction to Machine Learning

In this section, we’ll explore the core machine learning concepts that underpin the Graphite Note solution. You’ll learn about the algorithms and techniques used to analyze data, make predictions, and uncover valuable insights. By understanding these foundational principles, you’ll gain a deeper appreciation of how Graphite Note leverages machine learning to deliver powerful analytical capabilities.

Data Analitycs Maturity

From Business Intelligence (BI) to Artificial Intelligence (AI)

Analytics maturity represents an organization’s progression in leveraging data to drive insights and decisions. This journey typically follows four levels:

1. Descriptive Analytics: The foundation of analytics maturity, focused on answering “What happened?” Descriptive analytics relies on reporting and data mining to summarize past events. Most organizations begin here, gaining basic insights by understanding historical data.

2. Diagnostic Analytics: Building on descriptive insights, diagnostic analytics answers “Why did it happen?” by drilling deeper into data patterns and trends. Using techniques such as query drill-downs, diagnostic analytics provides context and explanations, helping organizations understand the causes of past events. Traditional organizations often operate within this descriptive and diagnostic phase.

3. Predictive Analytics: Moving into more advanced analytics, predictive analytics addresses “What will happen?” by utilizing machine learning and AI to forecast future outcomes. Through statistical simulations and data models, predictive analytics enables organizations to anticipate trends, customer behavior, and potential risks. Elevating to this level empowers organizations to make more proactive, data-driven decisions and gain a competitive edge.

4. Prescriptive Analytics: At the highest level of analytics maturity, prescriptive analytics answers “What should I do?” It combines machine learning, AI, and mathematical optimization to recommend actions that lead to desired outcomes. By offering actionable guidance, prescriptive analytics not only predicts future scenarios but also prescribes the best course of action, allowing organizations to optimize decisions and drive strategic growth.

While many organizations remain in the descriptive and diagnostic phases, those aiming to stay competitive and drive innovation must elevate their analytics capabilities. Graphite Note is designed to accelerate this journey, helping organizations seamlessly transition into predictive and prescriptive analytics. By embracing machine learning and AI through Graphite Note, companies can transform their data into a strategic asset, enabling proactive decision-making and unlocking new avenues for operational efficiency and business growth.

Machine Learning concepts

The following sections cover some of the most important machine learning concepts.

Key Drivers

Introduction

In predictive modeling, key drivers (or influencers) are pivotal in discerning which features within a dataset most significantly impact the target variable. These influencers provide insights into the relative importance of each variable, enabling data scientists and analysts to understand and predict outcomes more accurately.

By highlighting the strongest predictors, key influencers inform the prioritization of features for model optimization, ensuring that models are precise and interpretable in real-world scenarios. This foundational understanding is crucial for refining models and aligning them closely with the underlying patterns and trends present in the data.

Reading Key Drivers

When examining the visualization of key influencers in Graphite Note Models, you'll find features arrayed according to their influence on the target variable, organized from most to least important on the left.

This ranking allows for a quick assessment of which factors are pivotal in the model's predictions.

By observing the length and direction of the bars associated with each feature, one can gauge the strength of influence they have on the target outcome.

Understanding Key driver influence

The image displays a visual breakdown of how different tenure values (the length of time a customer has stayed with a service) influence the likelihood of customer churn, specifically the outcome “Churn = Yes”.

The chart divides tenure into several ranges (or bins) and shows how each range increases or decreases the likelihood of churn compared to the average baseline. The churn likelihood is expressed as a multiplier:

Customers with short tenure (1.93 to 7.58 months) are 2.18 times more likely to churn, the highest risk group in this analysis.
Those in the 7.58 to 13.17 range also show an elevated risk, 1.46x more likely to churn.
Churn likelihood continues to decrease with tenure — between 13.17 and 18.75 months, customers are 1.27x more likely to churn than average.
In contrast, customers with tenure between 18.75 and 24.33 months are 1.04x less likely to churn — a marginal decrease, but moving into safer territory.
This trend improves further for tenure ranges of 24.33 to 29.92 and 29.92 to 35.5 months, both showing a 1.18x decrease in churn risk.

Key takeaway: Longer-tenured customers churn less. The risk of churn is significantly higher during the first 18 months of a customer’s lifecycle and begins to drop off afterward. This suggests that businesses should focus retention efforts early in the customer journey — especially within the first year — where churn risk is highest.

Statistical Methodology Used

Graphite Note uses a smart and transparent approach to show how each feature (or column) in your dataset influences the prediction outcome—like whether a customer is likely to churn. First, it calculates feature importance by checking how much the model relies on each column to make accurate predictions.

This is done using a method called permutation importance: the system shuffles the values of one feature and observes how much the model’s performance drops. The bigger the drop, the more important that feature is.

For numeric columns such as “tenure,” Graphite Note then goes one step further. It splits the values into logical ranges (called bins) and looks at how likely the predicted outcome (e.g., Churn = Yes) is for each range. It compares the average churn rate across the full dataset with the churn rate in each of those bins. If the churn rate in a bin is higher than average, it says the likelihood “increases” (e.g., by 2.18x); if it’s lower, it says it “decreases” (e.g., by 1.18x).

This lets you clearly see which value ranges increase or reduce the chances of a specific outcome. To keep insights relevant, the system automatically removes categories or ranges that don’t have enough data to be meaningful.

This makes the analysis not only accurate but also easy to act on—giving you clarity on what drives results and where to focus your strategy.

Confusion Matrix

Purpose

The Confusion Matrix is a powerful diagnostic tool in classification tasks within predictive analytics. It presents a clear and concise layout for evaluating the performance of a classification model by showing the actual versus predicted values in a tabular format. The matrix allows users, regardless of their coding expertise, to assess the accuracy and effectiveness of a predictive model, providing insights into not only the number of correct and incorrect predictions but also the type of errors made.

Components of the Confusion Matrix

A confusion matrix for a binary classification problem consists of four components:

True Positives (TP): The number of instances that were predicted as positive and are actually positive.
False Positives (FP): The number of instances that were predicted as positive but are actually negative.
True Negatives (TN): The number of instances that were predicted as negative and are actually negative.
False Negatives (FN): The number of instances that were predicted as negative but are actually positive.

Applications

In the context of Graphite Note, a no-code predictive analytics platform, the confusion matrix serves several key purposes:

Performance Measurement: It quantifies the performance of a classification model, offering a visual representation of the model's ability to correctly or incorrectly predict categories.
Error Analysis: By breaking down the types of errors (FP and FN), the matrix aids in understanding specific areas where the model may require improvement.
Decision Support: The confusion matrix supports decision-making by highlighting the balance between sensitivity (or recall) and precision, which can be crucial for business outcomes.
Model Tuning: Users can leverage the insights from the confusion matrix to adjust model parameters and thresholds to optimize for certain predictive behaviors.
Communication Tool: It acts as a straightforward communication tool for stakeholders to grasp the results of a classification model without delving into complex statistical jargon.

Interpretation

In the example confusion matrix (Model Performance -> Accuracy Overview)

There are 799 instances where the model correctly predicted the positive class (TP).
There are 15622 instances where the model incorrectly predicted the positive class (FP).
There are 348 instances where the model failed to identify the positive class (FN).
There are 18159 instances where the model correctly identified the negative class (TN).

The high number of FP and FN relative to TP suggests a potential imbalance or a need for model refinement to improve predictive accuracy.

Conclusion

The classification confusion matrix is an integral part of the model evaluation in Graphite Note, enabling users to make informed decisions about the deployment and iteration of their predictive models.

Supervised vs Unsupervised ML

In machine learning, supervised and unsupervised learning are two main types of approaches:

1. Supervised Learning

In supervised learning, the model is trained on a labeled dataset. This means we provide both input data and the corresponding output labels to the model. The goal is for the model to learn the relationship between inputs and outputs so it can predict new, unseen data. Common examples include classification (e.g., email spam detection) and regression (e.g., predicting house prices).

For example, if you have an image dataset labeled with “cat” or “dog,” the model learns to classify new images as either a cat or dog based on this training.

Supervised learning on Diamonds dataset example

In this example, we have a dataset containing information about diamonds. The supervised machine learning approach focuses on predicting a specific target column based on other features in the dataset.

• Target is a Number (Regression): If the target column is numerical (e.g., “Price”), the goal is to predict the diamond’s price based on features like cut, color, and clarity. This is called regression.

• Target is Text (Classification): If the target is categorical (e.g., “Cut” with values like Ideal, Very Good), the goal is to classify diamonds into categories based on their characteristics. This is known as classification.

2. Unsupervised Learning

In unsupervised learning, the model is given only input data without any labeled outputs. The goal is to find patterns or groupings within the data. A common task here is clustering (e.g., grouping customers by purchasing behavior) and dimensionality reduction (e.g., simplifying data visualization).

For example, if you have images without labels, the model could group similar images together (like cats in one group and dogs in another) based on visual similarities.

Unsupervised learning on Diamonds dataset example

In unsupervised learning, there is no target column or labeled output provided. Instead, the model analyzes patterns within the data to group or cluster similar items together.

In this diamond dataset example:

• We don’t specify a target column (like price or cut); instead, the goal is to find natural groupings of data points based on their features.

• Here, clustering is used to identify groups of diamonds with similar Carat Weight and Price characteristics.

• The scatter plot on the right shows how the diamonds are grouped into different clusters (e.g., cluster0, cluster1, etc.), revealing patterns in the data without needing predefined labels.

This approach is useful when you want the model to identify hidden structures or patterns within the data. Unsupervised learning is often used for customer segmentation, anomaly detection, and recommendation systems.

Demo datasets

Demo Datasets

We have prepared 12 distinct demo datasets that you can upload and use in your Graphite Note account. These demo datasets include dummy data from a variety of business scenarios and serve as an excellent starting point for building and running your initial Graphite Note machine learning models. Instructions on how to proceed with demo dataset will be provided in the following pages.

Churn

Create Binary Classification model on Demo Churn dataset.

Get an overview of Customer Churn demo dataset and how it can be used to create your new Graphite Note model in this video:

Or follow instructions below to get step by step guidance on how to use Customer Churn demo dataset:

1.If you want to use Graphite Note demo datasets click "Import DEMO Dataset"

2. Select the dataset you want to use to create a machine learning model. In this case we will select Churn dataset to create binary classification analysis on customer engagement data .

3. Once selected, demo dataset will load into your account. Dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on "New Model" to create new one.

7. Select model type from our templates. In our case we will select "Binary Classification" by double clicking on its name.

8. Select dataset you want to use to produce model. We will use "Demo-Churn.csv."

9. Name your new model. We will call it "Binary Classification on Demo-Churn".

10. Write description of the model and select tag. If you want to, you can also create a new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up Binary Classification model first you need to define "Target Feature". That is binary column from your dataset that you'd like to make predictions about. In case of Binary Classification on Churn dataset, the target feature will be "Churn" column.

13. Click "Next" to get the list of model features that will be included in scenario. Model relies upon each column (feature) to make accurate predictions. When training model we will calculate which of the features are most important and behave as Key Drivers.

14. To start training model click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for a few moments and Voilà! Your Binary Classification model is trained. Click on "Performance" tab to get model insights and view Key Drivers.

16. Explore Binary Classification model by clicking on Impact Analysis and Training Results to get more insights on how model is trained.

17. If you want to turn your model into action click on "Predict" tab in the main model menu.

18. You can produce your own "What-If analysis" based on existing training results. You can also import a fresh CSV dataset with data model will use to make predictions on a target column. In our case that is "Churn". Keep in mind, the dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

CO2 Emission

Create Regression model on Demo CO2 Emission dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create the machine learning model. In this case, we will select CO2 Car Emissions dataset to create Regression Analysis on car emissions data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click the Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create a new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on "New Model" to create new one.

7. Select model type from our templates. In our case we will select "Regression" by double clicking on its name.

8. Select the dataset you want to use to produce the model. We will use "Demo-CO2-Car-Emissions-Canada.csv"

9. Name your new model. We will call it "Regression on Demo-CO2-Car-Emissions"

10. Write the model description and select tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up the Regression model first, you need to define "Target Feature". That is numeric column from your dataset that you'd like to make predictions about. In case of Regression on car emissions dataset target feature is "CO2 Emissions(g/km)" column.

13. Click "Next" to get the list of model features that will be included into scenario. Model relies upon each column (feature) to make accurate predictions. When training the model, we will calculate which of the features are most important and behave as Key Drivers.

14. To start training the model, click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on the "Performance" tab to get model insights and view the Key Drivers.

16. Explore the Regression model by clicking on Impact Analysis and Training Results, to get more insights on how model is trained.

17. If you want to take your model into action, click on the "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset that the model will use to make predictions on the target column. In our case that is "CO2 Emissions (g/km)". Keep in mind, the dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

Diamonds

Create Multi-Class Classification model on Demo Diamonds dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create machine learning model. In this case we will select a Diamonds dataset to create Multi Class Analysis on diamond characteristics data.

3. Once selected, the demo dataset will load directly to your account. Dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on "New Model" to create a new one.

7. Select a model type from our templates. In our case we will select "Multi-Class Classification" by double clicking on its name.

8. Select the dataset you want to use to produce model. We will use "Demo-Diamonds.csv"

9. Name your new model. We will call it "Multi-Class Classification on Demo-Diamonds"

10. Write description of the model and select tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up a Multi Class model first you need to define "Target Feature". That is text column from your dataset you'd like to make predictions about. In case of Multi Class on Diamonds dataset target feature is "Cut" column.

13. Click "Next" to get the list of model features that will be included under the scenario. Model relies on each column (feature) to make accurate predictions. When training the model, we will calculate which of the features are most important and behave as Key Drivers.

14. To start training model click "Run Scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on "Performance" tab to get model insights and view Key Drivers.

16. Explore Multi Class model by clicking on Impact Analysis, Model Fit, Accuracy Overview or Training Results to get more insights on how model is trained and set up.

17. If you want to take your model into action click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset to make predictions on target column. In our case that is "Cut". Keep in mind, the dataset you are uploading needs to contain the same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

eCommerce Orders

Create RFM Customer Segmentation on Demo eCommerce Orders dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create machine learning model. In this case we will select eCommerce Orders dataset to create RFM Customer Segmentation (Recency, Frequency, Monetary Value) analysis on ecommerce orders data.

3. Once selected, the demo dataset will load directly to your account. Dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore the dataset details on Summary tab.

5. To create new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on "New Model" to create new one.

7. Select a model type from our templates. In our case we will select "RFM Customer Segmentation" by double clicking on its name.

8. Select dataset you want to use to produce model. We will use "Demo-eCommerce-Orders.csv".

9. Name your new model. We will call it "RFM customer segmentation on Demo-eCommerce-Orders".

10. Write description of the model and select tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. Click this text field.

13. To set up RFM model, you first need to identify and define few parameters. These are: "Time /Date Column", "Customer ID", "Customer Name" (optional) and "Monetary" (amount spent). In our case we will select "created_at" as date, "user_id" as customer and "total" as monetary parameter.

14. To start training model click "Run scenario".

15. Wait for a few moments and Voilà! Your RFM Customer Segmentation model is trained. Click on the "Results" tab to get model insights.

16. You can navigate over different tabs to get deep insights into RFM analysis from different perspectives: Recency, Frequency, Monetary.

17. Tab "RFM Scores" shows detailed explanation on different scores along with RFM segments and descriptions.

18. Tab "RFM Analysis" gives you more details on different segments

19. Tab "RFM Matrix" will show you number of customers belonging to different RFM segment. You can export matrix data to use Customer IDs for different business actions (e.g. exporting list of about to churn customers).

Housing Prices

Create a Regression model on Demo Housing Prices dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select a dataset you want to use to create machine learning model. In this case we will select Housing-Prices dataset to create a "Regression Analysis" on house price historical data.

3. Once selected, the demo dataset will load directly to your account. The Dataset view will automatically open.

4. Adjust your dataset options on the Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore the dataset details on the Summary tab.

5. To create a new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on a "New Model" to create new one.

7. Select the model type from our templates. In our case, we will select "Regression" by double clicking on its name.

8. Select the dataset you want to use to produce the model. We will use "Demo-Housing-Prices.csv"

9. Name your new model. We will call it "Regression on Demo-Housing-Prices"

10. Write the description of the model and select a tag. If you want to, you can also create new a tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up a Regression Model, firstly, you need to define the "Target Feature". That is a numeric column from your dataset that you'd like to make predictions about. In the case of Regression on Demo Housing Prices, the dataset target feature is "Price" column.

13. Click "Next" to get the list of model features that will be included into scenario. Model relies upon each column (feature) to make accurate predictions. When training model we will calculate which of the features are most important and behave as Key Drivers.

14. To start training the model, click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on the "Performance" tab to get model insights and view the Key Drivers.

16. Explore the Regression Model by clicking on Impact Analysis, Model Fit and Training Results to get more insights on how model is trained and set up.

17. If you want to take your model into action. Click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import fresh CSV dataset with data model will use to make predictions on the target column. In our case, that is "Price". Keep in mind, the dataset you are uploading needs to contain the same feature columns as your model.

19. Use your model often to predict future behaviour, and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

Lead Scoring

Binary Classification Model on Demo Lead Scoring dataset.

Get an overview of Lead Scoring demo dataset and how it can be used to create your new Graphite Note model in this video:

Or follow instructions below to get step by step guidance on how to use Lead Scoring demo dataset:

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create the machine learning model. In this case, we will select "Lead Scoring dataset" to create binary classification analysis on potential customer interactions data.

3. Once selected, the demo dataset will load directly to your account. The Dataset view will automatically open.

4. Adjust your dataset options on the Settings tab. Click Columns tab to view the list of available columns with their corresponding data types. Then explore the dataset details on Summary tab.

5. Click "Models"

6. You will get list of available models. Click on "New Model" to create a new one.

7. Select the model type from our templates. In our case, we will select "Binary Classification" by double clicking on its name.

8. Select dataset you want to use to produce the model. We will use "Demo-Lead-Scoring.csv."

9. Name your new model. We will call it "Binary Classification on Demo-Lead-Scoring".

10. Write the description of the model and select a tag. If you want to, you can also create a new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up a Binary Classification model, firstly, you need to define the "Target Feature". That is a binary column from your dataset that you'd like to make predictions about. In the case of Binary Classification on a Lead Scoring dataset, the target feature will be the "Converted" column.

13. Click "Next" to get the list of model features that will be included in the model scenario. The model relies upon each column (feature) to make accurate predictions. When training the model, it will calculate which of the features are most important and behave as Key Drivers.

14. To start training model click "Run Scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Binary Classification model is trained. Click on the "Performance" tab to get model insights and to view the Key Drivers.

16. Explore the "Binary Classification" model by clicking on the Impact Analysis and Training Results to get more insights on how the model is trained.

17. If you want to turn your model into action, click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset to make predictions on the target column. In our case that is "Converted". Keep in mind, dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour, and to learn which key drivers are impacting outcomes. The more you use and retrain your model, the smarter it becomes!

Mall Customers

Create General Segmentation on Demo Mall Customers dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create your machine learning model. In this case, we will select "Mall Customers dataset", to create General Segmentation analysis on customer engagement data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on the Settings tab. Click the Columns tab to view the list of available columns, with their corresponding data types. Explore the dataset details on Summary tab.

5. To create a new model in the Graphite Note main menu, click on "Models"

6. You will get list of available models. Click on the "New Model" to create a new one.

7. Select the model type from our templates. In our case, we will select "General Segmentation" by double clicking on its name.

8. Select the dataset you want to use to produce model. We will use "Demo-Mall-Customers.csv"

9. Name your new model. We will call it "General Segmentation on Demo-Mall-Customers".

10. Write a description of the model and select a tag. If you want to, you can also create new tag from the pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up your General Segmentation model, firstly, you need to define "Feature" columns. That is numeric column (or columns) from your dataset on which segmentation would be based. In the case of General Segmentation on Mall Customers dataset, the numeric feature will be "Age", Annual Income" and "Spending Score" columns.

13. To start training the model click "Run Scenario". This will train your model based on the uploaded dataset.

14. Wait for few moments and Voilà! Your General Segmentation model is trained. Click on "Results" tab to get model insights and explore segmentation clusters.

15. Navigate over different tabs to get insights from high level "Cluster Summary" to "By Cluster" charts and tables.

16. Use "Cluster Visualisations" to view the scatter plot visualisations of cluster members.

Marketing Mix

Create a Regression model on Demo Marketing Mix dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create your machine learning model. In this case, we will select MMM dataset to create Regression Analysis on marketing mix and sales data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on the Settings Tab. Click the Columns tab to view list of available columns with their corresponding data types. Explore dataset details on the Summary tab.

5. To create a new model in the Graphite Note main menu click on "Models".

6. You will get list of available models. Click on "New Model" to create a new one.

7. Select your model type from our templates. In our case we will select "Regression" by double clicking on its name.

8. Select dataset you want to use to produce the model. We will use "Demo-MMM.csv"

9. Name your new model. We will call it "Regression on Demo-MMM".

10. Write description of the model and select tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up your "Regression Model", firstly, you need to define "Target Feature". That is numeric column from your dataset that you'd like to make predictions about. In the case of Regression on Marketing Mix and Sales Dataset, the target feature is "Sales" column.

13. Click "Next" to get the list of model features that will be included in scenario. The model relies upon each column (feature) to make accurate predictions. When training the model, we will calculate which of the features are most important and behave as Key Drivers.

14. To start training the model, click "Run Scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on "Performance" tab to get model insights and view the Key Drivers.

16. Explore Regression model by clicking on Impact Analysis, Model Fit and Training Results to get more insights on how the model is trained and set up.

17. If you want to take your model into action, click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import fresh CSV dataset with data model will use to make predictions on target column. In our case, that is "Sales". Keep in mind, dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

Car Sales

Create Timeseries on Demo monthly car sales dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select a dataset you want to use to create your advanced analytics model. In this case, we will select Monthly Car Sales dataset to create a "Timeseries Forecast" analysis on car sales data.

3. Once selected, the demo dataset will load directly to your account. Dataset view will automatically open.

4. Adjust your dataset options on the Settings tab. Click the Columns tab to view the list of available columns with their corresponding data types. Explore the dataset details on the Summary tab.

5. To create a new model in the Graphite Note main menu, click on "Models".

6. You will get list of available models. Click on "New Model" to create new one.

7. Select the model type from our templates. In our case, we will select "Timeseries Forecast" by double clicking on its name.

8. Select the dataset you want to use to produce the model. We will use "Demo-Monthly-Car-Sales.csv".

9. Name your new model. We will call it "Timeseries forecast on Demo-Monthly-Car-Sales".

10. Write description of the model and select a tag. If you want to, you can also create new tag from pop-up "Tags" window, that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up Timeseries forecast analysis first you need to define the "Target Column". That is a numeric column from your dataset that you'd like to forecast. In the case of Timeseries on monthly car sales dataset target column is "Sales"

13. If dataset includes multiple time series sequences, you can select field that will be used to uniquely identify each sequence. In the case of our demo dataset, we will not apply Sequence Identifier field since we have only "Sales" target column.

14.

15. Click "Next" to open "Time/Date Column" selection. Choose "Month" as date column.

16. From additional options below, choose "Monthly" as time interval and define "Forecast Horizon". We will set up forecast horizon to 6 months in the future.

17. Click "Next" to activate "Seasonality" options step. Here, you can define seasonality specifics of your forecast. If time interval is set to daily on the next step you will also have "Advanced options" available.

18. Click "Run Scenario" to train your timeseries forecast.

19. Wait for a few moments and Voilà! Your Timeseries forecast is trained. Click on the "Performance" tab to get insights and view the graph with original(historical) and predicted model data.

20. Explore more details on "trend", "Seasonality" and "Details" tabs.

21. If you want to turn your model into action click on "Predict" tab in the main model menu.

22. You can produce your own Forecast analysis based on the existing training results by selecting Start and End date from drop down calendar and clicking on "Predict" button.

23. Use your model often to predict future sales results. The more you use and retrain your model, the smarter it becomes!

Store Item Demand

Create a Regression model on Demo Store Item Demand dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select dataset you want to use to create your machine learning model. In this case, we will select Store Item Demand dataset to create Regression analysis on sales across store locations data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create a new model in the Graphite Note main menu, click on "Models".

6. You will get list of available models. Click on "New Model" to create new one.

7. Select model type from our templates. In our case we will select "Regression" by double clicking on its name.

8. Select the dataset you want to use to produce model. We will use "Demo-Store-Item-Demand.csv".

9. Name your new model. We will call it "Regression on Demo-Store-Item-Demand".

10. Write a description of the model and select tag. If you want to, you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up Regression Model, firstly, you will need to define "Target Feature". That is the numeric column from your dataset that you'd like to make predictions about. In case of Regression on Store Item Demand dataset target feature is "Sales" column.

13. Click "Next" to get the list of model features that will be included in model scenario. The model relies on each column (feature) to make accurate predictions. When training a model, we will calculate which of the features are most important and behave as the Key Drivers.

14. To start training your model click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on "Performance" tab to get model insights and view Key Drivers.

16. Explore Regression model by clicking on Impact Analysis, Model Fit and Training Results to get more insights on how model is trained and set up.

17. If you want to take your model into action click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset to make predictions on target column. In our case that is "Sales". Keep in mind, dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

Upsell

Create Binary Classification model on Demo Upsell dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select dataset you want to use to create your machine learning model. In this case we will select Upsell dataset to create binary classification analysis on additional purchases by customer data.

After selection, the demo dataset will automatically load into your account and the dataset view will open immediately.

4. Adjust your dataset options on the Settings tab. Click Columns tab to view the list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create a new model in the Graphite Note main menu, click on "Models"

6. You will get a list of available models. Click on "New Model" to create a new one.

7. Select the model type from our templates. In our case we will select "Binary Classification" by double clicking on its name.

8. Select the dataset you want to use to produce model. We will use "Demo-Upsell.csv".

9. Name your new model. We will call it "Binary Classification on Demo-Upsell".

10. Write a description of the model and select a tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up a Binary Classification model, firstly, you need to define "Target Feature". That is binary column from your dataset that you'd like to make predictions about. In case of Binary Classification on Upsell dataset target feature will be "Applied" column.

13. Click "Next" to get the list of model features that will be included in the model scenario. Your model relies upon each column (feature) to make accurate predictions. When training the model we will calculate which of the features are most important and behave as Key Drivers.

14. To start training your model click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Binary Classification model is trained. Click on "Performance" tab to get model insights and view the Key Drivers.

16. Explore Binary Classification model by clicking on Impact Analysis and Training Results to get more insights on how model is trained.

17. If you want to take your model into action click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset to make predictions on the target column. In our case that is "Applied". Keep in mind, the dataset you are uploading needs to contain the same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

What Dataset do I need for my use case?

The "What Dataset Do I Need?" section of Graphite Note is a comprehensive resource designed to guide users through the intricacies of dataset selection and preparation for various machine learning models. This section is crucial for users, especially those without extensive AI expertise, as it provides clear, step-by-step instructions and examples on how to curate and structure data for different predictive analytics scenarios.

Key Features of the Section

Model-Specific Guidance: Each page within this section is tailored to a specific predictive model, such as cross-selling prediction, churn prediction, or customer segmentation. It outlines the type of data required, the format, and how to interpret and use the data effectively.
Sample Datasets and Templates: To make the process more user-friendly, the section includes sample datasets and templates. These examples showcase the necessary columns and data types, along with a brief explanation of each, helping users to model their datasets accurately.
Target Column Identification: A crucial aspect of preparing a dataset for machine learning is identifying the target column. This section provides clear guidance on selecting the appropriate target for different types of analyses, whether it's for classification, regression, or clustering.
Data Cleaning and Preparation Tips: Recognizing that data rarely comes in a ready-to-use format, this section offers valuable tips on cleaning and preparing data, ensuring that users start their predictive analytics journey on the right foot.
Real-World Applications and Use Cases: To bridge the gap between theory and practice, the section includes examples of real-world applications and use cases. This approach helps users understand how their data preparation efforts translate into actionable insights in various business contexts.

Predict Customer Churn: Dataset

Predicting customer churn is a critical challenge for businesses aiming to retain their customers and reduce turnover. This problem typically involves a binary classification model, where the goal is to predict whether a customer is likely to leave or discontinue their use of a service or product in the near future.

Dataset Essentials for Customer Churn Prediction

A well-structured dataset is key to accurately predicting customer churn. Essential data elements include:

Customer Demographics: Age, gender, and other demographic factors that might influence customer loyalty.
Usage Patterns: Data on how frequently and in what manner customers use the product or service.
Customer Service Interactions: Records of customer support interactions, complaints, and resolutions.
Transaction History: Details of customer purchases, payment methods, and transaction frequency.
Engagement Metrics: Measures of customer engagement, such as email opens, website visits, or app usage.

A typical dataset for churn prediction might look like this:

CustomerID

Age

Gender

AnnualIncome

MonthlyUsage

SupportCalls

LastPurchase

Churned

Target Column: The Churned column is the target variable, indicating whether the customer has churned (Yes) or not (No).

Steps to Success with Graphite Note

Data Gathering: Collect comprehensive and relevant customer data.
Feature Engineering: Identify and create features that are most indicative of churn.
Model Training: Use Graphite Note to train a binary classification model on your dataset.
Model Evaluation: Test the model's performance and refine it for better accuracy.

Benefits of Predicting Customer Churn

Proactive Customer Retention: Identifying at-risk customers allows businesses to take proactive steps to retain them.
Improved Customer Experience: Insights from churn prediction can guide improvements in products and services.
Cost Efficiency: Retaining existing customers is often more cost-effective than acquiring new ones.
Accessible Analytics: Graphite Note's no-code platform makes predictive analytics accessible, enabling businesses of all sizes to leverage AI for customer retention.

In summary, the Predict Customer Churn model is an invaluable tool for businesses focused on customer retention. Through Graphite Note, this advanced predictive capability becomes accessible to businesses without the need for extensive technical expertise, allowing them to make informed, data-driven decisions for customer retention strategies.

Predict Revenue : Dataset

Predict Revenue is a critical task for businesses aiming to forecast future revenue streams accurately. This challenge is typically addressed using a time series forecasting model, which analyzes historical revenue data to predict future trends and patterns.

Dataset Essentials for Predict Revenue

A suitable dataset for Predict Revenue using time series forecasting should include:

Date/Time: The timestamp of revenue data, usually in daily, weekly, or monthly intervals.
Revenue: The total revenue recorded in each time period.
Seasonal Factors: Data on seasonal variations or events that might affect revenue.
Economic Indicators: Relevant economic factors that could influence revenue trends.
Marketing Spend: Information on marketing and advertising expenditures, if applicable.

An example dataset for Predict Revenue with time series forecasting might look like this:

Date

Total Revenue

Seasonal Event

Economic Indicator

Marketing Spend

Target Column: The Total Revenue column is the primary focus, as the model aims to forecast future values in this series.

Steps to Success with Graphite Note

Data Collection: Compile historical revenue data along with any relevant external factors.
Time Series Analysis: Utilize Graphite Note to analyze the time series data and identify patterns.
Model Training: Train a time series forecasting model using the platform.
Model Evaluation: Continuously evaluate and adjust the model based on new data and changing market conditions.

Benefits of Predict Revenue with Time Series Forecasting

Accurate Financial Planning: Enables more precise budgeting and financial planning.
Strategic Decision Making: Informs strategic decisions with insights into future revenue trends.
Adaptability to Market Changes: Helps businesses adapt strategies in response to predicted market changes.
User-Friendly Analytics: Graphite Note's no-code approach makes sophisticated time series forecasting accessible to users without specialized statistical knowledge.

In summary, Predict Revenue with time series forecasting is an essential tool for businesses to anticipate future revenue trends. Graphite Note simplifies this complex task, allowing businesses to leverage their historical data for insightful and actionable revenue predictions.

Product Demand Forecast: Dataset

Product Demand Forecast is a crucial process for businesses to predict future demand for their products. This task typically involves time series forecasting models, which analyze historical sales data to forecast future demand patterns.

Dataset Essentials for Product Demand Forecast

An effective dataset for Product Demand Forecast using time series forecasting should include:

Date/Time: The timestamp for each data point, typically daily, weekly, or monthly.
Product Sales: The number of units sold or the sales volume of each product.
Product Features: Characteristics of the product, such as category, price, or any special features.
Promotional Activities: Data on any marketing or promotional activities that might affect sales.
External Factors: Information on external factors like market trends, economic conditions, or seasonal events.

An example dataset for Product Demand Forecast might look like this:

Date

ProductID

Sales Volume

Price

Promotion

Market Trend

Seasonal Event

Target Column: The Sales Volume column is the primary focus, as the model aims to forecast future sales volumes for each product.

Steps to Success with Graphite Note

Data Collection: Gather detailed sales data along with product features and external factors.
Time Series Analysis: Use Graphite Note to analyze the sales data over time, identifying trends and patterns.
Model Training: Train a time series forecasting model on the platform.
Model Evaluation: Regularly evaluate the model's performance and adjust it based on new data and market changes.

Benefits of Product Demand Forecast

Inventory Management: Helps in planning inventory levels to meet future demand, avoiding stockouts or overstock situations.
Strategic Marketing: Informs marketing strategies by predicting when demand for certain products will increase.
Resource Allocation: Assists in allocating resources efficiently based on predicted product demand.
Accessible Forecasting: Graphite Note's no-code platform makes advanced forecasting techniques accessible to a wider range of users.

In summary, Product Demand Forecast is vital for businesses to anticipate market demand and plan accordingly. With Graphite Note, this complex analytical task becomes manageable, enabling businesses to leverage their data for effective demand planning and strategic decision-making.

Media Mix Modeling (MMM): Dataset

Media Mix Modeling (MMM) is a statistical analysis technique used to quantify the impact of various marketing channels on sales and other key performance indicators (KPIs). It helps businesses allocate their marketing budget more effectively by understanding the contribution of each channel to overall performance.

Dataset Essentials for Media Mix Modeling

A robust dataset for Media Mix Modeling should include:

Time Period: The specific dates or periods for which the data is collected.
Marketing Spend: The amount spent on each marketing channel during the period.
Sales Data: The total sales achieved in the same time period.
Channel Performance Metrics: Metrics like impressions, clicks, conversions, etc., for each channel.
External Factors: Information on external factors like economic conditions, competitor activities, or seasonal events.
Market Dynamics: Changes in market conditions, customer preferences, or product availability.

An example dataset for Media Mix Modeling might look like this:

Time Period

TV Spend

Digital Spend

Radio Spend

Print Spend

Total Sales

Economic Condition

Seasonal Event

Target Column: Totall Sales

Steps to Success with Graphite Note

Data Compilation: Gather comprehensive data across all marketing channels and corresponding sales data.
Model Development: Use Graphite Note, Regression Model, to develop a statistical model that correlates marketing spend across various channels with sales outcomes.
Analysis and Insights: Analyze the model's output to understand the effectiveness of each marketing channel.
Strategic Decision Making: Apply these insights to optimize future marketing spends and strategies.

Benefits of Media Mix Modeling

Optimized Marketing Budget: Allocate marketing budgets more effectively across channels.
ROI Analysis: Understand the return on investment for each marketing channel.
Strategic Planning: Plan marketing strategies based on data-driven insights.
Adaptability: Adjust marketing strategies in response to changing market conditions and consumer behaviors.
Accessible Advanced Analytics: Graphite Note's no-code platform makes complex MMM accessible to teams without specialized statistical knowledge.

In summary, Media Mix Modeling is a powerful tool for businesses to optimize their marketing strategies based on comprehensive data analysis. With Graphite Note, this advanced capability becomes accessible, allowing for more informed and effective marketing budget allocation.

Datasets

Data sources

Import data from CSV file

Overview

Graphite Note’s CSV integration allows you to import your data from CSV files. This guide will walk you through the steps to import your CSV data into Graphite Note.

Steps to Import Data

Create a New Dataset

Option 1: Go to the homepage and click on "Create" under Datasets.
Option 2: From the datasets list, click on "New Dataset."

Select CSV file

Choose "CSV" as your data source and click "Next".

Enter Dataset Information

Name: Provide a name for the dataset.
Description: Add a short description of the data.
Tags: Add tags for better organization.
Click "Next" to proceed.

Choose Parsing Options

Configure your parsing options to properly interpret the CSV file.
- Delimiter: Choose the character that separates values in your CSV (e.g., comma, semicolon).
- Header: Indicate whether your CSV file has a header row.
- Convert Empty Values to Null: Convert 'empty string' values to null. By default, this is turned on.
- File Encoding: Select the file character encoding. The default is Unicode (UTF-8).
- Skip Empty Rows: Skip empty rows at the beginning or end of the file. By default, Graphite Note will ignore those lines.
Click on "Parse file" to process the file.

Review parsed data
Review the parsed data and make any necessary adjustments:
- Rename Columns: Change column names if needed.
- Change Data Types: Adjust the data types of your columns as required.
- Click on the "Create" button to finalize and create your dataset.

CSV filesize upload limits

Graphite Note limits the file size for CSV uploads to 50 MB. This restriction ensures that the platform maintains optimal performance, prevents excessive resource consumption, and ensures fast processing times for data uploads. If users need to work with larger datasets, alternative methods such as connecting directly to a database (Postgres, MySQL, BigQuery) can be used.

Next Steps

Now that your data is imported and prepared, you can proceed to create a model without writing any code. Simply go to the and follow the steps to build and deploy your model using Graphite Note's intuitive interface.

Re-upload or append CSV

If you've collected additional data related to your previously uploaded CSV file or there have been changes to the existing data, you can use the re-upload option. This allows you to update or append new data to your existing dataset. Expanding your dataset with more records can benefit your machine learning model, but the impact depends on the quality and relevance of the new data. Learn more about the benefits of dataset expansion .

How to Re-upload a CSV File:

Go to the Datasets List:
- Navigate to the list of datasets.

Select the Dataset:

Choose the dataset you wish to re-upload.

Select Re-upload:

Click on the 'Re-upload' option.

Choose Options and upload:

Decide on data append:
- Depending on your needs, you can choose 'Append data' to add new records to the existing dataset. If 'Append data' is turned off, the new dataset will overwrite the existing
Upload Your File:
- Select or drag and drop your CSV file. Ensure the file has the same column structure as the previously uploaded file.
Click Update to complete the re-upload process.

Oracle Connector

Oracle Connector for Graphite Note will soon be available for public use. This connector will enable seamless integration with Oracle databases, enhancing data accessibility and supporting advanced analysis within the Graphite Note platform. Stay tuned for its release to unlock efficient Oracle data connectivity.

Prepare your Data

data prep

Expanding datasets

Expanding your dataset with more records can be beneficial for your machine learning model, but the impact depends on the quality and relevance of the new data. To learn how to re-upload or append data to your existing CSV dataset go here.

Here’s how adding more records might help:

1. Improved Generalization

More data generally helps the model to learn better and generalize to unseen data. If your initial dataset was limited, your model might have overfitted to the specific patterns in that data. Adding more data helps the model capture a wider range of patterns, leading to better performance on new data.

2. Reducing Overfitting

When your dataset is small, the model may learn noise or irrelevant patterns (overfitting). Expanding the dataset introduces more variety, making it harder for the model to memorize specific samples, thereby helping to reduce overfitting.

3. Better Representing the Data Distribution

A larger dataset often better represents the underlying data distribution, especially if the new records cover more edge cases, outliers, or scenarios that were underrepresented in the original dataset. This helps the model become more robust and perform well across a wider range of inputs.

4. Enhanced Model Accuracy

In most cases, expanding your dataset improves the accuracy of the model, especially if the model is data-hungry (like deep learning models). More data means more examples for the model to learn from, allowing it to better predict future outcomes.

5. Handling Class Imbalance

If your dataset suffers from class imbalance (e.g., if one class has far more records than another in a classification problem), adding more records from the minority class can make your dataset more balanced, improving the model’s ability to predict minority classes correctly.

Considerations:

• Quality over Quantity: Simply adding more data isn’t always beneficial if the additional data is noisy, irrelevant, or incorrectly labeled. High-quality, representative data is more important than just increasing the size of the dataset.

• Data Diversity: Adding data that captures a wider variety of features or scenarios is more helpful than adding redundant or very similar data points. If the new data points are too similar to the existing ones, the impact on model performance might be minimal.

• Graphite Note plan limits: Consider that expanding the dataset will increase both the computational requirements for training the model and the total number of dataset rows used in your current plan. More about Graphite Note plans finde here.

CSV File creating and formatting

This page outlines best practices for preparing CSV files that can be easily parsed and imported into Graphite Note.

Formatting Recommendations

1. Include a Header Row

• The first line should clearly list column names (e.g., customer_id, date, purchase_amount).

• Avoid using spaces or special characters in header names; if needed, use underscores or camelCase.

2. Check Delimiters

• Ensure the file uses a consistent delimiter (most commonly a comma , or semicolon ;).

• If your data contains commas within fields, enclose those fields in quotes (e.g., "123, Main Street").

3. Use UTF-8 Encoding

• Always save or export your file in UTF-8 format to avoid character encoding issues.

• This helps ensure Graphite Note can accurately parse all characters, including accented letters.

4. Date and Time Formats

• Use consistent date formats (e.g., YYYY-MM-DD) or a well-defined datetime format (e.g., YYYY-MM-DD HH:MM:SS).

• This helps Graphite Note accurately recognize date/time fields.

5. Handling Special Characters

• If your data contains quotes, commas, or other punctuation, ensure those fields are properly escaped or wrapped in quotes to prevent parsing errors.

6. Remove Unused Columns

• If possible, eliminate columns that you do not plan to analyze. This helps keep the CSV file size smaller and easier to handle.

7. Adhere to the 50 MB File Size Limit

• Graphite Note limits CSV uploads to 50 MB to ensure optimal performance and resource usage.

• If your file exceeds 50 MB, try splitting it into multiple CSVs or removing unnecessary columns and rows.

• For significantly larger datasets, consider connecting directly to a database (e.g., PostgreSQL, MySQL, BigQuery) to feed data into Graphite Note.

How to Create a CSV File from Excel?

1. Open Your Spreadsheet in Excel

Make sure you review and clean up any unnecessary columns or rows before exporting.

2. Save As CSV

Click on File → Save As (or Save a Copy in newer versions).
Select CSV (Comma delimited) or CSV UTF-8 (Comma delimited) in the Save as type dropdown menu.
Choose a file name and location, then click Save.

3. Verify Encoding

If possible, choose CSV UTF-8 when saving.
If CSV UTF-8 is not available, you may need to convert the file encoding separately (e.g., using a text editor like VSCode or Notepad++).

4. Confirm Data Integrity

Open the resulting CSV file in a plain text editor to verify that the column names and any special characters appear correctly.

Models

Machine Learning Models

In the next section, you’ll learn how to define a scenario, train a model, and leverage the results to make predictions, take strategic actions, and make data-driven decisions that directly impact your business.

Each model will be introduced on a dedicated page with step-by-step instructions and a video tutorial, guiding you through the process from setup to actionable insights.

Notebooks

What is Notebook?

The main idea behind Graphite Notebook is to do your own ; create various visualization with detailed descriptions, plot model results for better understanding, etc.

REST API

Complete

Description

The /dataset-complete endpoint is designed to signal the end of the dataset insertion process. Once all batches have been inserted via the /dataset-fill endpoint, this method should be called to trigger the final dataset shape calculation and any other necessary post-processing steps.

complete_url = "https://app.graphite-note.com/api/dataset-complete"

Parameters

user-code (string): Unique code identifying the user.
dataset-code (string): A unique code for the dataset, if pre-defined.

Indicate end of dataset insertion

To populate a dataset, follow these steps:

Make a POST request to /dataset-complete with the required parameters in the request body.
Include the user-code and dataset-code to identify the dataset and the user making the request.

Example Usage

For example, making a POST with following header included into request with the provided JSON body would result in the response below:

Header

POST /dataset-complete
Authorization: Bearer YOUR-TENANT-TOKEN

Request

{ 
"user-code": "0f02b4d4f9ae", 
"dataset-code": " a49932c0f135"
 }

Response

{
    "data": {
        "status": "success",
        "details": {
            "dataset-code": "a49932c0f135",
            "rows-count": 542289
        }
    }
}

Prediction API

Prediction API allows users to request predictions by sending data to a trained model in Graphite Note.

To interact with the Graphite Note API and perform predictions using a specific model, you need to make a POST request to the API endpoint. Depending on the model type and use case, you can now use two different versions of the API, each with its own request body format:

v1 Endpoint – for backward-compatible, alias-based input
v2 Endpoint – for simplified, column-name-based input (recommended for new users)

In the following sections, you will find more details about the Prediction API.

Headers

The request requires the following headers to be included:

Authorization: This header should be set to "Bearer [token]". Replace [token] with your unique token. The token can be found by accessing the account info page in the Graphite Note app, under the section displaying your current plan information.

Content-Type: This header should be set to "application/json" to indicate that the request payload is in JSON format.

Customer Lifetime Value

Introduction

Detecting early signs of reduced customer engagement is pivotal for businesses aiming to maintain loyalty. A notable signal of this disengagement is when a customer's once regular purchasing pattern starts to taper off, leading to a significant decrease in activity. Early detection of such trends allows marketing teams to take swift, proactive measures. By deploying effective retention strategies, such as offering tailored promotions or engaging in personalized communication, businesses can reinvigorate customer interest and mitigate the risk of losing them to competitors.

Our objective is to utilize a model that not only alerts us to customers with an increased likelihood of churn but also forecasts their potential purchasing activity and, importantly, estimates the total value they are likely to bring to the business over time.

These analytical needs are served by what is known in data science as Buy 'Til You Die (BTYD) models. These models track the lifecycle of a customer's interaction with a business, from the initial purchase to the last.

While customer churn models are well-established within contractual business settings, where customers are bound by the terms of service agreements, and churn risk can be anticipated as contracts draw to a close, non-contractual environments present a different challenge. In such settings, there are no defined end points to signal churn risk, making traditional classification models insufficient.

To address this complexity, our model adopts a probabilistic approach to customer behavior analysis, which does not rely on fixed contract terms but on behavioral patterns and statistical assumptions. By doing so, we can discern the likelihood of future transactions for every customer, providing a comprehensive and predictive understanding of customer engagement and value.

Customer Lifetime Value - How it Works?

The Customer Lifetime Value (CLV) model is a robust tool employed to ascertain the projected revenue a customer will contribute over their entire relationship with a business. The model employs historical data to inform predictive assessments, offering valuable foresight for strategic decision-making. This insight assists companies in prioritizing resources and tailoring customer engagement strategies to maximize long-term profitability.

The CLV model executes a series of sophisticated calculations. Yet, its operations can be conceptualized in a straightforward manner:

Historical Analysis: The model comprehensively evaluates past customer transaction data, noting the frequency and monetary value of purchases alongside the tenure of the customer relationship.
Engagement Probability: It assesses the likelihood of a customer’s future engagement based on their past activities, effectively estimating the chances of a customer continuing to transact with the business.
Forecasting: With the accumulated data, the model projects the customer’s future transaction behavior, predicting how often they will make purchases and the potential value of these purchases.
Lifetime Value Calculation: Integrating these elements, the model calculates an aggregate figure representing the total expected revenue from a customer for a designated future period.

Model Scenario

The Customer Lifetime Value model uses historical customer data to predict the future value a customer will generate for a business. It leverages algorithms and statistical techniques to analyze customer behavior, purchase patterns, and other relevant factors to estimate the potential revenue a customer will bring over their lifetime.

The dataset on which you will run your model must contain a time-related column.

We need to distinguish all customers, so we need an identifier variable like Customer ID. If you might have data about Customer Names, great, if not, don't worry, just select the same column as in the Customer ID field.

We need to choose the numeric variable regard to which we will observe customer behavior, called Monetary (amount spent).

Finally, you need to choose the Starting Date from which you'd like to calculate this model for your dataset.

When you're looking at this option for calculating Customer Lifetime Value (CLV), think of it as setting a starting line for a race. The "race" in this case is the journey you're tracking: how much your customers will spend over time.

The "Starting Date for Customer Lifetime Value Calculation" is basically asking you when you want to start watching the race. You have a couple of choices:

Max Date: This is like saying, "I want to start watching the race from the last time we recorded someone crossing the line." It sets the starting point at the most recent date in your records where a customer made a purchase.
Today: Choosing this means you want to start tracking from right now, today. So any purchases made after today will count towards the CLV.
-- select date --: This would be an option if you want to pick a specific date to start from, other than today or the most recent date in your data.

Model Results

Let's see how to interpret the results after we have run our model.

And then, the results consist of 2 tabs: and Tabs.

On the summary of repeat customers, we have:

the Total Repeat Customers: the customers came that keep returning (the loyal customers)
the Total Historical Amount: the past earnings from loyal customers
the Average Spend per Repeat Customer
the Average no. of Repeat Purchases: shows the customers' loyalty with the average number of repeat purchases
the Average Probability Alive Next 90 days: estimate the likelihood that a customer stays alive or active for their business in the next 90 days
the Predicted no. of Purchases next 90 days: the number of purchases you can expect the next 90 days based on our analysis
Predicted Amount Next 90 days: the revenue you can expect the next 90 days with our predicted amount feature
CLV Customer Lifetime Value: average revenue that one customer generated in the past and will generate in the future

CLV Insights

The CLV Insights Tab shows some charts on the lifetime of customers.

The forecasted number of purchases chart estimates the number of purchases that are expected to be made by returning customers over a specific period.

The forecasted amount chart is a graphical representation of the projected value of purchases to be made by returning customers over a certain period.

Finally, the average alive probability chart illustrates the average probability of a customer remaining active for a business over time, assuming no repeat purchases.

Details

Last but not least, on the Details Tab, you can find a detailed table where you can see all relevant values which were used for the above results.

You have all the information in each column if you click on the link on the details tab.

Understanding the Data Columns

The Details Tab within the Customer Lifetime Value Model offers an extensive breakdown of metrics for in-depth analysis. Each column represents a specific aspect of customer data that is pivotal to understanding and predicting customer behavior and value to your business. Below are the descriptions of the available columns:

`amount_sum`

Description: This column showcases the total historical revenue generated by an individual customer. By analyzing this data, businesses can identify high-value customers and allocate marketing resources efficiently.

`amount_count`

Description: Reflects the total number of purchases by a customer. This frequency metric is invaluable for loyalty assessments and can inform retention strategies.

`repeated_frequency`

Description: Indicates the frequency of repeated purchases, highlighting customer loyalty. This metric can be leveraged for targeted engagement campaigns.

`customer_age`

Description: The duration of the customer's relationship with the business, measured in days since their first purchase. It helps in segmenting customers based on the length of the relationship.

`average_monetary`

Description: Average monetary value per purchase, providing insight into customer spending habits. Businesses can use this to predict future revenue from a customer segment.

`probability_alive`

Description: Displays the current probability of a customer being active. A score of 1 means 100%, the customer is likely active, aiding in prioritizing engagement efforts.

`probability_alive_7_30_60_90_365`

Description: This column shows the probability of customers remaining active over various time frames without repeat purchases. It's critical for developing tailored customer retention plans.

`predicted_no_purchases_7_30_60_90_365`

Description: Predicts the number of future purchases within specific time frames. This forecast is essential for inventory planning and sales forecasting.

`CVL_30_60_90_365`

Description: Estimates potential customer value over different time frames, aiding in strategic financial planning and budget allocation for customer acquisition and retention.

Example - customer 1

In this given example, we have a snapshot of customer data from the CLV model. The model considers various unique aspects of customer behavior to predict future engagement and value. Let's analyze the key data points and what they signify in a non-technical way, while emphasizing the model’s ability to tailor predictions to individual customer behavior:

amount_sum: This customer has brought in a total revenue of $4,584.14 to your business.
amount_count: They have made 108 purchases, which shows a high level of engagement with your store.
repeated_frequency: Out of these purchases, 106 are repeat purchases, suggesting a strong customer loyalty.
customer_age: They have been a customer for 364 days, indicating a relatively long-term relationship with your business.
average_monetary: On average, they spend about $42.73 per transaction.
probability_alive: There’s an 85% to 86% chance that they are still actively engaging with your business, which is quite high.
probability_alive_7: Specifically, the probability that this customer will remain active in the next 7 days is about 44.48%.

Alex, with a remarkable 106 repeated purchases and a customer_age of 364 days, has shown a pattern of strong and consistent engagement. The average monetary value of their purchases is $42.73, contributing significantly to the revenue with a total amount_sum of $4,584.14. The current probability_alive is high, indicating Alex is likely still shopping.

However, even with this consistent past behavior, the probability_alive_7 drops to about 44.48%. It highlights a nuanced understanding of Alex's habits; a sudden change in their routine is notable, which is why the model predicts a more significant impact if Alex were to alter their shopping pattern even slightly.

Example - customer 2

On the other hand, we have Casey, who has made 2 purchases, with only 1 being a repeated transaction. Casey’s amount_sum is $185.93, with an average_monetary value of $84.44, and a customer_age of 135 days. Despite a high current probability_alive, the model shows a minimal decline to 83.73% in the probability_alive_7.

This slight decrease tells us that Casey's engagement is inherently more sporadic. The business doesn't expect Casey to make purchases with the same regularity as Alex. If Casey doesn't return for a week, it isn't alarming or out of character, as reflected in the gentle decline in their seven-day active probability.

The Story of Two Shopping Journeys

The contrast in these profiles, painted by the CLV model, enables the business to craft distinct customer journeys for Alex and Casey. For Alex, it's about ensuring consistency and rewarding loyalty to maintain that habitual engagement. Perhaps an automated alert for engagement opportunities could be set up if they don't make their usual purchases.

For Casey, the strategy may involve creating moments that encourage repeat engagement, possibly through sporadic yet impactful touchpoints. Since Casey's behavior suggests openness to larger purchases, albeit less frequently, the focus could be on highlighting high-value items or exclusive offers that align with their sporadic engagement pattern.

The CLV model's behavioral predictions allow the business to personalize customer experiences, maximize the potential of each interaction, and strategically allocate resources to maintain and grow the value of each customer relationship over time. This bespoke approach is the essence of modern customer relationship management, as it aligns perfectly with the individualized tendencies of customers like Alex and Casey.

Utilizing Column Data

This detailed data is a treasure trove for businesses keen on data-driven decision-making. Here’s how to utilize the information effectively:

Custom Segmentation: Use customer_age, amount_sum, and average_monetary to segment your customers into meaningful groups.
Detect Churners: Use probability_alive to segment customers currently being active for non contractual business like eCommerce and Retail. A score of 0.1 means 10% probability the customer is active ("alive") for your business.
Targeted Marketing Campaigns: Leverage repeated_frequency and probability_alive columns to identify customers for loyalty programs or re-engagement campaigns.
Revenue Projections: The CVL_30_60_90_365 column helps in projecting future revenue and understanding the long-term value of customer segments.
Strategic Planning: Use predicted_no_purchases_7_30_60_90_365 to plan for demand, stock management, and to set realistic sales targets.

By engaging with the columns in the Details Tab, users can extract actionable insights that can drive strategies aimed at optimizing customer lifetime value. Each metric can serve as a building block for a more nuanced, data-driven approach to customer relationship management.