1 of 78

Graphite Note Documentation

Welcome

Welcome to the Graphite Note documentation portal. These guides will show you how to predict, visualize, and analyze your data using machine learning with no code. Graphite Note is a powerful tool designed to democratize the power of data analysis and machine learning, making it accessible to individuals and teams of all skill levels. Whether you're a marketer looking to segment your audience, a sales team predicting lead conversions, or an operations manager forecasting product demand, Graphite Note is your go-to platform.

The platform is built with a user-friendly interface that allows you to connect your data, generate predictive models, and share your results with just a few clicks. It's not just about making predictions; it's about understanding them. With Graphite Note's built-in prescriptive analytics and data storytelling feature, you can transform complex data into meaningful narratives that drive strategic actions. This documentation will guide you through every step of the process, from setting up your account to making your first prediction. So let's dive in and start exploring the power of no-code machine learning with Graphite Note.

Account and Team Setup

Sign up

Account information

You can access to the Account information page by clicking on the Account tab in the top-right of Graphite Note, and then the Account info drop-down item.

This page features your personal information, including your team name, your current plan and the usage of it (data still available and the number of user). You can rename your team and contact sales to change your plan.

Team setup

You will then be redirected to the “Users page” accessible through the Account tab in the top-right corner of Graphite Note, followed by selecting the Users option from the drop-down menu. On this page, you can see all the information on each user of your team.

Roles

Administrator: can read and modify entities
Viewer: can only read entities, no editing allowed

The details are accessible through the Account tab in the top-right corner of Graphite Note, followed by selecting the Roles option from the drop-down menu.

If you want to change the properties of the viewer mode, you can click on the gear wheel to edit it.

You can now not allow access to some modules or give the possibility of editing them. You have also the possibility of deleting the role.

Plans

To help you choose which plan matches your project, you can start with a free trial, schedule a demo, or book a meeting with an expert.

FAQ

This page contains the most Frequently Asked Questions

What is Predictive Analytics?

Predictive analytics is a form of advanced analytics that uses both new and historical data to forecast future activity, behavior, and trends. It involves applying statistical analysis techniques, analytical queries, and automated machine learning algorithms to data sets to create predictive models that place a numerical value — or score — on the likelihood of a particular event happening.

What is Prescriptive Analytics?

Prescriptive analytics is a form of advanced analytics that examines data or content to answer the question "What should be done?" or "What can we do to make 'X' happen?". It is characterized by techniques such as graph analysis, simulation, complex event processing, neural networks, recommendation engines, heuristics, and machine learning.

How does Graphite Note ensure data security?

What types of data can I use with Graphite Note?

Graphite Note is designed to work with a wide range of data types. You can import data from various sources such as CSV files, databases, data warehouses. The platform can handle structured, tabular data (like numerical and categorical data).

How can I import my data into Graphite Note?

Importing data into Graphite Note is a straightforward process. You can upload data directly from your computer, connect to a database, or your data warehouse. Our platform supports a variety of data formats, including CSV, and SQL databases.

What kind of support is available if I need help using Graphite Note?

We offer a range of support options to help you get the most out of Graphite Note. This includes a comprehensive knowledge base, video tutorials, and email support. Our dedicated support team is always ready to assist you with any questions or issues you may have.

Can I use Graphite Note if I don't have a background in data science?

Absolutely! Graphite Note is designed to be user-friendly and accessible to everyone, regardless of their technical background. Our no-code platform allows you to generate predictive and prescriptive analytics without needing to write a single line of code.

What industries can benefit from using Graphite Note?

Graphite Note is versatile and can be beneficial to a wide range of industries. This includes but is not limited to retail, e-commerce, marketing, sales, finance, healthcare, and manufacturing. Any industry that relies on data to make informed decisions can benefit from our platform.

How can Graphite Note help with my specific business needs?

Graphite Note is a flexible platform that can be tailored to meet your specific business needs. Whether you're looking to improve customer retention, optimize your marketing campaigns, forecast sales, or identify trends, our platform can provide the insights you need to drive growth.

What training resources are available for new users?

We offer a variety of resources to help new users get started with Graphite Note. This includes step-by-step tutorials, webinars, and a comprehensive knowledge base. We're committed to helping you get the most out of our platform and will work with you during onboarding.

How to update a model with new data?

What is a tag?

A tag is a keyword associated with a model or dataset. It is a tool to group your models to easily find them, as you can filter your list by tags.

Also, you can create tags and manage them by clicking on the Account tab in the top-right of Graphite Note, and then the Tags drop-down item. You can change its name, its color, and its description or delete it.

You can also create a tag directly when you are importing a dataset or a model by clicking on Select tag and then Create & Apply or you can choose an existing one and Apply it.

What does parsing mean?

In programming, parsing refers to the process of analyzing and interpreting the structure of a sequence of characters or symbols according to a specific grammar or syntax. It is used in our application to understand and extract meaningful information from input data.

During parsing, a parser takes the input, which can be a program's source code or any other form of textual data, and breaks it down into a hierarchical structure that conforms to a predefined set of rules or grammar. This hierarchical structure is typically represented using a data structure such as an abstract syntax tree (AST) or a parse tree.

How does the free trial work?

Users can start their free trial of the SPROUT plan immediately. The trial is valid for the next 7 days. After seven days, if you want to continue our service, you must subscribe to a plan in communication with our sales team.

What's the difference between the starter (SPROUT) and other plans?

The starter plan is primarily designed for individual users that want to upload CSV files and create machine learning models.

The starter plan has the same core functionality as higher plans but with the following limitations:

Only one user in the workspace.
Only CSV connector
Number of total data source rows limited to 50k

What is API access?

APIs enable you to pull your predictions into your ERP, CRM, internal app, or website. It is a way to process or display predictions outside your Graphite Note account.

What is a Dedicated Data Scientist?

With Graphite Note, you have the option to add a Dedicated Data Scientist to your team. This is an expert in machine learning and data science who can assist you and your team with any questions or concerns you may have. They can also provide hands-on support with tasks such as data cleaning and improving the performance of your models.

Can I extend my trial?

We can extend your trial beyond the one-week default period in certain circumstances. Don't hesitate to get in touch with us before the end of your trial if you'd like to discuss this further.

Reputable Provider

Graphite Note runs on platforms belonging to reputable leading service providers and vendors that uphold the highest security standards, specifically: Amazon Web Service (AWS).

Demo datasets

Demo Datasets

We have prepared 12 distinct demo datasets that you can upload and use in your Graphite Note account. These demo datasets include dummy data from a variety of business scenarios and serve as an excellent starting point for building and running your initial Graphite Note machine learning models. Instructions on how to proceed with demo dataset will be provided in the following pages.

Create a Regression model on Demo Ads dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create a machine learning model. In this case we will select "Ads" dataset to create a Regression Analysis on marketing ads data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on the Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create a new model in the Graphite Note main menu click on "Models".

6. You will get list of available models. Click on "New Model" to create a new one.

7. Select model type from our templates. In our case, we will select "Regression" by double clicking on its name.

8. Select dataset you want to use to produce a model. We will use "Demo-Ads.csv."

9. Name your new model. We will call it "Regression on Demo-Ads".

10. Write the description of the model and select tag. If you want you can also create a new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up a "Regression Model", first you will need to define the "Target Feature". That is a numeric column from your dataset that you'd like to make predictions about. In the case of Regression on Ads dataset, target feature is "Clicks" column.

13. Click "Next" to get the list of model features that will be included in the scenario. Model relies upon each column (feature) to make accurate predictions. When training the model, we will calculate which of the features are most important and behave as Key Drivers.

14. To start training the model, click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on "Performance" tab to get model insights and view Key Drivers.

16. Explore the Regression Model by clicking on Impact Analysis, Model Fit and Training Results to get more insights on how the model is trained and set up.

17. If you want to take turn model into action click on "Predict" tab in the main model menu.

18. You can produce your own "What-If" analysis based on existing training results. You can also import a fresh CSV dataset into the data model, to make predictions on target column. In our case that is "Clicks". Keep in mind, the dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour, and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

Churn

Create Binary Classification model on Demo Churn dataset.

Get an overview of Customer Churn demo dataset and how it can be used to create your new Graphite Note model in this video:

Or follow instructions below to get step by step guidance on how to use Customer Churn demo dataset:

1.If you want to use Graphite Note demo datasets click "Import DEMO Dataset"

2. Select the dataset you want to use to create a machine learning model. In this case we will select Churn dataset to create binary classification analysis on customer engagement data .

3. Once selected, demo dataset will load into your account. Dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on "New Model" to create new one.

7. Select model type from our templates. In our case we will select "Binary Classification" by double clicking on its name.

8. Select dataset you want to use to produce model. We will use "Demo-Churn.csv."

9. Name your new model. We will call it "Binary Classification on Demo-Churn".

10. Write description of the model and select tag. If you want to, you can also create a new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up Binary Classification model first you need to define "Target Feature". That is binary column from your dataset that you'd like to make predictions about. In case of Binary Classification on Churn dataset, the target feature will be "Churn" column.

13. Click "Next" to get the list of model features that will be included in scenario. Model relies upon each column (feature) to make accurate predictions. When training model we will calculate which of the features are most important and behave as Key Drivers.

14. To start training model click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for a few moments and Voilà! Your Binary Classification model is trained. Click on "Performance" tab to get model insights and view Key Drivers.

16. Explore Binary Classification model by clicking on Impact Analysis and Training Results to get more insights on how model is trained.

17. If you want to turn your model into action click on "Predict" tab in the main model menu.

18. You can produce your own "What-If analysis" based on existing training results. You can also import a fresh CSV dataset with data model will use to make predictions on a target column. In our case that is "Churn". Keep in mind, the dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

CO2 Emission

Create Regression model on Demo CO2 Emission dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create the machine learning model. In this case, we will select CO2 Car Emissions dataset to create Regression Analysis on car emissions data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click the Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create a new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on "New Model" to create new one.

7. Select model type from our templates. In our case we will select "Regression" by double clicking on its name.

8. Select the dataset you want to use to produce the model. We will use "Demo-CO2-Car-Emissions-Canada.csv"

9. Name your new model. We will call it "Regression on Demo-CO2-Car-Emissions"

10. Write the model description and select tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up the Regression model first, you need to define "Target Feature". That is numeric column from your dataset that you'd like to make predictions about. In case of Regression on car emissions dataset target feature is "CO2 Emissions(g/km)" column.

13. Click "Next" to get the list of model features that will be included into scenario. Model relies upon each column (feature) to make accurate predictions. When training the model, we will calculate which of the features are most important and behave as Key Drivers.

14. To start training the model, click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on the "Performance" tab to get model insights and view the Key Drivers.

16. Explore the Regression model by clicking on Impact Analysis and Training Results, to get more insights on how model is trained.

17. If you want to take your model into action, click on the "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset that the model will use to make predictions on the target column. In our case that is "CO2 Emissions (g/km)". Keep in mind, the dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

Diamonds

Create Multi-Class Classification model on Demo Diamonds dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create machine learning model. In this case we will select a Diamonds dataset to create Multi Class Analysis on diamond characteristics data.

3. Once selected, the demo dataset will load directly to your account. Dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on "New Model" to create a new one.

7. Select a model type from our templates. In our case we will select "Multi-Class Classification" by double clicking on its name.

8. Select the dataset you want to use to produce model. We will use "Demo-Diamonds.csv"

9. Name your new model. We will call it "Multi-Class Classification on Demo-Diamonds"

10. Write description of the model and select tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up a Multi Class model first you need to define "Target Feature". That is text column from your dataset you'd like to make predictions about. In case of Multi Class on Diamonds dataset target feature is "Cut" column.

13. Click "Next" to get the list of model features that will be included under the scenario. Model relies on each column (feature) to make accurate predictions. When training the model, we will calculate which of the features are most important and behave as Key Drivers.

14. To start training model click "Run Scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on "Performance" tab to get model insights and view Key Drivers.

16. Explore Multi Class model by clicking on Impact Analysis, Model Fit, Accuracy Overview or Training Results to get more insights on how model is trained and set up.

17. If you want to take your model into action click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset to make predictions on target column. In our case that is "Cut". Keep in mind, the dataset you are uploading needs to contain the same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

eCommerce Orders

Create RFM Customer Segmentation on Demo eCommerce Orders dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create machine learning model. In this case we will select eCommerce Orders dataset to create RFM Customer Segmentation (Recency, Frequency, Monetary Value) analysis on ecommerce orders data.

3. Once selected, the demo dataset will load directly to your account. Dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore the dataset details on Summary tab.

5. To create new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on "New Model" to create new one.

7. Select a model type from our templates. In our case we will select "RFM Customer Segmentation" by double clicking on its name.

8. Select dataset you want to use to produce model. We will use "Demo-eCommerce-Orders.csv".

9. Name your new model. We will call it "RFM customer segmentation on Demo-eCommerce-Orders".

10. Write description of the model and select tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. Click this text field.

13. To set up RFM model, you first need to identify and define few parameters. These are: "Time /Date Column", "Customer ID", "Customer Name" (optional) and "Monetary" (amount spent). In our case we will select "created_at" as date, "user_id" as customer and "total" as monetary parameter.

14. To start training model click "Run scenario".

15. Wait for a few moments and Voilà! Your RFM Customer Segmentation model is trained. Click on the "Results" tab to get model insights.

16. You can navigate over different tabs to get deep insights into RFM analysis from different perspectives: Recency, Frequency, Monetary.

17. Tab "RFM Scores" shows detailed explanation on different scores along with RFM segments and descriptions.

18. Tab "RFM Analysis" gives you more details on different segments

19. Tab "RFM Matrix" will show you number of customers belonging to different RFM segment. You can export matrix data to use Customer IDs for different business actions (e.g. exporting list of about to churn customers).

Housing Prices

Create a Regression model on Demo Housing Prices dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select a dataset you want to use to create machine learning model. In this case we will select Housing-Prices dataset to create a "Regression Analysis" on house price historical data.

3. Once selected, the demo dataset will load directly to your account. The Dataset view will automatically open.

4. Adjust your dataset options on the Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore the dataset details on the Summary tab.

5. To create a new model in the Graphite Note main menu click on "Models"

6. You will get list of available models. Click on a "New Model" to create new one.

7. Select the model type from our templates. In our case, we will select "Regression" by double clicking on its name.

8. Select the dataset you want to use to produce the model. We will use "Demo-Housing-Prices.csv"

9. Name your new model. We will call it "Regression on Demo-Housing-Prices"

10. Write the description of the model and select a tag. If you want to, you can also create new a tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up a Regression Model, firstly, you need to define the "Target Feature". That is a numeric column from your dataset that you'd like to make predictions about. In the case of Regression on Demo Housing Prices, the dataset target feature is "Price" column.

13. Click "Next" to get the list of model features that will be included into scenario. Model relies upon each column (feature) to make accurate predictions. When training model we will calculate which of the features are most important and behave as Key Drivers.

14. To start training the model, click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on the "Performance" tab to get model insights and view the Key Drivers.

16. Explore the Regression Model by clicking on Impact Analysis, Model Fit and Training Results to get more insights on how model is trained and set up.

17. If you want to take your model into action. Click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import fresh CSV dataset with data model will use to make predictions on the target column. In our case, that is "Price". Keep in mind, the dataset you are uploading needs to contain the same feature columns as your model.

19. Use your model often to predict future behaviour, and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

Lead Scoring

Binary Classification Model on Demo Lead Scoring dataset.

Get an overview of Lead Scoring demo dataset and how it can be used to create your new Graphite Note model in this video:

Or follow instructions below to get step by step guidance on how to use Lead Scoring demo dataset:

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create the machine learning model. In this case, we will select "Lead Scoring dataset" to create binary classification analysis on potential customer interactions data.

3. Once selected, the demo dataset will load directly to your account. The Dataset view will automatically open.

4. Adjust your dataset options on the Settings tab. Click Columns tab to view the list of available columns with their corresponding data types. Then explore the dataset details on Summary tab.

5. Click "Models"

6. You will get list of available models. Click on "New Model" to create a new one.

7. Select the model type from our templates. In our case, we will select "Binary Classification" by double clicking on its name.

8. Select dataset you want to use to produce the model. We will use "Demo-Lead-Scoring.csv."

9. Name your new model. We will call it "Binary Classification on Demo-Lead-Scoring".

10. Write the description of the model and select a tag. If you want to, you can also create a new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up a Binary Classification model, firstly, you need to define the "Target Feature". That is a binary column from your dataset that you'd like to make predictions about. In the case of Binary Classification on a Lead Scoring dataset, the target feature will be the "Converted" column.

13. Click "Next" to get the list of model features that will be included in the model scenario. The model relies upon each column (feature) to make accurate predictions. When training the model, it will calculate which of the features are most important and behave as Key Drivers.

14. To start training model click "Run Scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Binary Classification model is trained. Click on the "Performance" tab to get model insights and to view the Key Drivers.

16. Explore the "Binary Classification" model by clicking on the Impact Analysis and Training Results to get more insights on how the model is trained.

17. If you want to turn your model into action, click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset to make predictions on the target column. In our case that is "Converted". Keep in mind, dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour, and to learn which key drivers are impacting outcomes. The more you use and retrain your model, the smarter it becomes!

Mall Customers

Create General Segmentation on Demo Mall Customers dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create your machine learning model. In this case, we will select "Mall Customers dataset", to create General Segmentation analysis on customer engagement data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on the Settings tab. Click the Columns tab to view the list of available columns, with their corresponding data types. Explore the dataset details on Summary tab.

5. To create a new model in the Graphite Note main menu, click on "Models"

6. You will get list of available models. Click on the "New Model" to create a new one.

7. Select the model type from our templates. In our case, we will select "General Segmentation" by double clicking on its name.

8. Select the dataset you want to use to produce model. We will use "Demo-Mall-Customers.csv"

9. Name your new model. We will call it "General Segmentation on Demo-Mall-Customers".

10. Write a description of the model and select a tag. If you want to, you can also create new tag from the pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up your General Segmentation model, firstly, you need to define "Feature" columns. That is numeric column (or columns) from your dataset on which segmentation would be based. In the case of General Segmentation on Mall Customers dataset, the numeric feature will be "Age", Annual Income" and "Spending Score" columns.

13. To start training the model click "Run Scenario". This will train your model based on the uploaded dataset.

14. Wait for few moments and Voilà! Your General Segmentation model is trained. Click on "Results" tab to get model insights and explore segmentation clusters.

15. Navigate over different tabs to get insights from high level "Cluster Summary" to "By Cluster" charts and tables.

16. Use "Cluster Visualisations" to view the scatter plot visualisations of cluster members.

Marketing Mix

Create a Regression model on Demo Marketing Mix dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select the dataset you want to use to create your machine learning model. In this case, we will select MMM dataset to create Regression Analysis on marketing mix and sales data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on the Settings Tab. Click the Columns tab to view list of available columns with their corresponding data types. Explore dataset details on the Summary tab.

5. To create a new model in the Graphite Note main menu click on "Models".

6. You will get list of available models. Click on "New Model" to create a new one.

7. Select your model type from our templates. In our case we will select "Regression" by double clicking on its name.

8. Select dataset you want to use to produce the model. We will use "Demo-MMM.csv"

9. Name your new model. We will call it "Regression on Demo-MMM".

10. Write description of the model and select tag. If you want you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up your "Regression Model", firstly, you need to define "Target Feature". That is numeric column from your dataset that you'd like to make predictions about. In the case of Regression on Marketing Mix and Sales Dataset, the target feature is "Sales" column.

13. Click "Next" to get the list of model features that will be included in scenario. The model relies upon each column (feature) to make accurate predictions. When training the model, we will calculate which of the features are most important and behave as Key Drivers.

14. To start training the model, click "Run Scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on "Performance" tab to get model insights and view the Key Drivers.

16. Explore Regression model by clicking on Impact Analysis, Model Fit and Training Results to get more insights on how the model is trained and set up.

17. If you want to take your model into action, click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import fresh CSV dataset with data model will use to make predictions on target column. In our case, that is "Sales". Keep in mind, dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

Store Item Demand

Create a Regression model on Demo Store Item Demand dataset

1. If you want to use Graphite Note demo datasets click "Import DEMO Dataset".

2. Select dataset you want to use to create your machine learning model. In this case, we will select Store Item Demand dataset to create Regression analysis on sales across store locations data.

3. Once selected, the demo dataset will load directly to your account. The dataset view will automatically open.

4. Adjust your dataset options on Settings tab. Click Columns tab to view list of available columns with their corresponding data types. Explore dataset details on Summary tab.

5. To create a new model in the Graphite Note main menu, click on "Models".

6. You will get list of available models. Click on "New Model" to create new one.

7. Select model type from our templates. In our case we will select "Regression" by double clicking on its name.

8. Select the dataset you want to use to produce model. We will use "Demo-Store-Item-Demand.csv".

9. Name your new model. We will call it "Regression on Demo-Store-Item-Demand".

10. Write a description of the model and select tag. If you want to, you can also create new tag from pop-up "Tags" window that will appear on the screen.

11. Click "Create" to create your demo model environment.

12. To set up Regression Model, firstly, you will need to define "Target Feature". That is the numeric column from your dataset that you'd like to make predictions about. In case of Regression on Store Item Demand dataset target feature is "Sales" column.

13. Click "Next" to get the list of model features that will be included in model scenario. The model relies on each column (feature) to make accurate predictions. When training a model, we will calculate which of the features are most important and behave as the Key Drivers.

14. To start training your model click "Run scenario". This will take a sample of 80% of your data and train several machine learning models.

15. Wait for few moments and Voilà! Your Regression model is trained. Click on "Performance" tab to get model insights and view Key Drivers.

16. Explore Regression model by clicking on Impact Analysis, Model Fit and Training Results to get more insights on how model is trained and set up.

17. If you want to take your model into action click on "Predict" tab in the main model menu.

18. You can produce your own What-If analysis based on existing training results. You can also import a fresh CSV dataset to make predictions on target column. In our case that is "Sales". Keep in mind, dataset you are uploading needs to contain same feature columns as your model.

19. Use your model often to predict future behaviour and to learn which key drivers are impacting the outcomes. The more you use and retrain your model, the smarter it becomes!

What Dataset do I need for my use case?

The "What Dataset Do I Need?" section of Graphite Note is a comprehensive resource designed to guide users through the intricacies of dataset selection and preparation for various machine learning models. This section is crucial for users, especially those without extensive AI expertise, as it provides clear, step-by-step instructions and examples on how to curate and structure data for different predictive analytics scenarios.

Key Features of the Section

Model-Specific Guidance: Each page within this section is tailored to a specific predictive model, such as cross-selling prediction, churn prediction, or customer segmentation. It outlines the type of data required, the format, and how to interpret and use the data effectively.
Sample Datasets and Templates: To make the process more user-friendly, the section includes sample datasets and templates. These examples showcase the necessary columns and data types, along with a brief explanation of each, helping users to model their datasets accurately.
Target Column Identification: A crucial aspect of preparing a dataset for machine learning is identifying the target column. This section provides clear guidance on selecting the appropriate target for different types of analyses, whether it's for classification, regression, or clustering.
Data Cleaning and Preparation Tips: Recognizing that data rarely comes in a ready-to-use format, this section offers valuable tips on cleaning and preparing data, ensuring that users start their predictive analytics journey on the right foot.
Real-World Applications and Use Cases: To bridge the gap between theory and practice, the section includes examples of real-world applications and use cases. This approach helps users understand how their data preparation efforts translate into actionable insights in various business contexts.

Datasets

Prepare your Data

data prep

UNDERSTANDING MACHINE LEARNING

Models

Auto ML

Preprocessing Data

Notebooks – Data Storytelling

REST API

Dataset examples - from online sources

Data is an essential component of any data modeling and analysis process. The kind of data you need for modeling depends on the specific problem you are trying to solve. In general, the data should be relevant, accurate, and consistent, and it should cover a significant period. In some cases, you may also need to preprocess or transform the data to make it suitable for modeling.

If you are new to using Graphite Note or are looking for some examples to practice with, there are several popular datasets available that you can explore. Some examples include weather data, financial data, social media data, and sensor data. These datasets are often available in open-source repositories or can be downloaded from public sources, such as government websites, social media platforms, or financial databases.

Graphite Note is a powerful tool that allows you to predict, visualize and analyze data in real-time. With the right dataset, you can use Graphite Note to gain valuable insights and make informed decisions about your business or research. Whether you are analyzing financial data to predict market trends or monitoring sensor data to optimize your production processes, our platform can help you make sense of your data and identify patterns that would be difficult to detect otherwise.

While the kind of data you need may vary depending on your specific needs, there are several popular datasets that you can use to practice and explore the capabilities of Graphite Note. With the right dataset and a solid understanding of data modeling and analysis, you can unlock the full potential of Graphite Note and gain insights that will drive your business or research forward.

We have highlighted a few popular datasets so you can get to know our platform better. After that, it's all up to you - collect your data and start having insights and fun!

An education company named “X Education” sells online courses to industry professionals. Many professionals interested in the courses land on their website and browse for courses on any given day—an excellent dataset for Binary Classification, with a target column "Converted" (YES/NO).

1.1 Usage

Use Graphite Note to gain valuable insights into your sales pipeline by identifying which leads are converting to customers and the factors that contribute to their success. With this information, you can optimize your sales strategy and improve your overall conversion rates.

In addition, our tool can also help you predict which new leads are most likely to convert to customers and provide a probability score for each lead. This can enable you to prioritize your sales efforts and focus on the leads with the highest conversion potential.

By leveraging our tool, you can gain a deeper understanding of your sales funnel and take proactive steps to improve your conversion rates, reduce churn, and increase revenue.

1.2 Model Type

1.3 How to Build This Model

To get started, download the provided dataset and upload it to Graphite Note. Once uploaded, create a new Binary Classification model in Graphite Note with the 'Converted' variable as the Target Variable. This will allow you to predict which leads are most likely to convert to customers.

After training the model, explore the insights that it provides, such as the most important features for predicting conversion and the distribution of conversion probabilities. This can help you to gain a better understanding of the factors that contribute to lead conversion and make informed decisions about your sales strategy.

1.4 What-if?

Finally, you can use the model to run a "what-if" scenario by predicting the conversion probability for new leads based on different scenarios or assumptions. This can help you to forecast the impact of changes in your sales approach or marketing efforts and make data-driven decisions.

By following these steps, you can leverage Graphite Note and the provided dataset to gain valuable insights into your sales pipeline, predict lead conversion, and optimize your sales strategy for better results.

A Telco company customer dataset. Each row represents a customer and each column contains the customer’s attributes. The dataset includes information about:

Customers who left the company – that will be our target column, ("Churn").

Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies.

Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges.

Demographic info about customers – gender, age range, and if they have partners and dependents.

2.1 Usage

Use Graphite Note to gain valuable insights into your customer base and identify which customers are most likely to churn. By analyzing the factors that contribute to churn, you can optimize your retention strategy and reduce customer churn rates.

In addition, our tool can also help you predict which customers are at high risk of churning, and provide a probability score for each customer. This can enable you to take proactive steps to retain those customers with the highest churn risk, such as offering personalized promotions or improving their overall experience.

By leveraging our tool, you can gain a deeper understanding of your customer base and identify opportunities to reduce churn, increase retention rates, and ultimately drive revenue growth. With our predictive churn model, you can make data-driven decisions that lead to more satisfied customers and a stronger business.

2.2 Model Type

2.3 How to Build This Model

To get started, download the provided dataset and upload it to Graphite Note. Once uploaded, create a new Binary Classification model in Graphite Note with the 'Churn' variable as the Target Variable.

This will allow you to predict which customers are most likely to churn.

After training the model, explore the insights that it provides, such as the most important features for predicting churn and the distribution of churn probabilities. This can help you to gain a better understanding of the factors that contribute to customer churn and make informed decisions about your retention strategy.

2.4 What-if?

Finally, you can use the model to run a "what-if" scenario by predicting the churn probability for different groups of customers based on different scenarios or assumptions. This can help you to forecast the impact of changes in your retention approach or customer experience efforts and make data-driven decisions.

By following these steps, you can leverage Graphite Note and the provided dataset to gain valuable insights into your customer base, predict customer churn, and optimize your retention strategy for better results.

The dataset contains monthly data on car sales from 1960 to 1968. It is great for our time series forecast model with which you can predict sales for the upcoming months.

3.1 Usage

Use Graphite Note to gain valuable insights into your business operations and forecast future trends by analyzing time series data. With our advanced forecasting models, you can make informed decisions about your business and optimize your operations for better results.

Our tool enables you to analyze historical data and identify patterns and trends, such as seasonality or cyclical trends. This can help you to forecast future demand or performance and make data-driven decisions about resource allocation, capacity planning, or inventory management.

3.2 Model Type

3.3 How to Build This Model

To get started, download the provided dataset and upload it to Graphite Note. Once uploaded, create a new Timeseries Forecast model in Graphite Note with

The 'Sales' variable as aTarget Variable
Time/Date Column: Month
Time Interval: Monthly

After training the model, explore the insights that it provides, such as identifying patterns, seasonality, and trends. This can help you to forecast future performance, plan resources effectively, and optimize your operations.

3.4 What-if?

Finally, you can use the model to run a "what-if" scenario by predicting future values.

This can help you to forecast the impact of changes in your business operations, such as changes in demand, capacity planning, or inventory management.

By following these steps, you can leverage Graphite Note to gain valuable insights into your business trends, forecast future performance, and optimize your operations for better results. With our advanced time series forecasting models, you can stay ahead of the competition and take advantage of new opportunities as they arise.

This is a demo CSV with orders for an imaginary eCommerce shop. You can use it for Timeseries forecasting, RFM model, Customer Lifetime Value Model, General Segmentation, or New vs Returning Customers model in Graphite.

A demo Mall Customers dataset from Kaggle. Ideal for General customer segmentation in Graphite.

Timeseries Forecast

Model Scenario

A Timeseries Forecast Model is designed to predict future values by analyzing historical time-related data. To utilize this model, your dataset must include both time-based and numerical columns. In this tutorial, we'll cover the fundamentals of the Model Scenario to help you achieve optimal results. Within the Model Scenario, you'll select parameters related to your dataset and the model itself.

For the Target Column, select a numeric value you want to predict. It's crucial to have values by day, week, or year. If some dates are repeated, you can aggregate them by taking their sum, average, etc.

Next, choose a Sequence Identifier Field to group certain fields and generate an independent forecast for each time series. These values shouldn't be unique; they must form a series.

Then, select the Time/Date Column, specifying the column containing time-related values. The Time Interval represents the data frequency—choose daily for daily data, yearly for annual data, etc. With Forecast Horizon, decide how many days, weeks, or years you want to predict from the last date in your dataset.

The model performs well with seasonal data patterns. If your data shows a linear growth trend, select "additive" for Seasonality Mode; for exponential growth, select "multiplicative." For example, if you see annual patterns, set Yearly Seasonality to True. (TIP: Plotting your data beforehand can help you understand these patterns.) If you're unsure, the model will attempt to detect seasonality automatically.

For daily or hourly intervals, you can access Advanced Parameters to add special dates, weekends, holidays, or limit the target value.

Advanced Parameters

We are constantly enhancing our platform with new features and improving existing models. For your daily data, we've introduced some new capabilities that can significantly boost forecast accuracy. Now, you can limit your target predictions, remove outliers, and include country holidays and special events.

To set prediction limits, enter the minimum and maximum values for your target variable. For example, if you're predicting daily temperatures and know the maximum is 40°C, enter that value to prevent the model from predicting higher temperatures. This helps the model recognize the appropriate range of the Target Column. Additionally, you can use the Remove Days of the Week feature to exclude certain days from your predictions.

Country holidays and special dates

We added parameters for country holidays and special dates to improve model accuracy. Large deviations can occur around holidays, where stores see more customers than usual. By informing the model about these holidays, you can achieve more balanced and accurate predictions. To add holidays in Graphite Note, navigate to the advanced section of the Model Scenario and select the relevant country or countries.

Similarly, you can add promotions or events that affect your data. Enter the promotion name, start date, duration, and future dates. This ensures the model accounts for these events in future predictions.

Combining these parameters provides more accurate results. The more information the model receives, the better the predictions.

Removing data points

In addition to adding holidays and special events, you can delete specific data points from your dataset. In Graphite Note, enter the start and end dates of the period you want to remove. For single-day periods, enter the same start and end date. You can remove multiple periods if necessary. Understanding your data and identifying outliers or irrelevant periods is crucial for accurate predictions. Removing these dates can help eliminate biases and improve model accuracy.

By following these steps, you can harness the full potential of your Timeseries Forecast Model, providing valuable insights and more accurate predictions for your business. Now it's your turn to do some modeling and explore your results!

Training model

After setting all parameters it is time to Run Scenario and train Machine Learning model.

Model Performance

After running your model, review your results. The Model Performance section provides visual and numerical summaries of key metrics. You'll see values for four evaluation metrics, crucial for assessing your machine learning algorithm's performance.

R-squared determines the proportion of variance in the dependent variable that can be explained by the independent variable. MAPE (Mean absolute percentage error), MAE (Mean absolute error) and RMSE (Root mean squared error) are measures that describe the average difference between the actual and predicted value.

Model Fit

The Model Fit Tab features a graph displaying actual and predicted values. Besides the primary target value prediction (yellow line), the model shows a range of values, known as the uncertainty interval (yellow shaded area). This visualization helps you gauge your model's performance.

If you used the Sequence Identifier Field, you can choose which value to analyze in each Model Result.

Trend

Trends and seasonality are key characteristics of time-series data that should be analyzed. The Trend Tab displays a graph illustrating the global trend that Graphite Note has detected from your historical data.

Seasonality

Seasonality represents the repeating patterns or cycles of behavior over time. Depending on your Time Interval, you can find one or two graphs in the Seasonality Tab. For daily data, one graph shows weekly patterns, while the other shows yearly patterns. For weekly and monthly data, the graph highlights recurring patterns throughout the year.

Special Dates

The Special Dates graph shows the percentage effects of the special dates and holidays in historical and future data:

Details

The Details Tab contains a comprehensive table with all the values related to the Model Fit Tab, along with additional information.

Predict

Once your model is trained zou can use to fullfill its real function and that is predicting future values for Target column and trend. Use Predict tab to set up Star date and End date for time interval for which prediction will be calculated.

after trigering predict button table with prediction results and trends will become available. Prediction results will have the same frequency as trained model. (for example if model is trained on daily data predictions will be calculated for every day in prediction interval, if model is trained monthly, predictions will be created for every month)

Use API

Beside predict option where you can mannualz enter prediction parameters in case of timeseries Start end End tade Graphite Note offers API connection that can be used as a two-way communication between Graphite Note model third party external applications. You can use the API to programmatically make predictions by passing data to a model and retrieve the prediction results in real time. Details on how to use API you can find in the REST API section.

The request requires the Authorization header to be included. This header should be set to "Bearer [token]". You will replace[token] with your unique token that can be found by accessing the account info page in the Graphite Note app, under the section displaying your current plan information.

Create Notebook

Click on New Notebook button to open New notebook creation vizard. Choose neotebook name, description, and attributes and click Create.

You can use notebook as single place to represent any of the following data:

Text with descriptions and writen insights about data

New visualization - you can choose between different visualisations to show data from zour dataset (such as barchart, linechart,

MODEL RESULT - you can present model results containing predictions based on data feed to model throug Predict tab or API.

MODEL ACTIONABLE INSIGHT - zou can present recomended actionable insight prepared by Gen Ai.

In our case we want to expose predct option to final user similar as it is available on Predict tab. Select Model result option and you will be guided to Model result visualization creator.

Choose model you want to use and you will be guided to next step where zou will choose model result you want to include into notebook.

Choose Predict as tab you would like to include into notebook. Predict tab enables users to make their own prediction interval selections and generate predictions based on model.

Once Notebook options are saved you will get Notebook frontend screen. You can always expand zour notebook with additional text, visualisations and model results.

You can share notebook to other users by copying URL link

RFM Customers Segmentation Model

RFM Customer Model - How it Works?

Our intelligent system observes customers' shopping behavior without getting into the nitty-gritty technical details. It watches how recently each customer made a purchase, how often they come back, and how much they spend. The system notices patterns and groups customers accordingly.

This smart system doesn't need you to say, "Anyone who spends over $1000 is a champion." It figures out on its own who the champions are by comparing all the customers to one another.

When we talk about 'champion' customers in the context of RFM analysis, we're referring to those who are the most engaged, recent, and valuable. The system's approach to finding these champions is quite intuitive yet sophisticated.

Here's how it operates:

Observation: Just like a keen observer at a social event, the system starts by watching—collecting data on when each customer last made a purchase (Recency), how often they've made purchases over a certain period (Frequency), and how much they've spent in total (Monetary).
Comparison: Next, the system compares each customer to every other customer. It looks for natural groupings—clusters of customers who exhibit similar purchasing patterns. For example, it might notice a group of customers who shop frequently, no matter the amount they spend, and another group that makes less frequent but more high-value purchases.
Group Formation: Without being told what criteria to use, the system uses the data to form groups. Customers with the most recent purchases, highest frequency, and highest monetary value start to emerge as one group—these are your potential 'champions.' The system does this by measuring the 'distance' between customers in terms of RFM factors, grouping those who are closest together in their purchasing behavior.
Adjustment: The system then iterates, refining the groups by moving customers until the groups are as distinct and cohesive as possible. It's a process of adjustment and readjustment, seeking out the pattern that best fits the natural divisions in the data.
Finalization: Once the system settles on the best grouping, it has effectively ranked customers, identifying those who are the most valuable across all three RFM dimensions. These are your 'champions,' but the system also recognizes other groups, like new customers who've made a big initial purchase or long-time customers who buy less frequently but consistently.

By using this method, the system takes on the complex task of understanding the many ways customers can be valuable to a business. It provides a nuanced view that goes beyond simple categorizations, recognizing the diversity of customer value. The result is a highly tailored strategy for customer engagement that aligns perfectly with the actual behaviors observed, allowing businesses to interact more effectively with each segment, especially the 'champions' who drive a significant portion of revenue.

Why is ML better than rules-based segmentation?

Here’s why this machine learning approach is more powerful than manual labeling:

Adaptive Learning: The system continuously learns and adapts based on actual behavior, not on pre-set rules that might miss the nuances of how customers are interacting right now.
Time Efficiency: It saves you a mountain of time. No more going through lists of transactions manually to score each customer. The system does it instantly.
Personalized Grouping: Because it’s based on actual behavior, the system creates groups that are tailor-made for your specific customer base and business model, rather than relying on broad, one-size-fits-all categories.
Scalability: Whether you have a hundred customers or a million, this smart system can handle the job. Manual scoring becomes impractical as your customer base grows.
Unbiased Decisions: The system is objective, based purely on data. There’s no risk of human bias that might categorize customers based on assumptions or incomplete information.

In essence, this smart approach to customer grouping helps businesses focus their energy where it counts, creating a personalized experience for each customer, just like a thoughtful host at a party who knows exactly who likes what. It’s about making everyone feel special without having to ask them a single question.

RFM Model Scores

In the RFM model in Graphite Note, the intelligent system categorizes customers into segments based on their Recency (R), Frequency (F), and Monetary (M) values, assigning scores from 0 to 4 for each of these three dimensions. With five scoring options for each RFM category (including the '0' score), this creates a comprehensive grid of potential combinations—resulting in a total of 125 unique segments (5 options for R x 5 options for F x 5 options for M = 125 segments).

This segmentation allows for a high degree of specificity. Each customer falls into a segment that accurately reflects their interaction with the business. For example, a customer who recently made a purchase (high Recency), buys often (high Frequency), and spends a lot (high Monetary) could fall into a segment scored as 4-4-4. This would indicate a highly valuable 'champion' customer.

On the other hand, a customer who made a purchase a long time ago (low Recency), buys infrequently (low Frequency), but when they do buy, they spend a significant amount (high Monetary), might be scored as 0-0-4, placing them in a different segment that suggests a different engagement strategy.

By scoring customers on a scale from 0 to 4 across all three dimensions, the business can pinpoint exact customer profiles. This precision allows for highly tailored marketing strategies. For example, those in the highest scoring segments might receive exclusive offers as a reward for their loyalty, while those in segments with room for growth might be targeted with re-engagement campaigns.

The use of 125 segments ensures that the business can differentiate not just between generally good and poor customers, but between various shades of customer behavior, tailoring approaches to nurture the potential value of each unique segment. This granularity facilitates nuanced understanding and actionability for marketing, sales, and customer relationship management.

Model Scenario

Wouldn't be great to tailor your marketing strategy regarding identified groups of customers? That way, you can target each group with personalized offers, increase profit, improve unit economics, etc.

Recency - how long it’s been since a customer bought something from you or visited your website
Frequency - how often a customer buys from you, or how often he visits your website
Monetary - the average spend of a customer per visit, or the overall transaction value in a given period

Let's go through the RFM analysis inside Graphite Note. The dataset on which you will run your RFM Model must contain a time-related column, given that this report studies customer behavior over some time.

We need to distinguish all customers, so we need an identifier variable like Customer ID.

If you might have data about Customer Names, great, if not, don't worry, just select the same column as in the Customer ID field.

Finally, we need to choose the numeric variable regard to which we will observe customer behavior, called Monetary (amount spent).

That's it, you are ready to run your first RFM Model.

RFM Model Results

RFM Scores

On the RFM Scores Tab, we have an overview of the customers and their scores:

Then you have a ranking of each RFM segment (125 of them) represented in a table.

And finally, a chart showing the number of customers per RFM score.

RFM Analysis

lost customer
hibernating customer
can-not-lose customer
at-risk customer
about-to-sleep customer
need-attention customer
promising customer
new customer
potential loyal customer
loyal customer
champion customer.

All information related to these groups of customers, such as the number of customers, average monetary, average frequency, and average recency per group, can be found in the RFM Analysis Tab.

There is also a table at the end to summarize everything.

Recency

According to the Recency factor, which is defined as the number of days since the last purchase, we divide customers into 5 groups:

lost
lapsing
average activity
active
very active.

In the Recency Tab, we observe the behavior of the above groups, such as the number of customers, average monetary, average frequency, and average recency per group.

Frequency

As Frequency is defined as the total number of purchases, customers can buy:

very rarely
rarely
regullary
frequently
very frequently.

Monetary

Monetary is defined as the amount of money the customer spent, so the customer can be a :

very low spender
low spender
medium spender
high spender
very high spender.

RFM Matrix

Details

All the values related to the first five tabs, with much more, can be found on the Details Tab, in the form of a table.

The RFM model columns outlined in your system provide a structured way to understand and leverage customer purchase behavior. Here’s how each column benefits the end user of the model:

Monetary: Indicates the total revenue a customer has generated. This helps prioritize customers who have contributed most to your revenue.
Avg_monetary: Shows the average spend per transaction. This can be used to gauge the spending level of different customer segments and tailor offers to match their spending habits.
Frequency: Reflects how often a customer purchases. This can inform retention strategies and indicate who might be receptive to more frequent communication.
Recency: Measures the time since the last purchase. This can help target re-engagement campaigns to customers who have recently interacted with your business.
Date_of_last_purchase & Date_of_first_purchase: These dates help track the customer lifecycle and can trigger communications at critical milestones.
Customer_age_days: The duration of the customer relationship. Long-standing customers might benefit from loyalty programs, while newer customers might be encouraged with welcome offers.
Recency_cluster, Frequency_cluster, and Monetary_cluster: These categorizations allow for segmentation at a granular level, helping customize strategies for groups of customers who share similar characteristics.
Rfm_cluster: This overall grouping combines recency, frequency, and monetary values, offering a holistic view of a customer's value and engagement, essential for creating differentiated customer journeys.
Recency_segment_name, Frequency_segment_name, and Monetary_segment_name: These descriptive labels provide intuitive insights into customer behavior and make it easier to understand the significance of each cluster for strategic planning.
Fm_cluster_sum: This score is a combined metric of frequency and monetary clusters, useful in prioritizing customers who are both frequent shoppers and high spenders.
Fm_segment_name and Rfm_segment_name: These labels offer a quick reference to the type of customer segment, simplifying the task of identifying and applying targeted marketing actions.

Testing the Model:

Seeking assurance about the model's accuracy and effectiveness? Here's how you can address these concerns:

Validation with Historical Data: Show how the model’s predictions align with actual customer behaviors observed historically. For instance, demonstrate how high RFM scores correlate with customers who have proven to be valuable.
Segmentation Analysis: Analyze the characteristics of customers within each RFM segment to validate that they make sense. For example, your top-tier RFM segment should clearly consist of customers who are recent, frequent, and high-spending.
Control Groups: Create control groups to test marketing strategies on different RFM segments and compare the outcomes. This can validate the effectiveness of segment-specific strategies suggested by the model.

Building Trust in the Model:

A/B Testing: Implement A/B testing where different marketing approaches are applied to similar customer segments to see which performs better, thereby showcasing the model's utility in identifying the right targets for different strategies.
Benchmarking: Compare the RFM model’s performance against other segmentation models or against industry benchmarks to establish its effectiveness.

Customer Lifetime Value Model

Introduction

Detecting early signs of reduced customer engagement is pivotal for businesses aiming to maintain loyalty. A notable signal of this disengagement is when a customer's once regular purchasing pattern starts to taper off, leading to a significant decrease in activity. Early detection of such trends allows marketing teams to take swift, proactive measures. By deploying effective retention strategies, such as offering tailored promotions or engaging in personalized communication, businesses can reinvigorate customer interest and mitigate the risk of losing them to competitors.

Our objective is to utilize a model that not only alerts us to customers with an increased likelihood of churn but also forecasts their potential purchasing activity and, importantly, estimates the total value they are likely to bring to the business over time.

These analytical needs are served by what is known in data science as Buy 'Til You Die (BTYD) models. These models track the lifecycle of a customer's interaction with a business, from the initial purchase to the last.

While customer churn models are well-established within contractual business settings, where customers are bound by the terms of service agreements, and churn risk can be anticipated as contracts draw to a close, non-contractual environments present a different challenge. In such settings, there are no defined end points to signal churn risk, making traditional classification models insufficient.

To address this complexity, our model adopts a probabilistic approach to customer behavior analysis, which does not rely on fixed contract terms but on behavioral patterns and statistical assumptions. By doing so, we can discern the likelihood of future transactions for every customer, providing a comprehensive and predictive understanding of customer engagement and value.

Customer Lifetime Value - How it Works?

The Customer Lifetime Value (CLV) model is a robust tool employed to ascertain the projected revenue a customer will contribute over their entire relationship with a business. The model employs historical data to inform predictive assessments, offering valuable foresight for strategic decision-making. This insight assists companies in prioritizing resources and tailoring customer engagement strategies to maximize long-term profitability.

The CLV model executes a series of sophisticated calculations. Yet, its operations can be conceptualized in a straightforward manner:

Historical Analysis: The model comprehensively evaluates past customer transaction data, noting the frequency and monetary value of purchases alongside the tenure of the customer relationship.
Engagement Probability: It assesses the likelihood of a customer’s future engagement based on their past activities, effectively estimating the chances of a customer continuing to transact with the business.
Forecasting: With the accumulated data, the model projects the customer’s future transaction behavior, predicting how often they will make purchases and the potential value of these purchases.
Lifetime Value Calculation: Integrating these elements, the model calculates an aggregate figure representing the total expected revenue from a customer for a designated future period.

Model Scenario

The Customer Lifetime Value model uses historical customer data to predict the future value a customer will generate for a business. It leverages algorithms and statistical techniques to analyze customer behavior, purchase patterns, and other relevant factors to estimate the potential revenue a customer will bring over their lifetime.

The dataset on which you will run your model must contain a time-related column.

We need to distinguish all customers, so we need an identifier variable like Customer ID. If you might have data about Customer Names, great, if not, don't worry, just select the same column as in the Customer ID field.

We need to choose the numeric variable regard to which we will observe customer behavior, called Monetary (amount spent).

Finally, you need to choose the Starting Date from which you'd like to calculate this model for your dataset.

When you're looking at this option for calculating Customer Lifetime Value (CLV), think of it as setting a starting line for a race. The "race" in this case is the journey you're tracking: how much your customers will spend over time.

The "Starting Date for Customer Lifetime Value Calculation" is basically asking you when you want to start watching the race. You have a couple of choices:

Max Date: This is like saying, "I want to start watching the race from the last time we recorded someone crossing the line." It sets the starting point at the most recent date in your records where a customer made a purchase.
Today: Choosing this means you want to start tracking from right now, today. So any purchases made after today will count towards the CLV.
-- select date --: This would be an option if you want to pick a specific date to start from, other than today or the most recent date in your data.

Model Results

Let's see how to interpret the results after we have run our model.

On the summary of repeat customers, we have:

the Total Repeat Customers: the customers came that keep returning (the loyal customers)
the Total Historical Amount: the past earnings from loyal customers
the Average Spend per Repeat Customer
the Average no. of Repeat Purchases: shows the customers' loyalty with the average number of repeat purchases
the Average Probability Alive Next 90 days: estimate the likelihood that a customer stays alive or active for their business in the next 90 days
the Predicted no. of Purchases next 90 days: the number of purchases you can expect the next 90 days based on our analysis
Predicted Amount Next 90 days: the revenue you can expect the next 90 days with our predicted amount feature
CLV Customer Lifetime Value: average revenue that one customer generated in the past and will generate in the future

CLV Insights

The CLV Insights Tab shows some charts on the lifetime of customers.

The forecasted number of purchases chart estimates the number of purchases that are expected to be made by returning customers over a specific period.

The forecasted amount chart is a graphical representation of the projected value of purchases to be made by returning customers over a certain period.

Finally, the average alive probability chart illustrates the average probability of a customer remaining active for a business over time, assuming no repeat purchases.

Details

Last but not least, on the Details Tab, you can find a detailed table where you can see all relevant values which were used for the above results.

You have all the information in each column if you click on the link on the details tab.

Understanding the Data Columns

The Details Tab within the Customer Lifetime Value Model offers an extensive breakdown of metrics for in-depth analysis. Each column represents a specific aspect of customer data that is pivotal to understanding and predicting customer behavior and value to your business. Below are the descriptions of the available columns:

`amount_sum`

Description: This column showcases the total historical revenue generated by an individual customer. By analyzing this data, businesses can identify high-value customers and allocate marketing resources efficiently.

`amount_count`

Description: Reflects the total number of purchases by a customer. This frequency metric is invaluable for loyalty assessments and can inform retention strategies.

`repeated_frequency`

Description: Indicates the frequency of repeated purchases, highlighting customer loyalty. This metric can be leveraged for targeted engagement campaigns.

`customer_age`

Description: The duration of the customer's relationship with the business, measured in days since their first purchase. It helps in segmenting customers based on the length of the relationship.

`average_monetary`

Description: Average monetary value per purchase, providing insight into customer spending habits. Businesses can use this to predict future revenue from a customer segment.

`probability_alive`

Description: Displays the current probability of a customer being active. A score of 1 means 100%, the customer is likely active, aiding in prioritizing engagement efforts.

`probability_alive_7_30_60_90_365`

Description: This column shows the probability of customers remaining active over various time frames without repeat purchases. It's critical for developing tailored customer retention plans.

`predicted_no_purchases_7_30_60_90_365`

Description: Predicts the number of future purchases within specific time frames. This forecast is essential for inventory planning and sales forecasting.

`CVL_30_60_90_365`

Description: Estimates potential customer value over different time frames, aiding in strategic financial planning and budget allocation for customer acquisition and retention.

Example - customer 1

In this given example, we have a snapshot of customer data from the CLV model. The model considers various unique aspects of customer behavior to predict future engagement and value. Let's analyze the key data points and what they signify in a non-technical way, while emphasizing the model’s ability to tailor predictions to individual customer behavior:

amount_sum: This customer has brought in a total revenue of $4,584.14 to your business.
amount_count: They have made 108 purchases, which shows a high level of engagement with your store.
repeated_frequency: Out of these purchases, 106 are repeat purchases, suggesting a strong customer loyalty.
customer_age: They have been a customer for 364 days, indicating a relatively long-term relationship with your business.
average_monetary: On average, they spend about $42.73 per transaction.
probability_alive: There’s an 85% to 86% chance that they are still actively engaging with your business, which is quite high.
probability_alive_7: Specifically, the probability that this customer will remain active in the next 7 days is about 44.48%.

Alex, with a remarkable 106 repeated purchases and a customer_age of 364 days, has shown a pattern of strong and consistent engagement. The average monetary value of their purchases is $42.73, contributing significantly to the revenue with a total amount_sum of $4,584.14. The current probability_alive is high, indicating Alex is likely still shopping.

However, even with this consistent past behavior, the probability_alive_7 drops to about 44.48%. It highlights a nuanced understanding of Alex's habits; a sudden change in their routine is notable, which is why the model predicts a more significant impact if Alex were to alter their shopping pattern even slightly.

Example - customer 2

On the other hand, we have Casey, who has made 2 purchases, with only 1 being a repeated transaction. Casey’s amount_sum is $185.93, with an average_monetary value of $84.44, and a customer_age of 135 days. Despite a high current probability_alive, the model shows a minimal decline to 83.73% in the probability_alive_7.

This slight decrease tells us that Casey's engagement is inherently more sporadic. The business doesn't expect Casey to make purchases with the same regularity as Alex. If Casey doesn't return for a week, it isn't alarming or out of character, as reflected in the gentle decline in their seven-day active probability.

The Story of Two Shopping Journeys

The contrast in these profiles, painted by the CLV model, enables the business to craft distinct customer journeys for Alex and Casey. For Alex, it's about ensuring consistency and rewarding loyalty to maintain that habitual engagement. Perhaps an automated alert for engagement opportunities could be set up if they don't make their usual purchases.

For Casey, the strategy may involve creating moments that encourage repeat engagement, possibly through sporadic yet impactful touchpoints. Since Casey's behavior suggests openness to larger purchases, albeit less frequently, the focus could be on highlighting high-value items or exclusive offers that align with their sporadic engagement pattern.

The CLV model's behavioral predictions allow the business to personalize customer experiences, maximize the potential of each interaction, and strategically allocate resources to maintain and grow the value of each customer relationship over time. This bespoke approach is the essence of modern customer relationship management, as it aligns perfectly with the individualized tendencies of customers like Alex and Casey.

Utilizing Column Data

This detailed data is a treasure trove for businesses keen on data-driven decision-making. Here’s how to utilize the information effectively:

Custom Segmentation: Use customer_age, amount_sum, and average_monetary to segment your customers into meaningful groups.
Detect Churners: Use probability_alive to segment customers currently being active for non contractual business like eCommerce and Retail. A score of 0.1 means 10% probability the customer is active ("alive") for your business.
Targeted Marketing Campaigns: Leverage repeated_frequency and probability_alive columns to identify customers for loyalty programs or re-engagement campaigns.
Revenue Projections: The CVL_30_60_90_365 column helps in projecting future revenue and understanding the long-term value of customer segments.
Strategic Planning: Use predicted_no_purchases_7_30_60_90_365 to plan for demand, stock management, and to set realistic sales targets.

By engaging with the columns in the Details Tab, users can extract actionable insights that can drive strategies aimed at optimizing customer lifetime value. Each metric can serve as a building block for a more nuanced, data-driven approach to customer relationship management.