Binary Classification
Last updated
Last updated
With the Binary Classification model, you can analyze feature importance in a binary column with two distinct values. This model also predicts likely outcomes based on various parameters. To achieve optimal results, we'll cover the basics of the Model Scenario, where you will select parameters related to your dataset and the model itself.
To run the scenario, you need to have a Target Feature, which must be a binary column. This means it should contain only two distinct values, such as Yes/No or 1/0.
In the next step, select the Model Features you wish to analyze. All features that fit into the model are selected by default, but you may deselect any features you do not want to use. Graphite Note automatically preprocesses your data for model training, excluding features that are unsuitable. You can view the list of excluded features and the reasons for their exclusion on the right side of the screen.
The Advanced Parameters step in model creation allows users to fine-tune their model settings, enabling behavior similar to how a data scientist would approach the task. These parameters are designed for advanced customization, but for most users, it is recommended to leave the default settings as they are to ensure optimal performance.
Users can explore and adjust these parameters to tailor the model to specific needs. For detailed explanations of the different advanced parameter settings, refer to the Advanced Parameters section.
The Generate Actionable Insights section allows users to enable the automatic generation of actionable insights based on model predictions, enhanced with the capabilities of generative AI. The insights are generated in the language specified in the User profile inforamtion page under the AI Generated Content Language settings.
You can activate this feature by checking the Generate Actionable Insights box. Once enabled, the system will use model predictions to create insights tailored to your needs.
Specify the primary objective of the analytics by completing the Goal field. This includes choosing an action (e.g., “Increase” or “Decrease”) and the specific metric or outcome (e.g., “Churn”). These inputs guide the insights generation process.
Additional Context is an optional field to provide extra details about your business, target audience, or specific focus areas. Examples might include demographics (e.g., focusing on age group 25-35) or market focus (e.g., targeting the European market). This helps align the generated insights with your business narrative.
Moving forward, you'll see a comprehensive list of preprocessing steps that Graphite Note will apply to prepare your data for training. This enhances data quality, ensuring your model produces accurate results. Typically, these steps are performed by data scientists, but with our no-code machine learning platform, Graphite Note handles it for you. After reviewing the preprocessing steps, you can finish and Run Scenario that will start model training.
The training duration may vary depending on the data volume, typically ranging from 1 to 10 minutes. The training will utilize 80% of the data to train various machine learning models and the remaining 20% to test these models and calculate relevant scores. Once completed, you will receive information about the best model based on the F1 value and details about training time.
To interpret the results after running your model, go to the Performance tab. Here, you can see the overall model performance post-training. Model evaluation metrics such as F1 Score, Accuracy, AUC, Precision, and Recall are displayed to assess the performance of classification models. Details on Model metrics can also be found on Accuracy Overview tab.
On the performance tab, you can explore six different views that provide insights related to model training and results: Overview, Key Drivers, Impact Analysis, Model Fit, Accuracy Overview, Training Results and Details.
Key Drivers indicate the importance of each column (feature) for the Model's predictions. The higher the reliance of the model on a feature, the more critical it is. Graphite uses permutation feature importance to determine these values.
The Impact Analysis tab allows you to select various features and analyze, using a bar chart, how changes in each feature affect the target feature. You can switch between Count and Percentage views.
The Model Fit Tab displays the performance of the trained model. It includes a stacked bar chart with percentages showing correct and incorrect predictions for binary values (1 or 0, Yes or No).
The Accuracy Overview tab features a Confusion Matrix to highlight classification errors, making it simple to identify if the model is confusing two classes. For each class, it summarizes the number of correct and incorrect predictions. Find out more about Classification Confusion Matrix in our Understanding ML section.
On the Accuracy Overview tab, you'll find detailed information on correct and incorrect predictions (True positives and negatives / False positives and negatives). Model metrics are explained at the bottom of the section.
In the Training Results Tab, you will find information about all the models automatically considered during the training process. Graphite ran several machine learning algorithms suitable for binary classification problems, using 80% of the data for training and 20% for testing. The best model, based on the F1 score, is chosen and marked in green in the models list.
Details tab shows the results of the predictive model, presented in a table format. Each record includes the predicted label, predicted probability, and predicted correctness, offering insights into the model's predictions, confidence, and accuracy for each data point. Dataset test results can be exported into Excel by clicking on the XLSX button in the right corner.
Once the model is trained, you can use it to predict future values, solve binary classification problems, and drive business decisions. Here are ways to take action with your Binary Classification model:
The Actionable Insights Screen displays the results of the inputs provided earlier in the Generate Actionable Insights section. If actionable insights were enabled, this screen delivers targeted recommendations based on the model’s predictions and analysis. This screen empowers users to act on data-driven insights, fostering strategic decision-making tailored to the specific factors influencing the target outcome.
Overview of Insights: Provides a summary of key drivers influencing the target outcome (e.g., churn reduction). These insights help businesses understand factors contributing to customer behavior and retention.
Insights for Main Attributes: Detailed actionable insights are provided for the most impactful attributes (key drivers), such as tenure or TotalCharges. Each attribute includes: Feature Importance, Distribution Analysis and Actionable Recommendations suggesting specific actions to address identified trends.
Strategic Recommendations: Insights are framed to guide business strategies, focusing on targeted engagement, customer retention initiatives, and the optimization of key processes.
The Predict Screen allows users to generate predictions based on their trained model by providing input values for specific features. This screen bridges the gap between model insights and actionable application, enabling users to explore hypothetical situations or process large-scale predictions efficiently.
What-If Scenario Predictions: You can manually input values for relevant features (e.g., tenure, TotalCharges, InternetService) to simulate specific scenarios. Once the values are entered, clicking the Predict button provides the predicted outcome along with a probability score for each possible result (e.g., Churn is Yes or Churn is No). Results are displayed as probability scores, giving users insights into the likelihood of different outcomes based on the input features.
CSV File or Dataset Predictions: You can upload a CSV file containing data for multiple observations to generate predictions in bulk or you can also utilize existing datasets from Graphite Note for batch predictions, leveraging previously uploaded data.
You can share your prediction results with your team using the Notebook feature. With Notebooks, users can also run their own predictions on your Binary Classification model.
Notebooks allow you to create various visualizations with detailed descriptions. You can plot model results for better understanding and enable users to make their own predictions. For more information, refer to the Data Storytelling section.