A Timeseries Forecast Model is designed to predict future values by analyzing historical time-related data. To utilize this model, your dataset must include both time-based and numerical columns. In this tutorial, we'll cover the fundamentals of the Model Scenario to help you achieve optimal results.
For the Target Column, select a numeric value you want to predict. It's crucial to have values by day, week, or year. If some dates are repeated, you can aggregate them by taking their sum, average, etc.
Next, you can choose a Sequence Identifier Field to group fields and generate an independent time series and forecast forecast for each group. Keep in mind, these values shouldn't be unique; they must form a series and there is maximum of 500 unique values allowed as sequence identifier. If you don't want to generate independent time series for each group, you can leave this option empty.
Then, select the Time/Date Column, specifying the column containing time-related values. The Time Interval represents the data frequency—choose daily for daily data, yearly for annual data, etc. With Forecast Horizon, decide how many days, weeks, or years you want to predict from the last date in your dataset.
The model performs well with seasonal data patterns. If your data shows a linear growth trend, select "additive" for Seasonality Mode; for exponential growth, select "multiplicative." For example, if you see annual patterns, set Yearly Seasonality to True. (TIP: Plotting your data beforehand can help you understand these patterns.) If you're unsure, the model will attempt to detect seasonality automatically.
For daily or hourly intervals, you can access Advanced Parameters to add special dates, weekends, holidays, or limit the target value.
We are constantly enhancing our platform with new features and improving existing models. For your daily data, we've introduced some new capabilities that can significantly boost forecast accuracy. Now, you can limit your target predictions, remove outliers, and include country holidays and special events.
To set prediction limits, enter the minimum and maximum values for your target variable. For example, if you're predicting daily temperatures and know the maximum is 40°C, enter that value to prevent the model from predicting higher temperatures. This helps the model recognize the appropriate range of the Target Column. Additionally, you can use the Remove Days of the Week feature to exclude certain days from your predictions.
We added parameters for country holidays and special dates to improve model accuracy. Large deviations can occur around holidays, where stores see more customers than usual. By informing the model about these holidays, you can achieve more balanced and accurate predictions. To add holidays in Graphite Note, navigate to the advanced section of the Model Scenario and select the relevant country or countries.
Similarly, you can add promotions or events that affect your data by enabling Add special dates option. Enter the promotion name, start date, duration, and future dates. This ensures the model accounts for these events in future predictions.
Combining these parameters provides more accurate results. The more information the model receives, the better the predictions.
In addition to adding holidays and special events, you can delete specific data points from your dataset. In Graphite Note, enter the start and end dates of the period you want to remove. For single-day periods, enter the same start and end date. You can remove multiple periods if necessary. Understanding your data and identifying outliers or irrelevant periods is crucial for accurate predictions. Removing these dates can help eliminate biases and improve model accuracy.
By following these steps, you can harness the full potential of your Timeseries Forecast Model, providing valuable insights and more accurate predictions for your business. Now it's your turn to do some modeling and explore your results!
After setting all parameters it is time to Run Scenario and train Machine Learning model.
The training duration may vary depending on the data volume, typically ranging from 1 to 10 minutes. The training will utilize 80% of the data to train various machine learning models and the remaining 20% to test these models and calculate relevant scores. Once completed, you will receive information about the best model based on the F1 value and details about training time.
To interpret the results after running your model, go to the Performance tab. Here, you can see the overall model performance post-training. Model evaluation metrics such as F1 Score, Accuracy, AUC, Precision, and Recall are displayed to assess the performance of classification models. details on Model metrics can also be found on Accuracy Overview tab.
On the performance tab, you can explore five different views that provide insights related to model training and results: Model Fit, Trend, Seasonality, Special Dates, and Details.
The Model Fit Tab displays a graph with actual and predicted values. The primary prediction is shown with a yellow line, and the uncertainty interval is illustrated with a yellow shaded area. This visualization helps assess the model's performance.
If you used the Sequence Identifier Field, you can choose which value to analyze in each Model Result.
Trends and seasonality are key characteristics of time-series data that should be analyzed. The Trend Tab displays a graph illustrating the global trend that Graphite Note has detected from your historical data.
Seasonality represents the repeating patterns or cycles of behavior over time. Depending on your Time Interval, you can find one or two graphs in the Seasonality Tab. For daily data, one graph shows weekly patterns, while the other shows yearly patterns. For weekly and monthly data, the graph highlights recurring patterns throughout the year.
The Special Dates graph shows the percentage effects of the special dates and holidays in historical and future data.
Details tab shows the results of the predictive model, presented in a table format. Each record includes the predicted label, predicted probability, and predicted correctness, offering insights into the model's predictions, confidence, and accuracy for each data point. Dataset test results can be exporetd into Excel by clicking on the XLSX button in the right corner.
Once the model is trained, you can use it to predict future values, solve binary classification problems, and drive business decisions. Here are ways to take action with your Timeseries forecast model:
After building and analyzing a predictive model using Graphite Note, the "Predict" function allows you to apply the model to new data. This enables you to forecast outcomes or target variables based on different feature combinations, providing actionable insights for decision-making.
You can share your prediction results with your team using the Notebook feature. With Notebooks, users can also run their own predictions on your Timerseries Forecast model.
Notebooks allow you to create various visualizations with detailed descriptions. You can plot model results for better understanding and enable users to make their own predictions. For more information, refer to the Data Storytelling section.