top of page

How to Improve the Accuracy of Facebook Prophet Forecasts

​​Prophet is an open-source python library for predictive data analysis, developed by Facebook's Core Data Science Team and released in 2017. Compared to other models like ARIMA which assume a linear relationship between past and future values, Prophet's strength is that it automatically detects "changepoints" and accordingly adjusts forecasts, making it robust to outliers and flexible in handling seasonality and non-linear trends.

 

Prophet works as a statistical model, not machine learning. It uses an additive model to capture key components of time series data: trend, seasonality, and holidays. The approach relies on curve-fitting rather than optimization of parameters through iterative learning. Although Prophet doesn't inherently use a separate training and test dataset, you can manually split your data into historical data for fitting the model and future data for evaluating the model's performance.

 

How to Improve Forecasting Accuracy

Oddly enough, one of the main draws of Prophet is also one of its core weaknesses. The approach for handling changepoints can result in both underfitting and overfitting.

 

Here are the methods that I previously used when implementing prophet for weekly revenue forecasting.


Plot the Historical Data

This isn't specific to Prophet, but it's best practice to do a simple plot of your historical data before jumping into the forecast. The visual may help you to notice any unexpected trends or gaps that could skew the forecast.


Remove Known Noise Data

Although Prophet is inherently robust to outliers, you should still exclude data that is obviously "wrong" or unhelpful. For example, data relating to internal test campaigns or an unwanted spike in reseller activity. It's usually easier to do the data cleansing before putting data in Prophet.


External Regressors

Prophet is designed for univariate time series analysis. Multiple variables can be added as external regressors.


The business that I was looking at did sales at seemingly random times throughout the year, so I couldn’t just use holiday or yearly seasonality. Instead, I relied almost solely on variables (“flags”) to indicate whether a sale occurred that week, and whether a sale started or ended that week.

model.add_regressor('sale_y_n', prior_scale=40.0)

For multiple related output variables, just model each one separately as univariate time series, or use the forecast values of one variable as inputs to the next. You can also consider more advanced models like VAR or LSTM.


Hyperparameter Tuning

In the previous code snippet, I have set prior_scale=40.0, indicating that this regressor should be heavily factored.


In Prophet, the below hyperparameters are set as default, so you may need to tune them to balance underfitting and overfitting.

changepoint_range=0.8
changepont_prior_scale=0.05
seasonality_prior_scale=10
holidays_prior_scale=10
fourier_order=10

Data Normalization

Since Prophet is a statistical model, it does not require the same level of data flattening as machine learning models. However, some data normalization may be beneficial depending on what you're working with. In my original dataset, the sale_y_n variable was an integer between 0 and 7, indicating how many day out of that week were within a sale period. The raw scale of 0 to 7 may introduce bias to the forecast, so I first normalized the sale_y_n variable using sklearn MinMaxScaler.

 

How to Evaluate the Prediction Accuracy of Facebook Prophet Forecasts

By default, prophet uses 80% confidence interval, meaning there's an 80% chance that the actual value will fall between yhat_lower and yhat_upper. However, Prophet's performance can be hit or miss depending on the use case. If it's still not working, try multiple approaches and use the best model that performs well on cross-validation, using the following metrics to evaluate the accuracy.

  • Mean absolute error (MAE): measures the average magnitude of errors, using the same unit as the data

  • Root mean squared error (RSME): measures the square root of the average of squared differences. Squaring puts more weight on larger errors, so RSME is a useful metrics when larger errors are especially costly. What constitutes a "good" RMSE depends

  • Mean absolute percentage error (MAPE): measures the average of the absolute percentage errors, expressed as a percentage. Generally MAPE under 10% is considered very good, 0-20% is considered good, and even up to 50% can also be acceptable in some use cases.


Follow the parsimony gradient - start with the simplest model and end at the most complex as needed. The simplest option is a naive model, or a seasonal naive model if you have significant seasonality or repetitive sales patterns. Another straightforward method is to use a simple average or exponential smoothing with seasonal adjustments. For greater flexibility, look into machine learning models such as XGBoost, LightGBM, or Random Forest Regressor.


Comments


Commenting has been turned off.
bottom of page