[BUG] Zero Division in diagnostics.performance_metrics() causing failed assertion #2577

ThomasChia · 2024-05-01T16:17:36Z

Issue

When calculating the SMAPE metric in the diagnostics.performance_metrics() function, there is the possibility for a zero division if y and yhat are both zero.

def smape(df, w):
    """Symmetric mean absolute percentage error
    based on Chen and Yang (2004) formula

    Parameters
    ----------
    df: Cross-validation results dataframe.
    w: Aggregation window size.

    Returns
    -------
    Dataframe with columns horizon and smape.
    """
    sape = np.abs(df['y'] - df['yhat']) / ((np.abs(df['y']) + np.abs(df['yhat'])) / 2)    <---- POSSIBLE ZERO DIVISION
    if w < 0:
        return pd.DataFrame({'horizon': df['horizon'], 'smape': sape})
    return rolling_mean_by_h(
        x=sape.values, h=df['horizon'].values, w=w, name='smape'
    )

This does not cause an error directly, however, it results in np.nan values where zero division occurs. When the rolling_mean_by_h() function is called, there is a groupby() which removes any np.nan values. This becomes an issue in the main performance_metrics() function with the following assert:

assert np.array_equal(res['horizon'].values, res_m['horizon'].values)

This is part of a loop that checks each of the metrics ensuring that they are the same length and fails given the above scenario, as np.nan values are removed and that metric returns fewer values.

Replication

Here is how you can replicate this issue:

import pandas as pd
from prophet import Prophet
from prophet.diagnostics import cross_validation, performance_metrics

df = pd.read_csv('https://raw.githubusercontent.com/facebook/prophet/main/examples/example_wp_log_peyton_manning.csv')

df['ds'] = pd.to_datetime(df['ds'])
df.loc[df['ds'].dt.dayofweek == 6, 'y'] = 0

m = Prophet()
m.fit(df)

df_cv = cross_validation(m, '365 days', initial='1825 days', period='365 days')
df_cv['yhat'] = df_cv['yhat'].clip(lower=0)
metrics = performance_metrics(df_cv)

We set certain values in the training data to zero and clip negative values to create a scenario where y and yhat are both zero.

The text was updated successfully, but these errors were encountered:

…ics() causing failed assertion (#2578)

ThomasChia mentioned this issue May 1, 2024

Fix Issue #2577 - Zero Division Error in diagnostics.performance_metrics() causing failed assertion #2578

Merged

tcuongd pushed a commit that referenced this issue May 18, 2024

Fix Issue #2577 - Zero Division Error in diagnostics.performance_metr…

36421b7

…ics() causing failed assertion (#2578)

ThomasChia closed this as completed May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Zero Division in diagnostics.performance_metrics() causing failed assertion #2577

[BUG] Zero Division in diagnostics.performance_metrics() causing failed assertion #2577

ThomasChia commented May 1, 2024

[BUG] Zero Division in diagnostics.performance_metrics() causing failed assertion #2577

[BUG] Zero Division in diagnostics.performance_metrics() causing failed assertion #2577

Comments

ThomasChia commented May 1, 2024

Issue

Replication