Numerical issues in L96 data generation

When generating very long time series using the generate_lorenz_data.py script, significant numerical issues start happening after roughly 500K steps. In order to see it, one can use e.g. the following command:

python3.13 scripts/generate_lorenz_data.py --output-dir ./data --seed 42 --num-workers 1 --spin-up-steps 10000 --num-trajectories 1 --num-steps 3000000 --dt 0.01 --forcing 8.0 --batch-size 1 --grid-size 40

Then plot the discrete derivatives of the generated time series for some portion of its duration, and compare with an integration of the L96 equations starting directly from the start of the plot window.

Before 500K steps, these plots roughly match.

After, there are some discrepancies.

By the end of the time series, there are weird spikes in the derivatives every few time steps.

The same issue is likely to happen for the L63 data generation as well. Perhaps something could be done to enforce more precision on the integration time steps of torchdiffeq.

Edited Sep 01, 2025 by Anthony Frion