Skip to content

Model checkpointing

Marcus Hardt requested to merge model-checkpointing into main

Created by: lehr-fa

Adds checkpointing and chunking to the computation of TiedAxialColumnAttention, which reduces the memory pressure by sequentializing dot products and disabling caching of intermediates.

Merge request reports

Loading