A practical guide to efficient neural networks: model compression techniques including pruning, quantization, knowledge distillation, and optimization tricks like gradient checkpointing and accumulation.
November 24, 2021
By
Mats Uytterhoeven