Pruning

Pruning is a technique for reducing the number of parameters in a neural network by removing the weights of the least important connections.

Quantization

Quantization is a technique for reducing the precision of the weights and activations in a neural network.

Neural Architecture Search

Neural architecture search (NAS) is a technique for automatically finding the best neural network architecture for a given task.

Knowledge Distillation

Knowledge distillation is a technique for training a neural network by distilling the knowledge of a larger, more accurate network into a smaller, less accurate network.

Other Resources

This page contains a list of additional resources related to optimization methods for neural networks.