Pruning CNN for resource efficient inference

Approach

The proposed scheme for pruning consists of the following steps:

  • Fine-tune the network until convergence on the target task;
  • Alternate iterations of pruning and fine-tuning;
  • Stop pruning when the required trade-off between accuracy and pruning objective is reached.

C(.) is a cost function, the goal of pruning is as follow:

There're several criterion for pruning by evaluating importance of neurons:

  • ORACLE pruning
    The best approximation of a neuron’s importance is to estimate the cost value of the network, once a particular neuron is pruned. This can be implemented as setting the pruning gate to 0 for each neuron in turn and estimating C(D|W).
  • Minimum weight
  • Activation based criteria
    One of the reasons of ReLU’s popularity is that convolutional layers with this activation act as feature detectors. Therefore it is reasonable to assume that if the activation value (the output of the neuron) is small then this feature detector is not important for prediction of the output of the network.
  • Taylor Expansion Approximation
    Intuitively, this criterion prunes neurons that have an almost flat influence on the cost function. This approach requires accumulation of the product of the activation and the gradient wrt. the cost function which is precomputed for back-propagation during training.

Experiment

References:
PRUNING CONVOLUTIONAL NEURAL NETWORKS FOR RESOURCE EFFICIENT INFERENCE, Pavlo Molchanov, 2017, ICLR

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容