【无回溯RNN训练】Training recurrent networks online without backtracking

    This prevents computing or even storing G(t) for moderately large-dimensional dynamical systems, such as recurrent neural networks.

1 The NoBackTrack algorithm

1.1 The rank-one trick: an expectation-preserving reduction

    We propose to build an approximation of G(t),The construction of an unbiased is based on the following “rank-one  trick”.

    The rank-one reductionA˜ depends, not only on the value of A, but also  on the way A is decomposed as a sum of rank-one terms. In the applications  to recurrent networks below, there is a natural such choice.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容