无标题文章

#structrual attention

##motivationattention works as a soft-selection module. It could model structural dependencies implicitly.##defination$x=[x_1,...,x_n]$reperesent a sequence of inputs, let q be a query, and$z$be a categorical latent variable with sample space${1,...,n}$.input space is accessed by attention distribution$z\simp(z|x,q)$. The context over a sequence is defined as exectation$c=\mathbb{E}_{z\simp(z|x,q)}f(x,z)$. f(x,z) is an*annotation function*.in this definition,  annotation functino works as a selection function in conventional attention function,$f(x,z)=x_z$. teh context vector can then be computed using a simple sum:$\textbf{c}=\mathbb{E}_{z\simp(z|x,q)}f(x,z)=\sum_{i=1}^np(z=i|x,q)\textbf{x}_i$##method

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容