论文阅读|Reasoning with Latent Structure Refinement for Document-Level Relation Extraction

ACL2020 文档级关系抽取论文

论文地址：https://arxiv.org/pdf/2005.06312.pdf

代码地址：https://github.com/nanguoshun/LSR

作者构造了一个动态的潜在结构优化策略来捕获非局部上下文信息，从而提取文档级关系。他们提出的模型通过自动归纳潜在的文档层次图来支持跨句子的关系推理。不同于以往的静态结构，该方法将文档层次图当作一种变量并通过end-to-end的方式推理得出。

模型由三个模块组成：

节点构造器

对输入文档的每个单词进行编码（BERT或者Glove或者Bi-LSTM）得到每个单词的向量表示。然后通过spaCy对文档中的每个句子生成最短依存路径，在路径上的节点作为模型的输入。其中不同的节点作者做了三种区分，entity node、mention node和MDP node。其中entity node是聚合了与之相关的mention node的表达。

f1.png

动态推理器

作者设计了一种动态结构推理方法，通过神经网络生成非静态的结构表达，作为GCN的输入。

f2.png

首先对于节点 $i$ 与节点 $j$ 的表达 $u_i,u_j$ 进行两层神经网络的计算得出其 $s_{ij}$ :
$s_ij=(tanh(W_pu_i))^TW_b(tanh(W_cu_j))$

对于根节点 $s_i^r=W_ru_i$ ，随后生成矩阵
$P_{ij}= \begin{cases} 0 ,& i=j \\ exp(s_ij), &otherwise \end{cases}$
首lapulas矩阵的启发，将 $P_{ij}$ 转化为 $L$ 的形式
$L_{ij}= \begin{cases} \sum{i^{'}}{n}P_{i^{'}j}& i==j \\ -P_{ij}&otherwise \end{cases}$
替换其中的根节点得到 $\hat{L}$
$L_{ij}= \begin{cases} \exp(s_i^r)& i==j \\L_{ij}&otherwise \end{cases}$
最后得到带权邻接矩阵 $A$
$A_{ij}=(1-\delta_{1,j})P_{ij}[\hat{L}^{-1}]_{ij}-(1-\delta{i,1}P_{ij}[\hat{L}^{-1}]_{ji})$
将 $A$ 作为GCN输入可以得到其表达为
$u_i^l=\sigma(\sum{j=1}{n}A_{ij}W^lu_j^{l-1}+b^l)$

分类器

采用双向线性函数计算每个关系类别的概率：
$P(r|e_i,e_j)=\sigma(e_i^TW_ee_j+b_e)_r$
这里 $e_i$ 与 $e_j$ 表示实体 $i$ 与实体 $j$ 的表达（通过上述步骤计算得出）

总结

作者采用最短依存树的方式构建了新的输入图结构，并基于神经网络提出了动态图结构，相比于同样是动态图结构的AGGCN模型，作者在这篇文章中提出推导的方法更加优异。

可参考写作

Unlike previous work that only ... , we ...

和以往的算法对比，写我们方法的优点时可用

Eg Unlike previous work (Liu and Lapata, 2018) that only induces the latent structure once, we repeatedly refine the document-level graph based on the updated representations, allowing the model to infer a more informative structure that goes beyond simple parent-child relations.

We follow ... to ...

Following .... , we

在follow某些工作的时候可以用

Eg We follow (Christopoulou et al., 2019) to split training set of GDA into an 80/20 split for training and development.

... depict the comparisons with ... on ...

描述实验结果的时候，除了demostrate,verify,describe,indicate,show以外也可以用该句式

Eg Table 3 depicts the comparisons with state-of- the-art models on the CDR dataset.

Intuitively 直观的说

Eg: Intuitively, the reasoner induces a shallow structure at early iterations since the information propagates mostly between neighboring nodes.

连接词