http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.377.5365&rep=rep1&type=pdf
用because来抽取可能的因果对,这里的因果对两端都是动词
有k*(r-k)个可能的因果对,找出其中可能性最大的
v_i表示原因,v_j表示结果
PS_I 是惩罚项
C(.) 表示count
pos表示动词距离提示词because的距离
用的corpus:
English Gigaword corpus
用pattern because和but分别生成正负样本
频率高于50的10, 455个动词对
Explicit Causal Association (ECA)
CD determines the causal dependency of the verb pair in unsupervised fashion
CI finds the tendency of instance I of (vi , vj ) to belong to the cause class as compared to the non-cause class using training corpus of event pairs
CD项是先验,可以降低false positive. 因为causal是correlation的一种
fk是关于词对的特征,一共有五类特征
Implicit Causal Association (ICA)
a novel metric ICA to avoid the problem of training data sparsity
ERM determines the likelihood of roles of the events in the cause relation
CD表示,两个单词之间是不是又correlation
C_I是说,这个pair是不是更可能具有causal 关系而不是non-causal关系
ERM是说,v_i是不是更可能是cause而v_j是不是更可能是effect
the high value of ERM of an event pair can have one of the fol- lowing two interpretations: (A) it is a non-causal event pair, or (B) it is a causal event pair but this pair and the pairs which are semantically closer to it hardly appear in explicit causal contexts.