KDD2018，短文本匹配：MIX

1. 摘要

首先生成文本的表示，然后计算匹配度（相似度）
不同方法主要的不同在于怎么得到文本表示和怎么计算表示相似度
得到文本表示的方法有
- 基于CNN的
  - A convolutional neural network for modelling sentences，ACL2014
  - Convolutional neural networks for sentence classification，EMNLP2014
- 基于RNN的
  - When are tree structures necessary for deep learning of representations?，EMNLP2015
  - Recurrent neural network for text classification with multi-task learning，IJCAI2016
- 基于Tree-base RNN的
  - Deep recursive neural networks for compositionality in language，NIPS2014
  - Parsing natural scenes and natural language with recursive neural networks，ICML2011
DSSM，用MLP得到文本表示，相似度计算是cosine相似度
- Learning deep structured semantic models for web search using clickthrough data，CIKM2013
- DSSM用的MLP，参数过多，模型复杂而且容易过拟合，而且也没有考虑到单词顺序
CDSSM，用CNN替换了MLP
- A latent semantic model with convolutional-pooling structure for information retrieval，CIKM2014
CNTN，用tensor匹配在CQA任务上表现很好
- Reasoning with neural tensor networks for knowledge base completion，NIPS2013

直接计算匹配特征
更加直观和自然
第一是关键词的匹配，其次是相对位置
同时考虑匹配度和匹配的结构
最近的研究表明这种方法在多文本匹配任务中表现更好
ARC-II，用CNN做匹配，相比前面的方法，考虑了单词的顺序，效果更好。（具体怎么做的得看原文）
- Convolutional neural network architectures for matching natural language sentences，NIPS2014
MatchPyramid，把两段文本做成一个2-d的Matching Matrix，里面元素是每对词语的匹配度（cosine），然后通过CNN得到整体的匹配度
- Text Matching as Image Recognition，AAAI2016
DRMM，When most NLP tasks focus on semantic matching, the Ad-hoc retrieval task is mainly about relevance matching。映射一个可变长的局部交互到一个固定长度的匹配直方图（具体怎么做的得看原文）
- A deep relevance matching model for ad-hoc retrieval，CIKM2016
KNRM [23] 和 Conv-KNRM [3] directly makes interaction between ngrams’ embeddings from two pieces of text and employs a kernel pooling layer to combine the cross-match layers to generate the matching score.
- End-to-end neural ad-hoc ranking with kernel pooling，SIGIR2017
- Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search，WSDM2018

总的来说，上的模型都过多的以来深度学习模型的泛化能力和训练数据的质量。

整体模型结构图

仅仅基于word Embedding的匹配不好，因为词语在不同语境下意义不同
所以用了unigrams, bigrams 和 trigrams，用卷积实现，卷积核大小分别为1、2、3。

unigrams, bigrams 和 trigrams
不同大小卷积核的结果都保留，每个都用来做匹配