《Evaluation of sentence embeddings in downstream and linguistic probing tasks》

论文名: Evaluation of sentence embedding in downstream and linguistic tasks

摘要解读：

句子向量表示方法越来越多，但是怎么去评估这些句子表示方法的优劣，仍然是一个问题。所以这篇论文是针对现有的句子向量表示方法，运用到NLP中的各项任务中，去评估这些方法的优劣性。

论文中评估的模型有：

【1】ELMo(Bow, all layers, 5.5B) https://allennlp.org/elmo

【2】ELMo(BoW, all layers, original) https://allennlp.org/elmo

【3】ELMo(BoW, top layer, original)

【4】FastText(BoW, Common Crawl) https:

//fasttext.cc/docs/en/english-vectors.html

【5】GloVe (BoW, Common Crawl) https:

//nlp.stanford.edu/projects/glove/

【6】Word2Vec (BoW, Google News) https:

//code.google.com/archive/p/word2vec/

【7】p-mean (monolingual) https://github.

com/UKPLab/arxiv2018-xling-sentence-embeddings

【8】Skip-Thought https://github.com/

ryankiros/skip-thoughts

【9】InferSent (AllNLI) https://github.

com/facebookresearch/InferSent

【10】USE (DAN) https://tfhub.dev/google/universal-sentence-encoder/1

【11】USE (Transformer) https://www.tensorflow.org/hub/modules/google/

universal-sentence-encoder-large/1.

简介：

如今，词向量技术在自然语言处理(NLP)和自然语言理解(NLU)领域中被广泛的使用。这些词向量改善了许多领域里的一些主要任务，比如：机器翻译、语义解析、文本分类和机器阅读。目前已经有很多方法可以实现词向量表示，比如，Neural Probabilistic Language Model、Word2vec、GloVe、ELMo、【注：2018年谷歌又开源了一款新的词向量模型BERT】。

虽然大多数的词向量技术都依赖于语言的分布式假设【注：所谓分布式假设，就是处于相似上下文中的词具有相似的含义】，但是它们之间的区别在于如何利用上下文去生成词向量的方式。这些不同的词向量技术应该能够在某一个主流的任务中表现优秀或能够抓取到语言特征。目前，针对某一具体任务选择一个词向量技术仍然需要做很多实验和评估。

虽然目前词向量技术能够提供一个高质量的词的表示，但是针对于更大粒度的文本，例如：句子、段落、文章，仍然是一个开放性研究问题。

《Evaluation of sentence embeddings in downstream and linguistic probing tasks》

推荐阅读更多精彩内容