feature based和fine tuning

相同点

都是借助别人已有的NLP模型完成自己的任务

不同点

As shown in figure 2 of {1}, in the fine-tuning strategy all weights are changed when training on the new task (except for the weights of the last layers for the original task), whereas in the feature extraction strategy only the weights of the newly added last layers change during the training phase:

image.png

feature-baed and fine tuning

feature-based

只变化了最后一层的参数。
通常feature-based方法包括两步：

首先在大的语料A上无监督地训练语言模型，训练完毕得到语言模型（用作embeddining）
然后构造task-specific model例如序列标注模型，采用有标记的语料B来有监督地训练task-specific model，将语言模型的参数固定，语料B的训练数据经过语言模型得到LM embedding(language model)，作为task-specific model的额外特征

EMLO是这种方法的典型

fine-tuning

除了最后一层，所有的参数都变化了。
Fine-tuning方式是指在已经训练好的语言模型的基础上，加入少量的task-specific parameters, 例如对于分类问题在语言模型基础上加一层softmax网络，然后在新的语料上重新训练来进行fine-tune。

构造语言模型，采用大的语料A来训练语言模型
在语言模型基础上增加少量神经网络层来完成specific task例如序列标注、分类等，然后采用有标记的语料B来有监督地训练模型，这个过程中语言模型的参数并不固定，依然是trainable variables.

参考资料知乎fine-tuning