参考文献:https://blog.csdn.net/leon_winter/article/details/104314441 主要是多任务学习(Multi-Task Le...

参考文献:https://blog.csdn.net/leon_winter/article/details/104314441 主要是多任务学习(Multi-Task Le...
参考文献: https://guyuecanhui.github.io/2019/11/09/paper-2018-ali-esmm/ https://blog.csdn.n...
参考文献:https://www.zhihu.com/question/34878706?sort=created “LSTM 能解决梯度消失/梯度爆炸”是对 LSTM 的经...
参考链接: https://github.com/DA-southampton/NLP_ability/blob/master/%E6%B7%B1%E5%BA%A6%E5%A...
参考文献: https://github.com/DA-southampton/NLP_ability/blob/master/%E6%B7%B1%E5%BA%A6%E5%A...
参考文献:https://www.jianshu.com/p/63943ffe2bab https://zhuanlan.zhihu.com/p/49271699 bert-...
参考文献: https://zhuanlan.zhihu.com/p/87562926 https://blog.csdn.net/weixin_37947156/artic...
参考文献:https://www.jianshu.com/p/63943ffe2bab MLM:在 encoder 的输出上添加一个分类层,用嵌入矩阵乘以输出向量,将其转换为...
参考链接: https://www.zhihu.com/question/322034410/answer/794201004 elmo通过双向lstm构造了双向 gpt利用...
GELU 激活函数 损失函数详解:https://mp.weixin.qq.com/s/pA9JW75p9J5e5KHe3ifcBQ 参考链接:https://blog.cs...