知识图谱系列 Randomly Perturb的图谱预训练

<p><span style="font-size:16px"><span style="font-size:16px">本文介绍基于</span><span style="font-size:16px">Randomly Perturb</span><span style="font-size:16px">互信息最大化的图谱预训练模型</span><span style="font-size:16px">GraphCL</span><span style="font-size:16px">(</span><span style="font-size:16px">NIPS 2020</span><span style="font-size:16px">),介绍模型核心点和模型思路,完整汇报</span><span style="font-size:16px">ppt</span><span style="font-size:16px">获取请关注公众号【AI机器学习与知识图谱】回复关键字:</span></span><strong><span style="font-size:16px">GraphCL</span></strong></p><p>
</p><p><span style="font-size:18px"><strong>一、背景知识</strong></span></p><p><span style="font-size:16px"><b>图谱预训练作用</b><span>:图神经网络</span><span>(GNNs)</span><span>已被证明是建模图结构数据的强大工具。然而,训练</span><span>GNN</span><span>模型通常需要大量的特定任务的标记数据,而获取这些数据往往非常昂贵。利用自监督</span><span>GNN</span><span>模型对未标记图谱数据进行预训练是减少标记工作的一种有效方法,然后将预训练学习到的模型可用在只有少量标签图谱数据的下游任务。</span></span></p><p>
</p><p><span style="font-size:16px"><b>大规模图谱预训练:</b><span>大规模知识图谱预训练方案都需要遵守以下几个套路:首先需要进行子图采样,使用子图进行模型训练;其次采用自监督学习模式,</span><span>Mask</span><span>图中的节点或边然后进行训练;计算</span><span>Loss</span><span>时需要进行负采样,图规模大无法基于全部负样例。</span></span></p><p><b>
</b></p><p><span style="font-size:16px"><b>对比学习</b><b>VS</b><b>生成式学习</b><span>:请参考上一篇有详细解释,。</span></span></p><p>
</p><p><span style="font-size:18px"><strong>二、GraphCL模型</strong></span></p><p><span style="font-size:16px"><span>
</span></span></p><p><span style="font-size:16px"><span>GraphCL</span>是一个基于对比学习的自监督图谱预训练模型,<span>GraphCL</span>模型对一个节点得到两个随机扰动的<span>L-hop</span>的<span>Subgraph</span>,通过最大化两个<span>Subgraph</span>之间的相似度来进行自监督学习。关注以下三个问题。</span></p><p><b>
</b></p><p><span style="font-size:16px"><b>问题</b><b>1</b><b>:</b><span>A Stochastic Perturbation</span>,如何获得一个节点两个<span>L-Hop</span>的子图?对个一个节点完整的<span>L-Hop Subgraph</span>,本文通过以概率<span>p</span>随机丢边的方式来生成不同的子图结构。</span></p><p><b>
</b></p><p><span style="font-size:16px"><b>问题</b><b>2</b><b>:</b><b>A </b><span>GNN based Encoder</span>,使用何种图神经网络对两个<span>L-Hop Subgraph</span>进行表征?简单的<span>GCN</span>模型(<span>Hamiltonet al. 2017</span>),汇聚函数使用<span>mean-pooling propagation rule</span>,但对于<span>Transductive</span>和<span>Inductive Learning</span>会不一样。<span>Transductive Learning</span>时汇聚公式如下:</span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-acbaaea73bb0ddf0.jpeg" img-data="{"format":"jpeg","size":2925,"height":40,"width":277}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:16px"><span>Inductive Learning</span><span>时汇聚公式如下:</span></span>
</p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-cabab5c0cdb6f667.jpeg" img-data="{"format":"jpeg","size":8926,"height":140,"width":327}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p>
</p><p><span style="font-size:16px"><b>问题</b><b>3</b><b>:</b><span>A Contrastive Loss Function</span>,损失函数如何定义?首先对两个<span>L-Hop Subgraph</span>相似度计算使用的是余弦相似度,损失函数是<span>Based on a normalized temperature-scaled cross entropy</span>,如下公式所示,其中1_([u≠v])指标函数表示当u≠v时为<span>1</span>,反之为<span>0</span>,τ是一个<span>temperature parameter</span>。</span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-ac6a40a28411a3c2.jpeg" img-data="{"format":"jpeg","size":2507,"height":49,"width":233}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p/><p>
</p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-8236db804cb03615.jpeg" img-data="{"format":"jpeg","size":9017,"height":70,"width":706}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p>
</p><p><span style="font-size:16px"><strong><span>GraphCL</span><span>模型运行步骤</span></strong></span>
</p><p>
</p><p><span style="font-size:16px"><span>对一个采样的</span><span>Mini-Batch</span>B,<span>GraphCL</span><span>模型执行步骤如下所示:</span><span>
1</span><span>、对于</span>B<span>中的节点</span><span>u</span><span>,定义</span>(X_u,A_u)<span>是节点</span><span>u</span><span>的</span><span>L</span><span>跳子图,包含从</span><span>u</span><span>到</span><span>L</span><span>跳内所有节点和边及其对应的特征信息;</span></span></p><p><span style="font-size:16px"><span>2</span><span>、按照之前介绍的扰动策略得到节点</span><span>u</span><span>的两个扰动的</span><span>L-Hop</span><span>子图</span>t_1,t_2<span>,如下公示所示:</span><span>
</span></span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-7774c5dbe0747bd9.jpeg" img-data="{"format":"jpeg","size":4735,"height":71,"width":245}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:16px"><span>3</span><span>、使用</span><span>GraphEncoder </span>f<span>在</span>t_1,t_2<span>上</span><span>,如下公式所示:</span></span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-8b275480c0d27203.jpeg" img-data="{"format":"jpeg","size":3970,"height":70,"width":207}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:16px"><span>4</span><span>、使用如下的</span><span>Loss Function</span><span>来训练更新</span><span>Graph Encoder </span><span>f</span><span>的</span><span>模型参数</span>
</span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-70a2d0a32cdf0222.jpeg" img-data="{"format":"jpeg","size":2279,"height":65,"width":158}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:16px"><span style="font-size:16px">5、GraphCL</span><span style="font-size:16px">模型结构图如下所示:</span>
</span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-5e5596c5bf6b5cc9.jpeg" img-data="{"format":"jpeg","size":14436,"height":225,"width":595}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p>
</p><p><span style="font-size:18px"><strong>三、结论</strong></span></p><p>
</p><p><span style="font-size:16px"><span style="font-size:16px">结论:在</span><span style="font-size:16px">Transductive Learning</span><span style="font-size:16px">和</span><span style="font-size:16px">Inductive Learning</span><span style="font-size:16px">两个方面,都证明</span><span style="font-size:16px">GraphCL</span><span style="font-size:16px">模型在许多节点分类基准上显著优于最先进的无监督学习。</span></span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-0f7b984ea33ced92.jpeg" img-data="{"format":"jpeg","size":34521,"height":195,"width":740}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-90921a9bd38c228d.jpeg" img-data="{"format":"jpeg","size":53311,"height":269,"width":823}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-8796bd43f3fed294.jpeg" img-data="{"format":"jpeg","size":44397,"height":272,"width":754}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p>
</p><p>
</p><p><span style="font-size:18px"><strong>往期精彩</strong></span></p><p><span>【知识图谱系列】Over-Smoothing 2020综述</span></p><p><span style="font-size:14px">【知识图谱系列】基于2D卷积的知识图谱嵌入</span></p><p><span style="font-size:14px">【知识图谱系列】自适应深度和广度图神经网络模型</span></p><p><span style="font-size:14px">【知识图谱系列】知识图谱的神经符号逻辑推理</span></p><p/><p><span style="font-size:14px">【知识图谱系列】多关系神经网络CompGCN</span>
</p><p/><p>【知识图谱系列】知识图谱表示学习综述 | 近30篇优秀论文串讲
</p><p/><p>【面经系列】八位硕博大佬的字节之旅</p>

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容