AGNN: Attention-based Graph Neural Network for Semi-supervised learning (Mar 2018)
Background
These architectures alternate between a propagation layer that aggregates the hidden states of the local neighborhood and a fully connected layer.
AGNN:
- 通过移除intermediate fully connected layer,降低模型参数。在半监督学习任务中,标签数据比较少,这种方式可以为设计innovative的传播层留有更多余地;
- 在传播层中使用attention mechanisms,动态调节局部信息,以此获得更高的准度。
Contribution
使用a linear classifier of multinomial logistic regression,移除了intermediate non-linear activation layers, 只保留了图神经网络中的邻接线性传播,这种模型结果可以得到与最好的图模型媲美的结果,同时也表明了图上邻接信息聚合的重要性;
-
基于attention mechanisms:
- 降低模型复杂度,在每个intermediate layer,只有一个scalar parameter
- Discover dynamically and adaptively which nodes are relevant to the target node for clarification
Model
节点特征向量: ,
标签:
有标签的子集:
假设邻近节点更有可能具有相同的标签,损失:
- 有监督损失:
- 拉普拉斯正则化:
邻接矩阵:
目标函数: ,来预测每个节点所属标签,其中:
Z_{ic}:节点i数据标签c的概率
Propagation Layer
第t层的隐层:
传播矩阵:
传播层:,可以是局部平均或者是随机游走:
- Random walk:
单层传播:
GCN
Is a special case of GNN which stacks two layers of specific propagation and perceptron:
其中:
损失:
GLN
将GCN的intermediate非线性激活移除,就是GLN:
The two propagation layers simply take linear local average of the raw features weighted by their degrees, and at the output layer, a simple linear classifier( multinomial logistics regression) is applied.
AGCN
GCN的层与层的传播是不变(static)的,并不会考虑到节点的状态(adaptive propagation)。
比如:P_{i,j} = 1 /\sqrt[2]{|N(i)|N(j)},无法知道哪个邻接的节点与分类的节点更有关。
Embedding_layer:
Attention-guided propagation layers:
Output row-vector of node i:
其中:
Attention from node j to node I is:
Add self loop in propagation: