阅读知识图谱随感
TransE
知识图谱的主要构成是三个东西。
h
\bm{h}
h表示头部,也叫做head。
r
\bm{r}
r表示关系,也叫做relation。
t
\bm{t}
t表示尾部,也叫做tail。一般而言,我们约定
G
=
{
h
,
r
,
t
}
G= \{\bm{h},\bm{r},\bm{t}\}
G={h,r,t}
这就表示一种关系。例如
h
:
=
I
t
a
l
y
\bm{h}:=Italy
h:=Italy,
r
:
=
C
a
p
i
t
o
l
\bm{r}:=Capitol
r:=Capitol,
t
:
=
R
o
m
e
\bm{t}:=Rome
t:=Rome,也就是意大利的首都是罗马。
说到知识图谱,必须介绍的就是TransE这个模型了。其得到得分函数(score function)的方法是
s
=
∣
∣
h
+
r
−
t
∣
∣
2
2
s= ||\bm{h} + \bm{r} - \bm{t}||_2^2
s=∣∣h+r−t∣∣22
一般而言,可以采用Pairewise的训练策略。也就是找一个正样本对和一个负样本对
L
=
[
m
+
s
(
h
,
r
,
t
+
)
−
s
(
h
,
r
,
t
−
)
]
+
L= [m + s(h,r,t_+) - s(h,r,t_-)]_+
L=[m+s(h,r,t+)−s(h,r,t−)]+
写出代码试试看
def TransE(self, head, relation, tail, mode):
score = (head + relation) - tail
score = self.gamma.item() - torch.norm(score, p=1, dim=2)
return score
RotatE
如果有一天,知识之间的关系可以用旋转来表示,那么世界会变成什么样子呢?这时候我们利用欧拉公式,一个复数
c
c
c乘上
e
i
θ
e^{i\theta}
eiθ之后相当于把这个复数旋转了
θ
\theta
θ角度。然后把相关参数用复数来表示之
h
=
x
h
+
i
y
h
t
=
x
t
+
i
y
t
r
=
x
r
+
i
y
r
\bm{h} = \bm{x}_h + i \bm{y}_h\\ \bm{t} = \bm{x}_t + i \bm{y}_t\\ \bm{r} = \bm{x}_r + i \bm{y}_r
h=xh+iyht=xt+iytr=xr+iyr
值得注意的是,
∣
∣
r
∣
∣
2
2
=
1
||\bm{r}||_2^2 = 1
∣∣r∣∣22=1。那么来计算一波
h
⋅
r
=
(
x
h
+
i
y
h
)
(
x
r
+
i
y
r
)
h
⋅
r
=
(
x
h
x
r
−
y
h
y
r
)
+
i
(
x
h
x
r
+
x
r
y
h
)
\bm{h} \cdot \bm{r} = (\bm{x}_h + i \bm{y}_h)(\bm{x}_r + i \bm{y}_r)\\ \bm{h} \cdot \bm{r} = (\bm{x}_h\bm{x}_r - \bm{y}_h\bm{y}_r) + i(\bm{x}_h\bm{x}_r + \bm{x}_r\bm{y}_h)
h⋅r=(xh+iyh)(xr+iyr)h⋅r=(xhxr−yhyr)+i(xhxr+xryh)
然后有
h
⋅
r
−
t
=
(
x
h
x
r
−
y
h
y
r
−
x
t
)
+
i
(
x
h
x
r
+
x
r
y
h
−
y
t
)
\bm{h} \cdot \bm{r} - \bm{t} = (\bm{x}_h\bm{x}_r - \bm{y}_h\bm{y}_r - \bm{x}_t) + i(\bm{x}_h\bm{x}_r + \bm{x}_r\bm{y}_h - \bm{y}_t)
h⋅r−t=(xhxr−yhyr−xt)+i(xhxr+xryh−yt)
具体运算则是
def RotatE(self, head, relation, tail, mode):
pi = 3.14159265358979323846
re_head, im_head = torch.chunk(head, 2, dim=2)
re_tail, im_tail = torch.chunk(tail, 2, dim=2)
#Make phases of relations uniformly distributed in [-pi, pi]
phase_relation = relation/(self.embedding_range.item()/pi)
re_relation = torch.cos(phase_relation)
im_relation = torch.sin(phase_relation)
re_score = re_head * re_relation - im_head * im_relation
im_score = re_head * im_relation + im_head * re_relation
re_score = re_score - re_tail
im_score = im_score - im_tail
score = torch.stack([re_score, im_score], dim = 0)
score = score.norm(dim = 0)
score = self.gamma.item() - score.sum(dim = 2)
return score
文章来源:RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space。