问题1: 选取二维数组中的若干行与列的交叉点
例如:
import numpy as np
a = np.arange(20).reshape((5,4))
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11],
# [12, 13, 14, 15],
# [16, 17, 18, 19]])
# select certain rows(0, 1, 3) AND certain columns(0, 2)
解答见 https://stackoverflow.com/questions/22927181/selecting-specific-rows-and-columns-from-numpy-array
Using ix_ one can quickly construct index arrays that will index the cross product. a[np.ix_([1,3],[2,5])] returns the array [[a[1,2] a[1,5]], [a[3,2] a[3,5]]].
>>> a = np.arange(20).reshape((5,4))
>>> a[np.ix_([0,1,3], [0,2])]
array([[ 0, 2],
[ 4, 6],
[12, 14]])
问题2: One Hot encodings
所谓'One Hot encodings', 是将多类问题的向量变化为0-1矩阵:
image.png
定义以下函数即可:
import numpy as np
def convert_to_one_hot(Y, C):
"""
Y是一个numpy.array, C是分类的种数
"""
Y = np.eye(C)[Y.reshape(-1)].T
return Y
y = np.array([[1, 2, 3, 0, 2, 1]])
print(y.shape)
print(y.reshape(-1).shape)
C = 4
print(convert_to_one_hot(y, C))
np.eye(C)
是构造一个对角线为1的对角矩阵, Y.reshape(-1)
把Y压缩成向量[numpy中向量shape是(n,), 矩阵shape是(1, n)],np.eye(C)[Y.reshape(-1)]
的意思是取对角矩阵的相应行, 最后.T
做转置, 就获得了下面的结果:
(1, 6)
(6,)
[[ 0. 0. 0. 1. 0. 0.]
[ 1. 0. 0. 0. 0. 1.]
[ 0. 1. 0. 0. 1. 0.]
[ 0. 0. 1. 0. 0. 0.]]
参考文献
[1] https://stackoverflow.com/questions/22927181/selecting-specific-rows-and-columns-from-numpy-array
[2] https://docs.scipy.org/doc/numpy-1.13.0/user/basics.indexing.html