array
方法创建一维或多维数组,参数只能有一个,可以是列表、range、元组
不管是一维还是多维,数据类型都是 numpy.ndarray (多维数组):
In [11]: a = np.array(list(range(9)))
In [12]: a
Out[12]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])
In [13]: a = np.array(range(9))
In [14]: a
Out[14]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])
In [15]: a = np.array(tuple(range(9)))
In [16]: a
Out[16]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])
In [17]: a = np.array([list('abcd'), list(range(1,5)), list('MNOP')])
In [18]: a
Out[18]:
array([['a', 'b', 'c', 'd'],
['1', '2', '3', '4'],
['M', 'N', 'O', 'P']], dtype='<U1')
In [19]: type(a)
Out[19]: numpy.ndarray
np.ones 创建元素值全部为 1 的数组
np.zeros 创建元素值全为 0 的数组
np.empty 创建空值多维数组,只分配内存,不填充任何值
以上方法的参数也只能有一个,方法的返回值的数据类型是 numpy.ndarray :
In [63]: np.ones(8)
Out[63]: array([1., 1., 1., 1., 1., 1., 1., 1.])
In [64]: np.zeros(8)
Out[64]: array([0., 0., 0., 0., 0., 0., 0., 0.])
In [65]: np.empty(8)
Out[65]: array([0., 0., 0., 0., 0., 0., 0., 0.])
In [66]: np.empty(8) == np.zeros(8)
Out[66]: array([ True, True, True, True, True, True, True, True])
In [74]: np.ones((3,4))
Out[74]:
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
In [75]: type(np.ones((3,4)))
Out[75]: numpy.ndarray
arange
的用法同 Python 中的 range :
In [85]: a = np.arange(3,9)
In [86]: a
Out[86]: array([3, 4, 5, 6, 7, 8])
In [87]: a = np.arange(9)
In [88]: a
Out[88]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])
多维数组的一些属性:
In [89]: a = np.ones((2,3,4))
In [90]: a
Out[90]:
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
In [91]: a.shape # 形状,2 组 3 行 4 列
Out[91]: (2, 3, 4)
In [92]: a.ndim # 维度
Out[92]: 3
In [93]: a.size # 元素总数
Out[93]: 24
In [94]: a.mean() # 平均值
Out[94]: 1.0
In [95]: a.dtype # 元素的数据类型
Out[95]: dtype('float64')
numpy.ndarray 数据可以通过 reshape
方法变形:
In [96]: a = np.arange(12)
In [97]: a
Out[97]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
In [98]: a.reshape(3,4)
Out[98]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [99]: a
Out[99]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
多维数组相乘,用 dot
方法按照矩阵的规则:
In [115]: a = np.arange(1,5).reshape(2,2)
In [116]: b = np.arange(5,9).reshape(2,2)
In [117]: a
Out[117]:
array([[1, 2],
[3, 4]])
In [118]: b
Out[118]:
array([[5, 6],
[7, 8]])
In [119]: a * b
Out[119]:
array([[ 5, 12],
[21, 32]])
In [120]: a.dot(b)
Out[120]:
array([[19, 22],
[43, 50]])
# 19 = 1*5 + 2*7
# 22 = 1*6 + 2*8
# 43 = 3*5 + 4*7
# 50 = 3*6 + 4*8
random
用于定义元素值为随机数的多维数组,常见的使用:
# randint 方法生成一个 [0, 3) 之间的整数(只给一个参数,另一个参数则默认为 0)
In [131]: for i in range(9):
...: print(np.random.randint(3))
...:
0
1
1
1
0
1
2
1
2
# randint 生成 [-2, 4) 之间的整数
In [132]: for i in range(9):
...: print(np.random.randint(-2, 4))
...:
-1
-1
-2
3
3
2
-2
-1
1
In [12]: np.random.randint(1, 3,(3, 4)) # 第三个参数定义 shape
Out[12]:
array([[2, 1, 1, 1],
[1, 1, 1, 2],
[1, 1, 1, 2]])
In [18]: np.random.randint(0, 3, 5)
Out[18]: array([2, 1, 2, 1, 0])
# rand 方法根据给定 shape 生成值为 [0, 1) 之间的数据
In [19]: np.random.rand()
Out[19]: 0.7651785443614785
In [20]: np.random.rand(2)
Out[20]: array([0.10671066, 0.79280396])
In [21]: np.random.rand(2, 3)
Out[21]:
array([[0.47940471, 0.11098761, 0.85766535],
[0.33916207, 0.3491621 , 0.33013207]])
# 正态分布 -- normal distribution
# 正态分布又称高斯分布,是连续随机变量概率分布的一种,有均值和标准差两个属性
# 正态分布的特点:曲线为轴对称图形,曲线下面的面积为 1
# normal 创建符合 ”正态分布“ 的数组,下面的 randn 是它的特殊形态
# 均值为 1,标准差为 2,数量为 15,符合正态分布的一维数组
In [134]: np.random.normal(1, 2, 15)
Out[134]:
array([ 0.80615746, -0.26899861, -0.74888346, 4.53490444, 1.37697208,
-0.55673608, 0.40559978, 0.7658986 , 0.24568999, -0.2934424 ,
2.26390635, -1.27112171, 3.3586831 , 1.41351671, 0.17504285])
# 均值为 1,标准差为 2,三行五列的二维数组
In [135]: np.random.normal(1, 2, (3, 5))
Out[135]:
array([[-2.27388178, 1.96749003, 1.15758907, 1.6039605 , 3.43645324],
[ 2.20140634, -1.0816135 , 1.73925819, 3.18562444, 1.31227349],
[-1.20275991, 1.14247008, -2.03774236, -1.25580262, 1.68002098]])
# 标准正态分布 -- standard normal distribution
# 标准正态分布就是一种特殊的、以 0 为均值、以 1 为标准差的正态分布,记为 N(0,1)
# randn 方法返回一组符合 “标准正态分布” 的随机样本
In [136]: np.random.randn(2, 3)
Out[136]:
array([[ 2.28233846, 0.88622313, -0.21362108],
[ 0.16395558, 0.0607688 , 1.14653666]])
# seed 方法可以锚定某一次随机数
# 一次锚定,永久有效,不可更改
# 注意参数限制:Seed must be between 0 and 2**32 - 1
In [137]: np.random.seed(1) # 下面的随机数就被锚定了
In [138]: np.random.randn(2,3) # 这是被锚定的数组
Out[138]:
array([[ 1.62434536, -0.61175641, -0.52817175],
[-1.07296862, 0.86540763, -2.3015387 ]])
In [139]: np.random.randn(2,3) # 这个是未被锚定的数组
Out[139]:
array([[ 1.74481176, -0.7612069 , 0.3190391 ],
[-0.24937038, 1.46210794, -2.06014071]])
In [140]: np.random.randn(2,3) # 这个也是未被锚定的数组
Out[140]:
array([[-0.3224172 , -0.38405435, 1.13376944],
[-1.09989127, -0.17242821, -0.87785842]])
In [141]: np.random.seed(1) # 要想获取上面被锚定的那个,就先写这一行
In [142]: np.random.randn(2,3) # 获得上面被锚定的数组
Out[142]:
array([[ 1.62434536, -0.61175641, -0.52817175],
[-1.07296862, 0.86540763, -2.3015387 ]])
around
和 round
处理小数
它俩的功能是完全一样的,round 会调用 around 来实现:
In [104]: a = np.random.rand(9) * 10
In [105]: a
Out[105]:
array([5.61645392, 6.08375342, 6.54287893, 9.11241186, 7.98373947,
3.99255352, 8.40064373, 0.52359344, 1.2900161 ])
In [106]: np.around(a) # 四舍五入取整数,但结果为 float 类型
Out[106]: array([6., 6., 7., 9., 8., 4., 8., 1., 1.])
In [107]: type(np.around(a)[1])
Out[107]: numpy.float64
In [108]: np.around(a).tolist() # 转换成列表
Out[108]: [6.0, 6.0, 7.0, 9.0, 8.0, 4.0, 8.0, 1.0, 1.0]
# round 方法保留几位小数,第二个参数为小数位数,若结尾为 0 即舍弃 0
In [109]: np.round(a, 3)
Out[109]: array([5.616, 6.084, 6.543, 9.112, 7.984,
3.993, 8.401, 0.524, 1.29 ])
In [110]: np.around(3.33)
Out[110]: 3.0
In [111]: np.round(3.33, 3)
Out[111]: 3.33
In [112]: np.round(3.33333, 3)
Out[112]: 3.333
cumsum
累加,对任何多维数组均有效:
In [3]: a = np.random.rand(10)
In [4]: a
Out[4]:
array([0.7520204 , 0.2567527 , 0.9816239 , 0.33018876, 0.78744841,
0.54783023, 0.68264491, 0.49180885, 0.77609804, 0.69531601])
In [5]: a.cumsum() # 第 n 个元素的值为前 n 个元素的值的和
Out[5]:
array([0.7520204 , 1.00877309, 1.99039699, 2.32058575, 3.10803416,
3.6558644 , 4.3385093 , 4.83031816, 5.6064162 , 6.30173221])
In [6]: np.cumsum(a) # 同上
Out[6]:
array([0.7520204 , 1.00877309, 1.99039699, 2.32058575, 3.10803416,
3.6558644 , 4.3385093 , 4.83031816, 5.6064162 , 6.30173221])
In [9]: a = np.random.rand(3,4)
In [10]: a
Out[10]:
array([[0.4454985 , 0.56523565, 0.79940169, 0.65121062],
[0.17480848, 0.96010804, 0.95316188, 0.95932027],
[0.20944227, 0.66267168, 0.71068392, 0.31114727]])
In [11]: a.cumsum() # 对于多维数组,默认是先从左到右再从上到下累加,不分组
Out[11]:
array([0.4454985 , 1.01073415, 1.81013584, 2.46134646, 2.63615495,
3.59626299, 4.54942487, 5.50874514, 5.7181874 , 6.38085908,
7.091543 , 7.40269027])
In [12]: a.cumsum(axis=0) # 以列分组,从上到下累加
Out[12]:
array([[0.4454985 , 0.56523565, 0.79940169, 0.65121062],
[0.62030699, 1.52534369, 1.75256357, 1.61053089],
[0.82974925, 2.18801537, 2.46324749, 1.92167816]])
In [13]: a.cumsum(axis=1) # 以行分组,从左到右累加
Out[13]:
array([[0.4454985 , 1.01073415, 1.81013584, 2.46134646],
[0.17480848, 1.13491653, 2.08807841, 3.04739867],
[0.20944227, 0.87211395, 1.58279787, 1.89394514]])
In [14]: np.cumsum(a, axis=1) # 同上
Out[14]:
array([[0.4454985 , 1.01073415, 1.81013584, 2.46134646],
[0.17480848, 1.13491653, 2.08807841, 3.04739867],
[0.20944227, 0.87211395, 1.58279787, 1.89394514]])
linspace
创建等差数列的一维数组:
In [21]: np.linspace(1, 20) # 前两个参数分别为起止数,左闭右闭,默认创建 50 个元素
Out[21]:
array([ 1. , 1.3877551 , 1.7755102 , 2.16326531, 2.55102041,
2.93877551, 3.32653061, 3.71428571, 4.10204082, 4.48979592,
4.87755102, 5.26530612, 5.65306122, 6.04081633, 6.42857143,
6.81632653, 7.20408163, 7.59183673, 7.97959184, 8.36734694,
8.75510204, 9.14285714, 9.53061224, 9.91836735, 10.30612245,
10.69387755, 11.08163265, 11.46938776, 11.85714286, 12.24489796,
12.63265306, 13.02040816, 13.40816327, 13.79591837, 14.18367347,
14.57142857, 14.95918367, 15.34693878, 15.73469388, 16.12244898,
16.51020408, 16.89795918, 17.28571429, 17.67346939, 18.06122449,
18.44897959, 18.83673469, 19.2244898 , 19.6122449 , 20. ])
In [22]: np.linspace(1, 20, 20) # 第三个参数设置元素数量
Out[22]:
array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13.,
14., 15., 16., 17., 18., 19., 20.])
In [23]: np.linspace(1, 20, 30)
Out[23]:
array([ 1. , 1.65517241, 2.31034483, 2.96551724, 3.62068966,
4.27586207, 4.93103448, 5.5862069 , 6.24137931, 6.89655172,
7.55172414, 8.20689655, 8.86206897, 9.51724138, 10.17241379,
10.82758621, 11.48275862, 12.13793103, 12.79310345, 13.44827586,
14.10344828, 14.75862069, 15.4137931 , 16.06896552, 16.72413793,
17.37931034, 18.03448276, 18.68965517, 19.34482759, 20. ])
concatenate
拼接数组:
In [4]: a = np.random.rand(5) * 100
In [5]: b = np.ones(3) * 10
In [6]: c = np.arange(1, 4) * -5
In [7]: a
Out[7]: array([41.59832138, 3.51696663, 4.36743242,
22.06651079, 47.40743768])
In [8]: b
Out[8]: array([10., 10., 10.])
In [9]: c
Out[9]: array([ -5, -10, -15])
# 拼接一维数组,默认参数 axis=0 可以不写
In [10]: np.concatenate((a, b, c))
Out[10]:
array([ 41.59832138, 3.51696663, 4.36743242, 22.06651079,
47.40743768, 10. , 10. , 10. ,
-5. , -10. , -15. ])
In [11]: d = np.arange(1,7).reshape(2,3)
In [12]: e = np.arange(7, 13).reshape(2,3)
In [13]: d
Out[13]:
array([[1, 2, 3],
[4, 5, 6]])
In [14]: e
Out[14]:
array([[ 7, 8, 9],
[10, 11, 12]])
# 拼接二维数组,默认参数 axis=0 ,即纵向拼接
In [15]: np.concatenate((d, e))
Out[15]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
# axis=1 ,横向拼接
In [16]: np.concatenate((d, e), axis=1)
Out[16]:
array([[ 1, 2, 3, 7, 8, 9],
[ 4, 5, 6, 10, 11, 12]])
In [18]: f
Out[18]:
array([[ 7, 8, 9],
[10, 11, 12]])
In [19]: g
Out[19]:
array([[13, 14],
[15, 16]])
# 像上面 f、g 这种情况,只能横向拼接了,纵向会报错:
In [20]: np.concatenate((f, g), axis=1)
Out[20]:
array([[ 7, 8, 9, 13, 14],
[10, 11, 12, 15, 16]])
In [21]: np.concatenate((f, g))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-21-d793fa5e247b> in <module>()
----> 1 np.concatenate((f, g))
ValueError: all the input array dimensions except for the concatenation...
.T
变形之术:
In [113]: a = np.arange(12).reshape(3, 4)
In [114]: a
Out[114]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [115]: a.T # 变形,纵横互换
Out[115]:
array([[ 0, 4, 8],
[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11]])
In [117]: a.reshape(4, 3) # 先把所有数据排成一行,然后重新分组
Out[117]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])