官方help文档

>>> help(numpy.random.choice)

Help on built-in function choice:

choice(...) method of mtrand.RandomState instance

choice(a, size=None, replace=True, p=None)

Generates a random sample from a given 1-D array

.. versionadded:: 1.7.0

Parameters

-----------

a : 1-D array-like or int

If an ndarray, a random sample is generated from its elements.

If an int, the random sample is generated as if a were np.arange(a)

size : int or tuple of ints, optional

Output shape. If the given shape is, e.g., ``(m, n, k)``, then

``m * n * k`` samples are drawn. Default is None, in which case a

single value is returned.

replace : boolean, optional

Whether the sample is with or without replacement

p : 1-D array-like, optional

The probabilities associated with each entry in a.

If not given the sample assumes a uniform distribution over all

entries in a.

Returns

--------

samples : single item or ndarray

The generated random samples

Raises

-------

ValueError

If a is an int and less than zero, if a or p are not 1-dimensional,

if a is an array-like of size 0, if p is not a vector of

probabilities, if a and p have different lengths, or if

replace=False and the sample size is greater than the population

size

个人理解

函数声明

choice(a, size=None, replace=True, p=None)

参数a：表示采样范围。需要传入一个一维的类似array的值（包括一维列表、元组、numpy中的ndarry）或者是一个整型值。如果传入的是一维数组n，那么对其中的元素随机采样；如果传入的是整型值n，那么从numpy.arrange(n)（也就是array([0,1,...,n-1])）中随机采样。

参数size：表示采样个数。整型或者是元组（其中所有元素需要是整型）或者无传入值。如果是传入的是整型值n，则表示从a中随机采n个样本；如果传入的是元组(n,m,p)，则表示随机采n*m*p个样本，输出格式是形状为(n,m,p)的ndarray；如果不传入值，则默认采一个样本。

'''python

#当不传入size参数时，默认只采一个样本：

np.random.choice(8)

#当传入的size参数为元组时：

np.random.choice(8,(2,2))

#array([[6, 1],

# [2, 4]])

#当传入的size为整型值时：

np.random.choice(8,4)

#array([7, 1, 1, 7])

'''

参数p：表示采样的概率。默认为none，表示每个样本被采取的概率相同，也就是统一采样；或者可以传入一个和a一样的，一维的类似array的值，传入数组必须和a的长度相同，且p中的元素之和必须为1。

'''python

#不传入p时，统一采样

np.random.choice(8,5)

#array([5, 6, 1, 1, 0])

#传入p时，按照给定的概率采样，可以看到下面因为第五个样本概率最大所以被采样次数较多

np.random.choice(8,4, p = [0, 0.1, 0.2, 0.3, 0.4, 0, 0, 0])

#array([4, 4, 4, 4], dtype=int64)

#当传入的p内元素之和不为1时报错

np.random.choice(8,4, p = [0, 0.1, 0.2, 0.3, 0.3, 0, 0, 0])

#Traceback (most recent call last):

# File "<stdin>", line 1, in <module>

# File "mtrand.pyx", line 1148, in mtrand.RandomState.choice

#ValueError: probabilities do not sum to 1

'''

参数replace：表示是否重复采样。默认为True表示可以重复采样（相当于从箱子里拿了球需要再放回去）；设置为False时表示不重复采样（相当于从箱子中拿了球不再放回），此时采样数不能大于传入数组a的长度，即传入的size大小不能大于a的长度，如果同时也传入参数p的话，p中不为0的元素个数必须大于等于数组a的长度（因为概率为0表示不对该样本采样）。

'''python

#设置replace为False时，采取的样本不会重复：

np.random.choice(8,5,replace = False)

#array([7, 3, 2, 6, 5])

#设置replace为True或者不设置时（默认就是为True），采取的样本可以重复：

np.random.choice(8,5)

#array([0, 6, 7, 3, 6])

#replace参数为False时，size大小需要小于等于a的大小，否则会报错：

np.random.choice(8,9,replace = False)

#Traceback (most recent call last):

# File "<stdin>", line 1, in <module>

# File "mtrand.pyx", line 1168, in mtrand.RandomState.choice

#ValueError: Cannot take a larger sample than population when 'replace=False'

#不设置replace时size的大小可以任意取：

np.random.choice(8,10)

#array([5, 1, 1, 3, 4, 0, 1, 3, 4, 2])

#同理，设置replace为False时，p中不为0的元素必须大于等于a的大小否则也会报错：

np.random.choice(4,3, replace = False, p = [0, 0.8, 0.2, 0] )

#ValueError: Fewer non-zero entries in p than size

'''

总结

可以自己多尝试，这样对参数掌握得更快，对其理解更充分。

2020-05-17 numpy.random.choice()的用法

2020-05-17 numpy.random.choice()的用法

官方help文档

个人理解

总结

友情链接更多精彩内容