NumPy用户指南 > 快速入门教程
Before reading this tutorial you should know a bit of Python. If you would like to refresh your memory, take a look at the Python tutorial.
If you wish to work the examples in this tutorial, you must also have some software installed on your computer. Please see https://scipy.org/install.html for instructions.
NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers. In NumPy dimensions are called axes.
For example, the coordinates of a point in 3D space [1, 2, 1]
has
one axis. That axis has 3 elements in it, so we say it has a length
of 3. In the example pictured below, the array has 2 axes. The first
axis has a length of 2, the second axis has a length of 3.
[[ 1., 0., 0.],
[ 0., 1., 2.]]
NumPy’s array class is called ndarray
. It is also known by the alias
array
. Note that numpy.array
is not the same as the Standard
Python Library class array.array
, which only handles one-dimensional
arrays and offers less functionality. The more important attributes of
an ndarray
object are:
the number of axes (dimensions) of the array.
the dimensions of the array. This is a tuple of integers indicating
the size of the array in each dimension. For a matrix with n rows
and m columns, shape
will be (n,m)
. The length of the
shape
tuple is therefore the number of axes, ndim
.
the total number of elements of the array. This is equal to the
product of the elements of shape
.
an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.
数组中每个元素的大小(以字节为单位)。例如,类型为元素的数组float64
具有itemsize
8(= 64/8),而类型complex32
中的一个具有itemsize
4(= 32/8)。等同于ndarray.dtype.itemsize
。
包含数组实际元素的缓冲区。通常,我们不需要使用此属性,因为我们将使用索引工具访问数组中的元素。
>>> import numpy as np
>>> a = np.arange(15).reshape(3, 5)
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
>>> a.shape
(3, 5)
>>> a.ndim
2
>>> a.dtype.name
'int64'
>>> a.itemsize
8
>>> a.size
15
>>> type(a)
<type 'numpy.ndarray'>
>>> b = np.array([6, 7, 8])
>>> b
array([6, 7, 8])
>>> type(b)
<type 'numpy.ndarray'>
有几种创建数组的方法。
例如,您可以使用array
函数从常规Python列表或元组创建数组。根据序列中元素的类型推导所得数组的类型。
>>> import numpy as np
>>> a = np.array([2,3,4])
>>> a
array([2, 3, 4])
>>> a.dtype
dtype('int64')
>>> b = np.array([1.2, 3.5, 5.1])
>>> b.dtype
dtype('float64')
常见的错误在于调用array
多个数字参数,而不是提供一个数字列表作为参数。
>>> a = np.array(1,2,3,4) # WRONG
>>> a = np.array([1,2,3,4]) # RIGHT
array
将序列序列转换为二维数组,将序列序列转换为三维数组,依此类推。
>>> b = np.array([(1.5,2,3), (4,5,6)])
>>> b
array([[ 1.5, 2. , 3. ],
[ 4. , 5. , 6. ]])
数组的类型也可以在创建时明确指定:
>>> c = np.array( [ [1,2], [3,4] ], dtype=complex )
>>> c
array([[ 1.+0.j, 2.+0.j],
[ 3.+0.j, 4.+0.j]])
通常,数组的元素最初是未知的,但是其大小是已知的。因此,NumPy提供了几个函数来创建具有初始占位符内容的数组。这些将增长阵列的必要性降至最低,这是一项昂贵的操作。
该函数zeros
创建一个全零的数组,该函数ones
创建全零
的数组,该函数empty
创建其初始内容是随机的并取决于内存状态的数组。默认情况下,创建的数组的dtype为
float64
。
>>> np.zeros( (3,4) )
array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]])
>>> np.ones( (2,3,4), dtype=np.int16 ) # dtype can also be specified
array([[[ 1, 1, 1, 1],
[ 1, 1, 1, 1],
[ 1, 1, 1, 1]],
[[ 1, 1, 1, 1],
[ 1, 1, 1, 1],
[ 1, 1, 1, 1]]], dtype=int16)
>>> np.empty( (2,3) ) # uninitialized, output may vary
array([[ 3.73603959e-262, 6.02658058e-154, 6.55490914e-260],
[ 5.30498948e-313, 3.14673309e-307, 1.00000000e+000]])
为了创建数字序列,NumPy提供了类似于range
返回数组而不是列表的函数
。
>>> np.arange( 10, 30, 5 )
array([10, 15, 20, 25])
>>> np.arange( 0, 2, 0.3 ) # it accepts float arguments
array([ 0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])
当arange
用浮点参数使用时,它通常是不可能预测得到的元件的数量,由于有限浮点精度。因此,通常最好使用将linspace
所需元素数量而不是步骤数量作为参数的函数:
>>> from numpy import pi
>>> np.linspace( 0, 2, 9 ) # 9 numbers from 0 to 2
array([ 0. , 0.25, 0.5 , 0.75, 1. , 1.25, 1.5 , 1.75, 2. ])
>>> x = np.linspace( 0, 2*pi, 100 ) # useful to evaluate function at lots of points
>>> f = np.sin(x)
当您打印数组时,NumPy以类似于嵌套列表的方式显示它,但是具有以下布局:
最后一个轴从左到右打印,
倒数第二个从上到下打印,
其余的也从上到下打印,每个切片之间用空行隔开。
然后将一维数组打印为行,将二维打印为矩阵,将三维打印为矩阵列表。
>>> a = np.arange(6) # 1d array
>>> print(a)
[0 1 2 3 4 5]
>>>
>>> b = np.arange(12).reshape(4,3) # 2d array
>>> print(b)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
>>>
>>> c = np.arange(24).reshape(2,3,4) # 3d array
>>> print(c)
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
请参阅下文以获取有关的更多详细信息reshape
。
如果数组太大而无法打印,则NumPy会自动跳过数组的中心部分,仅打印角点:
>>> print(np.arange(10000))
[ 0 1 2 ..., 9997 9998 9999]
>>>
>>> print(np.arange(10000).reshape(100,100))
[[ 0 1 2 ..., 97 98 99]
[ 100 101 102 ..., 197 198 199]
[ 200 201 202 ..., 297 298 299]
...,
[9700 9701 9702 ..., 9797 9798 9799]
[9800 9801 9802 ..., 9897 9898 9899]
[9900 9901 9902 ..., 9997 9998 9999]]
要禁用此行为并强制NumPy打印整个数组,可以使用更改打印选项set_printoptions
。
>>> np.set_printoptions(threshold=sys.maxsize) # sys module should be imported
数组上的算术运算符适用于elementwise。创建一个新数组,并用结果填充。
>>> a = np.array( [20,30,40,50] )
>>> b = np.arange( 4 )
>>> b
array([0, 1, 2, 3])
>>> c = a-b
>>> c
array([20, 29, 38, 47])
>>> b**2
array([0, 1, 4, 9])
>>> 10*np.sin(a)
array([ 9.12945251, -9.88031624, 7.4511316 , -2.62374854])
>>> a<35
array([ True, True, False, False])
与许多矩阵语言不同,乘积运算符*
在NumPy数组中按元素进行操作。可以使用@
运算符(在python> = 3.5中)或dot
函数或方法执行矩阵乘积:
>>> A = np.array( [[1,1],
... [0,1]] )
>>> B = np.array( [[2,0],
... [3,4]] )
>>> A * B # elementwise product
array([[2, 0],
[0, 4]])
>>> A @ B # matrix product
array([[5, 4],
[3, 4]])
>>> A.dot(B) # another matrix product
array([[5, 4],
[3, 4]])
某些操作(例如+=
和)*=
就位以修改现有数组,而不是创建一个新数组。
>>> a = np.ones((2,3), dtype=int)
>>> b = np.random.random((2,3))
>>> a *= 3
>>> a
array([[3, 3, 3],
[3, 3, 3]])
>>> b += a
>>> b
array([[ 3.417022 , 3.72032449, 3.00011437],
[ 3.30233257, 3.14675589, 3.09233859]])
>>> a += b # b is not automatically converted to integer type
Traceback (most recent call last):
...
TypeError: Cannot cast ufunc add output from dtype('float64') to dtype('int64') with casting rule 'same_kind'
当使用不同类型的数组进行操作时,结果数组的类型对应于更通用或更精确的数组(这种行为称为向上转换)。
>>> a = np.ones(3, dtype=np.int32)
>>> b = np.linspace(0,pi,3)
>>> b.dtype.name
'float64'
>>> c = a+b
>>> c
array([ 1. , 2.57079633, 4.14159265])
>>> c.dtype.name
'float64'
>>> d = np.exp(c*1j)
>>> d
array([ 0.54030231+0.84147098j, -0.84147098+0.54030231j,
-0.54030231-0.84147098j])
>>> d.dtype.name
'complex128'
Many unary operations, such as computing the sum of all the elements in
the array, are implemented as methods of the ndarray
class.
>>> a = np.random.random((2,3))
>>> a
array([[ 0.18626021, 0.34556073, 0.39676747],
[ 0.53881673, 0.41919451, 0.6852195 ]])
>>> a.sum()
2.5718191614547998
>>> a.min()
0.1862602113776709
>>> a.max()
0.6852195003967595
By default, these operations apply to the array as though it were a list
of numbers, regardless of its shape. However, by specifying the axis
parameter you can apply an operation along the specified axis of an
array:
>>> b = np.arange(12).reshape(3,4)
>>> b
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>>
>>> b.sum(axis=0) # sum of each column
array([12, 15, 18, 21])
>>>
>>> b.min(axis=1) # min of each row
array([0, 4, 8])
>>>
>>> b.cumsum(axis=1) # cumulative sum along each row
array([[ 0, 1, 3, 6],
[ 4, 9, 15, 22],
[ 8, 17, 27, 38]])
NumPy provides familiar mathematical functions such as sin, cos, and
exp. In NumPy, these are called “universal
functions”(ufunc
). Within NumPy, these functions
operate elementwise on an array, producing an array as output.
>>> B = np.arange(3)
>>> B
array([0, 1, 2])
>>> np.exp(B)
array([ 1. , 2.71828183, 7.3890561 ])
>>> np.sqrt(B)
array([ 0. , 1. , 1.41421356])
>>> C = np.array([2., -1., 4.])
>>> np.add(B, C)
array([ 2., 0., 6.])
See also
all
,
any
,
apply_along_axis
,
argmax
,
argmin
,
argsort
,
average
,
bincount
,
ceil
,
clip
,
conj
,
corrcoef
,
cov
,
cross
,
cumprod
,
cumsum
,
diff
,
dot
,
floor
,
inner
,
inv,
lexsort
,
max
,
maximum
,
mean
,
median
,
min
,
minimum
,
nonzero
,
outer
,
prod
,
re
,
round
,
sort
,
std
,
sum
,
trace
,
transpose
,
var
,
vdot
,
vectorize
,
where
One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python sequences.
>>> a = np.arange(10)**3
>>> a
array([ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729])
>>> a[2]
8
>>> a[2:5]
array([ 8, 27, 64])
>>> a[:6:2] = -1000 # equivalent to a[0:6:2] = -1000; from start to position 6, exclusive, set every 2nd element to -1000
>>> a
array([-1000, 1, -1000, 27, -1000, 125, 216, 343, 512, 729])
>>> a[ : :-1] # reversed a
array([ 729, 512, 343, 216, 125, -1000, 27, -1000, 1, -1000])
>>> for i in a:
... print(i**(1/3.))
...
nan
1.0
nan
3.0
nan
5.0
6.0
7.0
8.0
9.0
Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by commas:
>>> def f(x,y):
... return 10*x+y
...
>>> b = np.fromfunction(f,(5,4),dtype=int)
>>> b
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[20, 21, 22, 23],
[30, 31, 32, 33],
[40, 41, 42, 43]])
>>> b[2,3]
23
>>> b[0:5, 1] # each row in the second column of b
array([ 1, 11, 21, 31, 41])
>>> b[ : ,1] # equivalent to the previous example
array([ 1, 11, 21, 31, 41])
>>> b[1:3, : ] # each column in the second and third row of b
array([[10, 11, 12, 13],
[20, 21, 22, 23]])
When fewer indices are provided than the number of axes, the missing
indices are considered complete slices:
>>> b[-1] # the last row. Equivalent to b[-1,:]
array([40, 41, 42, 43])
The expression within brackets in b[i]
is treated as an i
followed by as many instances of :
as needed to represent the
remaining axes. NumPy also allows you to write this using dots as
b[i,...]
.
The dots (...
) represent as many colons as needed to produce a
complete indexing tuple. For example, if x
is an array with 5
axes, then
x[1,2,...]
is equivalent to x[1,2,:,:,:]
,
x[...,3]
to x[:,:,:,:,3]
and
x[4,...,5,:]
to x[4,:,:,5,:]
.
>>> c = np.array( [[[ 0, 1, 2], # a 3D array (two stacked 2D arrays)
... [ 10, 12, 13]],
... [[100,101,102],
... [110,112,113]]])
>>> c.shape
(2, 2, 3)
>>> c[1,...] # same as c[1,:,:] or c[1]
array([[100, 101, 102],
[110, 112, 113]])
>>> c[...,2] # same as c[:,:,2]
array([[ 2, 13],
[102, 113]])
Iterating over multidimensional arrays is done with respect to the first axis:
>>> for row in b:
... print(row)
...
[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]
However, if one wants to perform an operation on each element in the
array, one can use the flat
attribute which is an
iterator
over all the elements of the array:
>>> for element in b.flat:
... print(element)
...
0
1
2
3
10
11
12
13
20
21
22
23
30
31
32
33
40
41
42
43
See also
Indexing,
Indexing (reference),
newaxis
,
ndenumerate
,
indices
An array has a shape given by the number of elements along each axis:
>>> a = np.floor(10*np.random.random((3,4)))
>>> a
array([[ 2., 8., 0., 6.],
[ 4., 5., 1., 1.],
[ 8., 9., 3., 6.]])
>>> a.shape
(3, 4)
The shape of an array can be changed with various commands. Note that the following three commands all return a modified array, but do not change the original array:
>>> a.ravel() # returns the array, flattened
array([ 2., 8., 0., 6., 4., 5., 1., 1., 8., 9., 3., 6.])
>>> a.reshape(6,2) # returns the array with a modified shape
array([[ 2., 8.],
[ 0., 6.],
[ 4., 5.],
[ 1., 1.],
[ 8., 9.],
[ 3., 6.]])
>>> a.T # returns the array, transposed
array([[ 2., 4., 8.],
[ 8., 5., 9.],
[ 0., 1., 3.],
[ 6., 1., 6.]])
>>> a.T.shape
(4, 3)
>>>