Numpy常用操作记录

环境是Python3.7+Anaconda，文中的np指的是numpy。

从文件中读取数据进入numpy

#以,为分割符，得到一个矩阵
world_alcohol = numpy.genfromtxt("world_alcohol.txt", delimiter=",")
print(world_alcohol)
result:
[[      nan       nan       nan       nan       nan]
 [1.986e+03       nan       nan       nan 0.000e+00]
 [1.986e+03       nan       nan       nan 5.000e-01]
 ...
 [1.987e+03       nan       nan       nan 7.500e-01]
 [1.989e+03       nan       nan       nan 1.500e+00]
 [1.985e+03       nan       nan       nan 3.100e-01]]

用列表或者列表的列表创建数组

#The numpy.array() function can take a list or list of lists as input. When we input a list, we get a one-dimensional array as a result:
vector = numpy.array([5, 10, 15, 20])
#When we input a list of lists, we get a matrix as a result:
matrix = numpy.array([[5, 10, 15], [20, 25, 30], [35, 40, 45]])
print(vector)
print(matrix)

result:
[ 5 10 15 20]
[[ 5 10 15]
 [20 25 30]
 [35 40 45]]

查看矩阵或数组的形状

#We can use the ndarray.shape property to figure out how many elements are in the array
vector = numpy.array([1, 2, 3, 4])
print(vector.shape)
#For matrices, the shape property contains a tuple with 2 elements.
matrix = numpy.array([[5, 10, 15], [20, 25, 30]])
print(matrix.shape)

result:
(4,)
(2, 3)

查看数组元素的类型

#Each value in a NumPy array has to have the same data type
#NumPy will automatically figure out an appropriate data type when reading in data or converting lists to arrays. 
#You can check the data type of a NumPy array using the dtype property.
numbers = numpy.array([1, 2, 3, 4])
numbers.dtype

result:
dtype('int32')

从文件读取数据，并跳过第一行，且指定元素类型

world_alcohol = numpy.genfromtxt("world_alcohol.txt", delimiter=",", dtype="U75", skip_header=1)
print(world_alcohol)

获取二维数组的元素

uruguay_other_1986 = world_alcohol[1,4]
third_country = world_alcohol[2,2]
print (uruguay_other_1986)
print (third_country)

数组切片,遵循左闭右开原则

vector = numpy.array([5, 10, 15, 20])
print(vector[0:3]) 

result:
[ 5 10 15]

矩阵切片

matrix = numpy.array([
                    [5, 10, 15], 
                    [20, 25, 30],
                    [35, 40, 45]
                 ])
#:表示所有行的第1列
print(matrix[:,1])
#获取所有行的第0和第1列
print(matrix[:,0:2])
#获取第一，第二行与第0和第1列的元素
print(matrix[1:3,0:2])

将矩阵和一个数比较，得到一个布尔矩阵

#it will compare the second value to each element in the vector
# If the values are equal, the Python interpreter returns True; otherwise, it returns False
vector = numpy.array([5, 10, 15, 20])
vector == 10

result:
array([False,  True, False, False])

matrix = numpy.array([
                    [5, 10, 15], 
                    [20, 25, 30],
                    [35, 40, 45]
                 ])
matrix == 25

result:
array([[False, False, False],
       [False,  True, False],
       [False, False, False]])

利用布尔数组获取元素

#Compares vector to the value 10, which generates a new Boolean vector [False, True, False, False]. It assigns this result to equal_to_ten
vector = numpy.array([5, 10, 15, 20])
equal_to_ten = (vector == 10)
print equal_to_ten
print(vector[equal_to_ten])

result:
[False  True False False]
[10]

matrix = numpy.array([
                [5, 10, 15], 
                [20, 25, 30],
                [35, 40, 45]
             ])
second_column_25 = (matrix[:,1] == 25)
print second_column_25
print(matrix[second_column_25, :])

result:
[False  True False]
[[20 25 30]]

改变数组元素类型

#We can convert the data type of an array with the ndarray.astype() method.
vector = numpy.array(["1", "2", "3"])
print(vector.dtype)
print(vector)
vector = vector.astype(float)
print(vector.dtype)
print(vector)

result:
<U1
['1' '2' '3']
float64
[1. 2. 3.]

数组整体求和

vector = numpy.array([5, 10, 15, 20])
vector.sum()

result:
50

对二维数组的每一行求和

# The axis dictates which dimension we perform the operation on
#1 means that we want to perform the operation on each row, and 0 means on each column
matrix = numpy.array([
                [5, 10, 15], 
                [20, 25, 30],
                [35, 40, 45]
             ])
matrix.sum(axis=1)

result:
array([ 30,  75, 120])

对二维数组的每一列求和

matrix = numpy.array([
                [5, 10, 15], 
                [20, 25, 30],
                [35, 40, 45]
             ])
matrix.sum(axis=0)

result:
array([60, 75, 90])

改变数组形状

a = np.arange(15).reshape(3, 5)
print(a)

result:
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]

查看数组维度

#the number of axes (dimensions) of the array

a = np.arange(15).reshape(3, 5)

print(a.ndim)

result:
2

查看数组的元素个数

#the total number of elements of the array

a = np.arange(15).reshape(3, 5)

print(a.size)

result:
15

初始化全0矩阵，参数为行数和列数的元组

print(np.zeros ((3,4)))

result:
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

初始化全1矩阵，并指定元素类型

np.ones( (2,3,4), dtype=np.int32 )

result:
array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]])

根据start，end以及步长pace创建一个连续的整型数组，包括start，不包括end

#To create sequences of numbers
np.arange( 10, 30, 5 )

result:
array([10, 15, 20, 25])

随机初始化一个矩阵

np.random.random((2,3))

result:
array([[ 0.40130659,  0.45452825,  0.79776512],
       [ 0.63220592,  0.74591134,  0.64130737]])

创建一个数组，其元素值在start和end之间，并且每个元素的间隔相同，第一个参数是start，第二个元素是参数，第三个参数是该数组的元素总数

np.sin(np.linspace( 0, 2*np.pi, 100 ))

result:

array([0.        , 0.06346652, 0.12693304, 0.19039955, 0.25386607,
       0.31733259, 0.38079911, 0.44426563, 0.50773215, 0.57119866,
       0.63466518, 0.6981317 , 0.76159822, 0.82506474, 0.88853126,
       0.95199777, 1.01546429, 1.07893081, 1.14239733, 1.20586385,
       1.26933037, 1.33279688, 1.3962634 , 1.45972992, 1.52319644,
       1.58666296, 1.65012947, 1.71359599, 1.77706251, 1.84052903,
       1.90399555, 1.96746207, 2.03092858, 2.0943951 , 2.15786162,
       2.22132814, 2.28479466, 2.34826118, 2.41172769, 2.47519421,
       2.53866073, 2.60212725, 2.66559377, 2.72906028, 2.7925268 ,
       2.85599332, 2.91945984, 2.98292636, 3.04639288, 3.10985939,
       3.17332591, 3.23679243, 3.30025895, 3.36372547, 3.42719199,
       3.4906585 , 3.55412502, 3.61759154, 3.68105806, 3.74452458,
       3.8079911 , 3.87145761, 3.93492413, 3.99839065, 4.06185717,
       4.12532369, 4.1887902 , 4.25225672, 4.31572324, 4.37918976,
       4.44265628, 4.5061228 , 4.56958931, 4.63305583, 4.69652235,
       4.75998887, 4.82345539, 4.88692191, 4.95038842, 5.01385494,
       5.07732146, 5.14078798, 5.2042545 , 5.26772102, 5.33118753,
       5.39465405, 5.45812057, 5.52158709, 5.58505361, 5.64852012,
       5.71198664, 5.77545316, 5.83891968, 5.9023862 , 5.96585272,
       6.02931923, 6.09278575, 6.15625227, 6.21971879, 6.28318531])

求数组的sin()值，结果是数组每个元素的sin()值的矩阵

np.sin(np.linspace( 0, 2*np.pi, 100 ))

result:
array([ 0.00000000e+00,  6.34239197e-02,  1.26592454e-01,  1.89251244e-01,
        2.51147987e-01,  3.12033446e-01,  3.71662456e-01,  4.29794912e-01,
        4.86196736e-01,  5.40640817e-01,  5.92907929e-01,  6.42787610e-01,
        6.90079011e-01,  7.34591709e-01,  7.76146464e-01,  8.14575952e-01,
        8.49725430e-01,  8.81453363e-01,  9.09631995e-01,  9.34147860e-01,
        9.54902241e-01,  9.71811568e-01,  9.84807753e-01,  9.93838464e-01,
        9.98867339e-01,  9.99874128e-01,  9.96854776e-01,  9.89821442e-01,
        9.78802446e-01,  9.63842159e-01,  9.45000819e-01,  9.22354294e-01,
        8.95993774e-01,  8.66025404e-01,  8.32569855e-01,  7.95761841e-01,
        7.55749574e-01,  7.12694171e-01,  6.66769001e-01,  6.18158986e-01,
        5.67059864e-01,  5.13677392e-01,  4.58226522e-01,  4.00930535e-01,
        3.42020143e-01,  2.81732557e-01,  2.20310533e-01,  1.58001396e-01,
        9.50560433e-02,  3.17279335e-02, -3.17279335e-02, -9.50560433e-02,
       -1.58001396e-01, -2.20310533e-01, -2.81732557e-01, -3.42020143e-01,
       -4.00930535e-01, -4.58226522e-01, -5.13677392e-01, -5.67059864e-01,
       -6.18158986e-01, -6.66769001e-01, -7.12694171e-01, -7.55749574e-01,
       -7.95761841e-01, -8.32569855e-01, -8.66025404e-01, -8.95993774e-01,
       -9.22354294e-01, -9.45000819e-01, -9.63842159e-01, -9.78802446e-01,
       -9.89821442e-01, -9.96854776e-01, -9.99874128e-01, -9.98867339e-01,
       -9.93838464e-01, -9.84807753e-01, -9.71811568e-01, -9.54902241e-01,
       -9.34147860e-01, -9.09631995e-01, -8.81453363e-01, -8.49725430e-01,
       -8.14575952e-01, -7.76146464e-01, -7.34591709e-01, -6.90079011e-01,
       -6.42787610e-01, -5.92907929e-01, -5.40640817e-01, -4.86196736e-01,
       -4.29794912e-01, -3.71662456e-01, -3.12033446e-01, -2.51147987e-01,
       -1.89251244e-01, -1.26592454e-01, -6.34239197e-02, -2.44929360e-16])

数组和数组之间的+，-，*，/，结果是两个数组对应元素相运算的结果

#the product operator * operates elementwise in NumPy arrays
a = np.array( [20,30,40,50] )
b = np.arange( 1,5 )
print(a)
print(b)
print("-----------------")
print(a+b)
print(a-b)
print(a*b)
print(a/b)
print(b**2)

result:
[20 30 40 50]
[1 2 3 4]
-----------------
[21 32 43 54]
[19 28 37 46]
[ 20  60 120 200]
[20.         15.         13.33333333 12.5       ]
[ 1  4  9 16]

矩阵乘法，A.dot(B)或者numpy.dot(A,B)都可以

#The matrix product can be performed using the dot function or method
A = np.array( [[1,1],
               [0,1]] )
B = np.array( [[2,0],
               [3,4]] )
print A
print B

print A.dot(B)
print np.dot(A, B) 

result:
[[1 1]
 [0 1]]
[[2 0]
 [3 4]]
[[5 4]
 [3 4]]
[[5 4]
 [3 4]]

对矩阵进行指数和开方运算

B = np.arange(3)
print(B)
print(np.exp(B))
print(np.sqrt(B))

result:
[0 1 2]
[1.         2.71828183 7.3890561 ]
[0.         1.         1.41421356]

使矩阵降维

#Return the floor of the input
a = np.floor(10*np.random.random((3,4)))
print(a)

print(a.ravel())

result:
[[6. 5. 1. 5.]
 [3. 9. 3. 4.]
 [4. 1. 3. 2.]]
[6. 5. 1. 5. 3. 9. 3. 4. 4. 1. 3. 2.]

使矩阵转置

a = np.floor(10*np.random.random((3,4)))
print(a)
print(a.T)

result:
[[0. 9. 7. 6.]
 [7. 3. 7. 5.]
 [2. 0. 7. 0.]]
[[0. 7. 2.]
 [9. 3. 0.]
 [7. 7. 7.]
 [6. 5. 0.]]

将两个矩阵进行拼接，hstack表示按行，vstack表示按列

a = np.floor(10*np.random.random((2,2)))
b = np.floor(10*np.random.random((2,2)))
print(a)
print('---')
print(b)
print('---')
print(np.hstack((a,b)))

result:
[[5. 4.]
 [8. 5.]]
---
[[4. 9.]
 [3. 0.]]
---
[[5. 4. 4. 9.]
 [8. 5. 3. 0.]]

a = np.floor(10*np.random.random((2,2)))
b = np.floor(10*np.random.random((2,2)))
print(a)
print('---')
print(b)
print('---')
print(np.vstack((a,b)))

result:
[[4. 9.]
 [1. 0.]]
---
[[5. 3.]
 [8. 7.]]
---
[[4. 9.]
 [1. 0.]
 [5. 3.]
 [8. 7.]]

将一个矩阵切分为多个矩阵，hsplit表示按列，vsplit表示按行

a = np.floor(10*np.random.random((2,12)))
print(a)
print(np.hsplit(a,3))  #Split a to 3 matrixs
print(np.hsplit(a,(3,4))  ) # Split a after the third and the fourth column

result:
[[6. 6. 0. 5. 1. 5. 4. 1. 7. 7. 2. 7.]
 [8. 1. 5. 7. 0. 9. 5. 7. 3. 2. 2. 8.]]
[array([[6., 6., 0., 5.],
       [8., 1., 5., 7.]]),
 array([[1., 5., 4., 1.],
       [0., 9., 5., 7.]]), 
array([[7., 7., 2., 7.],
       [3., 2., 2., 8.]])]
[array([[6., 6., 0.],
       [8., 1., 5.]]), 
array([[5.],
       [7.]]), 
array([[1., 5., 4., 1., 7., 7., 2., 7.],
       [0., 9., 5., 7., 3., 2., 2., 8.]])]

a = np.floor(10*np.random.random((12,2)))
print(a)
print(np.vsplit(a,3))

result:
[[9. 7.]
 [1. 8.]
 [6. 1.]
 [4. 7.]
 [4. 3.]
 [1. 8.]
 [4. 3.]
 [0. 7.]
 [4. 0.]
 [5. 2.]
 [6. 7.]
 [3. 7.]]
[
array([[9., 7.],
        [1., 8.],
        [6., 1.],
        [4., 7.]]), 
array([[4., 3.],
        [1., 8.],
        [4., 3.],
        [0., 7.]]),
 array([[4., 0.],
        [5., 2.],
        [6., 7.],
        [3., 7.]])
]

复制数组a的几种办法

#Simple assignments make no copy of array objects or of their data.
a = np.arange(12)
b = a
# a and b are two names for the same ndarray object
b is a
b.shape = 3,4
print a.shape
#表明a和b是同一个数组，修改了a也会修改b

result:
(3, 4)

#The view method creates a new array object that looks at the same data.
a = np.arange(12)
c = a.view()
c.shape = 2,6
print(a.shape)
c[0,4] = 1234
print(a)
#虽然c改变了形状没有影响a，但改变c的值会影响a


result:
(12,)
[   0    1    2    3 1234    5    6    7    8    9   10   11]

#The copy method makes a complete copy of the array and its data.
a = np.arange(12)
d = a.copy() 
d[0] = 9999
print(d)
print(a)
#这是完全的复制，数组a和d只是值和形状相同，二者不会相互影响


result:
[9999    1    2    3 1234    5    6    7    8    9   10   11]
[   0    1    2    3 1234    5    6    7    8    9   10   11]

获取二维数组在每一行或每一列上最大值所在的索引

import numpy as np
data = np.sin(np.arange(20)).reshape(5,4)
print(data)
ind = data.argmax(axis=0)#表示每一列
print(ind)

result:
[[ 0.          0.84147098  0.90929743  0.14112001]
 [-0.7568025  -0.95892427 -0.2794155   0.6569866 ]
 [ 0.98935825  0.41211849 -0.54402111 -0.99999021]
 [-0.53657292  0.42016704  0.99060736  0.65028784]
 [-0.28790332 -0.96139749 -0.75098725  0.14987721]]
[2 0 3 1]

扩充数组

a = np.arange(0, 40, 10)
#将a数组的行数变为原来的3倍，列数变为原来的5倍
b = np.tile(a, (3, 5)) 
print(b)

result:
[[ 0 10 20 30  0 10 20 30  0 10 20 30  0 10 20 30  0 10 20 30]
 [ 0 10 20 30  0 10 20 30  0 10 20 30  0 10 20 30  0 10 20 30]
 [ 0 10 20 30  0 10 20 30  0 10 20 30  0 10 20 30  0 10 20 30]]

数组排序

a = np.array([[4, 3, 5], [1, 2, 1]])
print(a)
#将a的每一行进行排序
b = np.sort(a, axis=1)
print(b)

#第二种写法，结果相同

　a.sort(axis=1)
print(a)

result:
[[4 3 5]
 [1 2 1]]
[[3 4 5]
 [1 1 2]]

#第三种写法
a = np.array([4, 3, 1, 2])
print(a)
j = np.argsort(a)
print(j)
print(a[j])
#其中j的各个元素是a按从小到大排序的索引

result:
[4 3 1 2]
[2 3 1 0]
[1 2 3 4]

矩阵求逆

a=np.array([
    [1,2],
    [3,4]
])

print(a)
b=np.linalg.inv(a)
print(b)

result:
[[1 2]
 [3 4]]
[[-2.   1. ]
 [ 1.5 -0.5]]

先写这么多吧，以后用到别的再补上。。。

Numpy常用操作记录

猜你喜欢