环境是Python3.7+Anaconda,文中的np指的是numpy。
从文件中读取数据进入numpy
#以,为分割符,得到一个矩阵
world_alcohol = numpy.genfromtxt("world_alcohol.txt", delimiter=",")
print(world_alcohol)
result:
[[ nan nan nan nan nan]
[1.986e+03 nan nan nan 0.000e+00]
[1.986e+03 nan nan nan 5.000e-01]
...
[1.987e+03 nan nan nan 7.500e-01]
[1.989e+03 nan nan nan 1.500e+00]
[1.985e+03 nan nan nan 3.100e-01]]
用列表或者列表的列表创建数组
#The numpy.array() function can take a list or list of lists as input. When we input a list, we get a one-dimensional array as a result:
vector = numpy.array([5, 10, 15, 20])
#When we input a list of lists, we get a matrix as a result:
matrix = numpy.array([[5, 10, 15], [20, 25, 30], [35, 40, 45]])
print(vector)
print(matrix)
result:
[ 5 10 15 20]
[[ 5 10 15]
[20 25 30]
[35 40 45]]
查看矩阵或数组的形状
#We can use the ndarray.shape property to figure out how many elements are in the array
vector = numpy.array([1, 2, 3, 4])
print(vector.shape)
#For matrices, the shape property contains a tuple with 2 elements.
matrix = numpy.array([[5, 10, 15], [20, 25, 30]])
print(matrix.shape)
result:
(4,)
(2, 3)
查看数组元素的类型
#Each value in a NumPy array has to have the same data type
#NumPy will automatically figure out an appropriate data type when reading in data or converting lists to arrays.
#You can check the data type of a NumPy array using the dtype property.
numbers = numpy.array([1, 2, 3, 4])
numbers.dtype
result:
dtype('int32')
从文件读取数据,并跳过第一行,且指定元素类型
world_alcohol = numpy.genfromtxt("world_alcohol.txt", delimiter=",", dtype="U75", skip_header=1)
print(world_alcohol)
获取二维数组的元素
uruguay_other_1986 = world_alcohol[1,4]
third_country = world_alcohol[2,2]
print (uruguay_other_1986)
print (third_country)
数组切片,遵循左闭右开原则
vector = numpy.array([5, 10, 15, 20])
print(vector[0:3])
result:
[ 5 10 15]
矩阵切片
matrix = numpy.array([
[5, 10, 15],
[20, 25, 30],
[35, 40, 45]
])
#:表示所有行的第1列
print(matrix[:,1])
#获取所有行的第0和第1列
print(matrix[:,0:2])
#获取第一,第二行与第0和第1列的元素
print(matrix[1:3,0:2])
将矩阵和一个数比较,得到一个布尔矩阵
#it will compare the second value to each element in the vector
# If the values are equal, the Python interpreter returns True; otherwise, it returns False
vector = numpy.array([5, 10, 15, 20])
vector == 10
result:
array([False, True, False, False])
matrix = numpy.array([
[5, 10, 15],
[20, 25, 30],
[35, 40, 45]
])
matrix == 25
result:
array([[False, False, False],
[False, True, False],
[False, False, False]])
利用布尔数组获取元素
#Compares vector to the value 10, which generates a new Boolean vector [False, True, False, False]. It assigns this result to equal_to_ten
vector = numpy.array([5, 10, 15, 20])
equal_to_ten = (vector == 10)
print equal_to_ten
print(vector[equal_to_ten])
result:
[False True False False]
[10]
matrix = numpy.array([
[5, 10, 15],
[20, 25, 30],
[35, 40, 45]
])
second_column_25 = (matrix[:,1] == 25)
print second_column_25
print(matrix[second_column_25, :])
result:
[False True False]
[[20 25 30]]
改变数组元素类型
#We can convert the data type of an array with the ndarray.astype() method.
vector = numpy.array(["1", "2", "3"])
print(vector.dtype)
print(vector)
vector = vector.astype(float)
print(vector.dtype)
print(vector)
result:
<U1
['1' '2' '3']
float64
[1. 2. 3.]
数组整体求和
vector = numpy.array([5, 10, 15, 20])
vector.sum()
result:
50
对二维数组的每一行求和
# The axis dictates which dimension we perform the operation on
#1 means that we want to perform the operation on each row, and 0 means on each column
matrix = numpy.array([
[5, 10, 15],
[20, 25, 30],
[35, 40, 45]
])
matrix.sum(axis=1)
result:
array([ 30, 75, 120])
对二维数组的每一列求和
matrix = numpy.array([
[5, 10, 15],
[20, 25, 30],
[35, 40, 45]
])
matrix.sum(axis=0)
result:
array([60, 75, 90])
改变数组形状
a = np.arange(15).reshape(3, 5)
print(a)
result:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
查看数组维度
#the number of axes (dimensions) of the array
a = np.arange(15).reshape(3, 5)
print(a.ndim)
result:
2
查看数组的元素个数
#the total number of elements of the array
a = np.arange(15).reshape(3, 5)
print(a.size)
result:
15
初始化全0矩阵,参数为行数和列数的元组
print(np.zeros ((3,4)))
result:
array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]])
初始化全1矩阵,并指定元素类型
np.ones( (2,3,4), dtype=np.int32 )
result:
array([[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]])
根据start,end以及步长pace创建一个连续的整型数组,包括start,不包括end
#To create sequences of numbers
np.arange( 10, 30, 5 )
result:
array([10, 15, 20, 25])
随机初始化一个矩阵
np.random.random((2,3))
result:
array([[ 0.40130659, 0.45452825, 0.79776512],
[ 0.63220592, 0.74591134, 0.64130737]])
创建一个数组,其元素值在start和end之间,并且每个元素的间隔相同,第一个参数是start,第二个元素是参数,第三个参数是该数组的元素总数
np.sin(np.linspace( 0, 2*np.pi, 100 ))
result:
array([0. , 0.06346652, 0.12693304, 0.19039955, 0.25386607,
0.31733259, 0.38079911, 0.44426563, 0.50773215, 0.57119866,
0.63466518, 0.6981317 , 0.76159822, 0.82506474, 0.88853126,
0.95199777, 1.01546429, 1.07893081, 1.14239733, 1.20586385,
1.26933037, 1.33279688, 1.3962634 , 1.45972992, 1.52319644,
1.58666296, 1.65012947, 1.71359599, 1.77706251, 1.84052903,
1.90399555, 1.96746207, 2.03092858, 2.0943951 , 2.15786162,
2.22132814, 2.28479466, 2.34826118, 2.41172769, 2.47519421,
2.53866073, 2.60212725, 2.66559377, 2.72906028, 2.7925268 ,
2.85599332, 2.91945984, 2.98292636, 3.04639288, 3.10985939,
3.17332591, 3.23679243, 3.30025895, 3.36372547, 3.42719199,
3.4906585 , 3.55412502, 3.61759154, 3.68105806, 3.74452458,
3.8079911 , 3.87145761, 3.93492413, 3.99839065, 4.06185717,
4.12532369, 4.1887902 , 4.25225672, 4.31572324, 4.37918976,
4.44265628, 4.5061228 , 4.56958931, 4.63305583, 4.69652235,
4.75998887, 4.82345539, 4.88692191, 4.95038842, 5.01385494,
5.07732146, 5.14078798, 5.2042545 , 5.26772102, 5.33118753,
5.39465405, 5.45812057, 5.52158709, 5.58505361, 5.64852012,
5.71198664, 5.77545316, 5.83891968, 5.9023862 , 5.96585272,
6.02931923, 6.09278575, 6.15625227, 6.21971879, 6.28318531])
求数组的sin()值,结果是数组每个元素的sin()值的矩阵
np.sin(np.linspace( 0, 2*np.pi, 100 ))
result:
array([ 0.00000000e+00, 6.34239197e-02, 1.26592454e-01, 1.89251244e-01,
2.51147987e-01, 3.12033446e-01, 3.71662456e-01, 4.29794912e-01,
4.86196736e-01, 5.40640817e-01, 5.92907929e-01, 6.42787610e-01,
6.90079011e-01, 7.34591709e-01, 7.76146464e-01, 8.14575952e-01,
8.49725430e-01, 8.81453363e-01, 9.09631995e-01, 9.34147860e-01,
9.54902241e-01, 9.71811568e-01, 9.84807753e-01, 9.93838464e-01,
9.98867339e-01, 9.99874128e-01, 9.96854776e-01, 9.89821442e-01,
9.78802446e-01, 9.63842159e-01, 9.45000819e-01, 9.22354294e-01,
8.95993774e-01, 8.66025404e-01, 8.32569855e-01, 7.95761841e-01,
7.55749574e-01, 7.12694171e-01, 6.66769001e-01, 6.18158986e-01,
5.67059864e-01, 5.13677392e-01, 4.58226522e-01, 4.00930535e-01,
3.42020143e-01, 2.81732557e-01, 2.20310533e-01, 1.58001396e-01,
9.50560433e-02, 3.17279335e-02, -3.17279335e-02, -9.50560433e-02,
-1.58001396e-01, -2.20310533e-01, -2.81732557e-01, -3.42020143e-01,
-4.00930535e-01, -4.58226522e-01, -5.13677392e-01, -5.67059864e-01,
-6.18158986e-01, -6.66769001e-01, -7.12694171e-01, -7.55749574e-01,
-7.95761841e-01, -8.32569855e-01, -8.66025404e-01, -8.95993774e-01,
-9.22354294e-01, -9.45000819e-01, -9.63842159e-01, -9.78802446e-01,
-9.89821442e-01, -9.96854776e-01, -9.99874128e-01, -9.98867339e-01,
-9.93838464e-01, -9.84807753e-01, -9.71811568e-01, -9.54902241e-01,
-9.34147860e-01, -9.09631995e-01, -8.81453363e-01, -8.49725430e-01,
-8.14575952e-01, -7.76146464e-01, -7.34591709e-01, -6.90079011e-01,
-6.42787610e-01, -5.92907929e-01, -5.40640817e-01, -4.86196736e-01,
-4.29794912e-01, -3.71662456e-01, -3.12033446e-01, -2.51147987e-01,
-1.89251244e-01, -1.26592454e-01, -6.34239197e-02, -2.44929360e-16])
数组和数组之间的+,-,*,/,结果是两个数组对应元素相运算的结果
#the product operator * operates elementwise in NumPy arrays
a = np.array( [20,30,40,50] )
b = np.arange( 1,5 )
print(a)
print(b)
print("-----------------")
print(a+b)
print(a-b)
print(a*b)
print(a/b)
print(b**2)
result:
[20 30 40 50]
[1 2 3 4]
-----------------
[21 32 43 54]
[19 28 37 46]
[ 20 60 120 200]
[20. 15. 13.33333333 12.5 ]
[ 1 4 9 16]
矩阵乘法,A.dot(B)或者numpy.dot(A,B)都可以
#The matrix product can be performed using the dot function or method
A = np.array( [[1,1],
[0,1]] )
B = np.array( [[2,0],
[3,4]] )
print A
print B
print A.dot(B)
print np.dot(A, B)
result:
[[1 1]
[0 1]]
[[2 0]
[3 4]]
[[5 4]
[3 4]]
[[5 4]
[3 4]]
对矩阵进行指数和开方运算
B = np.arange(3)
print(B)
print(np.exp(B))
print(np.sqrt(B))
result:
[0 1 2]
[1. 2.71828183 7.3890561 ]
[0. 1. 1.41421356]
使矩阵降维
#Return the floor of the input
a = np.floor(10*np.random.random((3,4)))
print(a)
print(a.ravel())
result:
[[6. 5. 1. 5.]
[3. 9. 3. 4.]
[4. 1. 3. 2.]]
[6. 5. 1. 5. 3. 9. 3. 4. 4. 1. 3. 2.]
使矩阵转置
a = np.floor(10*np.random.random((3,4)))
print(a)
print(a.T)
result:
[[0. 9. 7. 6.]
[7. 3. 7. 5.]
[2. 0. 7. 0.]]
[[0. 7. 2.]
[9. 3. 0.]
[7. 7. 7.]
[6. 5. 0.]]
将两个矩阵进行拼接,hstack表示按行,vstack表示按列
a = np.floor(10*np.random.random((2,2)))
b = np.floor(10*np.random.random((2,2)))
print(a)
print('---')
print(b)
print('---')
print(np.hstack((a,b)))
result:
[[5. 4.]
[8. 5.]]
---
[[4. 9.]
[3. 0.]]
---
[[5. 4. 4. 9.]
[8. 5. 3. 0.]]
a = np.floor(10*np.random.random((2,2)))
b = np.floor(10*np.random.random((2,2)))
print(a)
print('---')
print(b)
print('---')
print(np.vstack((a,b)))
result:
[[4. 9.]
[1. 0.]]
---
[[5. 3.]
[8. 7.]]
---
[[4. 9.]
[1. 0.]
[5. 3.]
[8. 7.]]
将一个矩阵切分为多个矩阵,hsplit表示按列,vsplit表示按行
a = np.floor(10*np.random.random((2,12)))
print(a)
print(np.hsplit(a,3)) #Split a to 3 matrixs
print(np.hsplit(a,(3,4)) ) # Split a after the third and the fourth column
result:
[[6. 6. 0. 5. 1. 5. 4. 1. 7. 7. 2. 7.]
[8. 1. 5. 7. 0. 9. 5. 7. 3. 2. 2. 8.]]
[array([[6., 6., 0., 5.],
[8., 1., 5., 7.]]),
array([[1., 5., 4., 1.],
[0., 9., 5., 7.]]),
array([[7., 7., 2., 7.],
[3., 2., 2., 8.]])]
[array([[6., 6., 0.],
[8., 1., 5.]]),
array([[5.],
[7.]]),
array([[1., 5., 4., 1., 7., 7., 2., 7.],
[0., 9., 5., 7., 3., 2., 2., 8.]])]
a = np.floor(10*np.random.random((12,2)))
print(a)
print(np.vsplit(a,3))
result:
[[9. 7.]
[1. 8.]
[6. 1.]
[4. 7.]
[4. 3.]
[1. 8.]
[4. 3.]
[0. 7.]
[4. 0.]
[5. 2.]
[6. 7.]
[3. 7.]]
[
array([[9., 7.],
[1., 8.],
[6., 1.],
[4., 7.]]),
array([[4., 3.],
[1., 8.],
[4., 3.],
[0., 7.]]),
array([[4., 0.],
[5., 2.],
[6., 7.],
[3., 7.]])
]
复制数组a的几种办法
#Simple assignments make no copy of array objects or of their data.
a = np.arange(12)
b = a
# a and b are two names for the same ndarray object
b is a
b.shape = 3,4
print a.shape
#表明a和b是同一个数组,修改了a也会修改b
result:
(3, 4)
#The view method creates a new array object that looks at the same data.
a = np.arange(12)
c = a.view()
c.shape = 2,6
print(a.shape)
c[0,4] = 1234
print(a)
#虽然c改变了形状没有影响a,但改变c的值会影响a
result:
(12,)
[ 0 1 2 3 1234 5 6 7 8 9 10 11]
#The copy method makes a complete copy of the array and its data.
a = np.arange(12)
d = a.copy()
d[0] = 9999
print(d)
print(a)
#这是完全的复制,数组a和d只是值和形状相同,二者不会相互影响
result:
[9999 1 2 3 1234 5 6 7 8 9 10 11]
[ 0 1 2 3 1234 5 6 7 8 9 10 11]
获取二维数组在每一行或每一列上最大值所在的索引
import numpy as np
data = np.sin(np.arange(20)).reshape(5,4)
print(data)
ind = data.argmax(axis=0)#表示每一列
print(ind)
result:
[[ 0. 0.84147098 0.90929743 0.14112001]
[-0.7568025 -0.95892427 -0.2794155 0.6569866 ]
[ 0.98935825 0.41211849 -0.54402111 -0.99999021]
[-0.53657292 0.42016704 0.99060736 0.65028784]
[-0.28790332 -0.96139749 -0.75098725 0.14987721]]
[2 0 3 1]
扩充数组
a = np.arange(0, 40, 10)
#将a数组的行数变为原来的3倍,列数变为原来的5倍
b = np.tile(a, (3, 5))
print(b)
result:
[[ 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30]
[ 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30]
[ 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30]]
数组排序
a = np.array([[4, 3, 5], [1, 2, 1]])
print(a)
#将a的每一行进行排序
b = np.sort(a, axis=1)
print(b)
#第二种写法,结果相同
a.sort(axis=1)
print(a)
result:
[[4 3 5]
[1 2 1]]
[[3 4 5]
[1 1 2]]
#第三种写法
a = np.array([4, 3, 1, 2]) print(a) j = np.argsort(a) print(j) print(a[j]) #其中j的各个元素是a按从小到大排序的索引 result: [4 3 1 2] [2 3 1 0] [1 2 3 4]
矩阵求逆
a=np.array([ [1,2], [3,4] ]) print(a) b=np.linalg.inv(a) print(b) result: [[1 2] [3 4]] [[-2. 1. ] [ 1.5 -0.5]]
先写这么多吧,以后用到别的再补上。。。