5.16作业（numpy练习）

Exercise 0

题意

题解

randn(d0, d1, ..., dn) 返回一个n维数组，数组是标准正态分布的样本。用randn函数得到二维数组A。

import numpy
from numpy import random

n = 200
m = 500
A = random.randn(n, m)

numpy库中没有toeplitz函数，scipy.linalg库有。其实可以自己写一个函数生成Toeplitz数组。什么是Toeplitz矩阵？

def toeplitz(row_vector, column_vector):
    '''
    row_vector是 1×m 的二维数组
    column_vector是 m×1 的二维数组
    '''
    res = row_vector
    for i in range(1, column_vector.size):
        row_vector = numpy.roll(row_vector, 1)
        row_vector[0,0] = column_vector[i, 0]
        res = numpy.vstack((res, row_vector))
    return res

测试 toeplitz 函数

vec1 = numpy.array([[1, 2, 3, 4, 5]])
vec2 = numpy.array([[1], [6], [7], [8], [9]])
print(toeplitz(vec1, vec2))

结果

得到二维数组B

B = toeplitz(random.randn(1, m), random.randn(m, 1))

Exercise 1: Matrix operations

题意

题解

算术运算符对数组逐个元素地作运算，因此矩阵（二维数组）乘法不能用 A*B ，要用 dot 函数实现。两个二维数组相加可以直接用 A+B 。

数组的转置用 transpose 函数。

eye 函数生成一个单位矩阵。

print(A + A)
At = numpy.transpose(A)
print(numpy.dot(A, At))
print(numpy.dot(At, A))
print(numpy.dot(A, B))

def compute(lambda_):
    I = numpy.eye(m)
    #print(lambda_ * I)
    return numpy.dot(A, B - lambda_ * I)

Exercise 2: Solving a linear system

题意

题解

numpy.linalg 库有解线性方程组的函数 solve(a, b) 。 a 一定要是可逆矩阵，否则抛出异常。

这里以 b = numpy.ones(m) 为例，可以取其他值。

from numpy import linalg
b = numpy.ones(m)
x = linalg.solve(B, b)
print(x)

Exercise 3: Norms

题意

题解

使用 numpy.linalg.norm 函数得到范数。给参数 ord 传入不同的值能够得到不同的范数，如传入 numpy.inf 得到无穷范数。

'''Frobenius norm'''
print(linalg.norm(A, 'fro'))
'''infinity norm'''
print(linalg.norm(B, numpy.inf))
'''largest singular value'''
print(linalg.norm(B, 2))
'''smallest singular value'''
print(linalg.norm(B, -2))

某一次的运行结果

Exercise 4: Power iteration

题意

题解

power iteration 的流程如下所示（证明略）：

代码给出了两种选择X0的方法：元素都是1的向量，或随机产生的向量。

随机产生X0能以很大的几率避免X0是Z的特征向量这种情况。但这里Z也是随机产生的，X0的元素全是1问题也不是很大。

import time


def find_max(y):
    '''找到n×1数组y中绝对值最大的数'''
    largest = abs(y[0, 0])
    ans = y[0, 0]
    for i in range(1, y.size):
        if abs(y[i, 0]) > largest:
            largest = abs(y[i, 0])
            ans = y[i, 0]
    return ans


def power_iteration(Z, eps):
    time_stamp = time.clock()
    #x = random.rand(n, 1)
    x = numpy.ones((n, 1))
    '''迭代次数'''
    count = 1
    y = numpy.dot(Z, x)
    c1 = find_max(y)
    x = y / c1

    '''c0取足够大'''
    c0 = c1 + eps + 1e7

    while abs(c1 - c0) >= eps:
        count += 1
        y = numpy.dot(Z, x)
        c0 = c1
        c1 = find_max(y)
        x = y / c1

    '''过程花费时间'''
    duration = time.clock() - time_stamp
    return x, c1, duration, count


Z = random.randn(n, n)
eigen_vector, eigen_value, duration, count = power_iteration(Z, 1e-3)
print(eigen_vector)
print(eigen_value)
print('time cost: ', duration)
print('iteration count: ', count)

经过若干次测试，x = random.rand(n, 1) 的收敛速度明显慢于 x = numpy.ones((n, 1)) 。下面贴出的测试结果是使用后者的情况。迭代的收敛速度是线性的，效率本身较低，所以随机生成向量的收敛速度特别低（通过数值试验得出的结论，其理论支撑还有待考证）。可以据此提出猜想：收敛速度和初始向量X0有关。

n = 20 的几次结果：

n = 100 的几次结果：

n = 500 的几次结果：

Exercise 5: Singular values

题意

题解

numpy.linalg 库也提供了计算奇异值的函数 svd。

以 n = 500 作测试。

prob = 0.05
C = numpy.zeros((n, n))
while prob < 0.99:
    for i in range(0, n):
        for j in range(0, n):
            p = random.rand(1)[0]
            #print(p)
            C[i][j] = 1 if p < prob else 0

    u, s, vh = linalg.svd(C)
    max_singular_value = numpy.max(s)
    print('n:', n, '  p:', prob, '  s:', max_singular_value, '\n')
    prob += 0.05

测试结果：

最大奇异值 ≈ n × p

numpy.random.binomial 函数也可以生成满足要求的矩阵 C，可以不用自己写代码。

Exercise 6: Nearest neighbor

题意

题解

题目提示说用 argmin 函数，用 argmin 函数找出 A - z 中绝对值最小的元素的位置，返回 A 中相应位置的元素。

def find_closest(A, z):
    C = numpy.abs(A - z)
    idx = numpy.argmin(C)
    return A[idx // A.shape[0]][idx % A.shape[0]]

A = random.rand(n, n)
print(A)
print(find_closest(A, 1))

n = 3 时，测试结果如下