目录
本文为《Linear algebra and its applications》的读书笔记
Orthogonal sets
A set of vectors { u 1 , . . . , u p } \{\boldsymbol u_1,..., \boldsymbol u_p\} { u1,...,up} in R n \mathbb R^n Rn is said to be an orthogonal set if each pair of distinct vectors from the set is orthogonal, that is, if u i ⋅ u j = 0 \boldsymbol u_i \cdot \boldsymbol u_j = 0 ui⋅uj=0 whenever i ≠ j i \neq j i=j .
PROOF
If 0 = c 1 u 1 + . . . + c p u p \boldsymbol 0 = c_1\boldsymbol u_1+...+ c_p\boldsymbol u_p 0=c1u1+...+cpup for some scalars c 1 , . . . , c p c_1,..., c_p c1,...,cp, then
because u 1 \boldsymbol u_1 u1 is orthogonal to u 2 , . . . , u p \boldsymbol u_2,...,\boldsymbol u_p u2,...,up. Since u 1 \boldsymbol u_1 u1 is nonzero, u 1 ⋅ u 1 \boldsymbol u_1 \cdot \boldsymbol u_1 u1⋅u1 is not zero and so c 1 = 0 c_1 = 0 c1=0. Similarly, c 2 , . . . , c p c_2,..., c_p c2,...,cp must be zero. Thus S S S is linearly independent.
The next theorem suggests why an orthogonal basis is much nicer than other bases. The weights in a linear combination can be computed easily.
An Orthogonal Projection 正交投影
Given a nonzero vector u \boldsymbol u u in R n \mathbb R^n Rn, consider the problem of decomposing a vector y \boldsymbol y y in R n \mathbb R^n Rn into the sum of two vectors, one a multiple of u \boldsymbol u u and the other orthogonal to u \boldsymbol u u. We wish to write
y = y ^ + z ( 1 ) \boldsymbol y=\hat\boldsymbol y+\boldsymbol z\ \ \ \ \ \ \ \ \ \ \ (1) y=y^+z (1)
where y ^ = α u \hat\boldsymbol y=\alpha\boldsymbol u y^=αu for some scalar α \alpha α and z \boldsymbol z z is some vector orthogonal to u \boldsymbol u u. See Figure 2.
Given any scalar α \alpha α, let z = y − α u \boldsymbol z =\boldsymbol y -\alpha\boldsymbol u z=y−αu, so that (1) is satisfied. Then y − y ^ \boldsymbol y -\hat\boldsymbol y y−y^ is orthogonal to u \boldsymbol u u if and only if
0 = ( y − α u ) ⋅ u = y ⋅ u − ( α u ) ⋅ u = y ⋅ u − α ( u ⋅ u ) 0=(\boldsymbol y-\alpha\boldsymbol u)\cdot \boldsymbol u=\boldsymbol y\cdot\boldsymbol u-(\alpha\boldsymbol u)\cdot\boldsymbol u=\boldsymbol y\cdot\boldsymbol u-\alpha(\boldsymbol u\cdot\boldsymbol u) 0=(y−αu)⋅u=y⋅u−(αu)⋅u=y⋅u−α(u⋅u)
That is, (1) is satisfied with z \boldsymbol z z orthogonal to u \boldsymbol u u if and only if α = y ⋅ u u ⋅ u \alpha=\frac{\boldsymbol y\cdot \boldsymbol u}{\boldsymbol u\cdot\boldsymbol u} α=u⋅uy⋅u and y ^ = y ⋅ u u ⋅ u u \hat \boldsymbol y=\frac{\boldsymbol y\cdot \boldsymbol u}{\boldsymbol u\cdot\boldsymbol u}\boldsymbol u y^=u⋅uy⋅uu. The vector y ^ \hat\boldsymbol y y^ is called the orthogonal projection of y \boldsymbol y y onto u \boldsymbol u u, and the vector z \boldsymbol z z is called the component of y \boldsymbol y y orthogonal to u \boldsymbol u u.
This projection is determined by the subspace L L L spanned by u \boldsymbol u u (the line through u \boldsymbol u u and 0 \boldsymbol 0 0). Sometimes y ^ \hat \boldsymbol y y^ is denoted by p r o j L y proj_L \boldsymbol y projLy and is called the orthogonal projection of y \boldsymbol y y onto L L L. That is,
由正交投影可以求得点到直线的距离 ( ∥ y − y ^ ∥ \left \|\boldsymbol y-\hat\boldsymbol y\right \| ∥y−y^∥)
x ↦ p r o j L x \boldsymbol x\mapsto proj_L\boldsymbol x x↦projLx is a linear transformation.
A Geometric Interpretation of Theorem 5
The formula for the orthogonal projection y ^ \hat \boldsymbol y y^ in
has the same appearance as each of the terms in Theorem 5. Thus Theorem 5 decomposes each y \boldsymbol y y in S p a n { u 1 , . . . , u p } Span\{\boldsymbol u_1,...,\boldsymbol u_p\} Span{
u1,...,up} into the sum of p p p projections onto one-dimensional subspaces that are mutually orthogonal.
Orthonormal Sets 单位正交集
A set { u 1 , . . . , u p } \{\boldsymbol u_1,...,\boldsymbol u_p\} { u1,...,up} is an orthonormal set if it is an orthogonal set of unit vectors. If W W W is the subspace spanned by such a set, then { u 1 , . . . , u p } \{\boldsymbol u_1,...,\boldsymbol u_p\} { u1,...,up} is an orthonormal basis for W W W , since the set is automatically linearly independent
The simplest example of an orthonormal set is the standard basis { e 1 , . . . , e n } \{\boldsymbol e_1,...,\boldsymbol e_n\} { e1,...,en} for R n \mathbb R^n Rn. Any nonempty subset of { e 1 , . . . , e n } \{\boldsymbol e_1,...,\boldsymbol e_n\} { e1,...,en} is orthonormal, too.
When the vectors in an orthogonal set of nonzero vectors are normalized to have unit length, the new vectors will still be orthogonal, and hence the new set will be an orthonormal set.
Matrices whose columns form an orthonormal set are important in applications and in computer algorithms for matrix computations. Their main properties are given in Theorems 6 and 7.
Properties (a) and (c ) say that the linear mapping x ↦ U x \boldsymbol x \mapsto U\boldsymbol x x↦Ux preserves lengths and orthogonality.
Theorems 6 and 7 are particularly useful when applied to s q u a r e square square matrices. An orthogonal matrix is a square invertible matrix U U U such that U − 1 = U T U^{-1} = U^T U−1=UT . It is easy to see that any square matrix with orthonormal columns is an orthogonal matrix. Surprisingly, such a matrix must have orthonormal rows, too. ( ( U T ) − 1 = ( U T ) T (U^T)^{-1} = (U^T)^T (UT)−1=(UT)T)
d e t U = ± 1 detU=\pm1 detU=±1
EXERCISES
Show that if an n × n n \times n n×n matrix U U U satisfies ( U x ) ⋅ ( U y ) = x ⋅ y (U\boldsymbol x) \cdot(U\boldsymbol y)= \boldsymbol x\cdot \boldsymbol y (Ux)⋅(Uy)=x⋅y for all x \boldsymbol x x and y \boldsymbol y y in R n \mathbb R^n Rn, then U U U is an orthogonal matrix.
SOLUTION
U e j U\boldsymbol e_j Uej is the j j j-th column of U U U. Since ∥ U e j ∥ 2 = ( U e j ) ⋅ ( U e j ) = e j ⋅ e j = 1 \left\|U\boldsymbol e_j\right\|^2=(U\boldsymbol e_j) \cdot(U\boldsymbol e_j)= \boldsymbol e_j\cdot \boldsymbol e_j=1 ∥Uej∥2=(Uej)⋅(Uej)=ej⋅ej=1, the columns of U U U are unit vectors. For j ≠ k j\neq k j=k, ( U e j ) ⋅ ( U e k ) = e j ⋅ e k = 0 (U\boldsymbol e_j) \cdot(U\boldsymbol e_k)= \boldsymbol e_j\cdot \boldsymbol e_k=0 (Uej)⋅(Uek)=ej⋅ek=0. Thus ach pair of distinct columns of U U U is orthogonal.