These pages are a collection of my personal review on Matrix Analysis, mainly about matrices and something relating to them, such like the Space, Norm, etc. They are really the things that matter in data science and almost all the machine learning algorithms. Hence, I collected them in this form for the convenience of anyone who wants a quick desktop or mobile reference.

1 Algebraic and Analytic Structures

1.1 Group

1) 2) 3) (a \circ b) \circ c = a \circ (b \circ c) e \circ a = a \circ e = a a \circ x = x \circ a = e

$\begin{eqnarray*} &1)& (a\circ b)\circ c = a \circ (b \circ c) \\ &2)& e \circ a = a \circ e = a \\ &3)& a \circ x = x \circ a = e \end{eqnarray*}$

1.2 Abelian

1) 4) 2) 3) a \circ b = b \circ a

$\begin{eqnarray*} &1)&2)3) \\ &4)&a \circ b = b \circ a \end{eqnarray*}$

1.3 Ring $(R,+)$ or $(R,\cdot )$

1) 2) 3) (R, +) i s a n A b e l i a n g r o u p . (a b) c = a (b c) a (b + c) = a b + a c, (b + c) a = b a + c a

$\begin{eqnarray*} &1)&(R,+) is \; an \; Abelian \;group.\\ &2)&(ab)c = a(bc)\\ &3)&a(b+c) = ab+ac,(b+c)a = ba+ca \end{eqnarray*}$

1.4 Equivalence Relation $\equiv$

1) 2) 3) a \equiv a a \equiv b i m p l i e s b \equiv a a \equiv b a n d b \equiv c i m p l i e s a \equiv c

$\begin{eqnarray*} &1)&a \equiv a\\ &2)&a \equiv b \; implies \; b \equiv a \\ &3)&a \equiv b \; and \; b \equiv c \; implies \; a \equiv c \end{eqnarray*}$

1.5 Partial Order $\preceq$

1) 2) 3) a ⪯ a a ⪯ b a n d b ⪯ c i m p l i e s a ⪯ c a ⪯ b a n d b ⪯ a i m p l i e s a = b

$\begin{eqnarray*} &1)&a \preceq a \\ &2)&a \preceq b \; and \; b \preceq c \; implies \; a \preceq c \\ &3)&a \preceq b \; and \; b \preceq a \; implies \; a=b \end{eqnarray*}$

1.6 Majorization and Weak majorization

1)Marjorization

x ↓ y ↓ = = [x 1 ↓, x 2 ↓, . . ., x n ↓] \in R n [y 1 ↓, y 2 ↓, . . ., y n ↓] \in R n

$\begin{eqnarray*} x^{\downarrow} &=& [{x_{1}}^{\downarrow},{x_{2}}^{\downarrow},...,{x_{n}}^{\downarrow}] \in \mathbb{R}^{n}\\ y^{\downarrow} &=& [{y_{1}}^{\downarrow},{y_{2}}^{\downarrow},...,{y_{n}}^{\downarrow}] \in \mathbb{R}^{n} \end{eqnarray*}$

For $x,y \in \mathbb{R}^{n}$ , we say that $x$ is majorized by $y$ , denoted by $x\prec y$ , if

\sum j = 1 k x j ↓ \sum j = 1 n x j ↓ \leq = \sum j = 1 k y j ↓ f o r k i n [1 : n - 1] \sum j = 1 n y j ↓

$\begin{eqnarray*} \sum_{j=1}^{k}{x_{j}}^{\downarrow} &\leq& \sum_{j=1}^{k}{y_{j}}^{\downarrow} \; for \; k \; in \; [1:n-1] \\ \sum_{j=1}^{n}{x_{j}}^{\downarrow} &=& \sum_{j=1}^{n}{y_{j}}^{\downarrow} \end{eqnarray*}$

2)Weak Majorization

For $x,y \in \mathbb{R}^{n}$ , we say that $x$ is weak majorized by $y$ , denoted by $x\prec y$ , if

\sum j = 1 k x j ↓ \sum j = 1 n x j ↓ \leq \leq \sum j = 1 k y j ↓ f o r k i n [1 : n - 1] \sum j = 1 n y j ↓

$\begin{eqnarray*} \sum_{j=1}^{k}{x_{j}}^{\downarrow} &\leq& \sum_{j=1}^{k}{y_{j}}^{\downarrow} \; for \; k \; in \; [1:n-1] \\ \sum_{j=1}^{n}{x_{j}}^{\downarrow} &\leq& \sum_{j=1}^{n}{y_{j}}^{\downarrow} \end{eqnarray*}$

1.7 Supremum and Infimum

$T$ is a subset of poset $(S,\preceq )$ , $a$ is said to be a supremum of $T$ , denoted by sup $T$ , if

1) 2) 3) a \in S b ⪯ a f o r a l l b \in T a ⪯ c f o r a n y o t h e r u p p e r b o u n d c

$\begin{eqnarray*} &1)& a\in S \\ &2)& b \preceq a \; for \; all \; b \in T \\ &3)& a \preceq c \; for \; any \; other \; upper \; bound \; c \end{eqnarray*}$

$a$ is said to be a infimum of $T$ , denoted by inf $T$ , if

1) 2) 3) a \in S a ⪯ b f o r a l l b \in T c ⪯ a f o r a n y o t h e r l o w e r b o u n d c

$\begin{eqnarray*} &1)& a\in S \\ &2)& a \preceq b \; for \; all \; b \in T \\ &3)& c \preceq a \; for \; any \; other \; lower \; bound \; c \end{eqnarray*}$

1.8 Lattice

Let $a,b \in S$ , then inf{ $a,b$ } is also denoted by $a \wedge b$ , called the meet of a,b ; and sup { $a,b$ } is denoted by $a \vee b$ , called the join of a,b. Then, a poset $(S, \preceq)$ is called a lattice if $a \wedge b$ and $a \vee b$ exist for all $a,b \in S$ .

2 Linear Spaces

2.1 Linear Space

A set $\chi$ is said to be a linear space(or vector space) over a filed $\mathbb{F}$ , if

1) 2) 3) 4) 5) α x \in χ, w h e n α \in F, x \in F, i t i s a c l o s u r e p r o p e r t y (α β) x = α (β x) α (x + y) = α x + α y (α + β) x = α x + β x 1 \cdot x = x

$\begin{eqnarray*} &1)& \alpha x \in \chi, when \alpha \in \mathbb{F}, x \in \mathbb{F}, it \; is \; a \; closure \; property\\ &2)& (\alpha \beta) x = \alpha (\beta x)\\ &3)& \alpha (x+y) = \alpha x + \alpha y\\ &4)& (\alpha+\beta)x = \alpha x+\beta x\\ &5)& 1 \cdot x = x \end{eqnarray*}$

2.2 Dimension and Basis

Several vectors $x_{1},x_{2},...x_{m} \in \chi$ are said to be linear independent if

α 1 x 1 + α 2 x 2 + . . . + α m x m = 0

$\alpha_{1}x_{1}+\alpha_{2}x_{2}+...+\alpha_{m}x_{m} = 0$
implies

α1=α2=...=αm=0 $\alpha_{1}=\alpha_{2}=...=\alpha_{m}=0$ . ‘m’ is the dimension of

χ $\chi$ ,

x1,x2,...,xm $x_{1},x_{2},...,x_{m}$ is the basis of

χ $\chi$ .

d i m d i m d i m d i m d i m R n = n R m \times n = m n C n = n C m \times n = m n H n = n 2

$\begin{eqnarray*} &dim& \mathbb{R}^n = n\\ &dim& \mathbb{R}^{m\times n} = mn\\ &dim& \mathbb{C}^n = n\\ &dim& \mathbb{C}^{m\times n} = mn\\ &dim& \mathbb{H}^n = n^2 \end{eqnarray*}$

2.3 Null Space and Range Space

N (A) R (A) = = {x \in χ : A x = 0} {a x : x \in χ}

$\begin{eqnarray*} N(A)&=&\left \{x \in \chi:Ax = 0 \right \}\\ R(A)&=&\left \{ax:x \in \chi \right \} \end{eqnarray*}$

2.4 Normed Linear Space

For vectors:

∥ x ∥ p ∥ x ∥ \infty = = (\sum i = 1 n | x i | p) 1 p m a x | x i |

$\begin{eqnarray*} \left \| x \right \|_{p} &=& \left ( \sum_{i=1}^{n}\left | x_i \right |^p \right )^\frac{1}{p} \\ \left \| x \right \|_{\infty } &=& max \left | x_{i} \right | \end{eqnarray*}$

For matrices:

∥ A ∥ 1 ∥ A ∥ 2 ∥ A ∥ p ∥ A ∥ \infty = m a x 1 \leq j \leq m \sum i = 1 n ∣ ∣ a i j ∣ ∣ = σ 1 (A) = s u p ∥ x ∥ p = 1 ∥ A x ∥ p = m a x 1 \leq i \leq n \sum j = 1 m ∣ ∣ a i j ∣ ∣

$\begin{eqnarray*} &\left \| A \right \|_{1}& = \underset{1\leq j\leq m}{max} \sum_{i=1}^{n}\left | a_{ij} \right | \\ &\left \| A \right \|_{2}& = \sigma _{1}(A) \\ &\left \| A \right \|_{p}& = \underset{\left \| x \right \|p = 1}{sup}\left \| Ax \right \|_p \\ &\left \| A \right \|_{\infty }& = \underset{1\leq i\leq n}{max} \sum_{j=1}^{m}\left | a_{ij} \right | \end{eqnarray*}$
where

σ1 $\sigma_{1}$ is the maximum sigular value of A.

2.5 Inner Prouduct Space

⟨ x, y ⟩ = x * y

$\left \langle x,y \right \rangle = x^*y$
A linear space with an inner product is called an inner product space.

2.6 Gram Schimidt Orthonormalization

q 1 q 2 ⋮ q i = = = a 1 ∥ a 1 ∥ a 2 - ⟨ a 2 , q 1 ⟩ q 1 ∥ a 2 - ⟨ a 2 , q 1 ⟩ q 1 ∥ a i - \sum i - 1 j = 1 ⟨ a i , q j ⟩ q j ∥ ∥ a i - \sum i - 1 j = 1 ⟨ a i , q j ⟩ q j ∥ ∥

$\begin{eqnarray*} q_{1}&=&\frac{a_{1}}{\left \| a_{1} \right \|}\\ q_{2}&=&\frac{a_{2}-\left \langle a_{2},q_{1} \right \rangle q_{1}}{\left \| a_{2}-\left \langle a_{2},q_{1} \right \rangle q_{1} \right \|} \\ \vdots \\ q_{i}&=&\frac{a_{i}-\sum_{j=1}^{i-1} \left \langle a_{i},q_{j} \right \rangle q_{j}}{\left \|a_{i}-\sum_{j=1}^{i-1} \left \langle a_{i},q_{j} \right \rangle q_{j} \right \|} \end{eqnarray*}$

3 Matrix Factorization and Decompositions

3.1 Eigenvalues and Eigenvectors

The characteristic polynomial of A is defined to be

C A (z) = d e t (z I - A)

$C_{A}(z)=det(zI-A)$
A complex number

λ $\lambda$ satisfying

CA(λ)=0 $C_{A}(\lambda)=0$ is called an eigenvalue of A, and the vector

x∈Cn $x \in \mathbb{C}^n$ such that

Ax=λx $Ax= \lambda x$ is called the right eigenvector of A corresponding to the eigenvalue

λ $\lambda$ .

3.2 Spectrum

Spectrum is the set of eigenvalues of A.
Spectral Radius $\rho (A)$ is the maximum modulus of the eigenvalues of A, i.e., $\rho(A) = max \left | \lambda_{i} \right |$ .

3.3 Diagonalization

c A (z) = (z - λ 1) n 1 (z - λ 2) n 2 . . . (z - λ n l) n l

$c_{A}(z) = (z-\lambda_{1})^{n_1}(z-\lambda_{2})^{n_2}...(z-\lambda_{n_{l}})^{n_l}$
where

ni≥1 $n_{i}\geq 1$ and

∑li=1ni=n $\sum_{i=1}^{l}n_{i}=n$ ,

ni $n_{i}$ is the algebraic multiplicity of

λi $\lambda_{i}$ .

Eigenspace: $\varepsilon _{i}=N(A - \lambda_{i}I)$
Generalized Eigenspace: $\widetilde{\varepsilon _{i}} = N[(A-\lambda_{i}I)^{n_{i}}]$

3.4 Jordan Canonical Form

Choosing arbitrary basis from $\widetilde{\varepsilon _{i}}$ to form P, and tranfer A by $P^{-1}AP$ to get a Jordan Canonical Form. We can also get P from:

A μ 1 A μ 2 A μ 3 ⋮ = = = λ μ 1 λ μ 2 + μ 1 λ μ 3 + μ 2

$\begin{eqnarray*} A \mu_{1} &=& \lambda \mu_{1} \\ A \mu_{2} &=& \lambda \mu_{2}+\mu{1} \\ A \mu_{3} &=& \lambda \mu_{3}+\mu{2}\\ \vdots \end{eqnarray*}$

3.5 QR Factorization

A n \times m = Q R

$A^{n \times m}=QR$

Q R = = [q 1 q 1 . . . q m] Q' A

$\begin{eqnarray*} Q&=&[q_{1} \; q_{1} \; ... \;q_{m}]\\ R&=&Q’A \end{eqnarray*}$

3.6 Schur Factorization

T n \times n = U * A n \times n U

$T^{n \times n}=U^*A^{n \times n}U$
U: unitary matrix
A: with eigenvalues

λ1,...,λn $\lambda_{1},..., \lambda{n}$
T: an upper triangular matrix

3.7 SVD Decomposition

A m \times n = U S m \times n V *

$A^{m \times n}=US^{m \times n}V^*$
The left -singular vectors of A(columns of U) are a set of orthonormal eigenvectors of

AA∗ $AA^*$ .
The right-singular vectors of A(columns of V) are a set of orthonormal eigenvectors of

A∗A $A^*A$ .
The diagnal entries of S are the square roots of the non-negative eigenvalues of both

A∗A $A^*A$ and

AA∗ $AA^*$ , known as the singular values.
e.g. For a square matrix T

T = = = Q * A Q Q * U S V * Q (Q * U) S (Q * V) *

$\begin{eqnarray*} T &=& Q^*AQ \\ &=& Q^*USV^*Q \\ &=& (Q^*U)S(Q^*V)^* \end{eqnarray*}$

3.8 Spectral Decompostion

A = \sum i = 1 k λ i G i

$A = \sum_{i=1}^{k}\lambda_{i}G_{i}$

P - 1 A P = d i a g {λ 1, λ 2, . . ., λ k} \Rightarrow

$P^{-1}AP=diag \left \{ \lambda_{1},\lambda_{2},...,\lambda_{k} \right \}\Rightarrow$

A = = = = P d i a g {λ 1, λ 2, . . ., λ k} P - 1 [α 1, α 2, . . . α k] ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ λ 1 λ 2 ⋱ λ k ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ β T 1 β T 2 ⋮ β T k ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ λ 1 α 1 β T 1 + λ 2 α 2 β T 2 + . . . + λ k α k β T k λ 1 G 1 + λ 2 G 2 + . . . + λ k G k

$\begin{eqnarray*} A &=& P diag \left \{ \lambda_{1},\lambda_{2},...,\lambda_{k} \right \} P^{-1}\\ &=& [\alpha_{1},\alpha_{2},...\alpha_{k}]\begin{bmatrix} \lambda_{1}& & & \\ &\lambda_{2} & & \\ & &\ddots & \\ & & &\lambda_{k} \end{bmatrix} \begin{bmatrix} \beta_1^T\\ \beta_2^T\\ \vdots \\ \beta_k^T \end{bmatrix} \\ &=& \lambda_1 \alpha_1 \beta_1^T + \lambda_2 \alpha_2 \beta_2^T +... +\lambda_k \alpha_k \beta_k^T\\ &=& \lambda_1 G_1 +\lambda_2 G_2 +... +\lambda_k G_k \end{eqnarray*}$
where

Gk=αiβTi $G_k = \alpha_i \beta_i^T$ . There are some properties of

Gi $G_i$ :

\sum G i G 2 i G i G j = = = I G i 0

$\begin{eqnarray*} \sum G_i &=& I \\ G_i^2 &=&G_i\\ G_i G_j &=& 0 \end{eqnarray*}$

3.9 Matrix Functions

⎧ ⎩ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ s i n (A + B) s i n 2 A c o s (A + B) c o s 2 A = = = = s i n A c o s B + c o s A s i n B 2 s i n A c o s A c o s A s i n B + s i n A c o s B c o s 2 A - s i n 2 A 1

$\begin{eqnarray*} \left\{\begin{matrix} sin(A+B) &=& sinAcosB + cosA sinB\\ sin2A &=& 2sinAcosA\\ cos(A+B)&=&cosAsinB+sinAcosB\\ cos2A &=& cos^2A-sin^2A1 \end{matrix}\right. \end{eqnarray*}$
holds when

AB=BA,A,B∈Cm×n $AB=BA,A,B \in \mathbb{C}^{m \times n}$ .

4 Matrix Analysis

4.1 Positive Definite

a) b) c) d) p o s i t i v e d e f i n i t e : p o s i t i v e s e m i - d e f i n i t e : n e g a t i v e d e f i n i t e : n e g a t i v e s e m i - d e f i n i t e : x * x * x * x * A x > 0 \Leftrightarrow A > 0 A x \geq 0 \Leftrightarrow A \geq 0 A x < 0 \Leftrightarrow A < 0 A x \leq 0 \Leftrightarrow A \leq 0

$\begin{eqnarray*} &a)& \; positive definite: &x^*&Ax> 0\Leftrightarrow A> 0 \\ &b)& \; positive semi-definite: &x^*&Ax\geq 0\Leftrightarrow A\geq 0 \\ &c)& \; negative definite: &x^*&Ax< 0\Leftrightarrow A< 0 \\ &d)& \; negative semi-definite: &x^*&Ax \leq 0\Leftrightarrow A\leq 0 \end{eqnarray*}$

The following three statements are equivalent.

a) b) c) A > 0 σ (A) > 0 d e t ⎡ ⎣ ⎢ a 1 1 . . . a i 1 . . . . . . a 1 i . . . a i 1 ⎤ ⎦ ⎥

$\begin{eqnarray*} &a)& A>0\\ &b)& \sigma (A)>0\\ &c)& det\begin{bmatrix} a_11&... &a_{1i}\\ ...& &... \\ a_{i1}&... &a_{i1} \end{bmatrix} \end{eqnarray*}$

4.2 Rayleigh Quotient

For A, let $\lambda _{min} = \lambda_1\leq \lambda_2 \leq ... \leq \lambda_n =\lambda_{max}$ , $1\leq i_1 \leq i_2\leq ...\leq i_k\leq n$ are integers, $x_{i1},x_{i2},...x_{ik}$ are orthonormal vectors such that $Ax_{ip} = \lambda_{ip}x_{ip}$ , $S = span\left \{ x_{i1},x_{i2},...,x_{ik} \right \}$ ,then we have

a) b) λ i 1 \leq x * A x \leq λ i k f o r x \in S λ m i n \leq x * A x \leq λ m a x f o r x \in C n

$\begin{eqnarray*} &a)&\lambda_{i1} \leq x^*Ax \leq \lambda_{ik} \; for \; x \in S\\ &b)&\lambda_{min} \leq x^*Ax \leq \lambda_{max} \; for \; x \in \mathbb{C}^n \end{eqnarray*}$

4.3 Hermitian Matrix

Hermitian Matrix: $A = A^*$
Skew-Hermitian: $A=-A^*$
Theorem: If $A$ is a Hermitian Matrix, then
a) $x^*Ax$ is real for all $x \in \mathbb{C}^n$ .
b) $\lambda(A)$ are real.
c) $S^*AS$ is Hermitian.

5 Special Topics

5.1 Stochastic Matrix

A nonnegative matrix $S^{n \times n}$ is said to be a stochastic matrix if each of its row sums is equal to one. S satisfies $Se = e$ , which means the eigenvalue and eigenvector of $S$ are respectively 1 and $\left [ 1 ... 1 \right ]^{T}$ .Obviously , if $S$ and $T$ are stochastic, so is $ST$ .

A tutorial on Matrix

1 Algebraic and Analytic Structures

1.1 Group

1.2 Abelian

1.3 Ring $(R,+)$ or $(R,\cdot )$

1.4 Equivalence Relation $\equiv$

1.5 Partial Order $\preceq$

1.6 Majorization and Weak majorization

1)Marjorization

2)Weak Majorization

1.7 Supremum and Infimum

1.8 Lattice

2 Linear Spaces

2.1 Linear Space

2.2 Dimension and Basis

2.3 Null Space and Range Space

2.4 Normed Linear Space

2.5 Inner Prouduct Space

2.6 Gram Schimidt Orthonormalization

3 Matrix Factorization and Decompositions

3.1 Eigenvalues and Eigenvectors

3.2 Spectrum

3.3 Diagonalization

3.4 Jordan Canonical Form

3.5 QR Factorization

3.6 Schur Factorization

3.7 SVD Decomposition

3.8 Spectral Decompostion

3.9 Matrix Functions

4 Matrix Analysis

4.1 Positive Definite

4.2 Rayleigh Quotient

4.3 Hermitian Matrix

5 Special Topics

5.1 Stochastic Matrix

END

猜你喜欢

A tutorial on Matrix

1 Algebraic and Analytic Structures

1.1 Group

1.2 Abelian

1.3 Ring (R,+) (R,+) or (R,⋅) (R,\cdot )

1.4 Equivalence Relation ≡ \equiv

1.5 Partial Order ⪯ \preceq

1.6 Majorization and Weak majorization

1)Marjorization

2)Weak Majorization

1.7 Supremum and Infimum

1.8 Lattice

2 Linear Spaces

2.1 Linear Space

2.2 Dimension and Basis

2.3 Null Space and Range Space

2.4 Normed Linear Space

2.5 Inner Prouduct Space

2.6 Gram Schimidt Orthonormalization

3 Matrix Factorization and Decompositions

3.1 Eigenvalues and Eigenvectors

3.2 Spectrum

3.3 Diagonalization

3.4 Jordan Canonical Form

3.5 QR Factorization

3.6 Schur Factorization

3.7 SVD Decomposition

3.8 Spectral Decompostion

3.9 Matrix Functions

4 Matrix Analysis

4.1 Positive Definite

4.2 Rayleigh Quotient

4.3 Hermitian Matrix

5 Special Topics

5.1 Stochastic Matrix

END

猜你喜欢

1.3 Ring $(R,+)$ or $(R,\cdot )$

1.4 Equivalence Relation $\equiv$

1.5 Partial Order $\preceq$