# 행렬 규범 - Matrix norm

에서는 수학 하는 매트릭스 규범 A는 벡터 놈 요소 (벡터)되는 벡터 공간 행렬 (소정 치수)는.

## 정의

주어진 필드 ${\displaystyle K}$하나의 실제 또는 복잡한 숫자벡터 공간 ${\displaystyle K^{m\times n}}$ 크기의 모든 행렬 ${\displaystyle m\times n}$ (와 ${\displaystyle m}$ 행 및 ${\displaystyle n}$ 열) 필드의 항목 포함 ${\displaystyle K}$, 행렬 노름은 벡터 공간 노름 입니다.${\displaystyle K^{m\times n}}$(다음과 같은 이중 수직 막대를 사용하여 표시된 개별 규범 포함${\displaystyle \|A\|}$[1] ). 따라서 행렬 노름은 함수입니다. ${\displaystyle \|\cdot \|:K^{m\times n}\to \mathbb {R} }$다음 속성을 충족해야합니다. [2] [3]

모든 스칼라 ${\displaystyle \alpha \in K}$ 그리고 모든 행렬에 대해 ${\displaystyle A,B\in K^{m\times n}}$,

• ${\displaystyle \|\alpha A\|=|\alpha |\|A\|}$( 절대적으로 균질 함 )
• ${\displaystyle \|A+B\|\leq \|A\|+\|B\|}$( 부가 산적 이거나 삼각형 부등식 충족 )
• ${\displaystyle \|A\|\geq 0}$( 양수 가치 )
• ${\displaystyle \|A\|=0\iff A=0_{m,n}}$( 확실 함 )

또한 정사각형 행렬 ( m = n 인 행렬 )의 경우 일부 (전부는 아님) 행렬 노름이 다음 조건을 충족합니다. 이는 행렬이 단순한 벡터 이상이라는 사실과 관련이 있습니다. [2]

• ${\displaystyle \|AB\|\leq \|A\|\|B\|}$ 모든 행렬에 대해 ${\displaystyle A}$${\displaystyle B}$${\displaystyle K^{n\times n}.}$

이 추가 속성을 만족하는 행렬 규범을 부 곱셈 규범 [4] [3]이라고합니다 (일부 책에서 용어 행렬 규범 은 부 곱셈 [5] 규범에만 사용됨 ). 모두의 세트${\displaystyle n\times n}$이러한 부분 곱셈 규범과 함께 행렬은 Banach 대수 의 한 예입니다 .

부분 곱셈의 정의는 유도 된 p- 노름 의 경우와 같이 비 제곱 행렬로 확장되는 경우가 있습니다.${\displaystyle A\in {K}^{m\times n}}$${\displaystyle B\in {K}^{n\times k}}$ 그것을 보유 ${\displaystyle \|AB\|_{q}\leq \|A\|_{p}\|B\|_{q}}$. 여기,${\displaystyle \|\cdot \|_{p}}$${\displaystyle \|\cdot \|_{q}}$ 규범은 ${\displaystyle K^{p}}$${\displaystyle K^{q}}$, 여기서 p , q ≥ 1 .

아래에서 설명 할 세 가지 유형의 매트릭스 규범이 있습니다.

• 벡터 노름에 의해 유도 된 매트릭스 노름,
• 엔트리 급 행렬 규범 및
• Schatten 규범.

## 벡터 노름에 의해 유도 된 행렬 노름

벡터 노름을 가정합니다. ${\displaystyle \|\cdot \|}$ 의 위에 ${\displaystyle K^{m}}$주어진다. 어떤${\displaystyle m\times n}$행렬 A 는 다음에서 선형 연산자를 유도합니다.${\displaystyle K^{n}}$ ...에 ${\displaystyle K^{m}}$표준 기반과 관련하여, 하나는 공간에서 해당 유도 표준 또는 운영자 표준정의합니다.${\displaystyle K^{m\times n}}$ 모든 ${\displaystyle m\times n}$ 다음과 같이 행렬 :

{\displaystyle {\begin{aligned}\|A\|&=\sup\{\|Ax\|:x\in K^{n}{\text{ with }}\|x\|=1\}\\&=\sup \left\{{\frac {\|Ax\|}{\|x\|}}:x\in K^{n}{\text{ with }}x\neq 0\right\}.\end{aligned}}}

특히 벡터에 대한 p- 노름 ( 1 ≤ p ≤ ∞ )이 두 공간 모두에 사용되는 경우${\displaystyle K^{n}}$${\displaystyle K^{m}}$, 그에 상응하는 유도 연산자 규범 은 다음과 같습니다. [3]

${\displaystyle \|A\|_{p}=\sup _{x\neq 0}{\frac {\|Ax\|_{p}}{\|x\|_{p}}}.}$

These induced norms are different from the "entrywise" p-norms and the Schatten p-norms for matrices treated below, which are also usually denoted by ${\displaystyle \|A\|_{p}.}$

Note: The above description pertains to the induced operator norm when the same vector norm was used in the "departure space" ${\displaystyle K^{n}}$ and the "arrival space" ${\displaystyle K^{m}}$ of the operator ${\displaystyle A\in K^{m\times n}}$. This is not a necessary restriction. More generally, given a norm ${\displaystyle \|\cdot \|_{\alpha }}$ on ${\displaystyle K^{n}}$ and a norm ${\displaystyle \|\cdot \|_{\beta }}$ on ${\displaystyle K^{m}}$, one can define a matrix norm on ${\displaystyle K^{m\times n}}$ induced by these norms:
${\displaystyle \|A\|_{\alpha ,\beta }=\max _{x\neq 0}{\frac {\|Ax\|_{\beta }}{\|x\|_{\alpha }}}.}$
The matrix norm ${\displaystyle \|A\|_{\alpha ,\beta }}$ is sometimes called a subordinate norm. Subordinate norms are consistent with the norms that induce them, giving
${\displaystyle \|Ax\|_{\beta }\leq \|A\|_{\alpha ,\beta }\|x\|_{\alpha }.}$

Any induced operator norm is a submultiplicative matrix norm: ${\displaystyle \|AB\|\leq \|A\|\|B\|;}$ this follows from

${\displaystyle \|ABx\|\leq \|A\|\|Bx\|\leq \|A\|\|B\|\|x\|}$

and

${\displaystyle \max _{\|x\|=1}\|ABx\|=\|AB\|.}$

Moreover, any induced norm satisfies the inequality

${\displaystyle \|A^{r}\|^{1/r}\geq \rho (A)\quad }$ (1)

where ρ(A) is the spectral radius of A. For symmetric or hermitian A, we have equality in (1) for the 2-norm, since in this case the 2-norm is precisely the spectral radius of A. For an arbitrary matrix, we may not have equality for any norm; a counterexample would be

${\displaystyle A={\begin{bmatrix}0&1\\0&0\end{bmatrix}},}$

which has vanishing spectral radius. In any case, for square matrices we have the spectral radius formula:

${\displaystyle \lim _{r\to \infty }\|A^{r}\|^{1/r}=\rho (A).}$

### Special cases

In the special cases of ${\displaystyle p=1,2,\infty ,}$ the induced matrix norms can be computed or estimated by

${\displaystyle \|A\|_{1}=\max _{1\leq j\leq n}\sum _{i=1}^{m}|a_{ij}|,}$

이것은 단순히 행렬의 최대 절대 열 합계입니다.

${\displaystyle \|A\|_{\infty }=\max _{1\leq i\leq m}\sum _{j=1}^{n}|a_{ij}|,}$

이것은 단순히 행렬의 최대 절대 행 합계입니다.

${\displaystyle \|A\|_{2}=\sigma _{\max }(A),}$

어디 ${\displaystyle \sigma _{\max }(A)}$ 행렬의 가장 큰 특이 값을 나타냅니다. ${\displaystyle A}$. 사건에 대한 중요한 불평등이 있습니다${\displaystyle p=2}$:

${\displaystyle \|A\|_{2}=\sigma _{\max }(A)\leq \|A\|_{\rm {F}}=\left(\sum _{i=1}^{m}\sum _{j=1}^{n}|a_{ij}|^{2}\right)^{\frac {1}{2}},}$

어디 ${\displaystyle \|A\|_{\rm {F}}}$는 IS 의 Frobenius의 규범 . 평등은 행렬이${\displaystyle A}$랭크 1 행렬 또는 0 행렬입니다. 이 부등식은 행렬의 추적이 고유 값의 합과 같다는 사실에서 파생 될 수 있습니다.

언제 ${\displaystyle p=2}$ 우리는 ${\displaystyle \|A\|_{2}}$ 같이 ${\displaystyle \sup\{x^{T}Ay:x,y\in K^{n}{\text{ with }}\|x\|_{2}=\|y\|_{2}=1\}}$. Cauchy-Schwarz 부등식을 사용하여 위의 정의와 동일 함을 보여줄 수 있습니다 .

예를 들어

${\displaystyle A={\begin{bmatrix}-3&5&7\\2&6&4\\0&2&8\\\end{bmatrix}},}$

우리는 그것을 가지고

${\displaystyle \|A\|_{1}=\max(|{-3}|+2+0;5+6+2;7+4+8)=\max(5,13,19)=19,}$
${\displaystyle \|A\|_{\infty }=\max(|{-3}|+5+7;2+6+4;0+2+8)=\max(15,12,10)=15.}$

특별한 경우 ${\displaystyle p=2}$( 유클리드 표준 또는${\displaystyle \ell _{2}}$-norm for vector), 유도 된 행렬 노름은 스펙트럼 노름 입니다. 행렬의 스펙트럼 규범${\displaystyle A}$가장 큰 특이 값 의은${\displaystyle A}$(즉, 행렬 의 가장 큰 고유 값제곱근${\displaystyle A^{*}A}$, 어디 ${\displaystyle A^{*}}$켤레 전치나타냅니다.${\displaystyle A}$) : [6]

${\displaystyle \|A\|_{2}={\sqrt {\lambda _{\max }\left(A^{*}A\right)}}=\sigma _{\max }(A).}$

이 경우 ${\displaystyle \|A^{*}A\|_{2}=\|AA^{*}\|_{2}=\|A\|_{2}^{2}}$ 이후 ${\displaystyle \|A^{*}A\|_{2}=\sigma _{\max }(A^{*}A)=\sigma _{\max }(A)^{2}=\|A\|_{2}^{2}}$ 유사하게 ${\displaystyle \|AA^{*}\|_{2}=\|A\|_{2}^{2}}$하여 특이 값 분해 (SVD).

## "Entrywise"매트릭스 규범

이 규범은 ${\displaystyle m\times n}$ 크기 벡터로 행렬 ${\displaystyle m\cdot n}$, 익숙한 벡터 노름 중 하나를 사용합니다. 예를 들어, 벡터에 대해 p- 노름을 사용하면 p ≥ 1 이됩니다.

${\displaystyle \|A\|_{p,p}=\|\mathrm {vec} (A)\|_{p}=\left(\sum _{i=1}^{m}\sum _{j=1}^{n}|a_{ij}|^{p}\right)^{1/p}}$

This is a different norm from the induced p-norm (see above) and the Schatten p-norm (see below), but the notation is the same.

The special case p = 2 is the Frobenius norm, and p = ∞ yields the maximum norm.

### L2,1 and Lp,q norms

Let ${\displaystyle (a_{1},\ldots ,a_{n})}$ be the columns of matrix ${\displaystyle A}$. The ${\displaystyle L_{2,1}}$ norm[7] is the sum of the Euclidean norms of the columns of the matrix:

${\displaystyle \|A\|_{2,1}=\sum _{j=1}^{n}\|a_{j}\|_{2}=\sum _{j=1}^{n}\left(\sum _{i=1}^{m}|a_{ij}|^{2}\right)^{\frac {1}{2}}}$

The ${\displaystyle L_{2,1}}$ norm as an error function is more robust, since the error for each data point (a column) is not squared. It is used in robust data analysis and sparse coding.

For p, q ≥ 1, the ${\displaystyle L_{2,1}}$ norm can be generalized to the ${\displaystyle L_{p,q}}$ norm as follows:

${\displaystyle \|A\|_{p,q}=\left(\sum _{j=1}^{n}\left(\sum _{i=1}^{m}|a_{ij}|^{p}\right)^{\frac {q}{p}}\right)^{\frac {1}{q}}.}$

### Frobenius norm

When p = q = 2 for the ${\displaystyle L_{p,q}}$ norm, it is called the Frobenius norm or the Hilbert–Schmidt norm, though the latter term is used more frequently in the context of operators on (possibly infinite-dimensional) Hilbert space. This norm can be defined in various ways:

${\displaystyle \|A\|_{\text{F}}={\sqrt {\sum _{i=1}^{m}\sum _{j=1}^{n}|a_{ij}|^{2}}}={\sqrt {\operatorname {trace} \left(A^{*}A\right)}}={\sqrt {\sum _{i=1}^{\min\{m,n\}}\sigma _{i}^{2}(A)}},}$

where ${\displaystyle \sigma _{i}(A)}$ are the singular values of ${\displaystyle A}$. Recall that the trace function returns the sum of diagonal entries of a square matrix.

The Frobenius norm is an extension of the Euclidean norm to ${\displaystyle K^{n\times n}}$ and comes from the Frobenius inner product on the space of all matrices.

The Frobenius norm is submultiplicative and is very useful for numerical linear algebra. The submultiplicativity of Frobenius norm can be proved using Cauchy–Schwarz inequality.

Frobenius norm is often easier to compute than induced norms, and has the useful property of being invariant under rotations (and unitary operations in general). That is, ${\displaystyle \|A\|_{\text{F}}=\|AU\|_{\text{F}}=\|UA\|_{\text{F}}}$ for any unitary matrix ${\displaystyle U}$. This property follows from the cyclic nature of the trace (${\displaystyle \operatorname {trace} (XYZ)=\operatorname {trace} (ZXY)}$):

${\displaystyle \|AU\|_{\text{F}}^{2}=\operatorname {trace} \left((AU)^{*}AU\right)=\operatorname {trace} \left(U^{*}A^{*}AU\right)=\operatorname {trace} \left(UU^{*}A^{*}A\right)=\operatorname {trace} \left(A^{*}A\right)=\|A\|_{\text{F}}^{2},}$

and analogously:

${\displaystyle \|UA\|_{\text{F}}^{2}=\operatorname {trace} \left((UA)^{*}UA\right)=\operatorname {trace} \left(A^{*}U^{*}UA\right)=\operatorname {trace} \left(A^{*}A\right)=\|A\|_{\text{F}}^{2},}$

where we have used the unitary nature of ${\displaystyle U}$ (that is, ${\displaystyle U^{*}U=UU^{*}=\mathbf {I} }$).

It also satisfies

${\displaystyle \|A^{*}A\|_{\text{F}}=\|AA^{*}\|_{\text{F}}\leq \|A\|_{\text{F}}^{2}}$

and

${\displaystyle \|A+B\|_{\text{F}}^{2}=\|A\|_{\text{F}}^{2}+\|B\|_{\text{F}}^{2}+2\langle A,B\rangle _{\text{F}},}$

where ${\displaystyle \langle A,B\rangle _{\text{F}}}$ is the Frobenius inner product.

### Max norm

The max norm is the elementwise norm with p = q = ∞:

${\displaystyle \|A\|_{\max }=\max _{ij}|a_{ij}|.}$

This norm is not submultiplicative.

Note that in some literature (such as Communication complexity), an alternative definition of max-norm, also called the ${\displaystyle \gamma _{2}}$-norm, refers to the factorization norm:

${\displaystyle \gamma _{2}(A)=\min _{U,V:A=UV^{T}}\|U\|_{2,\infty }\|V\|_{2,\infty }=\min _{U,V:A=UV^{T}}\max _{i,j}\|U_{i,:}\|_{2}\|V_{j,:}\|_{2}}$

## Schatten norms

The Schatten p-norms arise when applying the p-norm to the vector of singular values of a matrix.[3] If the singular values of the ${\displaystyle m\times n}$ matrix ${\displaystyle A}$ are denoted by σi, then the Schatten p-norm is defined by

${\displaystyle \|A\|_{p}=\left(\sum _{i=1}^{\min\{m,n\}}\sigma _{i}^{p}(A)\right)^{\frac {1}{p}}.}$

These norms again share the notation with the induced and entrywise p-norms, but they are different.

All Schatten norms are submultiplicative. They are also unitarily invariant, which means that ${\displaystyle \|A\|=\|UAV\|}$ for all matrices ${\displaystyle A}$ and all unitary matrices ${\displaystyle U}$ and ${\displaystyle V}$.

The most familiar cases are p = 1, 2, ∞. The case p = 2 yields the Frobenius norm, introduced before. The case p = ∞ yields the spectral norm, which is the operator norm induced by the vector 2-norm (see above). Finally, p = 1 yields the nuclear norm (also known as the trace norm, or the Ky Fan 'n'-norm[8]), defined as

${\displaystyle \|A\|_{*}=\operatorname {trace} \left({\sqrt {A^{*}A}}\right)=\sum _{i=1}^{\min\{m,n\}}\sigma _{i}(A),}$

where ${\displaystyle {\sqrt {A^{*}A}}}$ denotes a positive semidefinite matrix ${\displaystyle B}$ such that ${\displaystyle BB=A^{*}A}$. More precisely, since ${\displaystyle A^{*}A}$ is a positive semidefinite matrix, its square root is well-defined. The nuclear norm ${\displaystyle \|A\|_{*}}$ is a convex envelope of the rank function ${\displaystyle {\text{rank}}(A)}$, so it is often used in mathematical optimization to search for low rank matrices.

## Consistent norms

A matrix norm ${\displaystyle \|\cdot \|}$ on ${\displaystyle K^{m\times n}}$ is called consistent with a vector norm ${\displaystyle \|\cdot \|_{a}}$ on ${\displaystyle K^{n}}$ and a vector norm ${\displaystyle \|\cdot \|_{b}}$ on ${\displaystyle K^{m}}$, if:

${\displaystyle \|Ax\|_{b}\leq \|A\|\|x\|_{a}}$

for all ${\displaystyle A\in K^{m\times n},x\in K^{n}}$. All induced norms are consistent by definition.

## Compatible norms

A matrix norm ${\displaystyle \|\cdot \|}$ on ${\displaystyle K^{n\times n}}$ is called compatible with a vector norm ${\displaystyle \|\cdot \|_{a}}$ on ${\displaystyle K^{n}}$, if:

${\displaystyle \|Ax\|_{a}\leq \|A\|\|x\|_{a}}$

for all ${\displaystyle A\in K^{n\times n},x\in K^{n}}$. Induced norms are compatible with the inducing vector norm by definition.

## Equivalence of norms

For any two matrix norms ${\displaystyle \|\cdot \|_{\alpha }}$ and ${\displaystyle \|\cdot \|_{\beta }}$, we have that:

${\displaystyle r\|A\|_{\alpha }\leq \|A\|_{\beta }\leq s\|A\|_{\alpha }}$

for some positive numbers r and s, for all matrices ${\displaystyle A\in K^{m\times n}}$. In other words, all norms on ${\displaystyle K^{m\times n}}$ are equivalent; they induce the same topology on ${\displaystyle K^{m\times n}}$. This is true because the vector space ${\displaystyle K^{m\times n}}$ has the finite dimension ${\displaystyle m\times n}$.

Moreover, for every vector norm ${\displaystyle \|\cdot \|}$ on ${\displaystyle \mathbb {R} ^{n\times n}}$, there exists a unique positive real number ${\displaystyle k}$ such that ${\displaystyle l\|\cdot \|}$ is a submultiplicative matrix norm for every ${\displaystyle l\geq k}$.

A submultiplicative matrix norm ${\displaystyle \|\cdot \|_{\alpha }}$ is said to be minimal, if there exists no other submultiplicative matrix norm ${\displaystyle \|\cdot \|_{\beta }}$ satisfying ${\displaystyle \|\cdot \|_{\beta }<\|\cdot \|_{\alpha }}$.

### Examples of norm equivalence

Let ${\displaystyle \|A\|_{p}}$ once again refer to the norm induced by the vector p-norm (as above in the Induced Norm section).

For matrix ${\displaystyle A\in \mathbb {R} ^{m\times n}}$ of rank ${\displaystyle r}$, the following inequalities hold:[9][10]

• ${\displaystyle \|A\|_{2}\leq \|A\|_{F}\leq {\sqrt {r}}\|A\|_{2}}$
• ${\displaystyle \|A\|_{F}\leq \|A\|_{*}\leq {\sqrt {r}}\|A\|_{F}}$
• ${\displaystyle \|A\|_{\max }\leq \|A\|_{2}\leq {\sqrt {mn}}\|A\|_{\max }}$
• ${\displaystyle {\frac {1}{\sqrt {n}}}\|A\|_{\infty }\leq \|A\|_{2}\leq {\sqrt {m}}\|A\|_{\infty }}$
• ${\displaystyle {\frac {1}{\sqrt {m}}}\|A\|_{1}\leq \|A\|_{2}\leq {\sqrt {n}}\|A\|_{1}.}$

Another useful inequality between matrix norms is

${\displaystyle \|A\|_{2}\leq {\sqrt {\|A\|_{1}\|A\|_{\infty }}},}$

which is a special case of Hölder's inequality.

## References

1. ^ "Comprehensive List of Algebra Symbols". Math Vault. 2020-03-25. Retrieved 2020-08-24.
2. ^ a b Weisstein, Eric W. "Matrix Norm". mathworld.wolfram.com. Retrieved 2020-08-24.
3. ^ a b c d "Matrix norms". fourier.eng.hmc.edu. Retrieved 2020-08-24.
4. ^ Malek-Shahmirzadi, Massoud (1983). "A characterization of certain classes of matrix norms". Linear and Multilinear Algebra. 13 (2): 97–99. doi:10.1080/03081088308817508. ISSN 0308-1087.
5. ^ Horn, Roger A. (2012). Matrix analysis. Johnson, Charles R. (2nd ed.). Cambridge: Cambridge University Press. pp. 340–341. ISBN 978-1-139-77600-4. OCLC 817236655.
6. ^ Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, §5.2, p.281, Society for Industrial & Applied Mathematics, June 2000.
7. ^ Ding, Chris; Zhou, Ding; He, Xiaofeng; Zha, Hongyuan (June 2006). "R1-PCA: Rotational Invariant L1-norm Principal Component Analysis for Robust Subspace Factorization". Proceedings of the 23rd International Conference on Machine Learning. ICML '06. Pittsburgh, Pennsylvania, USA: ACM. pp. 281–288. doi:10.1145/1143844.1143880. ISBN 1-59593-383-2.
8. ^ Fan, Ky. (1951). "Maximum properties and inequalities for the eigenvalues of completely continuous operators". Proceedings of the National Academy of Sciences of the United States of America. 37 (11): 760–766. Bibcode:1951PNAS...37..760F. doi:10.1073/pnas.37.11.760. PMC 1063464. PMID 16578416.
9. ^ Golub, Gene; Charles F. Van Loan (1996). Matrix Computations – Third Edition. Baltimore: The Johns Hopkins University Press, 56–57. ISBN 0-8018-5413-X.
10. Roger Horn과 Charles Johnson. Matrix Analysis, Chapter 5, Cambridge University Press, 1985. ISBN 0-521-38632-2 .

## 서지

• James W. Demmel , Applied Numerical Linear Algebra, 섹션 1.7, SIAM 발행, 1997.
• Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM 발행, 2000. [1]
• John Watrous , Theory of Quantum Information, 2.3 Norms of operator , 강의 노트, University of Waterloo, 2011.
• Kendall Atkinson , An Introduction to Numerical Analysis, 1989 년 John Wiley & Sons 발행