John Constable (1776-1837) was an English painter.
We have sections
\begin{equation}{\label{a}}\tag{A}\mbox{}\end{equation}
Eigenspaces.
Definition. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T:V\rightarrow V\) be a linear operator on \(V\). An element \(v\in V\) is called an eigenvector of \(T\) when there exists \(\lambda\in {\cal F}\) satisfying \(T(v)=\lambda v\). \(\sharp\)
Suppose that \(v\neq\theta\) and \(v\) is an eigenvector of a linear operator \(T\). Then, there exists \(\lambda\in {\cal F}\) such that \(T(v)=\lambda v\). If there exists another \(\lambda^{*}\in {\cal F}\) satisfying \(T(v)=\lambda^{*}v\), then we have \(\lambda v-\lambda^{*}v=\theta\). This says that \(\lambda =\lambda^{*}\), since \(v\neq\theta\). In this case, the scalar \(\lambda\) is unique and is called the eigenvalue of \(T\) with respect to the eigenvector \(v\). We can also says that \(v\) is an eigenvector with eigenvalue \(\lambda\).
If \(A\) is an \(n\times n\) square matrix with coefficients in \({\cal F}\), then an eigenvector of \(A\) is defined as an eigenvector of the linear operator \(T:{\cal F}^{n}\rightarrow {\cal F}^{n}\) represented by this matrix relative to the standard basis. Therefore, an eigenvector \(X\) of \(A\) is a column vector in \({\cal F}^{n}\) for which there exists \(\lambda\in {\cal F}\) satisfying \(AX=\lambda X\).
Example. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). If \(v\) is an eigenvector with eigenvalue \(\lambda\), then, for any \(\alpha\in {\cal F}\), we have
\begin{align*} T(\alpha v) & =\alpha T(v)\\ & =\alpha\lambda v\\ & =\lambda (\alpha v),\end{align*}
which says that \(\alpha v\) is also an eigenvector with the same eigenvalue \(\lambda\). \(\sharp\)
\begin{equation}{\label{lap30}}\mbox{}\tag{1}\end{equation}
Proposition \ref{lap30}. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\) with eigenvalue \(\lambda\). We define the set
\[V_{\lambda}=\left\{v\in V:T(v)=\lambda v\right\},\]
which consists of all eigenvectors of \(T\) having the same eigenvalue \(\lambda\). Then \(V_{\lambda}\) is a subspace of \(V\).
Proof. Let \(v_{1},v_{2}\in V_{\lambda}\) and \(\alpha\in {\cal F}\). Then, we have
\begin{align*} T(v_{1}+v_{2}) & =T(v_{1})+T(v_{2})\\ & =\lambda v_{1}+\lambda v_{2}\\ & =\lambda (v_{1}+v_{2})\end{align*}
and
\begin{align*} T(\alpha v_{1}) & =\alpha T(v_{1})\\ & =\alpha\lambda v_{1}\\ & =\lambda (\alpha v_{1}).\end{align*}
This completes the proof. \(\blacksquare\)
The subspace \(V_{\lambda}\) in Proposition \ref{lap30} is also called the eigenspace of \(T\) with eigenvalue \(\lambda\).
\begin{equation}{\label{lap217}}\tag{2}\mbox{}\end{equation}
Proposition \ref{lap217}. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(v_{1},\cdots ,v_{m}\) be eigenvectors of \(T\) with the corresponding eigenvalues \(\lambda_{1},\cdots ,\lambda_{m}\). If \(\lambda_{i}\neq\lambda_{j}\) for \(i\neq j\), then \(v_{1},\cdots ,v_{m}\) are linearly independent.
Proof. We are going to prove it by induction on \(m\). For \(m=1\), an element \(v_{1}\in V\) with \(v_{1}\neq\theta\) is linearly independent. Suppose that \(\{v_{2},\cdots ,v_{m}\}\) are linearly independent. If
\begin{equation}{\label{laeq31}}\tag{3}
\alpha_{1}v_{1}+\cdots +\alpha_{m}v_{m}=\theta ,
\end{equation}
then we want to prove \(\alpha_{i}=0\) for all \(i=1,\cdots ,m\). We multiply the equality (\ref{laeq31}) by \(\lambda_{1}\) on both sides to obtain
\begin{equation}{\label{laeq32}}\tag{4}
\lambda_{1}\alpha_{1}v_{1}+\cdots +\lambda_{1}\alpha_{m}v_{m}=\theta .
\end{equation}
We also apply \(T\) to the equality (\ref{laeq31}) to obtain
\begin{equation}{\label{laeq33}}\tag{5}
\lambda_{1}\alpha_{1}v_{1}+\cdots +\lambda_{m}\alpha_{m}v_{m}=\theta .
\end{equation}
We subtract the expressions (\ref{laeq32}) and (\ref{laeq33}) to obtain
\[\alpha_{2}(\lambda_{2}-\lambda_{1})v_{2}+\cdots +\alpha_{m}(\lambda_{m}-\lambda_{1})v_{m}=\theta .\]
Since \(\lambda_{i}-\lambda_{1}\neq 0\) for \(i=2,\cdots ,m\) and \(\{v_{2},\cdots ,v_{m}\}\) are linearly independent, we must have \(\alpha_{i}=0\) for \(i=2,\cdots ,m\). Therefore, from (\ref{laeq31}), we obtain \(\alpha_{1}v_{1}=\theta\), which also implies \(\alpha_{1}=0\). This completes the proof. \(\blacksquare\)
\begin{equation}{\label{lac34}}\tag{6}\mbox{}\end{equation}
Lemma\ref{lac34}. Let \(V\) and \(W\) be vector spaces over the same scalar field \({\cal F}\), and let \(T:V\rightarrow W\) be a linear mapping such that it is surjective and \(\mbox{Ker}(T)=\{\theta_{U}\}\). Then \(T\) has an inverse linear mapping.
Proof. By Proposition \ref{lap13}, we see that \(T\) is injective and surjective. This says that \(T\) has an inverse. By Proposition \ref{lap14}, this inverse mapping is linear, and the proof is complete. \(\blacksquare\)
The above Lemma \ref{lac34} can refer to the page Linear Mappings.
\begin{equation}{\label{lap35}}\tag{7}\mbox{}\end{equation}
Proposition \ref{lap35}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Then \(\lambda\) is an eigenvalue of \(T\) if and only if \(T-\lambda I\) is not invertible.
Proof. Let \(\lambda\) be an eigenvalue of \(T\). Then there exists an element \(v\in V\) with \(v\neq\theta\) such that \(T(v)=\lambda v\), i.e., \(T(v)-\lambda v=\theta\). This shows that the kernel of \(T-\lambda I\) cannot be invertible. Conversely, we assume that \(T-\lambda I\) is not invertible. Using Lemma \ref{lac34}, we see that \(T-\lambda I\) must have a non-zero kernel, which means that there exists \(v\neq\theta\) satisfying \((T-\lambda I)(v)=\theta\). Therefore, we obtain \(T(v)=\lambda v\). This completes the proof. \(\blacksquare\)
\begin{equation}{\label{lap257}}\tag{8}\mbox{}\end{equation}
Proposition \ref{lap257}. Let \(V\) be a finite-dimensional vector space over the complex numbers with \(\dim (V)\geq 1\), and let \(T\) be a linear operator on \(V\). Then, there exists a non-zero eigenvector of \(T\). \(\sharp\)
Proposition \ref{lap257} is false if the vector space \(V\) is assumed to be over \(\mathbb{R}\) instead of being over \(\mathbb{C}\).
\begin{equation}{\label{b}}\tag{B}\mbox{}\end{equation}
Diagonalization.
Definition. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). We say that \(T\) is diagonalizable when there is a basis \(\mathfrak{B}\) for \(V\) such that \([T]_{\mathfrak{B}}\) is a diagonal matrix. \(\sharp\)
If \(A\) is an \(n\times n\) square matrix, then we say that \(A\) is diagonalizable if and only if \(T_{A}\) is diagonalizable.
\begin{equation}{\label{lap106}}\tag{9}\mbox{}\end{equation}
Proposition \ref{lap106}. Let \(V\) be an \(n\)-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Suppose that the set of eigenvectors \(\{v_{1},\cdots ,v_{n}\}\) with the corresponding eigenvalues \(\lambda_{1},\cdots ,\lambda_{n}\) forms an ordered basis \(\mathfrak{B}\) for \(V\). Then, the matrix associated with \(T\) with respect to \(\mathfrak{B}\) is the diagonal matrix given by
\[[T]_{\mathfrak{B}}=\left [\begin{array}{rrrr}
\lambda_{1} & 0 & \cdots & 0\\
0 & \lambda_{2} & \cdots & 0\\
\vdots & \vdots & \cdots & \vdots\\
0 & 0 & \cdots & \lambda_{n}
\end{array}\right ].\]
\end{Pro}
Proof. It is obvious by the fact of \(T(v_{i})=\lambda_{i}v_{i}\) for all \(i=1,\cdots ,n\). \(\blacksquare\)
Corollary. Let \(V\) be an \(n\)-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Suppose that \(T\) has \(n\) distinct eigenvalues. Then \(V\) has a basis consisting of eigenvectors for \(A\), and \(T\) is diagonalizable.
\begin{equation}{\label{lac105}}\tag{10}\mbox{}\end{equation}
Corollary \ref{lac105}. Let \(A\) be an \(n\times n\) matrix over the scalar field \({\cal F}\). Suppose that \(A\) has \(n\) distinct eigenvalues in \({\cal F}\). Then, there exists a non-sigular matrix \(B\) in \({\cal F}\) such that \(B^{-1}AB\) is a diagonal matrix in which the diagonal entries consist of the corresponding eigenvalues.
\begin{equation}{\label{lap214}}\tag{11}\mbox{}\end{equation}
Proposition \ref{lap214}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Then \(T\) is diagonalizable if and only if there is a basis \(\mathfrak{B}\) for \(V\) such that each vector of which is an eigenvector of \(T\). Moreover, the diagonal matrix \([T]_{\mathfrak{B}}\) consists of eigenvalues that are corresponding to the eigenvectors in \(\mathfrak{B}\).
Proof. If \(T\) is diagonalizable, by definition, there exists a basis \(\mathfrak{B}=\{v_{1},\cdots ,v_{n}\}\) for \(V\) such that \([T]_{\mathfrak{B}}\equiv D\) is a diagonal matrix. Therefore, we have
\begin{align*} T(v_{j}) & =\sum_{i=1}^{n}d_{ij}v_{i}\\ & =d_{jj}v_{j}\equiv\lambda_{j}v_{j},\end{align*}
which says that each vector of \(\mathfrak{B}\) is an eigenvector of \(T\). The converse follows from Proposition \ref{lap106} immediately, and the proof is complete. \(\blacksquare\)
Example. We consider the following matrix
\[A=\left [\begin{array}{cc}
1 & 3\\ 4 & 2
\end{array}\right ]\]
and the vectors
\[v_{1}=\left [\begin{array}{r}
1 \\ -1
\end{array}\right ]\mbox{ and }
v_{2}=\left [\begin{array}{r}
3 \\ 4
\end{array}\right ].\]
Then, we have
\[T_{A}(v_{1})=Av_{1}=\left [\begin{array}{rr}
1 & 3\\ 4 & 2
\end{array}\right ]\left [\begin{array}{r}
1 \\ -1
\end{array}\right ]=\left [\begin{array}{r}
2 \\ -2
\end{array}\right ]=2\left [\begin{array}{r}
1 \\ -1
\end{array}\right ]=-2v_{1},\]
which says that \(v_{1}\) is an eigenvector of \(A\) with eigenvalue \(-2\). We also have
\begin{align*} T_{A}(v_{2}) & =Av_{2}\\ & =\left [\begin{array}{rr}
1 & 3\\ 4 & 2
\end{array}\right ]\left [\begin{array}{r}
3 \\ 4
\end{array}\right ]\\ & =\left [\begin{array}{r}
15 \\ 20
\end{array}\right ]\\ & =5\left [\begin{array}{r}
3 \\ 4
\end{array}\right ]\\ & =5v_{2},\end{align*}
which says that \(v_{2}\) is an eigenvector of \(A\) with eigenvalue \(5\). Since \(\mathfrak{B}=\{v_{1},v_{2}\}\) is an ordered basis for \(\mathbb{R}^{2}\), Proposition~ ref{lap214} says that \(A\) is diagonalizable satisfying
\[[T_{A}]_{\mathfrak{B}}=\left [\begin{array}{rr}
-2 & 0\\ 0 & 5
\end{array}\right ].\]
Suppose that we take
\[B=\left [\begin{array}{rrr}
1 & -1\\ 3 & 4
\end{array}\right ],\]
where the columns are the eigenvectors for the different eigenvalues \(-2\) and \(5\). Then, we have
\[B^{-1}AB=\left [\begin{array}{rrr}
-2 & 0\\ 0 & 5
\end{array}\right ],\]
where the diagonal consists of all eigenvalues of \(A\). \(\sharp\)
\begin{equation}{\label{c}}\tag{C}\mbox{}\end{equation}
Characteristic Polynomials.
Let \(A\) be an \(n\times n\) matrix in a scalar field \({\cal F}\). The characteristic polynomial $P_{A}$ of \(A\) is defined by
\[P_{A}(\lambda )=\det (\lambda I-A).\]
\begin{equation}{\label{lat36}}\tag{12}\mbox{}\end{equation}
Lemma \ref{lat36}. Let \(A\) be an \(n\times n\) matrix such that \(\det (A)\neq 0\). Then \(A\) is invertible. The \(ij\)th entry \(a_{ij}^{*}\) of \(A^{-1}\) is given by
\[a_{ij}^{*}=\frac{(-1)^{i+j}\cdot\det (A^{(ji)})}{\det (A)}.\]
The above Lemma \ref{lat36} can refer to the page Arithmetic Operations of Matrices.
Theorem. Let \(A\) be an \(n\times n\) matrix in a scalar field \({\cal F}\). A scalar \(\lambda\in {\cal F}\) is an eigenvalue of \(A\) if and only if \(\lambda\) is a root of the characteristic polynomial \(P_{A}\) of \(A\). In other words, a scalar \(\lambda\in {\cal F}\) is an eigenvalue of \(A\) if and only if \(\det (\lambda I_{n}-A)=0\).
Proof. Assume that \(\lambda\) is an eigenvalue of \(A\). Then, using Proposition \ref{lap35}, \(\lambda I-A\) is not invertible. According to Lemma \ref{lat36}, we have \(\det (\lambda I-A)=0\), which says that \(\lambda\) is a root of the characteristic polynomial \(P_{A}\) of \(A\). Conversely, if \(\lambda\) is a root of \(P_{A}\), i.e., \(\det (\lambda I-A)=0\), then \(\lambda I-A\) is not invertible. Using Proposition \ref{lap35} again, we see that \(\lambda\) is an eigenvalue of \(A\). This completes the proof. \(\blacksquare\)
\begin{equation}{\label{lap37}}\tag{13}\mbox{}\end{equation}
Proposition \ref{lap37}. Let \(A\) and \(B\) be two \(n\times n\) matrices. Suppose that \(B\) is invertible. Then the characteristic polynomial of \(A\) is equal to the characteristic polynomial of \(B^{-1}AB\).
Proof. Using the definition and properties of determinant, we have
\begin{align*} \det (\lambda I-A) & =\det (B^{-1}(\lambda I-A)B)\\ & =\det (\lambda B^{-1}B-B^{-1}AB)\\ & =\det (\lambda I-B^{-1}AB).\end{align*}
This completes the proof. \(\blacksquare\)
Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Now, we shall define the characteristic polynomial of a linear operator \(T\). Let \(\mathfrak{B}\) be an ordered basis for \(V\), and let \(A\) be the matrix representing \(T\) with respect to \(\mathfrak{B}\). If \(\mathfrak{B}^{\prime}\) is another ordered basis of \(V\), then there exists an invertible matrix \(B\) in \({\cal F}\) such that the matrix \(A’\) representing \(T\) with respect to \(\mathfrak{B}’\) is given by \(A’=B^{-1}AB\). Proposition \ref{lap37} says that the characteristic polynomial of \(A\) is the same as that of \(A’\). Therefore, the formal definition of characteristic polynomial for \(T\) is given below.
Definition. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(A\) be a matrix representing \(T\) with respect to some ordered basis for \(V\). The characteristic polynomial, denoted by \(P_{T}\), of the linear operator \(T\) is defined to be
the characteristic polynomial \(P_{A}\) of \(A\). \(\sharp\)
Example. Let \(T\) be a linear operator on \(\mathbb{C}^{3}\) which is represented in the standard ordered basis \(\mathfrak{B}\) by the following \(3\times 3\) matrix
\[[T]_{\mathfrak{B}}=A=\left [\begin{array}{rrr}
1 & 1 & -1\\ 0 & 1 & 0\\ 1 & 0 & 1
\end{array}\right ].\]
The characteristic polynomial \(P_{T}\) for \(T\) (or \(P_{A}\) for \(A\)) is given by
\begin{align*} \det (\lambda I-A) & =\left |\begin{array}{ccc}
\lambda -1 & -1 & 1\\ 0 & \lambda -1 & 0\\ -1 & 0 & \lambda -1
\end{array}\right |\\ & =(\lambda -1)(\lambda^{2}-2\lambda +2).\end{align*}
Therefore, the eigenvalues of \(T\) (or \(A\)) are \(1\), \(1+i\) and \(1-i\). \(\sharp\)
Example. Let \(T\) be a linear operator on \(\mathbb{R}^{2}\) which is represented in the standard ordered basis \(\mathfrak{B}\) by the following \(2\times 2\) matrix
\[[T]_{\mathfrak{B}}=A=\left [\begin{array}{rrr}
0 & -1\\ 1 & 0
\end{array}\right ].\]
The characteristic polynomial \(P_{T}\) for \(T\) (or \(P_{A}\) for \(A\)) is given by
\begin{align*} \det (\lambda I-A) & =\left |\begin{array}{ccc}
\lambda & 1\\ -1 & \lambda
\end{array}\right |\\ & =\lambda^{2}+1.\end{align*}
Since this polynomial has no roots in \(\mathbb{R}\), it says that \(T\) has no eigenvalues. However, if we consider \(T\) as a linear operator on \(\mathbb{C}^{2}\) instead of \(\mathbb{R}^{2}\), then the above polynomial has two roots \(i\) and \(-i\). In this case, the eigenvalues of \(T\) (or \(A\)) are \(i\) and \(-i\). \(\sharp\)
Example. We consider the following matrix
\[A=\left [\begin{array}{rrr}
1 & 1\\ 4 & 1
\end{array}\right ].\]
The characteristic polynomial \(P_{A}\) for \(A\) is given by
\begin{align*} \det (\lambda I-A) & =\left |\begin{array}{ccc}
\lambda -1 & 1\\ 4 & \lambda -1
\end{array}\right |\\ & =(\lambda +1)(\lambda -3).\end{align*}
Therefore, the eigenvalues of \(T\) are \(-1\) and \(3\). Next, we want to find all the eigenvectors of \(A\). Given any eigenvalue \(\lambda\), the eigenvector \(X\) satisfies \(AX=\lambda X\), i.e., \((\lambda I_{n}-A)X={\bf 0}\). We first find all the eigenvectors corresponding to eigenvalue \(3\). This says that we need to find \(X\in\mathbb{R}^{2}\) satisfying
\begin{align*}
\left [\begin{array}{c}
0\\ 0
\end{array}\right ] & =(3I_{2}-A)X\\ & =\left (\left [\begin{array}{rrr}
3 & 0\\ 0 & 3
\end{array}\right ]-\left [\begin{array}{rrr}
1 & 1\\ 4 & 1
\end{array}\right ]\right )\left [\begin{array}{c}
x\\ y
\end{array}\right ]\\
& =\left [\begin{array}{rrr}
2 & -1\\ -4 & 2
\end{array}\right ]\left [\begin{array}{c}
x\\ y
\end{array}\right ]\\ & =\left [\begin{array}{c}
-2x+y\\ 4x-2y
\end{array}\right ].
\end{align*}
The set of all solutions is given by
\[\left\{t\left [\begin{array}{c}
1\\ 2
\end{array}\right ]:t\in\mathbb{R}\right\},\]
which is also the set of all eigenvectors corresponding to eigenvalue \(3\). Using the same arguments, we can find all eigenvectors corresponding to eigenvalue
$-1$, which is given by
\[\left\{t\left [\begin{array}{r}
1\\ -2
\end{array}\right ]:t\in\mathbb{R}\right\}.\]
We also observe that
\[\mathfrak{B}=\left\{\left [\begin{array}{c}
1\\ 2
\end{array}\right ],\left [\begin{array}{r}
1\\ -2
\end{array}\right ]\right\}\]
is a basis for \(\mathbb{R}^{2}\). Therefore, the matrix \(A\) is diagonalizable. Suppose that we take
\[B=\left [\begin{array}{rrr}
1 & 1\\ 2 & -2
\end{array}\right ],\]
where the columns are the eigenvectors for the corresponding eigenvalues \(3\) and \(-1\). Then, we have
\[B^{-1}AB=\left [\begin{array}{rrr}
3 & 0\\ 0 & -1
\end{array}\right ],\]
where the diagonal consists of all eigenvalues of \(A\). \(\sharp\)
\begin{equation}{\label{laex216}}\tag{14}\mbox{}\end{equation}
Example \ref{laex216}. Let \(T\) be a linear operator on \(\mathbb{R}^{3}\) defined by
\[T(x,y,z)=\left (4x+z,2x+3y+2z,x+4z\right ).\]
Let \(\mathfrak{B}\) be the standard basis for \(\mathbb{R}^{3}\). Then, we have
\[A=[T]_{\mathfrak{B}}=\left [\begin{array}{rrr}
4 & 0 & 1\\ 2 & 3 & 2\\ 1 & 0 & 4
\end{array}\right ].\]
The characteristic polynomial of \(T\) is given by
\begin{align*} \det (\lambda I-A) & =\left |\begin{array}{ccc}
\lambda -4 & 0 & -1\\ -2 & \lambda -3 & -2\\ -1 & 0 & \lambda -4
\end{array}\right |\\ & =(\lambda -5)(\lambda -3)^{2}.\end{align*}
Therefore, the eigenvalues of \(T\) are \(5\) and \(3\).
- For eigenvalue \(5\), we consider
\begin{align*}
\left [\begin{array}{c}
0\\ 0 \\ 0
\end{array}\right ] & =(5I_{3}-A)X\\ & =\left (\left [\begin{array}{rrr}
5 & 0 & 0\\ 0 & 5 & 0\\ 0 & 0 & 5
\end{array}\right ]-\left [\begin{array}{rrr}
4 & 0 & 1\\ 2 & 3 & 2\\ 1 & 0 & 4
\end{array}\right ]\right )\left [\begin{array}{c}
x\\ y\\ z
\end{array}\right ]\\
& =\left [\begin{array}{rrr}
1 & 0 & -1\\ -2 & 2 & -2\\ -1 & 0 & 1
\end{array}\right ]\left [\begin{array}{c}
x\\ y\\ z
\end{array}\right ]\\ & =\left [\begin{array}{c}
x-z\\ -2x+2y-2z\\ -x+z
\end{array}\right ].
\end{align*}
The set of all solutions is given by
\[V_{5}=\left\{t\left [\begin{array}{c}
1\\ 2\\ 1
\end{array}\right ]:t\in\mathbb{R}\right\},\]
which is the eigenspace with eigenvalue \(5\) and basis
\[\left\{\left [\begin{array}{c}
1\\ 2\\ 1
\end{array}\right ]\right\}.\] - For eigenvalue \(-2\), we consider
\begin{align*}
\left [\begin{array}{c}
0\\ 0 \\ 0
\end{array}\right ] & =(3I_{3}-A)X\\ & =\left (\left [\begin{array}{rrr}
3 & 0 & 0\\ 0 & 3 & 0\\ 0 & 0 & 3
\end{array}\right ]-\left [\begin{array}{rrr}
4 & 0 & 1\\ 2 & 3 & 2\\ 1 & 0 & 4
\end{array}\right ]\right )\left [\begin{array}{c}
x\\ y\\ z
\end{array}\right ]\\
& =\left [\begin{array}{rrr}
-1 & 0 & -1\\ -2 & 0 & -2\\ -1 & 0 & -1
\end{array}\right ]\left [\begin{array}{c}
x\\ y\\ z
\end{array}\right ]\\ & =\left [\begin{array}{c}
-x-z\\ -2x-2z\\ -x-z
\end{array}\right ].
\end{align*} Let \(y=s\) be a parameter, and solve the system for \(x\) and \(z\) by introducing another parameter \(t\). The set of all solutions is given by\[V_{3}=\left\{s\left [\begin{array}{c}
0\\ 1\\ 0
\end{array}\right ]+t\left [\begin{array}{c}
-1\\ 0\\ 1
\end{array}\right ]:s,t\in\mathbb{R}\right\},\]
which is the eigenspace with eigenvalue \(3\) and basis
\[\left\{\left [\begin{array}{c}
0\\ 1\\ 0
\end{array}\right ],\left [\begin{array}{c}
-1\\ 0\\ 1
\end{array}\right ]\right\}.\]
The union of the bases for \(V_{5}\) and \(V_{3}\) given by
\[\mathfrak{B}=\left\{\left [\begin{array}{c}
1\\ 2\\ 1
\end{array}\right ],\left [\begin{array}{c}
0\\ 1\\ 0
\end{array}\right ],\left [\begin{array}{c}
-1\\ 0\\ 1
\end{array}\right ]\right\}\]
forms a basis of \(\mathbb{R}^{3}\). Since basis \(\mathfrak{B}\) consists of eigenvectors of \(T\), it says that \(T\) is diagonalizable. Suppose that we take
\[B=\left [\begin{array}{rrr}
1 & 0 & -1\\ 2 & 1 & 0\\ 1 & 0 & 1
\end{array}\right ],\]
where the columns are the eigenvectors for the corresponding eigenvalues \(5\) and \(3\). Then, we have
\[B^{-1}AB=\left [\begin{array}{rrr}
5 & 0 & 0\\ 0 & 3 & 0\\ 0 & 0 & 3
\end{array}\right ],\]
where the diagonal consists of all eigenvalues of \(T\). \(\sharp\)
In the sequel, we shall present the general results as given in Example \ref{laex216}. The multiplicity of the eigenvalue \(\lambda_{i}\) for \(T\) is defined to be the multiplicity of root \(\lambda_{i}\) for the characteristic polynomial of \(T\). For example, the multiplicity of eigenvalue \(3\) in Example \ref{laex216} has multiplicity \(2\). Let \({\cal F}\) be a scalar field, we say that a polynomial \(f\) with degree \(n\) can be factored over \({\cal F}\) if and only if there exists \(c,\lambda_{1},\cdots ,\lambda_{n}\in {\cal F}\) satisfying
\[f(t)=c(t-\lambda_{1})(t-\lambda_{2})\cdots (t-\lambda_{n}).\]
It is obvious that \((t^{2}+1)(t-3)\) cannot be factored over \(\mathbb{R}\). However, it can be factored over \(\mathbb{C}\).
\begin{equation}{\label{lal218}}\tag{15}\mbox{}\end{equation}
Proposition \ref{lal218}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(\lambda\) be an eigenvalue of \(T\) having multiplicity \(m\), and let \(V_{\lambda}\) be the eigenspace for \(\lambda\). Then \(1\leq\dim (V_{\lambda})\leq m\).
Proof. Suppose that \(\dim (V)=n\). Let \(\{v_{1},\cdots ,v_{p}\}\) be an ordered basis for \(V_{\lambda}\). We can extend it to an ordered basis
\[\mathfrak{B}=\left\{v_{1},\cdots ,v_{p},v_{p+1},\cdots ,v_{n}\right\}\]
for \(V\). Since each \(v_{i}\) for \(i=1,\cdots ,p\) is an eigenvector of \(T\) corresponding to \(\lambda\), we have
\[A=[T]_{\mathfrak{B}}=\left [\begin{array}{cc}
\lambda I_{p} & B\\
{\bf 0} & C
\end{array}\right ].\]
Therefore, the characteristic polynomial of \(T\) corresponding to \(\lambda\) is given by
\begin{align*}
P_{T}(t) & =\det\left (tI_{n}-A\right )\\ & =\left [\begin{array}{cc}(t-\lambda )I_{p} & -B\\
{\bf 0} & \lambda I_{n-p}-C
\end{array}\right ]\\
& =\det\left ((t-\lambda )I_{p}\right )\cdot\det\left (\lambda I_{n-p}-C\right )
\\ & =(t-\lambda )^{p}\cdot f(t),
\end{align*}
where \(f\) is a polynomial. We can see that the multiplicity of \(\lambda\) is at least \(p=\dim (V_{\lambda})\). This says that \(1\leq p\leq m\), and the proof is complete. \(\blacksquare\)
\begin{equation}{\label{lal219}}\tag{16}\mbox{}\end{equation}
Proposition \ref{lal219}. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(\lambda_{1},\cdots ,\lambda_{k}\) be the distinct eigenvalues of \(T\) with the corresponding eigenspaces \(V_{\lambda_{1}},V_{\lambda_{2}},\cdots ,V_{\lambda_{k}}\). Then, we have the following properties.
(i) For each \(v_{i}\in V_{\lambda_{i}}\), \(i=1,\cdots ,k\), if
\[v_{1}+v_{2}+\cdots +v_{k}=\theta ,\]
then \(v_{i}=\theta\) for all \(i=1,\cdots ,k\).
(ii) Let \(S_{i}\) be a finite and linearly independent subset of \(V_{\lambda_{i}}\). Then, the union
\[S=S_{1}\cup S_{2}\cup\cdots\cup S_{k}\]
is a linearly independent subset of \(V\)
Proof. Part (i) follows mmediately from Proposition \ref{lap217} . To prove part (ii), for each \(i=1,\cdots ,k\), we define
\[S_{i}=\left\{v_{i1},v_{i2},\cdots ,v_{in_{i}}\right\}.\]
Given any scalars \(\{\alpha_{ij}\}\) satisfying
\[\sum_{i=1}^{k}\sum_{j=1}^{n_{i}}\alpha_{ij}v_{ij}=\theta ,\]
for each \(i=1,\cdots , k\), let
\[w_{i}=\sum_{j=1}^{n_{i}}\alpha_{ij}v_{ij}.\]
Then \(w_{i}\in V_{\lambda_{i}}\) for each \(i=1,\cdots ,k\) and
\[w_{1}+w_{2}+\cdots +w_{k}=\theta .\]
Using part (i), we must have \(w_{i}=\theta\) for all \(i=1,\cdots ,k\). Since \(S_{i}\) is linearly independent, we must have \(\alpha_{ij}=0\) for all \(j=1,\cdots ,n_{i}\). This conclude that \(S\) is linearly independent, and the proof is complete. \(\blacksquare\)
\begin{equation}{\label{lat230}}\tag{17}\mbox{}\end{equation}
Theorem \ref{lat230}, Let \(V\) be a finite-dimensional vector space over \({\cal F}\), and let \(T\) be a linear operator on \(V\) such that its characteristic polynomial can be factored over \({\cal F}\), i.e., the characteristic polynomial \(P_{T}\) of \(T\) is given by
\[P_{T}(\lambda )=(\lambda -\lambda_{1})^{d_{1}}\cdots (\lambda -\lambda_{r})^{d_{r}}.\]
Let \(\lambda_{1},\cdots ,\lambda_{r}\) be the distinct eigenvalues of \(T\) with the corresponding eigenspaces \(V_{\lambda_{1}},V_{\lambda_{2}},\cdots ,V_{\lambda_{r}}\). Then, we have the following properties.
(i) The linear operator is diagonalizable if and only if the multiplicity of \(\lambda_{i}\) is equal to the dimension of eigenspace \(V_{\lambda_{i}}\) for \(i=1,\cdots ,r\).
(ii) If \(T\) is diagonalizable and \(\mathfrak{B}_{i}\) is an ordered basis for \(V_{\lambda_{i}}\) for each \(i=1,\cdots ,r\), then the union
\[\mathfrak{B}=\mathfrak{B}_{1}\cup\mathfrak{B}_{2}\cup\cdots\cup\mathfrak{B}_{r}\]
is an ordered basis for \(V\) consisting of eigenvectors of \(T\).
Proof. Suppose that \(\dim (V)=n\). Let \(m_{i}\) denote the multiplicity of \(\lambda_{i}\) and \(d_{i}\) denote the dimension of \(V_{\lambda_{i}}\) for \(i=1,\cdots ,r\). To prove part (i), we assume that \(T\) is diagonalizable. Let \(\mathfrak{B}\) be a basis for \(V\) consisting of eigenvectors of \(T\). Let \(\mathfrak{B}_{i}=\mathfrak{B}\cap V_{\lambda}\), and let \(n_{i}\) denote the number of eigenvectors in \(\mathfrak{B}_{i}\) for \(i=1,\cdots ,r\). Using Proposition \ref{lal218}, we have \(d_{i}\leq m_{i}\) for \(i=1,\cdots ,r\). Since \(\mathfrak{B}_{i}\) is a linearly independent subset of \(V_{\lambda_{i}}\) and \(d_{i}=\dim (V_{\lambda_{i}})\), we must have \(n_{i}\leq d_{i}\) for \(i=1,\cdots ,r\). Since \(\mathfrak{B}\) consists of \(n\) vectors, we have \(n=\sum_{i=1}^{r}n_{i}\). On the other hand, since the degree of the characteristic polynomial of \(T\) is equal to the sum of the multiplicities of the eigenvalues, we also have \(n=\sum_{i=1}^{r}m_{i}=n\). Therefore, we obtain
\[n=\sum_{i=1}^{r}n_{i}\leq\sum_{i=1}^{r}d_{i}\leq\sum_{i=1}^{r}m_{i}=n,\]
which implies
\[\sum_{i=1}^{r}\left (m_{i}-d_{i}\right )=0.\]
Since \(m_{i}-d_{i}\geq 0\) for \(i=1,\cdots ,r\), we must have \(m_{i}=d_{i}\) for \(i=1,\cdots ,r\).
For the converse, suppose that \(m_{i}=d_{i}\) for \(i=1,\cdots ,r\). Let \(\mathfrak{B}_{i}\) be an ordered basis for \(V_{\lambda_{i}}\) for each \(i=1,\cdots ,r\), and take the union
\[\mathfrak{B}=\mathfrak{B}_{1}\cup\mathfrak{B}_{2}\cup\cdots\cup\mathfrak{B}_{r}.\]
According to Proposition~\ref{lal219}, it says that \(\mathfrak{B}\) is linearly independent. Since \(m_{i}=d_{i}\) for all \(i\) and \(\sum_{i=1}^{r}m_{i}=n\), we have \(\sum_{i=1}^{r}d_{i}=n\), which says that \(\mathfrak{B}\) contains \(n\) vectors. Since \(n=\dim (V)\), we conclude that \(\mathfrak{B}\) is an ordered basis for \(V\) consisting of eigenvectors of \(T\), which proves part (ii) and also says that \(T\) is diagonalizable. The proof is complete. \(\blacksquare\)
Let \(\mathfrak{B}\) be an ordered basis for an \(n\)-dimensional vector space \(V\) over the scalar field \({\cal F}\). Given any \(v\in V\), the coordinate vector \([v]_{\mathfrak{B}}\) is a column vector in \({\cal F}^{n}\). If we define the mapping \(\phi_{\mathfrak{B}}:V\rightarrow {\cal F}^{n}\) by \(\phi_{\mathfrak{B}}(v)=[v]_{\mathfrak{B}}\), then we can show that \(\phi_{\mathfrak{B}}\) is an isomorphism, which is also left as an exercise.
\begin{equation}{\label{lap215}}\tag{18}\mbox{}\end{equation}
Proposition \ref{lap215}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\) with an ordered basis \(\mathfrak{B}\), and let \(T\) be a linear operator on \(V\). Let \(A=[T]_{\mathfrak{B}}\), and let Let \(\lambda\) be an eigenvalue of \(A\), i.e., the eigenvalue of \(T\). Then, we have the following properties.
(i) Given \(v\in V\), \(\phi_{\mathfrak{B}}(v)\) is an eigenvector of \(A\) corresponding to the eigenvalue \(\lambda\) if and only if \(v\) is an eigenvector of \(T\)
corresponding to the eigenvalue \(\lambda\).
(ii) A vector \(y\in {\cal F}^{n}\) is an eigenvector of \(A\) corresponding to the eigenvalue \(\lambda\) if and only if \(\phi_{\mathfrak{B}}^{-1}(y)\) is an eigenvector of \(T\) corresponding to \(\lambda\).
Proof. Suppose that \(v\) is an eigenvector of \(T\) corresponding to \(\lambda\). Then \(T(v)=\lambda v\). Therefore, since \(\phi_{\mathfrak{B}}\) is an isomorphism, we have
\begin{align*} A\phi_{\mathfrak{B}}(v) & =\phi_{\mathfrak{B}}(T(v))\\ & =\phi_{\mathfrak{B}}(\lambda v)\\ & =\lambda\phi_{\mathfrak{B}}(v).\end{align*}
Since \(v\neq\theta\), we also have \(\phi_{\mathfrak{B}}(v)\neq {\bf 0}\), which says that \(\phi_{\mathfrak{B}}(v)\) is an eigenvector of \(A\). Conversely, suppose that \(\phi_{\mathfrak{B}}(v)\) is an eigenvector of \(A\) corresponding to \(\lambda\). Then, we have
\begin{align*} \phi_{\mathfrak{B}}(T(v)) & =A\phi_{\mathfrak{B}}(v)\\ & =\lambda\phi_{\mathfrak{B}}(v)\\ & =\phi_{\mathfrak{B}}(\lambda v).\end{align*}
Since \(\phi_{\mathfrak{B}}\) is injective, we have \(T(v)=\lambda v\), which shows that \(v\) is an eigenvector of \(T\) corresponding to \(\lambda\). Since \(\phi_{\mathfrak{B}}\) is bijective, part (ii) is the equivalent formulation of part (i), and the proof is complete. \(\blacksquare\)
Example. Let \(V\) be the vector space consisting of all polynomials with degree less than or equal to \(2\) and coefficients in \(\mathbb{R}\), and let \(T\) be a linear operator on \(V\) defined by
\[T(f(x))=f(x)+(x+1)f'(x).\]
Let \(\mathfrak{B}=\{1,x,x^{2}\}\) be the standard ordered basis for \(V\). Then
\[[T]_{\mathfrak{B}}=A=\left [\begin{array}{rrr}
1 & 1 & 0\\ 0 & 2 & 2\\ 0 & 0 & 3
\end{array}\right ].\]
The characteristic polynomial \(P_{T}\) for \(T\) (or \(P_{A}\) for \(A\)) is given by
\begin{align*} \det (\lambda I-A) & =\left |\begin{array}{ccc}
\lambda -1 & 1 & 0\\ 0 & \lambda -2 & 2\\ 0 & 0 & \lambda -3
\end{array}\right |\\ & =(\lambda -1)(\lambda -2)(\lambda -3).\end{align*}
Therefore, the eigenvalues of \(T\) (or \(A\)) are \(1\), \(2\) and \(3\). We can apply Proposition \ref{lap215} to find all eigenvectors of \(T\).
- The eigenvectors of \(A\) corresponding to the eigenvalue \(1\) have the form of
\[\left\{t\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ]:t\in\mathbb{R}\mbox{ and }t\neq 0\right\}.\]
According to Proposition \ref{lap215}, the eigenvector of \(T\) corresponding to the eigenvalue \(1\) have the form of
\[\phi_{\mathfrak{B}}^{-1}\left (t\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ]\right )=t\phi_{\mathfrak{B}}^{-1}\left (\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ]\right )=t\cdot 1=t.\]
This says that the non-zero constant polynomial are the eigenvectors of \(T\) corresponding to the eigenvalue \(1\). - The eigenvectors of \(A\) corresponding to the eigenvalue \(2\) have the form of
\[\left\{t\left [\begin{array}{c}
1\\ 1\\ 0
\end{array}\right ]:t\in\mathbb{R}\mbox{ and }t\neq 0\right\}.\]
Since \(\phi_{\mathfrak{B}}^{-1}\) is an isomorphism, the eigenvector of \(T\) corresponding to the eigenvalue \(2\) have the form of
\begin{align*}
\phi_{\mathfrak{B}}^{-1}\left (t\left [\begin{array}{c}
1\\ 1\\ 0
\end{array}\right ]\right ) & =t\phi_{\mathfrak{B}}^{-1}\left (\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ]+\left [\begin{array}{c}
0\\ 1\\ 0
\end{array}\right ]\right )\\
& =t\cdot\left (\phi_{\mathfrak{B}}^{-1}\left (
\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ]\right )+\phi_{\mathfrak{B}}^{-1}\left (\left [\begin{array}{c}
0\\ 1\\ 0
\end{array}\right ]\right )\right )\\
& =t\cdot (1+x).
\end{align*}
This says that the polynomials of the form \(t\cdot (1+x)\) are the eigenvectors of \(T\) corresponding to the eigenvalue \(2\). - The eigenvectors of \(A\) corresponding to the eigenvalue \(3\) have the form of
\[\left\{t\left [\begin{array}{c}
1\\ 2\\ 1
\end{array}\right ]:t\in\mathbb{R}\mbox{ and }t\neq 0\right\}.\]
Therefore, the eigenvector of \(T\) corresponding to the eigenvalue \(3\) have the form of
\begin{align*}
\phi_{\mathfrak{B}}^{-1}\left (t\left [\begin{array}{c}
1\\ 2\\ 1
\end{array}\right ]\right ) & =t\phi_{\mathfrak{B}}^{-1}\left (\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ]+\left [\begin{array}{c}
0\\ 2\\ 0
\end{array}\right ]+\left [\begin{array}{c}
0\\ 0\\ 1
\end{array}\right ]\right )\\
& =t\cdot\left (\phi_{\mathfrak{B}}^{-1}\left (
\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ]\right )+\phi_{\mathfrak{B}}^{-1}\left (\left [\begin{array}{c}
0\\ 2\\ 0
\end{array}\right ]\right )+\phi_{\mathfrak{B}}^{-1}\left (\left [\begin{array}{c}
0\\ 0\\ 1
\end{array}\right ]\right )\right )\\
& =t\cdot (1+2x+x^{2}).
\end{align*}
This says that the polynomials of the form \(t\cdot (1+2x+x^{2})\) are the eigenvectors of \(T\) corresponding to the eigenvalue \(3\).
By taking \(t=1\) in the above cases, it is easy to see that
\[\widehat{\mathfrak{B}}=\left\{1,1+x,1+2x+x^{2}\right\}\]
is an ordered basis for \(V\). Therefore, \(T\) is diagonalizable with
\[[T]_{\widehat{\mathfrak{B}}}=\left [\begin{array}{rrr}
1 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 3
\end{array}\right ].\]
Example. Let \(V\) be the vector space consisting of all polynomials with degree less than or equal to \(2\) and coefficients in \(\mathbb{R}\), and let \(T\) be a linear operator on \(V\) defined by
\[T(f(x))=f(1)+f'(0)x+\left [f'(0)+f”(0)\right ]x^{2}.\]
Let \(\mathfrak{B}=\{1,x,x^{2}\}\) be the standard ordered basis for \(V\). Then
\[[T]_{\mathfrak{B}}=A=\left [\begin{array}{rrr}
1 & 1 & 1\\ 0 & 1 & 0\\ 0 & 1 & 2
\end{array}\right ].\]
The characteristic polynomial \(P_{T}\) for \(T\) (or \(P_{A}\) for \(A\)) is given by
\begin{align*} \det (\lambda I-A) & =\left |\begin{array}{ccc}
\lambda -1 & 1 & 1\\ 0 & \lambda -1 & 0\\ 0 & 1 & \lambda -2
\end{array}\right |\\ & =(\lambda -1)^{2}(\lambda -2).\end{align*}
Therefore, the eigenvalues of \(T\) (or \(A\)) are \(1\) and \(2\) with multiplicities \(2\) and \(1\), respectively.
- For eigenvalue \(1\), we consider
\begin{align*}
\left [\begin{array}{c}
0\\ 0 \\ 0
\end{array}\right ] & =(I_{3}-A)X\\ & =\left (\left [\begin{array}{rrr}
1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1
\end{array}\right ]-\left [\begin{array}{rrr}
1 & 1 & 1\\ 0 & 1 & 0\\ 0 & 1 & 2
\end{array}\right ]\right )\left [\begin{array}{c}
x\\ y\\ z
\end{array}\right ]\\
& =\left [\begin{array}{rrr}
0 & -1 & -1\\ 0 & 0 & 0\\ 0 & -1 & -1
\end{array}\right ]\left [\begin{array}{c}
x\\ y\\ z
\end{array}\right ]\\ & =\left [\begin{array}{c}
-y-z\\ 0\\ -y-z
\end{array}\right ].
\end{align*}
The set of all solutions is given by
\[V_{1}=\left\{s\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ]+t\left [\begin{array}{r}
0\\ -1\\ 1
\end{array}\right ]:s,t\in\mathbb{R}\right\},\]
which is the eigenspace with eigenvalue \(1\) and basis
\[\left\{\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ],\left [\begin{array}{r}
0\\ -1\\ 1
\end{array}\right ]\right\}.\] - For eigenvalue \(2\), we consider
\begin{align*}
\left [\begin{array}{c}
0\\ 0 \\ 0
\end{array}\right ] & =(2I_{3}-A)X\\ & =\left (\left [\begin{array}{rrr}
2 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 2
\end{array}\right ]-\left [\begin{array}{rrr}
1 & 1 & 1\\ 0 & 1 & 0\\ 0 & 1 & 2
\end{array}\right ]\right )\left [\begin{array}{c}
x\\ y\\ z
\end{array}\right ]\\
& =\left [\begin{array}{rrr}
1 & -1 & -1\\ 0 & 1 & 0\\ 0 & -1 & 0
\end{array}\right ]\left [\begin{array}{c}
x\\ y\\ z
\end{array}\right ]\\ & =\left [\begin{array}{c}
x-y-z\\ y\\ -y
\end{array}\right ].
\end{align*}
The set of all solutions is given by
\[V_{2}=\left\{t\left [\begin{array}{r}
1\\ 0\\ 1
\end{array}\right ]:t\in\mathbb{R}\right\},\]
which is the eigenspace with eigenvalue \(2\) and basis
\[\left\{\left [\begin{array}{c}
1\\ 0\\ 1
\end{array}\right ]\right\}.\]
Therefore, the union of these two bases given by
\[\left\{\left [\begin{array}{c}
1\\ 0\\ 0
\end{array}\right ],\left [\begin{array}{r}
0\\ -1\\ 1
\end{array}\right ],\left [\begin{array}{c}
1\\ 0\\ 1
\end{array}\right ]\right\}\]
is an ordered basis for \(\mathbb{R}^{3}\) consisting of eigenvectors of \(A\), which are also the coordinate vectors relative to standard basis \(\mathfrak{B}=\{1,x,x^{2}\}\) of the vectors in the ordered basis
\[\widehat{\mathfrak{B}}=\left\{1,-x+x^{2},1+x^{2}\right\}\]
for \(V\) consisting of eigenvectors of \(T\). Therefore, \(T\) is diagonalizable with
\[[T]_{\widehat{\mathfrak{B}}}=\left [\begin{array}{rrr}
1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 2
\end{array}\right ].\]
Definition. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(W\) be a subspace of \(V\). We say that \(W\) is invariant under \(T\) or \(T\)-invariant when \(T(W)\subseteq W\).
Example. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Then we have the following trivial \(T\)-invariant subspaces.
- The zero subspace and the whole space \(V\) are \(T\)-invariant.
- \(\mbox{Im}(T)\) and \(\mbox{Ker}(T)\) are \(T\)-invariant.
- The eigenspace \(V_{\lambda}\) for any eigenvalue of \(T\) is \(T\)-invariant. \(\sharp\)
Proposition. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(f\) be a polynomial in \({\cal F}\). If \(W\) is the kernel of \(f(T)\), then \(W\) is an invariant subspace of \(V\) under \(T\).
Proof. Given \(w\in W\), i.e., \((f(T))(w)=\theta\), we want to show that \((f(T))(A(w))=\theta\). Since \(tf(t)=f(t)t\), i.e., \(Tf(T)=f(T)T\), we have
\begin{align*} (f(T))(A(w)) & =(f(T)T)(w)=(Tf(T))(w)\\ & =T((f(T))(w))=T(\theta )=\theta .\end{align*}
This completes the proof. \(\blacksquare\)
\begin{equation}{\label{lad227}}\tag{19}\mbox{}\end{equation}
Definition \ref{lad227}. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Given any \(v\in V\), the subspace
\[W=\mbox{span}\left (\left\{v,T(v),T^{2}(v),\cdots\right\}\right )\]
is called the $T$-cyclic subspace of \(V\) generated by \(v\)
\begin{equation}{\label{laex139}}\tag{20}\mbox{}\end{equation}
Example. \ref{laex139}. We provide some simple \(T\)-cyclic subspaces.
- For any linear operator \(T\) on \(V\), we have \(\langle\theta\rangle_{T}=\{\theta\}\) the zero subspace of \(V\).
- We have that \(\dim\langle v\rangle_{T}=1\) if and only if \(v\) is an eigenvector for \(T\).
- We have \(\dim\langle v\rangle_{I}=1\), where \(I\) is an identity operator and \(v\neq\theta\). This also says that if \(\dim (V)>1\), then the identity operator \(I\) has no cyclic vector.
\begin{equation}{\label{lap223}}\tag{21}\mbox{}\end{equation}
Remark \ref{lap223}. Let \(V\) be a vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(W\) be the \(T\)-cyclic subspace of \(V\) generated by \(v\in V\). Then, we have the following observations.
- \(W\) is \(T\)-invariant.
- Any \(T\)-invariant subspace of \(V\) containing \(v\) also contains \(W\). In other words, \(W\) is the smallest \(T\)-invariant subspace of \(V\) containing \(v\).
\begin{equation}{\label{lae221}}\tag{22}\mbox{}\end{equation}
Example \ref{lae221}. Let \(T\) be a linear operator on \(\mathbb{R}^{3}\) defined by
\[T(x,y,z)=\left (-y+z,x+z,3z\right ).\]
We shall find the \(T\)-cyclic subspace generated by \({\bf e}_{1}=(1,0,0)\). Now, we have
\begin{align*} T({\bf e}_{1}) & =T(1,0,0)\\ & =(0,1,0)={\bf e}_{2}\end{align*}
and
\begin{align*} T^{2}({\bf e}_{1}) & =T({\bf e}_{2})\\ & =(-1,0,0)=-{\bf e}_{1}.\end{align*}
Therefore, we can also obtain
\[T^{3}({\bf e}_{1})=T(-{\bf e}_{1})=-{\bf e}^{2}\]
and
\[T^{4}({\bf e}_{1})=T(-{\bf e}_{2})={\bf e}^{1}.\]
This says
\begin{align*}
\mbox{span}\left (\left\{{\bf e}_{1},T({\bf e}_{1}),T^{2}({\bf e}_{1}),
\cdots\right\}\right ) & =\mbox{span}\left\{\left ({\bf e}_{1},{\bf e}_{2},
-{\bf e}_{1},-{\bf e}_{2}\right\}\right )\\
& =\mbox{span}\left\{\left ({\bf e}_{1},{\bf e}_{2}\right\}\right )\\
& =\left\{(x,y,0):x,y\in\mathbb{R}\right\}.
\end{align*}
Example. Let \(V\) be the space of all polynomial functions from \(\mathbb{R}\) into \(\mathbb{R}\) which have degree less than or equal to \(2\). Let \(T\) be a linear operator on \(V\) defined by \(T(f)=f’\). Then, the \(T\)-cyclic subspace generated by \(x^{2}\in V\) is
\[\mbox{span}\left (\left\{x^{2},2x,2\right\}\right )=V.\]
\begin{equation}{\label{lap222}}\tag{23}\mbox{}\end{equation}
Proposition \ref{lap222}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(W\) be a \(T\)-cyclic subspace of \(V\) generated by a non-zero vector \(v\in V\) such that \(\dim (W)=k\). Then, we have the following properties.
(i) The set \(\{v,T(v),T^{2}(v),T^{k-1}(v)\}\) is a basis for \(W\).
(ii) For \(a_{0},a_{1},a_{2},\cdots ,a_{k-1}\in {\cal F}\), if
\[a_{0}v+a_{1}T(v_{1})+a_{2}T^{2}(v)+\cdots +a_{k-1}T^{k-1}(v)+T^{k}(v)=\theta ,\]
then the characteristic polynomial of \(T_{W}\) is
\[P_{T_{W}}(\lambda )=(-1)^{k}\cdot\left (a_{0}+a_{1}t+a_{2}\lambda^{2}+\cdots
+a_{k-1}\lambda^{k-1}+\lambda^{k}\right ).\]
Proof. To prove part (i), since \(v\neq\theta\), the set \(\{v\}\) is linearly independent. Let \(r\) be the largest positive integer such that
\[\mathfrak{B}=\left\{v,T(v),T^{2}(v),\cdots ,T^{r-1}(v)\right\}\]
is linearly independent. Such integer \(r\) must exists, since \(V\) is finite-dimensional. Let \(Z=\mbox{span}(\mathfrak{B})\). Then \(\mathfrak{B}\) is a basis for \(Z\). We are going to claim that \(T^{r}(v)\in Z\). Suppose that \(T^{r}(v)\not\in Z\). Then, \(T^{r}(v)\not\in\mbox{span}(\mathfrak{B})\). According to Proposition \ref{lap156}, the set \(\mbox{span}(\mathfrak{B})\cup\{T^{r}(v)\}\) must be independent, which contradicts the fact that \(r\) is the largest positive integer such that \(\mbox{span}(\mathfrak{B})\) is linearly independent. This shows that \(T^{r}(v)\in Z\), which will be used to claim that \(Z\) is a \(T\)-invariant subspace of \(V\). Let \(w\in Z\). Then there exists \(b_{0},b_{1},\cdots , b_{r-1}\in {\cal F}\) satisfying
\[w=b_{0}v+b_{1}T(v)+b_{2}T^{2}(v)+\cdots +b^{r-1}T^{r-1}(v).\]
Therefore, we have
\[T(w)=b_{0}T(v)+b_{1}T^{2}(v)+b_{2}T^{3}(v)+\cdots +b^{r-1}T^{r}(v),\]
which says that \(T(w)\) is a linea combinations of vectors in \(Z\). This shows \(T(w)\in Z\), i.e., \(Z\) is \(T\)-invariant. Since \(v\in Z\), using part (ii) of Remark \ref{lap223}, we see that \(W\) is the smallest \(T\)-invariant subspace of \(V\) containing \(v\). Therefore, we must have \(W\subseteq Z\). Since \(Z\subseteq W\) is obvious, we conclude that \(Z=W\). This shows that \(\mathfrak{B}\) is a basis for \(W\) such that \(\dim (W)=r\). Therefore, we must \(r=k\).
To prove part (ii), we can regard \(\mathfrak{B}\) as an ordered basis for \(W\). Since
\[a_{0}v+a_{1}T(v_{1})+a_{2}T^{2}(v)+\cdots +a_{k-1}T^{k-1}(v)+T^{k}(v)=\theta ,\]
we can have
\[[T_{W}]_{\mathfrak{B}}=\left [\begin{array}{ccccc}
0 & 0 & \cdots & 0 & -a_{0}\\
1 & 0 & \cdots & 0 & -a_{1}\\
\vdots & \vdots & \cdots & \vdots & \vdots\\
0 & 0 & \cdots & 1 & -a_{k-1}
\end{array}\right ],\]
which has the characteristic polynomial
\[f(\lambda )=(-1)^{k}\cdot\left (a_{0}+a_{1}t+a_{2}\lambda^{2}+\cdots
+a_{k-1}\lambda^{k-1}+\lambda^{k}\right ).\]
Therefore, we mush have \(P_{T_{W}}(\lambda )=f(\lambda )\). This completes the proof. \(\blacksquare\)
Example. Continued from Example \ref{lae221}, let \(W\) be the \(T\)-cyclic subspace generated by \({\bf e}_{1}\). Then, we have obtained
\begin{align*} W & =\mbox{span}\left\{\left ({\bf e}_{1},{\bf e}_{2}\right\}\right )\\ & =\left\{(x,y,0):x,y\in\mathbb{R}\right\}.\end{align*}
Let \(\mathfrak{B}_{W}=\{{\bf e}_{1},{\bf e}_{2}\}\) be an ordered basis for \(W\). Then, we have
\[[T_{w}]_{\mathfrak{B}_{W}}=\left [\begin{array}{rr}
0 & -1\\ 1 & 0
\end{array}\right ].\]
Therefore, the characteristic polynomial of \(T_{W}\) is given by
\[P_{T_{W}}(\lambda )=\det\left [\begin{array}{rr}
-\lambda & -1\\ 1 & -\lambda
\end{array}\right ]=\lambda^{2}+1.\] We have that
\[\{{\bf e}_{1},T({\bf e}_{1})\}=\{{\bf e}_{1},{\bf e}_{2}\}\]
is a basis for \(W\). Since \(T^{2}({\bf e}_{1})=-{\bf e}_{1}\), we have
\[1{\bf e}_{1}+0T({\bf e}_{1})+T^{2}({\bf e}_{1})=\theta\]
By part (ii) of Proposition \ref{lap222}, we obtain
\begin{align*} P_{T_{W}}(\lambda ) & =(-1)^{2}\cdot\left (1+0\lambda+\lambda^{2}\right )\\ & =\lambda^{2}+1.\end{align*}
If the subspace \(W\) is \(T\)-invariant, then \(T\) can induce a liner operator \(T_{W}\) on \(W\) defined by \(T_{W}(v)=T(v)\) for \(v\in W\). Suppose that \(\dim (V)=n\) and \(\dim (W)=r\). Now, we choose an ordered basis \(\mathfrak{B}_{V}=\{v_{1},\cdots ,v_{n}\}\) for \(V\) such that \(\mathfrak{B}_{W}=\{v_{1},\cdots ,v_{r}\}\) is an ordered basis for \(W\). Let \(A=[T]_{\mathfrak{B}_{V}}\) and \(B=[T_{W}]_{\mathfrak{B}_{W}}\). By definition, we have
\[T(v_{j})=\sum_{i=1}^{n}a_{ij}v_{i}.\]
Since \(W\) is invariant under \(T\), it follows that \(T(v_{j})\in W\) for \(j=1,\cdots ,r\), which also means
\[T(v_{j})=\sum_{i=1}^{r}a_{ij}v_{i}.\]
In other words, we have \(a_{ij}=0\) for \(j=1,\cdots ,r\) and \(i=r+1,\cdots ,n\). Therefore, the matrix \(A\) has the following block form
\begin{equation}{\label{laeq114}}\tag{24}
A=\left [\begin{array}{cc}
B & C\\ {\bf 0} & D
\end{array}\right ],
\end{equation}
where \(B\) is an \(r\times r\) matrix, \(C\) is an \(r\times (n-r)\) matrix, and \(D\) is an \((n-r)\times (n-r)\) matrix.
\begin{equation}{\label{lap224}}\tag{25}\mbox{}\end{equation}
Proposition \ref{lap224}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(W\) be a \(T\)-invariant subspace of \(V\). The characteristic polynomial for the restriction operator \(T_{W}\) divides the characteristic polynomial for \(T\).
Proof. Let \(\dim (V)=n\) and \(\dim (W)=r\). We have \(A=[T]_{\mathfrak{B}_{V}}\) and \(B=[T_{W}]_{\mathfrak{B}_{W}}\) as shown in (\ref{laeq114}). Since \(A\) is in the block form, we have
\[\mbox{det}(A-\lambda I_{n})=\mbox{det}(B-\lambda I_{r})\cdot\mbox{det}(D-\lambda I_{n-r}).\]
This completes the proof. \(\blacksquare\)
Suppose that \(T\) is a diagonalizable operator with distinct eigenvalues \(\lambda_{1},\cdots ,\lambda_{r}\), where \(r<n\). Since \(\lambda_{i}\) is a root of the characteristic polynomial \(P_{T}\), we also assume that the root \(\lambda_{i}\) has multiplicity \(d_{i}\). Then \(d_{1}+\cdots +d_{r}=n\) and
\[P_{T}(\lambda )=(\lambda -\lambda_{1})^{d_{1}}\cdots (\lambda -\lambda_{r})^{d_{r}}.\]
Let \(I_{i}\) be the \(d_{i}\times d_{i}\) identity matrix. Then, there is an ordered basis \(\mathfrak{B}\) for \(V\) such that \([T]_{\mathfrak{B}}\) is the block form shown below
\[[T]_{\mathfrak{B}}=\left [\begin{array}{cccc}
\lambda_{1}I_{1} & {\bf 0} & \cdots & {\bf 0}\\
{\bf 0} & \lambda_{2}I_{2} & \cdots & {\bf 0}\\
\vdots & \vdots & \cdots & \vdots\\
{\bf 0} & {\bf 0} & \cdots & \lambda_{r}I_{r}
\end{array}\right ].\]
\begin{equation}{\label{lap107}}\tag{26}\mbox{}\end{equation}
Proposition \ref{lap107}. Let \(V\) be vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Suppose that \(T(v)=\lambda v\). If \(f\) is any polynomial, then \((f(T))(v)=f(c)v\).
Proof. We have
\begin{align*} T^{2}(v) & =T(T(v))=T(\lambda v)\\ & =\lambda T(v)=\lambda^{2}v.\end{align*}
In general, we can obtain \(T^{n}(v)=\lambda^{n}v\). Using the linearity of \(T\), we can show that \((f(T))(v)=f(c)v\), and the proof is complete. \(\blacksquare\)
For example, if \(f(x)=x^{2}+2x+3\), then \(f(T)=T^{2}+2T+I\) is a linear operator on \(V\). Therefore, we have
\begin{align*}
(f(T))(v) & =(T^{2}+2T+I)(v)=T^{2}(v)+2T(v)+I(v)\\
& =\lambda^{2}v+2\lambda v+v=(\lambda^{2}+2\lambda +1)v=f(\lambda )v.
\end{align*}
Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Suppose that \(V\) has a basis \(\{v_{1},\cdots ,v_{n}\}\) consisting of eigenvectors of \(T\) with the corresponding eigenvalues \(\{\lambda_{1},\cdots ,\lambda_{n}\}\). Then, the characteristic polynomial of \(T\) is
\[P_{T}(\lambda )=(\lambda -\lambda_{1})\cdots (\lambda -\lambda_{n}).\]
In this case, we define
\[P_{T}(T)=(T-\lambda_{1} I)\cdots (T-\lambda_{n}I).\]
Then, we have \((P_{T}(T))(v_{i})=0\) for all \(i=1,\cdots ,n\). This also says that \(P_{T}(T)\) is a zero mapping, i.e., \((P_{T}(T))(v)=\theta\) for all \(v\in V\).
\begin{equation}{\label{lat38}}\tag{27}\mbox{}\end{equation}
Theorem \ref{lat38}. (Cayley-Hamilton) Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). If \(P_{T}\) is the characteristic polynomial for \(T\), then \(P_{T}(T)\) is a zero mapping, i.e., \((P_{T}(T))(v)=\theta\) for all \(v\in V\).
Proof. If \(v=\theta\), then it is obvious that \((P_{T}(T))(v)=\theta\), since \(P_{T}(T)\) is linear. Therefore, we assume that \(v\neq\theta\). Let \(W\) be the \(T\)-cyclic subspace generated by \(v\) with \(\dim (W)=r\). From part (i) of Proposition \ref{lap222}, the set \(\{v,T(v),T^{2}(v),T^{k-1}(v)\}\) is a basis for \(W\). Therefore, there exists \(a_{0},a_{1},a_{2},\cdots ,a_{k-1}\in {\cal F}\) satisfying
\begin{equation}{\label{laeq225}}\tag{28}
a_{0}v+a_{1}T(v_{1})+a_{2}T^{2}(v)+\cdots +a_{k-1}T^{k-1}(v)+T^{k}(v)=\theta .
\end{equation}
By part (ii) of Proposition \ref{lap222}, the characteristic polynomial of \(T_{W}\) is
\begin{equation}{\label{laeq226}}\tag{29}
P_{T_{W}}(\lambda )=(-1)^{k}\cdot\left (a_{0}+a_{1}t+a_{2}\lambda^{2}+\cdots+a_{k-1}\lambda^{k-1}+\lambda^{k}\right ).
\end{equation}
Combining (\ref{laeq225}) and (\ref{laeq226}), we obtain
\[(P_{T_{W}}(T))(v)=(-1)^{k}\cdot\left (a_{0}I+a_{1}T+a_{2}T^{2}+\cdots+a_{k-1}T^{k-1}+T^{k}\right )(v)=\theta .\]
Using Proposition \ref{lap224}, we see that \(P_{T_{W}}\) divides \(P_{T}\), which says that there exists a polynomial \(q\) such that \(P_{T}=q\cdot P_{T_{W}}\). Therefore, we have
\begin{align*} (P_{T}(T))(v) & =(q(T)\circ P_{T_{W}}(T))(v)\\ & =(q(T))((P_{T_{W}}(T))(v))\\ & =(q(T))(\theta )=\theta .\end{align*}
This completes the proof. \(\blacksquare\)
Let \(T\) be a linear operator on \(\mathbb{R}^{2}\) defined by
\[T(x,y)=(x+2y,-2x+y)\]
and let \(\mathfrak{B}\) be the standard basis for \(\mathbb{R}^{2}\). Then, we have
\[[T]_{\mathfrak{B}}=A=\left [\begin{array}{rr}
1 & 2\\ -2 & 1
\end{array}\right ].\]
The characteristic polynomial is given by
\begin{align*} P_{T}(\lambda ) & =\det\left (A-\lambda I_{2}\right )\\ & =\lambda^{2}-2\lambda +5.\end{align*}
The Cayley-Hamilton theorem says that
\[P_{T}(T)=T^{2}-2T+5I\]
is a zero mapping. We can also show
\[P_{A}(A)=P_{T}(A)=A^{2}-2A+5I_{2}\]
is a zero matrix. \(\sharp\)
Corollary. Let \(A\) be an \(n\times n\) matrix of complex numbers and let \(P_{A}\) be its characteristic polynomial. Then \(P_{A}(A)\) is a zero matrix.
Proof. We can regard \(A\) as a linear mapping from \(\mathbb{C}^{n}\) into itself. Then, the results follows immediately from Theorem \ref{lat38}. \(\blacksquare\)
\begin{equation}{\label{lap109}}\tag{30}\mbox{}\end{equation}
Proposition \ref{lap109}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(\lambda_{1},\cdots ,\lambda_{r}\) be the distinct eigenvalues of \(T\), and let \(W_{i}\) be the eigenspaces associated with the eigenvalues \(\lambda_{i}\). If \(W=W_{1}+W_{2}+\cdots +W_{r}\), then
\[\dim (W)=\dim (W_{1})+\dim (W_{2})+\cdots +\dim (W_{r}).\]
Moreover, if \(\mathfrak{B}_{i}\) is an ordered basis for \(W_{i}\), then \(\mathfrak{B}=\bigcup_{i=1}^{r}\mathfrak{B}_{i}\) arranged in the order \(\mathfrak{B}_{1},\cdots ,\mathfrak{B}_{r}\) is an ordered basis for \(W\).
Proof. For any \(w_{i}\in W_{i}\), we have \(T(w_{i})=\lambda_{i}w_{i}\). This says that \(W\) is the space spanned by all of the eigenvectors of \(T\), and \(\mathfrak{B}\) spans \(W\). Therefore, we remain to prove that \(\mathfrak{B}\) is linearly independent. Suppose that \(w_{1}+\cdots +w_{r}=\theta\). We shall show that \(w_{i}=\theta\). Let \(f\) be any polynomial. Proposition \ref{lap107} says
\begin{align*}
\theta & =(f(T))(\theta )=(f(T))\left (w_{1}+\cdots +w_{r}\right )\\
& =(f(T))(w_{1})+\cdots +(f(T))(w_{r})=f(\lambda_{1})w_{1}+\cdots +
f(\lambda_{r})w_{r}.
\end{align*}
In particular, we take polynomials \(f_{1},\cdots ,f_{r}\) by
\[f_{i}(\lambda_{j})=\delta_{ij}=\left\{\begin{array}{ll}
1 & i=j\\ 0 & i\neq j.
\end{array}\right .\]
Then, we obtain
\begin{align*} \theta & =(f_{i}(T))(\theta )\\ & =f_{i}(\lambda_{1})w_{1}+\cdots +f_{i}(\lambda_{r})w_{r}
\\ & =\sum_{j=1}^{r}\delta_{ij}w_{j}=w_{i}\end{align*}
for each \(i\). The linear combination of vectors in \(\mathfrak{B}\) has the form of \(b_{1}+\cdots +b_{r}\), where \(b_{i}\) is also the linear combination of the vectors in \(\mathfrak{B}_{i}\). Therefore, if \(b_{1}+\cdots +b_{r}=\theta\), then \(b_{i}=\theta\) for all \(i\). Since \(\mathfrak{B}_{i}\) is independent, the scalars (coefficients) presented in the linear combination \(b_{i}\) must be zero. Therefore, the scalars (coefficients) presented in the linear combination \(b_{1}+\cdots +b_{r}\) must also be zero. This shows the independence, and the proof is complete. \(\blacksquare\)
\begin{equation}{\label{lat110}}\tag{31}\mbox{}\end{equation}
Theorem \ref{lat110}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(\lambda_{1},\cdots ,\lambda_{r}\) be the distinct eigenvalues of \(T\), and let \(W_{i}=\mbox{Ker}(T-\lambda_{i}I)\). Then, the following statements are equivalent.
(a) \(T\) is diagonalizable.
(b) The characteristic polynomial of \(T\) is
\begin{equation}{\label{laeq108}}\tag{32}
P_{T}(\lambda )=(\lambda -\lambda_{1})^{d_{1}}\cdots (\lambda -\lambda_{r})^{d_{r}},
\end{equation}
where \(d_{i}=\dim (W)_{i}\) for \(i=1,\cdots ,r\).
(c) We have
\[\dim (V)=\dim (W_{1})+\dim (W_{2})+\cdots +\dim (W_{r}).\]
Proof. We have observed that (a) implies (b). If (\ref{laeq108}) holds, then \(\dim (V)=d_{1}+\cdots +d_{r}\), since the degree of characteristic polynomial \(P_{T}\) is the dimension of \(V\). This shows that (b) implies (c). Suppose that (c) holds. Proposition \ref{lap109} says that \(V=W_{1}+\cdots +W_{r}\), i.e., the eigenvectors of \(T\) spans \(V\). This completes the proof. \(\blacksquare\)
\begin{equation}{\label{laex137}}\tag{33}\mbox{}\end{equation}
Example \ref{laex137}. Let \(T\) be a linear operator on \(\mathbb{R}^{3}\) which is represented in the standard ordered basis by the matrix
\[A=\left [\begin{array}{rrr}
5 & -6 & -6\\ -1 & 4 & 2\\ 3 & -6 & -4
\end{array}\right ].\]
The characteristic polynomial is given by
\[P_{T}(\lambda )=\left |\begin{array}{ccc}
\lambda -5 & 6 & 6\\ 1 & \lambda -4 & -2\\ -3 & 6 & \lambda +4
\end{array}\right |=(\lambda -2)^{2}(\lambda -1).\]
This says that the eigenvalues are \(\lambda_{1}=1\) and \(\lambda_{2}=2\). Then, we have
\[\lambda_{1}I_{3}-A=I_{3}-A=\left [\begin{array}{rrr}
-4 & 6 & 6\\ 1 & -3 & -2\\ -3 & 6 & 5
\end{array}\right ]\]
and
\[\lambda_{2}I_{3}-A=2I_{3}-A=\left [\begin{array}{rrr}
-3 & 6 & 6\\ 1 & -2 & -2\\ -3 & 6 & 6
\end{array}\right ].\]
It is not difficult to show \(\mbox{rank}(\lambda_{1}I_{3}-A)=2\) and \(\mbox{rank}(\lambda_{2}I_{3}-A)=1\), i.e., \(\mbox{rank}(\lambda_{1}I-T)=2\) and \(\mbox{rank}(\lambda_{2}I-T)=1\). Since
\[3=\dim\mathbb{R}^{3}=\mbox{rank}(\lambda I-T)+\mbox{nullity}(\lambda I-T),\]
we have \(\dim (W_{1})=1\) and \(\dim (W_{2})=2\). Using Theorem \ref{lat110}, it follows that \(T\) is diagonalizable. By solving
\begin{align*} (\lambda_{1}I_{3})X-A & =\left [\begin{array}{rrr}
-4 & 6 & 6\\ 1 & -3 & -2\\ -3 & 6 & 5
\end{array}\right ]\left [\begin{array}{c}
x_{1}\\ x_{2}\\ x_{3}
\end{array}\right ]\\ & =\left [\begin{array}{c}
0\\ 0\\ 0
\end{array}\right ],\end{align*}
we can obtain the eigenvector \(v_{1}=(3,-1,3)\).
\begin{align*} (\lambda_{2}I_{3})X-A & =\left [\begin{array}{rrr}
-3 & 6 & 6\\ 1 & -2 & -2\\ -3 & 6 & 6
\end{array}\right ]\left [\begin{array}{c}
x_{1}\\ x_{2}\\ x_{3}
\end{array}\right ]\\ & =\left [\begin{array}{c}
0\\ 0\\ 0
\end{array}\right ],\end{align*}
we can obtain the eigenvectors \(v_{2}=(2,1,0)\) and \(v_{3}=(2,0,1)\). If we take \(\mathfrak{B}=\{v_{1},v_{2},v_{3}\}\) as the ordered basis for \(\mathbb{R}^{3}\), then \([T]_{\mathfrak{B}}\) is a diagonal matrix
\[D=\left [\begin{array}{rrr}
1 & 0 & 0\\ 0 & 2 & 0\\ 0 & 0 & 2
\end{array}\right ].\]
If we take the eigenvectors as the column vector to form the matrix
\[P=\left [\begin{array}{rrr}
3 & 2 & 2\\ -1 & 1 & 0\\ 3 & 0 & 1
\end{array}\right ],\]
then \(P^{-1}AP=D\). \(\sharp\)
Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\) with \(\dim (V)=n\), and let \(T\) be a linear operator on \(V\). Let \(W\) be the subspace spanned by all of the eigenvectors of \(T\). Let \(\lambda_{1},\cdots ,\lambda_{s}\) be the distinct eigenvalues of \(T\). For each \(i\), let \(W_{i}\) be the subspace of all eigenvectors associated with the eigenvalue \(\lambda_{i}\), and let \(\mathfrak{B}_{W_{i}}\) be an ordered basis for \(W_{i}\). Proposition \ref{lap109} says that \(\mathfrak{B}_{W}=\bigcup_{i=1}^{r}\mathfrak{B}_{W_{i}}\) is an ordered basis for \(W\) and
\[r=\dim (W)=\dim (W_{1})+\dim (W_{2})+\cdots +\dim (W_{s}).\]
Let \(\mathfrak{B}_{W}=\{v_{1},\cdots ,v_{r}\}\) such that the first few \(v\)’s form the basis \(\mathfrak{B}_{W_{1}}\), the next few \(v\)’s form the basis \(\mathfrak{B}_{W_{2}}\), and so on. Let \(r_{i}=\dim (W_{i})\). We also have
\begin{align}
T(v_{j}) & =\lambda_{1}v_{j}\mbox{ for }j=1,\cdots ,r_{1}\\
T(v_{j+r_{1}}) & =\lambda_{2}v_{j+r_{1}}\mbox{ for }j=1,\cdots ,r_{2}\nonumber\\
T(v_{j+r_{2}}) & =\lambda_{3}v_{j+r_{2}}\mbox{ for }j=1,\cdots ,r_{3}\nonumber\\
\vdots & \vdots\vdots\nonumber\\
T(v_{j}) & =\lambda_{s}v_{j}\mbox{ for }j=r_{s}+1,\cdots ,n\label{laeq116}\tag{34}.
\end{align}
For any \(w\in W\), we have
\[w=\alpha_{1}v_{1}+\cdots +\alpha_{r}v_{r}, \]
which implies
\begin{align*} T(w) & =\sum_{j=1}^{r}\alpha_{j}T(v_{j})\\ & =\beta_{1}v_{1}+\cdots +\beta_{r}v_{r}\in W\end{align*}
for some \(\beta_{i}\in {\cal F}\) by (\ref{laeq115}). This says that the subspace \(W\) is \(T\)-invariant. Now, we take any other vectors \(v_{r+1},\cdots ,v_{n}\) such that \(\mathfrak{B}_{V}=\{v_{1},\cdots ,v_{n}\}\) is an ordered basis for \(V\). Then \([T]_{\mathfrak{B}_{V}}\) has the block from shown in (\ref{laeq114}) and \(B=[T]_{\mathfrak{B}_{W}}\) is a diagonal matrix given by
\begin{align*} B & =[T]_{\mathfrak{B}_{W}}\\ & =\left [\begin{array}{cccc}
B_{1} & {\bf 0} & \cdots & {\bf 0}\\
{\bf 0} & B_{2} & \cdots & {\bf 0}\\
\vdots & \vdots & \vdots & \vdots\\
{\bf 0} & {\bf 0} & \cdots & B_{s}
\end{array}\right ],\end{align*}
where \(B_{i}\) is an \(r_{i}\times r_{i}\) diagonal matrix such that all the entries in the diagonal are the same value \(\lambda_{i}\). The characteristic polynomial for \(T_{W}\) is given by
\[P_{T_{W}}(\lambda )=(\lambda -\lambda_{1})^{r_{1}}\cdot (\lambda -\lambda_{2})^{r_{2}}\cdots (\lambda -\lambda_{s})^{r_{s}}.\]
From Theorem \ref{lat110}, it follows that \(T\) is diagonolizable if and only if \(r=n\), i.e., \(r_{1}+\cdots +r_{s}=n\).
\begin{equation}{\label{d}}\tag{D}\mbox{}\end{equation}
Minimal Polynomials.
The Cayley-Hamilton Theorem \ref{lat38} says that, for any linear operator \(T\) on a finite-dimensional vector space \(V\) over the scalar field \({\cal F}\), its characteristic polynomial \(P_{T}\) satisfies that \(P_{T}(T)\) is a zero mapping. It is possible that there exists another polynomial \(p\) with \(\deg p\leq\deg P_{T}\) such that \(p(T)\) is also a zero mapping. Therefore, we propose the following definition.
\begin{equation}{\label{lad111}}\tag{35}\mbox{}\end{equation}
Definition \ref{lad111}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). We say that \(p\) is a minimal polynomial for \(T\) when the following conditions are satisfied.
- \(p\) is a monic polynomial (i.e., the leading coefficient is \(1\)) over the scalar field \({\cal F}\).
- \(p(T)\) is a zero mapping.
- If \(f\) is another polynomial such that \(f(T)\) is a zero mapping, then \(\deg f\geq\deg p\). \(\sharp\)
We can show that if there is another monic polynomial \(\widehat{p}\) satisfies the above three conditions, then \(\widehat{p}=p\). This shows that uniqueness of minimal polynomial, and says that Definition \ref{lad111} is well-defined. The minimal polynomial of a given matrix \(A\) is defined to be the minimal polynomial of its corresponding linear operator \(T_{A}\).
\begin{equation}{\label{lar113}}\tag{36}\mbox{}\end{equation}
Remark \ref{lar113}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). The Cayley-Hamilton Theorem \ref{lat38} says that the minimal polynomial \(p\) divides the characteristic polynomial \(P_{T}\) for \(T\), since \(p(T)\) and \(P_{T}(T)\) are zero mappings and \(\deg p\leq\deg P_{T}\).
\begin{equation}{\label{lap112}}\tag{37}\mbox{}\end{equation}
Proposition \ref{lap112}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). The characteristic and minimal polynomials for \(T\) have the same roots except for multiplicities.
Proof. Let \(p\) be a minimal polynomial for \(T\). We shall show that \(p(\lambda )=0\) if and only if \(\lambda\) is an eigenvalue of \(T\). Suppose that \(p(\lambda )=0\). Then \(p(x)=(x-\lambda )q(x)\), where \(q(x)\) is a polynomial with \(\deg q<\deg p\). The definition of minimal polynomial says that \(q(T)\) cannot be a zero mapping. Therefore there exists \(v_{0}\in V\) such that \((q(T))(v_{0})\neq\theta\). We write \(v_{1}=(q(T))(v_{0})\). Then, we have
\begin{align*} \theta & =(p(T))(v_{0})\\ & =((T-\lambda I)q(T))(v_{0})\\ & =(T-\lambda I)(v_{1}),\end{align*}
which says that \(\lambda\) is an eigenvalue of \(T\). For the converse, suppose that \(T(v)=\lambda v\) for some \(v\neq\theta\) and \(\lambda\in {\cal F}\). We have \((p(T))(v)=p(\lambda )v\). Since \(p(T)\) is a zero mapping and \(v\neq\theta\), we must have \(p(c)=0\). This completes the proof. \(\blacksquare\)
Example. We consider the following matrix
\[A=\left [\begin{array}{rrr}
3 & -1 & 0\\ 0 & 2 & 0\\ 1 & -1 & 2
\end{array}\right ].\]
Its characteristic polynomial is given by
\begin{align*} P_{A}(\lambda ) & =\det (\lambda I_{3}-A)\\ & =\det\left [\begin{array}{rrr}
\lambda -3 & 1 & 0\\ 0 & \lambda -2 & 0\\ -1 & 1 & \lambda -2
\end{array}\right ]\\ & =(\lambda -2)^{2}\cdot (\lambda -3).\end{align*}
Using Proposition \ref{lap112}, the minimal polynomial must be either \((\lambda -2)\cdot (\lambda -3)\) or \((\lambda -2)^{2}\cdot (\lambda -3)\). Since \((A-2)(A-3)\) is a zero matrix, we conclude that the minimal polynomial is \(p(\lambda )=(\lambda -2)\cdot (\lambda -3)\). \(\sharp\)
Example. Let \(T\) be a linear operator on \(\mathbb{R}^{2}\) defined by
\[T(x,y)=(2x+5y,6x+y).\]
Let \(\mathfrak{B}\) be the standard basis for \(\mathbb{R}^{2}\). Then, we have
\[A=[T]_{\mathfrak{B}}=\left [\begin{array}{cc}
2 & 5\\ 6 & 1\end{array}\right ].\]
Its characteristic polynomial is given by
\begin{align*} P_{T}(\lambda ) & =\det (\lambda I_{2}-A)\\ & =\det\left [\begin{array}{cc}
\lambda -2 & -5\\ -6 & \lambda -1\end{array}\right ]\\ & =(\lambda -7)\cdot (\lambda +4).\end{align*}
Using Proposition \ref{lap112}, the minimal polynomial must be \((\lambda -7)\cdot (\lambda +4)\). \(\sharp\)
\begin{equation}{\label{laex246}}\tag{38}\mbox{}\end{equation}
Example \ref{laex246}. Let \(V\) be the vector space consisting of all polynomials with degree less than or equal to \(2\) and coefficients in \(\mathbb{R}\), and let \(T\) be a linear operator on \(V\) defined by \(T(f(x))=f'(x)\). Let \(\mathfrak{B}\) be the standard basis for \(V\). Then, we have
\[[T]_{\mathfrak{B}}=\left [\begin{array}{rrr}
0 & 1 & 0\\ 0 & 0 & 2\\ 0 & 0 & 0
\end{array}\right ].\]
Its characteristic polynomial is given by \(P_{T}(\lambda )=\lambda^{3}\). Therefore, the minimal polynomial may be \(t\), \(t^{2}\) or \(t^{3}\). Since \(T^{2}(x^{2})=2\neq 0\), it follows that the minimal polynomial of \(T\) must be \(t^{3}\). \(\sharp\)
Proposition. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Let \(W\) be a \(T\)-invariant subspace of \(V\). The minimal polynomial for the restriction operator \(T_{W}\) divides the minimal polynomial for \(T\).
Proof. Let \(\dim (V)=n\) and \(\dim (W)=r\). We consider the \(k\)th power of \(A\) as given in (\ref{laeq114}), which can be obtained below
\[A^{k}=\left [\begin{array}{cc}
B^{k} & E_{k}\\ {\bf 0} & D^{k}
\end{array}\right ],\]
where \(E_{k}\) is some \(r\times (n-r)\) matrix depending on \(k\). Therefore, if the polynomial \(p\) satisfies \(p(A)={\bf 0}\) the zero matrix, then \(p(B)\) is also a zero matrix. This completes the proof. \(\blacksquare\)
Example. Let \(T\) be a linear operator on \(\mathbb{R}^{4}\) defined by
\[T\left (x_{1},x_{2},x_{3},x_{4}\right )=\left (x_{1}+x_{2}+2x_{3}-x_{4},x_{2}+x_{4},2x_{3}-x_{4},x_{3}+x_{4}\right ).\]
Let \(W\) be a subspace of \(\mathbb{R}^{4}\) defined by
\[W=\left\{\left (x_{1},x_{2},0,0\right ):x_{1},x_{2}\in\mathbb{R}\right\}.\]
Since
\[T\left (x_{1},x_{2},0,0\right )=\left (x_{1}+x_{2},x_{2},0,0\right )\in W,\]
it follows that \(W\) is \(T\)-invariant. We also see that
\[\mathfrak{B}_{W}=\left\{{\bf e}_{1},{\bf e}_{2}\right\}\]
is an ordered basis for \(W\). Let
\[\mathfrak{B}_{V}=\left\{{\bf e}_{1},{\bf e}_{2},{\bf e}_{3},{\bf e}_{4}\right\}\]
be an ordered basis for \(V\). Then, we have
\[B=[T_{W}]_{\mathfrak{B}_{W}}=\left [\begin{array}{cc}
1 & 1\\ 0 & 1
\end{array}\right ]\]
and
\[A=[T]_{\mathfrak{B}_{V}}=\left [\begin{array}{rrrr}
1 & 1 & 2 & -1\\ 0 & 1 & 0 & 1\\ 0 & 0 & 2 & -1\\ 0 & 0 & 1 & 1
\end{array}\right ].\]
Finally, we can obtain
\begin{align*}
P_{T}(\lambda ) & =\det\left (A-\lambda I_{4}\right )\\
& =\det\left [\begin{array}{cccc}
1-\lambda & 1 & 2 & -1\\ 0 & 1-\lambda & 0 & 1\\ 0 & 0 & 2-\lambda & -1\\
0 & 0 & 1 & 1-\lambda
\end{array}\right ]\\
& =\det\left [\begin{array}{cc}
1-\lambda & 1\\ 0 & 1-\lambda
\end{array}\right ]\cdot\det\left [\begin{array}{cc}
2-\lambda & -1\\ 1 & 1-\lambda
\end{array}\right ]\\
& =P_{T_{W}}(\lambda )\cdot\det\left [\begin{array}{cc}
2-\lambda & -1\\ 1 & 1-\lambda
\end{array}\right ].
\end{align*}
\begin{equation}{\label{lat245}}\tag{39}\mbox{}\end{equation}
Theorem \ref{lat245}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\) with \(\dim (V)=n\), and let \(T\) be a linear operator on \(V\). Then \(T\) is diagonalizable if and only if the minimal polynomial of \(T\) is of the form
\[p(t)=\left (t-\lambda_{1}\right )\left (t-\lambda_{2}\right )\cdots\left (t-\lambda_{r}\right ),\]
where \(r\leq n\) and \(\lambda_{1},\cdots ,\lambda_{r}\) are distinct eigenvalues of \(T\).
Proof. Suppose that \(T\) is diagonalizable. Then, there is a basis \(\mathfrak{B}=\{v_{1},\cdots ,v_{n}\}\) for \(V\) which consists of eigenvectors of \(T\). Let \(\lambda_{1},\cdots ,\lambda_{r}\) be the distinct eigenvalues of \(T\), and define
\[p(t)=\left (t-\lambda_{1}\right )\left (t-\lambda_{2}\right )\cdots\left (t-\lambda_{r}\right ).\]
From Proposition \ref{lap112}, we see that \(p\) and the minimal polynomial have also the same roots, which means that \(p\) divides the minimal polynomial. In order to show that \(p\) is the minimal polynomial, by definition, we remain to claim that \(p(T)\) is a zero mapping. For any \(v_{i}\in\mathfrak{B}\), we have \((T-\lambda_{j}I)(v_{i})=\theta\) for some \(\lambda_{j}\). Since \(p\) can be written as \(p(t)=q_{j}(t)\cdot (t-\lambda_{j})\) for some polynomial \(q_{j}\), we have
\begin{align*} (p(T))(v_{i}) & =(q_{j}(T))((T-\lambda_{j}I)(v_{i}))\\ & =(q_{j}(T))(\theta )=\theta .\end{align*}
This shows that \(p(T)\) is indeed a zero mapping. Therefore, \(p\) is a minimal polynomial of \(T\).
Conversely, suppose that there exist distinct scalars \(\lambda_{1},\cdots ,\lambda_{r}\) such that the minimal polynomial of \(T\) is
\[p(t)=\left (t-\lambda_{1}\right )\left (t-\lambda_{2}\right )\cdots\left (t-\lambda_{r}\right ).\]
Proposition \ref{lap112} says that these \(\lambda_{1},\cdots ,\lambda_{r}\) are the eigenvalues of \(T\). We are going to prove it by mathematical induction on \(n=\dim (V)\). It is obvious that \(T\) is diagonalizable for \(n=1\). We assume that \(T\) is diagonalizable for \(\dim V<n\) and \(n>1\). Now, we consider \(\dim V=n\) and \(W=\mbox{Im}(T-\lambda_{r}I)\). Since \(\lambda_{r}\) is an eigenvalue of \(T\), it says \(W\neq V\) (check it !). If \(W=\{\theta\}\), then \(T=\lambda_{r}I\), which says that \(T\) is diagonalizable. Therefore, we assume \(0<\dim (W)<n\). It is obvious that \(W\) is \(T\)-invariant. Let \(T_{W}\) be the linear operator that is the restriction of \(T\) on \(W\). Since \(p\) is a minimal polynomial, by definition, we have
\begin{align*}
\theta & =(p(T))(v)\\ & =((T-\lambda_{1}I)(T-\lambda_{2}I)\cdots (T-\lambda_{r-1}I)(T-\lambda_{r}I))(v)\\
& =((T-\lambda_{1}I)(T-\lambda_{2}I)\cdots (T-\lambda_{r-1}I))(x),
\end{align*}
where \(x=(T-\lambda_{r}I)(v)\in W\). This shows that, fro any \(x\in W\), we have
\[((T-\lambda_{1}I)(T-\lambda_{2}I)\cdots (T-\lambda_{r-1}I))(x)=\theta ,\]
which says that the minimal polynomial of \(T_{W}\) divides the following polynomial
\begin{equation}{\label{laeq258}}\tag{40}
\left (t-\lambda_{1}\right )\left (t-\lambda_{2}\right )\cdots\left (t-\lambda_{r-1}\right ).
\end{equation}
Therefore, the minimal polynomial of \(T_{W}\) must be the form of (\ref{laeq258}). By the induction hypothesis, the linear operator \(T_{W}\) must be diagonalizable. Proposition \ref{lap112} also says that \(\lambda_{r}\) cannot be the eigenvalue of \(T_{W}\). Therefore, we obtain
\begin{equation}{\label{laeq259}}\tag{41}
W\cap\mbox{Ker}(T-\lambda_{r}I)=\{\theta\}.
\end{equation}
Let
\[\mathfrak{B}_{1}=\left\{w_{1},\cdots ,w_{m}\right\}\]
be a basis for \(W\) that consists of eigenvectors of \(T_{W}\), which are also eigenvector of \(T\). Let
\[\mathfrak{B}_{2}=\left\{v_{1},\cdots ,v_{p}\right\}\]
be a basis for \(\mbox{Ker}(T-\lambda_{r}I)\). Then \(\mathfrak{B}_{1}\) and \(\mathfrak{B}_{2}\) are disjoint by (\ref{laeq259}). By considering \(T-\lambda_{r}I\) and using Theorem \ref{lat71}, we also have \(n=m+p\). Next, we shall claim that \(\mathfrak{B}=\mathfrak{B}_{1}\cup\mathfrak{B}_{2}\) is linearly independent. For any scalars \(\alpha_{1},\cdots ,\alpha_{m}\) and \(\beta_{1},\cdots ,\beta_{r}\) satisfying
\[\alpha_{1}w_{1}+\cdots +\alpha_{m}w_{m}+\beta_{1}v_{1}+\cdots +\beta_{p}v_{p}=\theta .\]
Let
\[x=\alpha_{1}w_{1}+\cdots +\alpha_{m}w_{m}\]
and
\[y=\beta_{1}v_{1}+\cdots +\beta_{p}v_{p}.\]
Then \(x\in W\) and \(y\in \mbox{Ker}(T-\lambda_{r}I)\) satisfy \(x+y=\theta\). It says that \(x=-y\in W\cap\mbox{Ker}(T-\lambda_{r}I)\). Therefore, we have \(x=\theta\), which implies \(\alpha_{i}=0\) for \(i=1,\cdots ,m\), since \(\mathfrak{B}_{1}\) is linearly independent. Similarly, we can show that \(y=\theta\), which also implies \(\beta_{j}=0\) for \(j=1,\cdots ,p\), since \(\mathfrak{B}_{2}\) is linearly independent. Therefore, we conclude that \(\mathfrak{B}\) is linearly independent subset of \(V\) consisting of \(n=\dim (V)\) eigenvectors of \(T\). It means that \(\mathfrak{B}\) is a basis for \(V\) consisting of eigenvectors of \(T\). Therefore, the linear operator \(T\) is diagonalizable. This completes the proof. \(\blacksquare\)
Example. Continued from Example \ref{laex246}, since the minimal polynomial is \(p(t)=t^{3}\), Theorem \ref{lat245} says that the differential operator \(T\) is not diagonalizable. \(\sharp\)
Example. Let \(V\) be the vector space consisting of all \(2\times 2\) matrices over \(\mathbb{R}\). We shall determine all matrices \(A\) that are satisfying \(A^{2}-3A+2I_{2}={\bf 0}\). Let
\[f(t)=t^{2}-3t+2=(t-1)(t-2).\]
Since \(f(A)={\bf 0}\), the minimal polynomial \(p\) of \(A\) divides \(f\). Therefore, the possible minimal polynomials of \(A\) are \(t-1\), \(t-2\) or \((t-1)(t-2)\). If \(p(t)=t-1\) or \(p(t)=t-2\), then \(A=I_{2}\) or \(A=2I_{2}\), respectively. If \(p(t)=(t-1)(t-2)\), then \(A\) is diagonalizable by Theorem \ref{lat245} with eigenvalues \(1\) and \(2\). Therefore, the matrix \(A\) must be similar to
\[D=\left [\begin{array}{cc}
1 & 0\\ 0 & 2
\end{array}\right ]\]
In other words, we have \(A=PDP^{-1}\) for some invertible \(2\times 2\) matrix. \(\sharp\)
Example. Let \(A\) be an \(n\times n\) matrix over \(\mathbb{R}\) satisfying \(A^{3}=A\). We are going to show that \(A\) is diagonalizable. Let
\[f(t)=t^{3}-t=t(t+1)(t-1).\]
Then \(f(A)={\bf 0}\). Therefore, the minimal polynomial \(p\) divides \(f\). Since \(p\) and \(f\) have the same roots, it follows that \(p(t)=t(t+1)(t-1)\). This shows that \(A\) is diagonalizable by Theorem \ref{lat245}. \(\sharp\)
\begin{equation}{\label{e}}\tag{E}\mbox{}\end{equation}
Triangulation.
Definition. Let \(V\) be an \(n\)-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). A finite sequence \(\{V_{1},\cdots ,V_{n}\}\) of subspaces of \(V\) is called a fan of \(T\) when \(V_{i}\subseteq V_{i+1}\) for \(i=1,\cdots ,n-1\), \(\dim (V_{i})=i\) for \(i=1,\cdots ,n\) and each \(V_{i}\) is \(T\)-invariant for \(i=1,\cdots ,n\). A finite sequence \(\{v_{1},\cdots ,v_{n}\}\) in \(V\) is called a fan basis of \(T\) if and only if \(\{v_{1},\cdots ,v_{i}\}\) is a basis of \(V_{i}\) for \(i=1,\cdots ,n\). \(\sharp\)
We see that the fan basis exists. For example, let \(v_{1}\) be a basis of \(V_{1}\). We can extend \(v_{1}\) to a basis \(\{v_{1},v_{2}\}\) of \(V_{2}\), and also to a basis \(\{v_{1},v_{2},v_{3}\}\) of \(V_{3}\). Inductively, we can finally obtain a basis \(\{v_{1},\cdots ,v_{i}\}\) of \(V_{i}\).
Proposition. Let \(\{v_{1},\cdots ,v_{n}\}\) be a fan basis of a linear mapping. Then, the matrix associated with \(T\) with respect to this fan basis is an upper triangular matrix.
Proof. Since \(T(V_{i})\subseteq V_{i}\) for each \(i=1,\cdots ,n\), there exists \(a_{ij}\in {\cal F}\) satisfying
\begin{align*}
T(v_{1}) & =a_{11}v_{1}\\
T(v_{2}) & =a_{21}v_{1}+a_{22}v_{2}\\
& \vdots\\
T(v_{i}) & =a_{1i}v_{1}+a_{2i}v_{2}+\cdots +a_{ii}v_{i}\\
& \vdots\\
T(v_{n}) & =a_{1n}v_{1}+a_{2n}v_{2}+\cdots +a_{nn}v_{i},
\end{align*}
which means that the matrix associated with \(T\) with respect to this fan basis is the triangular matrix
\[\left [\begin{array}{cccc}
a_{11} & a_{12} & \cdots & a_{1n}\\
0 & a_{22} & \cdots & a_{2n}\\
\vdots & \vdots & \cdots & \vdots\\
0 & 0 & \cdots & a_{nn}
\end{array}\right ].\]
This completes the proof. \(\blacksquare\)
Definition. The triangulation of linear mappings and matrices is defined below.
- Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). We say that \(T\) is triangulable when there exists an ordered basis \(\mathfrak{B}\) of \(V\) such that the associated matrix \([T]_{\mathfrak{B}}\) of \(T\) is a triangular matrix.
- Let \(A\) be an \(n\times n\) matrix over the scalar field \({\cal F}\). We say that \(A\) is triangulable when there exists a non-singular matrix \(B\) in \({\cal F}\) such that \(B^{-1}AB\) is a triangular matrix. \(\sharp\)
We see that if the matrix \(A\) is triangulable, then its corresponding linear mapping \(T_{A}:{\cal F}^{n}\rightarrow {\cal F}^{n}\) is triangulable.
Theorem. Let \(V\) be a finite-dimensional vector space over the complex numbers with \(\dim (V)\geq 1\). If \(T:V\rightarrow V\) is a linear mapping, then there exists a fan of \(T\) in \(V\).
Corollary. Let \(V\) be a finite-dimensional vector space over the complex numbers with \(\dim (V)\geq 1\). If \(T:V\rightarrow V\) is a linear mapping, then there exists a basis of \(V\) such that the associated matrix of \(T\) is a triangular matrix with respect to this basis.
Corollary. Let \(A\) be a matrix of complex numbers. Then there exists a non-singular matrix \(B\) such that \(B^{-1}AB\) is a triangular matrix.
Theorem. Let \(V\) be a finite-dimensional vector space over the complex numbers with \(\dim (V)\geq 1\). Suppose that \(V\) is endowed with a positive Hermitian product. Let \(T:V\rightarrow V\) is a unitary mapping. Then there exists an orthonormal basis of \(V\) consisting of eigenvectors of \(T\).
Corollary. Let \(A\) be a complex unitary matrix. Then there exists a unitary matrix \(U\) such that \(U^{-1}AU\) is a diagonal matrix.
Proposition. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\) such that the minimal polynomial for \(T\) has the form
\[p(x)=\left (x-c_{1}\right )^{d_{1}}\cdot\left (x-c_{2}\right )^{d_{2}}\cdots\left (x-c_{r}\right )^{d_{r}},\]
where \(c_{i}\in {\cal F}\) for \(i=1,\cdots ,r\). Let \(W\) be a \(T\)-invariant proper subspace of \(V\) \((W\neq V)\). Then, there exists a vector \(v\in V\) satisfying \(v\not\in W\) and \((T-\lambda I)(v)\in W\) for some eigenvalue \(\lambda\) of \(T\).
Theorem. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Then \(T\) is triangulable if and only if the minimal polynomial for \(T\) has the form
\[p(x)=\left (x-c_{1}\right )^{d_{1}}\cdot\left (x-c_{2}\right )^{d_{2}}\cdots\left (x-c_{r}\right )^{d_{r}},\]
where \(c_{i}\in {\cal F}\) for \(i=1,\cdots ,r\).
\begin{equation}}{\label{lat124}}\tag{42}\mbox{}\end{equation}
Theorem \ref{lat124}. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Then \(T\) is diagonalizable if and only if the minimal polynomial for \(T\) has the form
\[p(x)=\left (x-c_{1}\right )\cdot\left (x-c_{2}\right )\cdots\left (x-c_{r}\right ),\]
where \(c_{1},\cdots ,c_{r}\) are distinct elements in \({\cal F}\).
Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \(T\) be a linear operator on \(V\). Suppose that the characteristic polynomial for \(T\) can be factored as
\[P_{T}(\lambda )=\left (\lambda -\lambda_{1}\right )^{d_{1}}\cdot\left (\lambda -\lambda_{2}\right )^{d_{2}}\cdots
\left (\lambda -\lambda_{r}\right )^{d_{r}}.\]
We have two different methods for determining whether or not \(T\) is diagonalizable.
- (Method 1). For each \(i\), we need to see whether we can find \(d_{i}\) independent eigenvectors associated with the eigenvalue \(\lambda_{i}\). If this is true, then these eigenvectors can form an ordered basis \(\mathfrak{B}\) such that \([T]_{\mathfrak{B}}\) is a diagonal matrix and \(\lambda_{i}\) will repeat \(d_{i}\) times in the diagonal of \([T]_{\mathfrak{B}}\).
- (Method 2). We need to check whether or not
\[(T-\lambda_{1}I)(T-\lambda_{2}I)\cdots (T-\lambda_{r}I)\]
is a zero mapping.
Definition. Let \(V\) be a vector space over a scalar field \({\cal F}\), and let \({\cal T}\) be a family of all linear operators on \(V\). A subspace \(W\) of \(V\) is called \({\cal T}\)-invariant when \(T(W)\subseteq W\) for all \(T\in {\cal T}\). We say that \(V\) is a simple ${\cal T}$-space when \(V\neq\{\theta\}\) and the \({\cal T}\)-subspaces of \(V\) are only \(V\) and \(\{\theta\}\). \(\sharp\)
We see that the subspace \(W\) of \(V\) is \({\cal T}\)-invariant if and only if \(W\) is \(T\)-invariant for all \(T\in {\cal T}\).
\begin{equation}{\label{lap40}}\tag{43}\mbox{}\end{equation}
Proposition \ref{lap40}. Let \(T\in {\cal T}\) such that \(ST=TS\) for all \(S\in {\cal T}\). Then, the image and kernel of \(T\) are \({\cal T}\)-invariant subspaces.
Proof. Let \(w\) be in the image of \(T\). This says that there exists \(v\in V\) satisfying \(w=T(v)\). Then, we have
\begin{align*} S(w) & =S(T(v))=(ST)(v)\\ & =(TS)(v)=T(S(v)),\end{align*}
which says that \(S(w)\) is also in the image of \(T\). Therefore, the image of \(T\) is \({\cal T}\)-invariant. Let \(u\) be in the kernel of \(T\), i.e., \(T(u)=\theta\). Then, we have
\begin{align*} (T(S(u)) & =(TS)(u)=(ST)(u)\\ & =S(T(u))=S(\theta )=\theta ,\end{align*}
which says that \(S(u)\) is in the kernel of \(T\). Therefore, the kernel of \(T\) is \({\cal T}\)-invariant. This completes the proof. \(\blacksquare\)
Theorem. Let \(V\) be a vector space over a scalar field \({\cal F}\), and let \({\cal T}\) be a family of all linear operators on \(V\). Assume that \(V\) is a simple \({\cal T}\)-space. If \(T\in {\cal T}\) satisfying \(ST=TS\) for all \(S\in {\cal T}\), then either \(T\) is invertible, or \(T\) is a zero mapping.
Proof. If \(T\) is not a zero mapping, Proposition \ref{lap40} says \(\ker T=\{\theta\}\), which also means that the image of \(T\) is all of \(V\). Therefore, \(T\) is invertible. This completes the proof. \(\blacksquare\)
Theorem. Let \(V\) be a finite-dimensional vector space over \(\mathbb{C}\), and let \({\cal T}\) be a family of all linear operators on \(V\). Assume that \(V\) is a simple \({\cal T}\)-space. If \(T\in {\cal T}\) satisfying \(ST=TS\) for all \(S\in {\cal T}\), then there exists \(\lambda\) satisfying \(T=\lambda I\). \(\sharp\)
Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \({\cal T}\) be a family of linear operators on \(V\). We may ask whether or not we can simultaneously triangulate or diagonalize the linear operators in \({\cal T}\). In other words, we try to find an ordered basis \(\mathfrak{B}\) for \(V\) such that the matrices \([T]_{\mathfrak{B}}\) are triangular (or diagonal) for all \(T\in {\cal T}\). The family \({\cal T}\) is said to be a commuting family when \(TU=UT\) for any \(T,U\in {\cal T}\).
Proposition. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \({\cal T}\) be a commuting family of triangulable linear operators on \(V\). Let \(W\) be a proper subspace of \(V\) which is invariant under \({\cal T}\). Then, there exists a vector \(v\in V\) such that \(v\not\in W\) and the vector \(T(v)\) is in the subspace spanned by \(v\) and \(W\) for each \(T\in {\cal T}\).
Theorem. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \({\cal T}\) be a commuting family of triangulable linear operators on \(V\). Then there exists an ordered basis \(\mathfrak{B}\) for \(V\) such that the matrices \([T]_{\mathfrak{B}}\) are triangular for all \(T\in {\cal T}\).
Corollary. Let \({\cal A}\) be a commuting family of \(n\times n\) matrices over \(\mathbb{C}\). Then, there exists a non-singular \(n\times n\) matrix \(P\) over \(\mathbb{C}\) such that \(P^{-1}AP\) is upper-triangular for every \(A\in {\cal A}\).
Theorem. Let \(V\) be a finite-dimensional vector space over the scalar field \({\cal F}\), and let \({\cal T}\) be a commuting family of diagonalizable linear operators on \(V\). Then, there exists an ordered basis \(\mathfrak{B}\) for \(V\) such that the matrices \([T]_{\mathfrak{B}}\) are diagonal for all \(T\in {\cal T}\).


