Orthogonal and symmetric matrices¶

Definition 7.36

A real square matrix \(A \in {\mathrm {Mat}}_{n \times n}\) is called orthogonal if

\[ AA^T = {\mathrm {id}}. \]

This is equivalent to saying that \(A\) is invertible and \(A^{-1} = A^T\). The following lemma explains the name “orthogonal”.

Lemma 7.37

For a square matrix \(A \in {\mathrm {Mat}}_{n \times n}\), the following are equivalent:

\(A\) is orthogonal,
the \(n\) rows are an orthonormal basis of \({\bf R}^n\),
the \(n\) columns are an orthonormal basis of \({\bf R}^n\).

Proof. If \(e_i\) is the \(i\)-th standard basis vector, we know that \(Ae_i\) is the \(i\)-th column \(A\). We compute

\[ {\left \langle Ae_i, Ae_j \right \rangle} = (Ae_i)^T(Ae_j) = e_i^T A^T A e_j. \]

The vector \(A^T A e_j\) is the \(j\)-th column of \(A^T A\), and the number \(e_i^T A^T A e_j\) is the \(i\)-th entry of that vector. Thus, saying that the above expression equals 1 for \(i=j\) and 0 otherwise is equivalent to requiring \(A^T A = {\mathrm {id}}\). ◻

Theorem 7.38 (Related exercises: Exercise 7.11)

The following conditions are equivalent for an \(n \times n\)-matrix \(A\):

\(A\) is symmetric,
\(A\) is orthogonally diagonalizable, i.e., there is an orthogonal matrix \(P\) such that \(P^{-1}AP\) is a diagonal matrix,
\(A\) has an orthonormal eigenbasis.

If these equivalent conditions hold, then the columns of \(P\) form an orthonormal eigenbasis and vice versa. (Note that \(P^{-1} = P^T\) can be computed without computing, properly speaking, the inverse of \(P\).)

The implication 1. \(\Rightarrow\) 2. in particular says:

\[ A \text{ symmetric} \Rightarrow A \text{ diagonalizable}. \]

For a proof of this theorem, see, e.g. (Nicholson 1995, Theorem 8.2.2). The vectors of an orthonormal eigenbasis are also called the principal axes of \(A\). The theorem is sometimes called the principal axes theorem. We only point out that the difficult direction is to show that 1. \(\Rightarrow\) 2.. One does this by proving that a symmetric real matrix has only real eigenvalues (as opposed to complex). For \(2 \times 2\)-matrices, one can see this by direct computation (see also Exercise 6.2): the characteristic polynomial of a symmetric \(2 \times 2\)-matrix \(A = \left ( \begin{array}{cc} a & b \\ b & d \end{array} \right )\) is

\[ \chi_A(t) = \det (A-t{\mathrm {id}}) = (a-t)(d-t) - b^2 = t^2 + (-a-d) t + ad-b^2. \]

The zeroes of this polynomial are given by

\[ \begin{align*} \lambda_{1/2} & = \frac{a+d}2 \pm \sqrt{\frac{(a+d)^2}4-ad+b^2} \\ & = \frac{a+d}2 \pm \sqrt{\frac{a^2 + d^2}4 + \frac{ad}2 - ad + b^2} \\ & = \frac{a+d}2 \pm \sqrt{\frac{(a-d)^2}4 + b^2}. \end{align*} \]

The expression in the square root is always non-negative, so that \(\lambda_{1/2}\) are real numbers.

As an example of a non-symmetric matrix with imaginary eigenvalues, we have seen in Example 6.21 that the matrix \(A = \left ( \begin{array}{cc} 0 & -1 \\ 1 & 0 \end{array} \right )\) has the eigenvalues \(\lambda_{1/2} = \pm i\).

Example 7.39

The matrix \(A = \left ( \begin{array}{ccc} 5 & -4 & 2 \\ -4 & 5 & 2 \\ 2 & 2 & -1 \end{array} \right )\) is symmetric. We compute an orthonormal eigenbasis by first computing the eigenvalues:

\[ \chi_A(t) = -t^3 + 9 t^2 + 9 t - 81. \]

The eigenvalues and an eigenvector for them are as follows:

\(\lambda_1 = 9\), \(v_1 = (-1,1,0)\),
\(\lambda_2 = 3\), \(v_2 = (1,1,1)\),
\(\lambda_3 = -3\), \(v_3 = (-1,-1,2)\).

These three vectors are orthogonal; this is seen by direct computation. Alternatively, since the eigenvalues are all distinct, they are automatically orthogonal (Exercise 7.12). They are however not normal, dividing by their norm gives an orthonormal eigenbasis:

\[ \frac 1{\sqrt 2} \left ( \begin{array}{c} -1 \\ 1 \\ 0 \end{array} \right ), \frac1{\sqrt 3} \left ( \begin{array}{c} 1 \\ 1 \\ 1 \end{array} \right ), \frac1{\sqrt 6} \left ( \begin{array}{c} 1 \\ 1 \\ -2 \end{array} \right ). \]

Nicholson, W. K. 1995. *Linear Algebra with Applications*. Mathematics Series. PWS Publishing Company. .