7.1 Linear Algebra Concepts

Linear transformations and change of basis are widely used in statistics, for this reason I briefly describe the definition of these concepts and how they are related.

7.1.1 Linear Transformation

Letting \(V\) and \(W\) be vector spaces, a function \(f: V \rightarrow W\) is a linear transformation if the additivity and scalar multiplication properties are hold for any two vectors \(\mathbf{u}, \mathbf{v} \in V\) and a constant \(c\): \[f(\mathbf{u}+\mathbf{v}) = f(\mathbf{u}) + f(\mathbf{v})\] \[f(c\mathbf{v}) = cf(\mathbf{v}).\]

This concept is more common to use when working with matrices. Considering the vector spaces \(V \in \mathbb{R}^n\) and \(W \in \mathbb{R}^m\), a matrix \(\mathbf{A}_{m \times n}\) and the vector \(\mathbf{x} \in V\); then the function \[f(\mathbf{x}) = \mathbf{A}\mathbf{x}\] is a linear transformation \(V \in \mathbb{R}^n\) to \(W \in \mathbb{R}^m\) because it holds the properties mentioned above. In this definition, although not mentioned, we are assuming that both \(V\) and \(W\) are defined using the standard basis for \(\mathbb{R}^n\) and \(\mathbb{R}^m\) respectively.

7.1.2 Change of Basis

Consider a vector \(\mathbf{u} \in \mathbb{R}^n\), it is implicitly defined using the standard basis \(\{\mathbf{e}_1,\dots,\mathbf{e}_n\}\) for \(\mathbb{R}^n\), such as \(\mathbf{u}=\sum_{i=1}^n u_i \mathbf{e}_i\). In a similar manner, this vector \(\mathbf{u}\) can also be represented in vector spaces with different basis, this is called change of basis. For example, consider the vector space \(V \in \mathbb{R}^n\) with basis \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\). Then, in order to make the change of basis, it is required to find \(\mathbf{u}_v=(u_{v_1},\dots,u_{v_n})^\intercal\) such as \[\mathbf{u} = \sum_{i=1}^n u_{v_i} \mathbf{v}_i = \mathbf{V}\mathbf{u}_v,\] where the \(n\times n\) matrix \(\mathbf{V}=(\mathbf{v}_1,\dots,\mathbf{v}_n)\), hence the change from the standard basis to the vector space \(V\) is \[\mathbf{u}_v = \mathbf{V}^{-1}\mathbf{u},\] while the change from the vector space \(V\) to the standard basis is \[\mathbf{u} = \mathbf{V}\mathbf{u}_v.\]

Now, consider another vector space \(W \in \mathbb{R}^n\) with basis \(\{\mathbf{w}_1,\dots,\mathbf{w}_n\}\), the vector \(\mathbf{u}_v\) defined on the space \(V\) can also be defined on the space \(W\) as \[\mathbf{u}_w = \mathbf{W}^{-1}\mathbf{V}\mathbf{u}_v\] where the \(n\times n\) matrix \(\mathbf{W}=(\mathbf{w}_1,\dots,\mathbf{w}_n)\); similarly, the vector \(\mathbf{u}_w \in W\) can be defined on the space \(V\) as \[\mathbf{u}_v = \mathbf{V}^{-1}\mathbf{W}\mathbf{u}_w.\] It can be seen that in both cases, the original vector is first transformed to the space vector with standard basis (left-multiplying the basis matrix) and then transformed to the desired vector space (left-multiplying the basis matrix inverse ).

7.1.3 Change of Basis for Linear Transformations

Previously, we have presented a linear transformation \(f(\mathbf{x})=\mathbf{A}\mathbf{x}:\mathbb{R}^n\rightarrow\mathbb{R}^m\) using standard basis. This transformation can also be represented from a vector space \(V\) with basis \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) to a vector space \(W\) with basis \(\{\mathbf{w}_1,\dots,\mathbf{w}_n\}\), then \(f': V \rightarrow W\) is defined as \[f'(\mathbf{x}_v) = \mathbf{W}^{-1}\mathbf{A}\mathbf{V}\mathbf{x}_v,\] where the matrices \(\mathbf{W}\) and \(\mathbf{V}\) are the basis matrix of the vector spaces \(W\) and \(V\) respectively. The matrix multiplication \(\mathbf{W}^{-1}\mathbf{A}\mathbf{V}\) implies a change of basis from to standard basis, the linear transformation using the standard basis, and the change from the standard basis to the space \(W\). In cases that \(V=W\), then the linear transformation is defined as \[f'(\mathbf{x}_v) = \mathbf{V}^{-1}\mathbf{A}\mathbf{V}\mathbf{x}_v.\]

7.1.4 Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors are used in several concepts of statistical inference and modelling. It can be useful for dimension reduction, decomposition of variance-covariance matrices, so on. For this reason, we provide basic details about eigenvectors and eigenvalues and their close relationship with linear transformations.

7.1.4.1 Definition

The eigenvector of a linear transformation \(\mathbf{A}_{n\times n}\) is a non-zero vector \(\mathbf{v}\) such as the linear transformation of this vector is proportional to itself: \[\mathbf{A}\mathbf{v} = \lambda \mathbf{v} \iff (\mathbf{A}-\lambda\mathbf{I})\mathbf{v} = \mathbf{0},\] where \(\lambda\) is the eigenvalue associated to the eigenvector \(\mathbf{v}\). The equation above has non-zero solution if and only if \[\det(\mathbf{A}-\lambda\mathbf{I}) = 0.\] Then, all the eigenvalues \(\lambda\) of \(\mathbf{A}\) hold the condition above.

There is an equivalence between the linear transformation \(f(\mathbf{x}) = \mathbf{A}\mathbf{x}\), and the eigenvalues \(\lambda_1, \lambda_2, \dots, \lambda_n\) and eigenvectors \(\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_n\) of itself. This relationship provide more useful interpretation of the eigenvalues and eigenvectors, we will use the change of basis concept to describe it.

7.1.4.2 Eigendecomposition and geometric interpretation

Considering a vector space \(V\) with basis \(\{\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_n\}\), any vector \(\mathbf{x} \in \mathbb{R}^n\) can be represented as \(\mathbf{V}\mathbf{x}_v\), where \(\mathbf{x}_v\) is the representation of \(\mathbf{x}\) using the matrix of basis \(\mathbf{V}=(\mathbf{v}_1, \dots, \mathbf{v}_n)\) of the vector space \(V\). Then, the linear transformation can be expressed as \[f(\mathbf{x}) = \mathbf{A}\mathbf{x} = \mathbf{A}\mathbf{V}\mathbf{x}_v = \mathbf{V}\mathbf{D}\mathbf{x}_v,\] where the diagonal matrix \(D=\text{diag}(\lambda_1, \dots, \lambda_n)\) and the last equivalence hold because \(\mathbf{A}\mathbf{v}_i=\mathbf{v}_i\lambda_i\). Finally, expressing \(\mathbf{x}_v\) in terms of the vector \(\mathbf{x}\) defined on the standard basis, we obtain that \[f(\mathbf{x}) = \mathbf{V}\mathbf{D}\mathbf{V}^{-1}\mathbf{x},\] the equality \(\mathbf{A}=\mathbf{V}\mathbf{D}\mathbf{V}^{-1}\) is called eigendecomposition. Hence, the linear transformation is equivalent to the following: change the basis of \(\mathbf{x}\) to the vector space \(V\) , apply the diagonal linear transformation \(D\) and return to the space with standard basis. Geometrically, you can think of \(\{\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_n\}\) as the basis of vectorial space \(V\) where the transformation \(\mathbf{A}\) becomes only an scaling transformation \(\mathbf{D}\) and the eigenvalues \(\lambda_1, \lambda_2, \dots, \lambda_n\) are the scaling factor in direction of the corresponding eigenvector \(\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_n\).

7.1.4.3 Basis properties

There are certain properties the are useful for statistical modelling such as:

Trace of \(\mathbf{A}\) is equals to the sum of the eigenvalues.
Determinant of \(\mathbf{A}\) is equals to the sum of the eigenvalues.
If \(\mathbf{A}\) is symmetric, then all eigenvalues are real.
If \(\mathbf{A}\) is positive definite, then all eigenvalues are positive.

Note that, some of these properties can be explained using the eigendecomposition \(\mathbf{A} = \mathbf{V}\mathbf{D}\mathbf{V}^{-1}\).

7.1.5 Cauchy–Schwartz inequality

\(|<u, v>| <= ||u|| * ||v||\)