Vector spaces
Definition
Definition: a vector space \(V\) is a set on which the operations of addition and scalar multiplication are defined. Such that for all vectors \(\mathbf{u}\) and \(\mathbf{v}\) in \(V\) the vectors \(\mathbf{u} + \mathbf{v}\) are in \(V\) and for each scalar \(a\) the vector \(a\mathbf{v}\) is in \(V\). With the following axioms satisfied.
- Associativity of vector addition: \(\mathbf{u} + (\mathbf{v} + \mathbf{w}) = (\mathbf{u} + \mathbf{v}) + \mathbf{w}\) for any \(\mathbf{u},\mathbf{v}, \mathbf{w} \in V\).
- Commutativity of vector addition: \(\mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u}\) for any \(\mathbf{u},\mathbf{v} \in V\).
- Identity element of vector addition: \(\exists \mathbf{0} \in V\) such that \(\mathbf{v} + \mathbf{0}\) for all \(\mathbf{v} \in V\).
- Inverse element of vector addition: \(\forall \mathbf{v} \in V \exists (-\mathbf{v}) \in V\) such that \(\mathbf{v} + (-\mathbf{v}) = \mathbf{0}\).
- Distributivity of scalar multiplication with respect to vector addition: \(a(\mathbf{u} + \mathbf{v}) = a\mathbf{u} + a\mathbf{v}\) for any scalar \(a\) and any \(\mathbf{u}, \mathbf{v} \in V\).
- Distributivity of scalar multiplication with respect to field addition: \((a + b) \mathbf{v} = a \mathbf{v} + b \mathbf{v}\) for any scalars \(a\) and \(b\) and any \(\mathbf{v} \in V\).
- Compatibility of scalar multiplication with field multiplication: \(a(b\mathbf{v}) = (ab) \mathbf{v}\) for any scalars \(a\) and \(b\) and any \(\mathbf{v} \in V\).
- Identity element of scalar multiplication: \(1 \mathbf{v} = \mathbf{v}\) for all \(\mathbf{v} \in V\).
Some important properties of a vector space can be derived from this definition in the following proposition a few have been listed.
Proposition: if \(V\) is a vector space and \(\mathbf{u}\), \(\mathbf{v}\) is in \(V\), then
- \(0 \mathbf{v} = \mathbf{0}\).
- \(\mathbf{u} + \mathbf{v} = \mathbf{0} \implies \mathbf{u} = - \mathbf{v}\).
- \((-1)\mathbf{v} = - \mathbf{v}\).
Proof:
For 1, suppose \(\mathbf{v} \in V\) then it follows from axioms 3, 6 and 8
therefore
For 2, suppose for \(\mathbf{u}, \mathbf{v} \in V\) that \(\mathbf{u} + \mathbf{v} = \mathbf{0}\) then it follows from axioms 1, 3 and 4
therefore
For 3, suppose \(\mathbf{v} \in V\) then it follows from 1 and axioms 4 and 6
therefore
from 2 it follows then that
Euclidean spaces
Perhaps the most elementary vector spaces are the Euclidean vector spaces \(V = \mathbb{R}^n\) with \(n \in \mathbb{N}\). Given a nonzero vector \(\mathbf{u} \in \mathbb{R}^n\) defined by
it may be associated with the directed line segment from \((0, \dots, 0)\) to \((u_1, \dots, u_n)\). Or more generally line segments that have the same length and direction can be represented by any line segment from \((a_1, \dots, a_n)\) to \((a_1 + u_1, \dots, a_n + u_n)\). Vector addition and scalar multiplication in \(\mathbb{R}^n\) are respectively defined by
for any \(\mathbf{u}, \mathbf{v} \in \mathbb{R}^n\) and any scalar \(a\).
This can be extended to matrices with \(V = \mathbb{R}^{m \times n}\) with \(m,n \in \mathbb{N}\), the set of all matrices. Given a nonzero matrix \(A \in \mathbb{R}^{m \times n}\) defined by \(A = (a_{ij})\). Matrix addition and scalar multiplication in \(\mathbb{R}^{m \times n}\) are respectively defined by
for any \(A, B, C \in \mathbb{R}^{m \times n}\) and any scalar \(\alpha\).
Function spaces
Let \(V\) be a vector space over a field \(F\) and let \(X\) be any set. The functions \(X \to F\) can be given the structure of a vector space over \(F\) where the operations are defined by
for any \(f,g: X \to F\), any \(x \in X\) and any \(a \in F\).
Polynomial spaces
Let \(P_n\) denote the set of all polynomials of degree less than \(n \in \mathbb{N}\) where the operations are defined by
for any \(p,q: X \to P_n\), any \(x \in X\) and any \(a \in P_n\).
Vector subspaces
Definition: if \(S\) is a nonempty subset of a vector space \(V\) and \(S\) satisfies the conditions
- \(a \mathbf{u} \in S\) whenever \(\mathbf{u} \in S\) for any scalar \(a\).
- \(\mathbf{u} + \mathbf{v} \in S\) whenever \(\mathbf{u}, \mathbf{v} \in S\).
then \(S\) is said to be a subspace of \(V\).
In a vector space \(V\) it can be readily verified that \(\{\mathbf{0}\}\) and \(V\) are subspaces of \(V\). All other subspaces are referred to as proper subspaces and \(\{\mathbf{0}\}\) is referred to as the zero subspace.
Theorem: Every subspace of a vector space is a vector space.
Proof:
May be proved by testing if all axioms remain valid for the definition of a subspace.
The null space of a matrix
Definition: let \(A \in \mathbb{R}^{m \times n}\), \(\mathbf{x} \in \mathbb{R}^n\) and let \(N(A)\) denote the set of all solutions of the homogeneous system \(A\mathbf{x} = \mathbf{0}\). Therefore
\[ N(A) = \{\mathbf{x} \in \mathbb{R}^n \;|\; A \mathbf{x} = \mathbf{0}\}, \]referred to as the null space of \(A\).
Claiming that \(N(A)\) is a subspace of \(\mathbb{R}^n\). Clearly \(\mathbf{0} \in N(A)\) so \(N(A)\) is nonempty. If \(\mathbf{x} \in N(A)\) and \(\alpha\) is a scalar then
and hence \(\alpha \mathbf{x} \in N(A)\). If \(\mathbf{x}, \mathbf{y} \in N(A)\) then
therefore \(\mathbf{x} + \mathbf{y} \in N(A)\) and it follows that \(N(A)\) is a subspace of \(\mathbb{R}^n\).
The span of a set of vectors
Definition: let \(\mathbf{v}_1, \dots, \mathbf{v}_n\) be vectors in a vector space \(V\) with \(n \in \mathbb{N}\). A sum of the form
\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n, \]with scalars \(a_1, \dots, a_n\) is called a linear combination of \(\mathbf{v}_1, \dots, \mathbf{v}_n\).
The set of all linear combinations of \(\mathbf{v}_1, \dots, \mathbf{v}_n\) is called the span of \(\mathbf{v}_1, \dots, \mathbf{v}_n\) which is denoted by \(\text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\).
The nullspace can be for example defined by a span of vectors.
Theorem: if \(\mathbf{v}_1, \dots, \mathbf{v}_n\) are vectors in a vector space \(V\) with \(n \in \mathbb{N}\) then \(\text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) is a subspace of \(V\).
Proof:
Let \(b\) be a scalar and \(\mathbf{u} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) given by
with scalars \(a_1, \dots, a_n\). Since
it follows that \(b \mathbf{u} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\).
If we also have \(\mathbf{w} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) given by
with scalars \(b_1, \dots, b_n\). Then
it follows that \(\mathbf{u} + \mathbf{w} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) is a subspace of \(V\).
For example, a vector \(\mathbf{x} \in \mathbb{R}^3\) is in \(\text{span}(\mathbf{e}_1, \mathbf{e}_2)\) if and only if it lies in the \(x_1 x_2\)-plane in 3-space. Thus we can think of the \(x_1 x_2\)-plane as the geometrical representation of the subspace \(\text{span}(\mathbf{e}_1, \mathbf{e}_2)\).
Definition: the set \(\{\mathbf{v}_1, \dots, \mathbf{v}_n\}\) with \(n \in \mathbb{N}\) is a spanning set for \(V\) if and only if every vector \(V\) can be written as a linear combination of \(\mathbf{v}_1, \dots, \mathbf{v}_n\).
Linear independence
We have the following observations.
Proposition: if \(\mathbf{v}_1, \dots, \mathbf{v}_n\) with \(n \in \mathbb{N}\) span a vector space \(V\) and one of these vectors can be written as a linear combination of the other \(n-1\) vectors then those \(n-1\) vectors span \(V\).
Proof:
suppose \(\mathbf{v}_n\) with \(n \in \mathbb{N}\) can be written as a linear combination of the vectors \(\mathbf{v}_1, \dots, \mathbf{v}_{n-1}\) given by
Let \(\mathbf{v}\) be any element of \(V\). Since we have
we can write any vector \(\mathbf{v} \in V\) as a linear combination of \(\mathbf{v}_1, \dots, \mathbf{v}_{n-1}\) and hence these vectors span \(V\).
Proposition: given \(n\) vectors \(\mathbf{v}_1, \dots, \mathbf{v}_n\) with \(n \in \mathbb{N}\), it is possible to write one of the vectors as a linear combination of the other \(n-1\) vectors if and only if there exist scalars \(a_1, \dots, a_n\) not all zero such that
\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n = \mathbf{0}. \]
Proof:
Suppose that one of the vectors \(\mathbf{v}_1, \dots, \mathbf{v}_n\) with \(n \in \mathbb{N}\) can be written as a linear combination of the others
Subtracting \(\mathbf{v}_n\) from both sides obtains
we have \(a_n = -1\) and
We may use these oberservations to state the following definitions.
Definition: the vectors \(\mathbf{v}_1, \dots, \mathbf{v}_n\) in a vector space \(V\) with \(n \in \mathbb{N}\) are said to be linearly independent if
\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n = \mathbf{0} \implies \forall i \in \{1, \dots, n\} [c_i = 0]. \]
It follows from the above propositions that if \(\{\mathbf{v}_1, \dots, \mathbf{v}_n\}\) is a minimal spanning set of a vector space \(V\) then \(\mathbf{v}_1, \dots, \mathbf{v}_n\) are linearly independent. A minimal spanning set is called a basis of the vector space.
Definition: the vectors \(\mathbf{v}_1, \dots, \mathbf{v}_n\) in a vector space \(V\) with \(n \in \mathbb{N}\) are said to be linearly dependent if there exists scalars \(a_1, \dots, a_n\) not all zero such that
\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n = \mathbf{0}. \]
It follows from the above propositions that if a set of vectors is linearly dependent then at least one vector is a linear combination of the other vectors.
Theorem: let \(\mathbf{x}_1, \dots, \mathbf{x}_n\) be vectors in \(\mathbb{R}^n\) with \(n \in \mathbb{N}\) and let \(X = (\mathbf{x}_1, \dots, \mathbf{x}_n)\). The vectors \(\mathbf{x}_1, \dots, \mathbf{x}_n\) will be linearly dependent if and only if \(X\) is singular.
Proof:
Let \(\mathbf{x}_1, \dots, \mathbf{x}_n\) be vectors in \(\mathbb{R}^n\) with \(n \in \mathbb{N}\) and let \(X = (\mathbf{x}_1, \dots, \mathbf{x}_n)\). Suppose we have the linear combination given by
can be rewritten as a matrix equation by
with \(\mathbf{a} = (a_1, \dots, a_n)^T\). This equation will have a nontrivial solution if and only if \(X\) is singular. Therefore \(\mathbf{x}_1, \dots, \mathbf{x}_n\) will be linearly dependent if and only if \(X\) is singular.
This result can be used to test whether \(n\) vectors are linearly independent in \(\mathbb{R}^n\) for \(n \in \mathbb{N}\).
Theorem: let \(\mathbf{v}_1, \dots, \mathbf{v}_n\) be vectors in a vector space \(V\) with \(n \in \mathbb{N}\). A vector \(\mathbf{v} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) can be written uniquely as a linear combination of \(\mathbf{v}_1, \dots, \mathbf{v}_n\) if and only if \(\mathbf{v}_1, \dots, \mathbf{v}_n\) are linearly independent.
Proof:
If \(\mathbf{v} \in \text{span}(\mathbf{v}_1, \dots \mathbf{v}_n)\) with \(n \in \mathbb{N}\) then \(\mathbf{v}\) can be written as a linear combination
Suppose that \(\mathbf{v}\) can also be expressed as a linear combination
If \(\mathbf{v}_1, \dots \mathbf{v}_n\) are linearly independent then subtracting both expressions yields
By the linear independence of \(\mathbf{v}_1, \dots \mathbf{v}_n\), the coefficients must all be 0, hence
therefore the representation of \(\mathbf{v}\) is unique when \(\mathbf{v}_1, \dots \mathbf{v}_n\) are linearly independent.
On the other hand if \(\mathbf{v}_1, \dots \mathbf{v}_n\) are linearly dependent then the coefficients must not all be 0 and \(a_i \neq b_i\) for some \(i \in \{1, \dots, n\}\). Therefore the representation of \(\mathbf{v}\) is not unique when \(\mathbf{v}_1, \dots \mathbf{v}_n\) are linearly dependent.
Basis and dimension
Definition: the vectors \(\mathbf{v}_1,\dots,\mathbf{v}_n \in V\) form a basis if and only if
- \(\mathbf{v}_1,\dots,\mathbf{v}_n\) are linearly independent,
- \(\mathbf{v}_1,\dots,\mathbf{v}_n\) span \(V\).
Therefore, a basis may define a vector space, but it is not necessarily unique.
Theorem: if \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) is a spanning set for a vector space \(V\), then any collection of \(m\) vectors in \(V\) where \(m>n\), is linearly dependent.
Proof:
Let \(\mathbf{u}_1, \dots, \mathbf{u}_m \in V\), where \(m > n\). Then since \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) span \(V\) we have
for \(i,j \in \{1, \dots, n\}\) with \(a_{ij} \in \mathbb{R}\).
A linear combination \(c_1 \mathbf{u}_1 + \dots + c_m \mathbf{u}_m\) can be written in the form
obtaining
Considering the system of equations
for \(j \in \{1, \dots, n\}\), a homogeneous system with more unknowns than equations. Therefore the system must have a nontrivial solution \((\hat c_1, \dots, \hat c_m)^T\), but then
hence \(\mathbf{u}_1, \dots, \mathbf{u}_m\) are linearly dependent.
Corollary: if both \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) and \(\{\mathbf{u}_1,\dots,\mathbf{u}_m\}\) are bases for a vector space \(V\), then \(n = m\).
Proof:
Let both \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) and \(\{\mathbf{u}_1,\dots,\mathbf{u}_m\}\) be bases for \(V\). Since \(\mathbf{v}_1,\dots,\mathbf{v}_n\) span \(V\) and \(\mathbf{u}_1,\dots,\mathbf{u}_m\) are linearly independent then it follows that \(m \leq n\), similarly \(\mathbf{u}_1,\dots,\mathbf{u}_m\) span \(V\) and \(\mathbf{v}_1,\dots,\mathbf{v}_n\) are linearly independent so \(n \leq m\). Which must imply \(n=m\).
With this result we may now refer to the number of elements in any basis for a given vector space. Which leads to the following definition.
Definition: let \(V\) be a vector space. If \(V\) has a basis consisting of \(n \in \mathbb{N}\) vectors, then \(V\) has dimension \(n\). The subspace \(\{\mathbf{0}\}\) of \(V\) is said to have dimension \(0\). \(V\) is said to be finite dimensional if there is a finite set of vectors that spans \(V\), otherwise \(V\) is infinite dimensional.
So a single nonzero vector must span one-dimension exactly. For multiple vectors we have the following theorem.
Theorem: if \(V\) is a vector space of dimension \(n \in \mathbb{N}\ \backslash \{\mathbf{0}\}\), then
- any set of \(n\) linearly independent vectors spans \(V\),
- any \(n\) vectors that span \(V\) are linearly independent,
Proof:
To prove 1, suppose that \(\mathbf{v}_1,\dots,\mathbf{v}_n \in V\) are linearly independent and \(\mathbf{v} \in V\). Since \(V\) has dimension \(n\), it has a basis consisting of \(n\) vectors and these vectors span \(V\). It follows that \(\mathbf{v}_1,\dots,\mathbf{v}_n, \mathbf{v}\) must be linearly dependent. Thus there exist scalars \(c_1, \dots, c_n, c_{n+1}\) not all zero, such that
The scalar \(c_{n+1}\) cannot be zero, since that would imply that \(\mathbf{v}_1,\dots,\mathbf{v}_n\) are linearly dependent, hence
with
for \(i \in \{1, \dots, n\}\). Since \(\mathbf{v}\) was an arbitrary vector in \(V\) it follows that \(\mathbf{v}_1, \dots, \mathbf{v}_n\) span \(V\).
To prove 2, suppose that \(\mathbf{v}_1,\dots,\mathbf{v}_n\) span \(V\). If \(\mathbf{v}_1,\dots,\mathbf{v}_n\) are linearly dependent, then one vector \(\mathbf{v}_i\) can be written as a linear combination of the others, take \(i=n\) without loss of generality. It follows that \(\mathbf{v}_1,\dots,\mathbf{v}_{n-1}\) will still span \(V\), which contradicts with \(\dim V = n\), therefore \(\mathbf{v}_1, \dots, \mathbf{v}_n\) must be linearly independent.
Therefore no set fewer than \(n\) vectors can span \(V\), if \(\dim V = n\).
Change of basis
Definition: let \(V\) be a vector space and let \(E = \{\mathbf{e}_1, \dots \mathbf{e}_n\}\) be an ordered basis for \(V\). If \(\mathbf{v}\) is any element of \(V\), then \(\mathbf{v}\) can be written in the form
\[ \mathbf{v} = v_1 \mathbf{e}_1 + \dots + v_n \mathbf{e}_n, \]where \(v_1, \dots, v_n \in \mathbb{R}\) are the coordinates of \(\mathbf{v}\) relative to \(E\).
Row space and column space
Definition: if \(A\) is an \(m \times n\) matrix, the subspace of \(\mathbb{R}^{n}\) spanned by the row vectors of \(A\) is called the row space of \(A\). The subspace of \(\mathbb{R}^m\) spanned by the column vectors of \(A\) is called the column space of \(A\).
With the definition of a row space the following theorem may be posed.
Theorem: two row equivalent matrices have the same row space.
Proof:
Let \(A\) and \(B\) be two matrices, if \(B\) is row equivalent to \(A\) then \(B\) can be formed from \(A\) by a finite sequence of row operations. Thus the row vectors of \(B\) must be linear combinations of the row vectors of \(A\). Consequently, the row space of \(B\) must be a subspace of the row space of \(A\). Since \(A\) is row equivalent to \(B\), by the same reasoning, the row space of \(A\) is a subspace of the row space of \(B\).
With the definition of a column space a theorem posed in systems of linear equations may be restated as.
Theorem: a linear system \(A \mathbf{x} = \mathbf{b}\) is consistent if and only if \(\mathbf{b}\) is in the column space of \(A\).
Proof:
For the proof, see the initial proof in systems of linear equations.
With this restatement the following statements may be proposed.
Proposition: let \(A\) be an \(m \times n\) matrix. The linear system \(A \mathbf{x} = \mathbf{b}\) is consistent for every \(\mathbf{b} \in \mathbb{R}^m\) if and only if the column vectors of \(A\) span \(\mathbb{R}^m\).
The system \(A \mathbf{x} = \mathbf{b}\) has at most one solution for every \(\mathbf{b}\) if and only if the column vectors of \(A\) are linearly independent.
Proof:
Let \(A\) be an \(m \times n\) matrix. It follows that \(A \mathbf{x} = \mathbf{b}\) will be consistent for every \(\mathbf{b} \in \mathbb{R}^m\) if and only if the column vectors of \(A\) span \(\mathbb{R}^m\). To prove the second statement, the system \(A \mathbf{x} = \mathbf{0}\) can have only the trivial solution and hence the column vectors of \(A\) must be linearly independent. Conversely, if the column vectors of \(A\) are linearly independent, \(A \mathbf{x} = \mathbf{0}\) has only the trivial solution. If \(\mathbf{x}_1, \mathbf{x}_2\) were both solutions of \(A \mathbf{x} = \mathbf{b}\) then \(\mathbf{x}_1 - \mathbf{x}_2\) would be a solution of \(A \mathbf{x} = \mathbf{0}\)
It follows that \(\mathbf{x}_1 - \mathbf{x}_2 = \mathbf{0}\) and hence \(\mathbf{x}_1 = \mathbf{x}_2\).
From these propositions the following corollary emerges.
Corollary: an \(n \times n\) matrix \(A\) is nonsingular if and only if the column vectors of \(A\) form a basis for \(\mathbb{R}^n\).
Proof:
Let \(A\) be an \(m \times n\) matrix. If the column vectors of \(A\) span \(\mathbb{R}^m\), then \(n\) must be greater or equal to \(m\), since no set of fewer than \(m\) vectors could span \(\mathbb{R}^m\). If the columns of \(A\) are linearly independent, then \(n\) must be less than or equal to \(m\), since every set of more than \(m\) vectors in \(\mathbb{R}^m\) is linearly dependent. Thus, if the column vectors of \(A\) form a basis for \(\mathbb{R}^m\), then \(n = m\).
Theorem: if \(A\) is an \(m \times n\) matrix, the dimension of the row space of \(A\) equals the dimension of the column space of \(A\).
Proof:
Will be added later.
Rank and nullity
Definition: the rank of a matrix \(A\), denoted as \(\text{rank}(A)\), is the dimension of the row space of \(A\).
The rank of a matrix may be determined by reducing the matrix to row echelon form. The nonzero rows of the row echelon matrix will form a basis for the row space. The rank may be interpreted as a measure for singularity of the matrix.
Definition: the nullity of a matrix \(A\), denoted as \(\text{nullity}(A)\), is the dimension of the null space of \(A\).
The nullity of \(A\) is the number of columns without a pivot in the reduced echelon form.
Theorem: if \(A\) is an \(m \times n\) matrix, then
\[ \text{rank}(A) + \text{nullity}(A) = n. \]
Proof:
Let \(U\) be the reduced echelon form of \(A\). The system \(A \mathbf{x} = \mathbf{0}\) is equivalent to the system \(U \mathbf{x} = \mathbf{0}\). If \(A\) has rank \(r\), then \(U\) will have \(r\) nonzero rows and consequently the system \(U \mathbf{x} = \mathbf{0}\) will involve \(r\) pivots and \(n - r\) free variables. The dimension of the null space will equal the number of free variables.