Vector spaces

Definition

Definition: a vector space \(V\) is a set on which the operations of addition and scalar multiplication are defined. Such that for all vectors \(\mathbf{u}\) and \(\mathbf{v}\) in \(V\) the vectors \(\mathbf{u} + \mathbf{v}\) are in \(V\) and for each scalar \(a\) the vector \(a\mathbf{v}\) is in \(V\). With the following axioms satisfied.

Associativity of vector addition: \(\mathbf{u} + (\mathbf{v} + \mathbf{w}) = (\mathbf{u} + \mathbf{v}) + \mathbf{w}\) for any \(\mathbf{u},\mathbf{v}, \mathbf{w} \in V\).

Commutativity of vector addition: \(\mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u}\) for any \(\mathbf{u},\mathbf{v} \in V\).

Identity element of vector addition: \(\exists \mathbf{0} \in V\) such that \(\mathbf{v} + \mathbf{0}\) for all \(\mathbf{v} \in V\).

Inverse element of vector addition: \(\forall \mathbf{v} \in V \exists (-\mathbf{v}) \in V\) such that \(\mathbf{v} + (-\mathbf{v}) = \mathbf{0}\).

Distributivity of scalar multiplication with respect to vector addition: \(a(\mathbf{u} + \mathbf{v}) = a\mathbf{u} + a\mathbf{v}\) for any scalar \(a\) and any \(\mathbf{u}, \mathbf{v} \in V\).

Distributivity of scalar multiplication with respect to field addition: \((a + b) \mathbf{v} = a \mathbf{v} + b \mathbf{v}\) for any scalars \(a\) and \(b\) and any \(\mathbf{v} \in V\).

Compatibility of scalar multiplication with field multiplication: \(a(b\mathbf{v}) = (ab) \mathbf{v}\) for any scalars \(a\) and \(b\) and any \(\mathbf{v} \in V\).

Identity element of scalar multiplication: \(1 \mathbf{v} = \mathbf{v}\) for all \(\mathbf{v} \in V\).

Some important properties of a vector space can be derived from this definition in the following proposition a few have been listed.

Proposition: if \(V\) is a vector space and \(\mathbf{u}\), \(\mathbf{v}\) is in \(V\), then

\(0 \mathbf{v} = \mathbf{0}\).

\(\mathbf{u} + \mathbf{v} = \mathbf{0} \implies \mathbf{u} = - \mathbf{v}\).

\((-1)\mathbf{v} = - \mathbf{v}\).

Proof:

For 1, suppose \(\mathbf{v} \in V\) then it follows from axioms 3, 6 and 8

\[ \mathbf{v} = 1 \mathbf{v} = (1 + 0)\mathbf{v} = 1 \mathbf{v} + 0 \mathbf{v} = \mathbf{v} + 0\mathbf{v}, \]

therefore

\[ \begin{align*} -\mathbf{v} + \mathbf{v} &= - \mathbf{v} + (\mathbf{v} + 0\mathbf{v}) = (-\mathbf{v} + \mathbf{v}) + 0\mathbf{v}, \\ \mathbf{0} &= \mathbf{0} + 0\mathbf{v} = 0\mathbf{v}. \end{align*} \]

For 2, suppose for \(\mathbf{u}, \mathbf{v} \in V\) that \(\mathbf{u} + \mathbf{v} = \mathbf{0}\) then it follows from axioms 1, 3 and 4

\[ - \mathbf{v} = - \mathbf{v} + \mathbf{0} = - \mathbf{v} + (\mathbf{v} + \mathbf{u}), \]

therefore

\[ -\mathbf{v} = (-\mathbf{v} + \mathbf{v}) + \mathbf{u} = \mathbf{0} + \mathbf{u} = \mathbf{u}. \]

For 3, suppose \(\mathbf{v} \in V\) then it follows from 1 and axioms 4 and 6

\[ \mathbf{0} = 0 \mathbf{v} = (1 + (-1))\mathbf{v} = 1\mathbf{v} + (-1)\mathbf{v}, \]

therefore

\[ \mathbf{v} + (-1)\mathbf{v} = \mathbf{0}, \]

from 2 it follows then that

\[ (-1)\mathbf{v} = -\mathbf{v}. \]

Euclidean spaces

Perhaps the most elementary vector spaces are the Euclidean vector spaces \(V = \mathbb{R}^n\) with \(n \in \mathbb{N}\). Given a nonzero vector \(\mathbf{u} \in \mathbb{R}^n\) defined by

\[ \mathbf{u} = \begin{pmatrix}u_1 \\ \vdots \\ u_n\end{pmatrix}, \]

it may be associated with the directed line segment from \((0, \dots, 0)\) to \((u_1, \dots, u_n)\). Or more generally line segments that have the same length and direction can be represented by any line segment from \((a_1, \dots, a_n)\) to \((a_1 + u_1, \dots, a_n + u_n)\). Vector addition and scalar multiplication in \(\mathbb{R}^n\) are respectively defined by

\[ \mathbf{u} + \mathbf{v} = \begin{pmatrix} u_1 + v_1 \\ \vdots \\ u_n + v_n \end{pmatrix} \quad \text{ and } \quad a \mathbf{u} = \begin{pmatrix} a u_1 \\ \vdots \\ a u_n \end{pmatrix}, \]

for any \(\mathbf{u}, \mathbf{v} \in \mathbb{R}^n\) and any scalar \(a\).

This can be extended to matrices with \(V = \mathbb{R}^{m \times n}\) with \(m,n \in \mathbb{N}\), the set of all matrices. Given a nonzero matrix \(A \in \mathbb{R}^{m \times n}\) defined by \(A = (a_{ij})\). Matrix addition and scalar multiplication in \(\mathbb{R}^{m \times n}\) are respectively defined by

\[ A + B = C \iff a_{ij} + b_{ij} = c_{ij} \quad \text{ and } \quad \alpha A = C \iff \alpha a_{ij} = c_{ij}, \]

for any \(A, B, C \in \mathbb{R}^{m \times n}\) and any scalar \(\alpha\).

Function spaces

Let \(V\) be a vector space over a field \(F\) and let \(X\) be any set. The functions \(X \to F\) can be given the structure of a vector space over \(F\) where the operations are defined by

\[ \begin{align*} (f + g)(x) = f(x) + g(x), \\ (af)(x) = af(x), \end{align*} \]

for any \(f,g: X \to F\), any \(x \in X\) and any \(a \in F\).

Polynomial spaces

Let \(P_n\) denote the set of all polynomials of degree less than \(n \in \mathbb{N}\) where the operations are defined by

\[ \begin{align*} (p+q)(x) = p(x) + q(x), \\ (ap)(x) = ap(x), \end{align*} \]

for any \(p,q: X \to P_n\), any \(x \in X\) and any \(a \in P_n\).

Vector subspaces

Definition: if \(S\) is a nonempty subset of a vector space \(V\) and \(S\) satisfies the conditions

\(a \mathbf{u} \in S\) whenever \(\mathbf{u} \in S\) for any scalar \(a\).

\(\mathbf{u} + \mathbf{v} \in S\) whenever \(\mathbf{u}, \mathbf{v} \in S\).

then \(S\) is said to be a subspace of \(V\).

In a vector space \(V\) it can be readily verified that \(\{\mathbf{0}\}\) and \(V\) are subspaces of \(V\). All other subspaces are referred to as proper subspaces and \(\{\mathbf{0}\}\) is referred to as the zero subspace.

Theorem: Every subspace of a vector space is a vector space.

Proof:

May be proved by testing if all axioms remain valid for the definition of a subspace.

The null space of a matrix

Definition: let \(A \in \mathbb{R}^{m \times n}\), \(\mathbf{x} \in \mathbb{R}^n\) and let \(N(A)\) denote the set of all solutions of the homogeneous system \(A\mathbf{x} = \mathbf{0}\). Therefore

\[ N(A) = \{\mathbf{x} \in \mathbb{R}^n \;|\; A \mathbf{x} = \mathbf{0}\}, \]

referred to as the null space of \(A\).

Claiming that \(N(A)\) is a subspace of \(\mathbb{R}^n\). Clearly \(\mathbf{0} \in N(A)\) so \(N(A)\) is nonempty. If \(\mathbf{x} \in N(A)\) and \(\alpha\) is a scalar then

\[ A(\alpha \mathbf{x}) = \alpha A\mathbf{x} = \alpha \mathbf{0} = \mathbf{0} \]

and hence \(\alpha \mathbf{x} \in N(A)\). If \(\mathbf{x}, \mathbf{y} \in N(A)\) then

\[ A(\mathbf{x} + \mathbf{y}) = A\mathbf{x} + A\mathbf{y} = \mathbf{0} + \mathbf{0} = \mathbf{0} \]

therefore \(\mathbf{x} + \mathbf{y} \in N(A)\) and it follows that \(N(A)\) is a subspace of \(\mathbb{R}^n\).

The span of a set of vectors

Definition: let \(\mathbf{v}_1, \dots, \mathbf{v}_n\) be vectors in a vector space \(V\) with \(n \in \mathbb{N}\). A sum of the form

\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n, \]

with scalars \(a_1, \dots, a_n\) is called a linear combination of \(\mathbf{v}_1, \dots, \mathbf{v}_n\).

The set of all linear combinations of \(\mathbf{v}_1, \dots, \mathbf{v}_n\) is called the span of \(\mathbf{v}_1, \dots, \mathbf{v}_n\) which is denoted by \(\text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\).

The nullspace can be for example defined by a span of vectors.

Theorem: if \(\mathbf{v}_1, \dots, \mathbf{v}_n\) are vectors in a vector space \(V\) with \(n \in \mathbb{N}\) then \(\text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) is a subspace of \(V\).

Proof:

Let \(b\) be a scalar and \(\mathbf{u} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) given by

\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n, \]

with scalars \(a_1, \dots, a_n\). Since

\[ b \mathbf{u} = (b a_1)\mathbf{v}_1 + \dots + (b a_n)\mathbf{v}_n, \]

it follows that \(b \mathbf{u} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\).

If we also have \(\mathbf{w} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) given by

\[ b_1 \mathbf{v}_1 + \dots + b_n \mathbf{v}_n, \]

with scalars \(b_1, \dots, b_n\). Then

\[ \mathbf{u} + \mathbf{w} = (a_1 + b_1) \mathbf{v}_1 + \dots + (a_n + b_n)\mathbf{v}_n, \]

it follows that \(\mathbf{u} + \mathbf{w} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) is a subspace of \(V\).

For example, a vector \(\mathbf{x} \in \mathbb{R}^3\) is in \(\text{span}(\mathbf{e}_1, \mathbf{e}_2)\) if and only if it lies in the \(x_1 x_2\)-plane in 3-space. Thus we can think of the \(x_1 x_2\)-plane as the geometrical representation of the subspace \(\text{span}(\mathbf{e}_1, \mathbf{e}_2)\).

Definition: the set \(\{\mathbf{v}_1, \dots, \mathbf{v}_n\}\) with \(n \in \mathbb{N}\) is a spanning set for \(V\) if and only if every vector \(V\) can be written as a linear combination of \(\mathbf{v}_1, \dots, \mathbf{v}_n\).

Linear independence

We have the following observations.

Proposition: if \(\mathbf{v}_1, \dots, \mathbf{v}_n\) with \(n \in \mathbb{N}\) span a vector space \(V\) and one of these vectors can be written as a linear combination of the other \(n-1\) vectors then those \(n-1\) vectors span \(V\).

Proof:

suppose \(\mathbf{v}_n\) with \(n \in \mathbb{N}\) can be written as a linear combination of the vectors \(\mathbf{v}_1, \dots, \mathbf{v}_{n-1}\) given by

\[ \mathbf{v}_n = a_1 \mathbf{v}_1 + \dots + a_{n-1} \mathbf{v}_{n-1}. \]

Let \(\mathbf{v}\) be any element of \(V\). Since we have

\[ \begin{align*} \mathbf{v} &= b_1 \mathbf{v}_1 + \dots + b_{n-1} \mathbf{v}_{n-1} + b_n \mathbf{v}_n, \\ &= b_1 \mathbf{v}_1 + \dots + b_{n-1} \mathbf{v}_{n-1} + b_n (a_1 \mathbf{v}_1 + \dots + a_{n-1} \mathbf{v}_{n-1}), \\ &= (b_1 + b_n a_1)\mathbf{v}_1 + \dots + (b_{n-1} + b_n a_{n-1}) \mathbf{v}_{n-1}, \end{align*} \]

we can write any vector \(\mathbf{v} \in V\) as a linear combination of \(\mathbf{v}_1, \dots, \mathbf{v}_{n-1}\) and hence these vectors span \(V\).

Proposition: given \(n\) vectors \(\mathbf{v}_1, \dots, \mathbf{v}_n\) with \(n \in \mathbb{N}\), it is possible to write one of the vectors as a linear combination of the other \(n-1\) vectors if and only if there exist scalars \(a_1, \dots, a_n\) not all zero such that

\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n = \mathbf{0}. \]

Proof:

Suppose that one of the vectors \(\mathbf{v}_1, \dots, \mathbf{v}_n\) with \(n \in \mathbb{N}\) can be written as a linear combination of the others

\[ \mathbf{v}_n = a_1 \mathbf{v}_1 + \dots + a_{n-1} \mathbf{v}_{n-1}. \]

Subtracting \(\mathbf{v}_n\) from both sides obtains

\[ a_1 \mathbf{v}_1 + \dots + a_{n-1} \mathbf{v}_{n-1} - \mathbf{v}_n = \mathbf{0}, \]

we have \(a_n = -1\) and

\[ a_1 \mathbf{v}_1 + \dots + a_n\mathbf{v}_n = \mathbf{0}. \]

We may use these oberservations to state the following definitions.

Definition: the vectors \(\mathbf{v}_1, \dots, \mathbf{v}_n\) in a vector space \(V\) with \(n \in \mathbb{N}\) are said to be linearly independent if

\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n = \mathbf{0} \implies \forall i \in \{1, \dots, n\} [c_i = 0]. \]

It follows from the above propositions that if \(\{\mathbf{v}_1, \dots, \mathbf{v}_n\}\) is a minimal spanning set of a vector space \(V\) then \(\mathbf{v}_1, \dots, \mathbf{v}_n\) are linearly independent. A minimal spanning set is called a basis of the vector space.

Definition: the vectors \(\mathbf{v}_1, \dots, \mathbf{v}_n\) in a vector space \(V\) with \(n \in \mathbb{N}\) are said to be linearly dependent if there exists scalars \(a_1, \dots, a_n\) not all zero such that

\[ a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n = \mathbf{0}. \]

It follows from the above propositions that if a set of vectors is linearly dependent then at least one vector is a linear combination of the other vectors.

Theorem: let \(\mathbf{x}_1, \dots, \mathbf{x}_n\) be vectors in \(\mathbb{R}^n\) with \(n \in \mathbb{N}\) and let \(X = (\mathbf{x}_1, \dots, \mathbf{x}_n)\). The vectors \(\mathbf{x}_1, \dots, \mathbf{x}_n\) will be linearly dependent if and only if \(X\) is singular.

Proof:

Let \(\mathbf{x}_1, \dots, \mathbf{x}_n\) be vectors in \(\mathbb{R}^n\) with \(n \in \mathbb{N}\) and let \(X = (\mathbf{x}_1, \dots, \mathbf{x}_n)\). Suppose we have the linear combination given by

\[ a_1 \mathbf{x}_1 + \dots + a_n \mathbf{x}_n = \mathbf{0}, \]

can be rewritten as a matrix equation by

\[ X\mathbf{a} = \mathbf{0}, \]

with \(\mathbf{a} = (a_1, \dots, a_n)^T\). This equation will have a nontrivial solution if and only if \(X\) is singular. Therefore \(\mathbf{x}_1, \dots, \mathbf{x}_n\) will be linearly dependent if and only if \(X\) is singular.

This result can be used to test whether \(n\) vectors are linearly independent in \(\mathbb{R}^n\) for \(n \in \mathbb{N}\).

Theorem: let \(\mathbf{v}_1, \dots, \mathbf{v}_n\) be vectors in a vector space \(V\) with \(n \in \mathbb{N}\). A vector \(\mathbf{v} \in \text{span}(\mathbf{v}_1, \dots, \mathbf{v}_n)\) can be written uniquely as a linear combination of \(\mathbf{v}_1, \dots, \mathbf{v}_n\) if and only if \(\mathbf{v}_1, \dots, \mathbf{v}_n\) are linearly independent.

Proof:

If \(\mathbf{v} \in \text{span}(\mathbf{v}_1, \dots \mathbf{v}_n)\) with \(n \in \mathbb{N}\) then \(\mathbf{v}\) can be written as a linear combination

\[ \mathbf{v} = a_1 \mathbf{v}_1 + \dots + a_n \mathbf{v}_n. \]

Suppose that \(\mathbf{v}\) can also be expressed as a linear combination

\[ \mathbf{v} = b_1 \mathbf{v}_1 + \dots + b_n \mathbf{v}_n. \]

If \(\mathbf{v}_1, \dots \mathbf{v}_n\) are linearly independent then subtracting both expressions yields

\[ (a_1 - b_1)\mathbf{v}_1 + \dots + (a_n - b_n)\mathbf{v}_n = \mathbf{0}. \]

By the linear independence of \(\mathbf{v}_1, \dots \mathbf{v}_n\), the coefficients must all be 0, hence

\[ a_1 = b_1,\; \dots \;, a_n = b_n \]

therefore the representation of \(\mathbf{v}\) is unique when \(\mathbf{v}_1, \dots \mathbf{v}_n\) are linearly independent.

On the other hand if \(\mathbf{v}_1, \dots \mathbf{v}_n\) are linearly dependent then the coefficients must not all be 0 and \(a_i \neq b_i\) for some \(i \in \{1, \dots, n\}\). Therefore the representation of \(\mathbf{v}\) is not unique when \(\mathbf{v}_1, \dots \mathbf{v}_n\) are linearly dependent.

Basis and dimension

Definition: the vectors \(\mathbf{v}_1,\dots,\mathbf{v}_n \in V\) form a basis if and only if

\(\mathbf{v}_1,\dots,\mathbf{v}_n\) are linearly independent,

\(\mathbf{v}_1,\dots,\mathbf{v}_n\) span \(V\).

Therefore, a basis may define a vector space, but it is not necessarily unique.

Theorem: if \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) is a spanning set for a vector space \(V\), then any collection of \(m\) vectors in \(V\) where \(m>n\), is linearly dependent.

Proof:

Let \(\mathbf{u}_1, \dots, \mathbf{u}_m \in V\), where \(m > n\). Then since \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) span \(V\) we have

\[ \mathbf{u}_i = a_{i1} \mathbf{v}_1 + \dots + a_{in} \mathbf{v}_n, \]

for \(i,j \in \{1, \dots, n\}\) with \(a_{ij} \in \mathbb{R}\).

A linear combination \(c_1 \mathbf{u}_1 + \dots + c_m \mathbf{u}_m\) can be written in the form

\[ c_1 \sum_{j=1}^n a_{1j} \mathbf{v}_j + \dots + c_m \sum_{j=1}^n a_{1j} a_{mj} \mathbf{v}_j, \]

obtaining

\[ c_1 \mathbf{u}_1 + \dots + c_m \mathbf{u}_m = \sum_{i=1}^m \bigg( c_i \sum_{j=1}^n a_{ij} \mathbf{v}_j \bigg) = \sum_{j=1}^n \bigg(\sum_{i=1}^m a_{ij} c_i \bigg) \mathbf{v}_j. \]

Considering the system of equations

\[ \sum_{i=1}^m a_{ij} c_i = 0 \]

for \(j \in \{1, \dots, n\}\), a homogeneous system with more unknowns than equations. Therefore the system must have a nontrivial solution \((\hat c_1, \dots, \hat c_m)^T\), but then

\[ \hat c_1 \mathbf{u}_1 + \dots + \hat c_m \mathbf{u}_m = \sum_{j=1}^n 0 \mathbf{v}_j = \mathbf{0}, \]

hence \(\mathbf{u}_1, \dots, \mathbf{u}_m\) are linearly dependent.

Corollary: if both \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) and \(\{\mathbf{u}_1,\dots,\mathbf{u}_m\}\) are bases for a vector space \(V\), then \(n = m\).

Proof:

Let both \(\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) and \(\{\mathbf{u}_1,\dots,\mathbf{u}_m\}\) be bases for \(V\). Since \(\mathbf{v}_1,\dots,\mathbf{v}_n\) span \(V\) and \(\mathbf{u}_1,\dots,\mathbf{u}_m\) are linearly independent then it follows that \(m \leq n\), similarly \(\mathbf{u}_1,\dots,\mathbf{u}_m\) span \(V\) and \(\mathbf{v}_1,\dots,\mathbf{v}_n\) are linearly independent so \(n \leq m\). Which must imply \(n=m\).

With this result we may now refer to the number of elements in any basis for a given vector space. Which leads to the following definition.

Definition: let \(V\) be a vector space. If \(V\) has a basis consisting of \(n \in \mathbb{N}\) vectors, then \(V\) has dimension \(n\). The subspace \(\{\mathbf{0}\}\) of \(V\) is said to have dimension \(0\). \(V\) is said to be finite dimensional if there is a finite set of vectors that spans \(V\), otherwise \(V\) is infinite dimensional.

So a single nonzero vector must span one-dimension exactly. For multiple vectors we have the following theorem.

Theorem: if \(V\) is a vector space of dimension \(n \in \mathbb{N}\ \backslash \{\mathbf{0}\}\), then

any set of \(n\) linearly independent vectors spans \(V\),

any \(n\) vectors that span \(V\) are linearly independent,

Proof:

To prove 1, suppose that \(\mathbf{v}_1,\dots,\mathbf{v}_n \in V\) are linearly independent and \(\mathbf{v} \in V\). Since \(V\) has dimension \(n\), it has a basis consisting of \(n\) vectors and these vectors span \(V\). It follows that \(\mathbf{v}_1,\dots,\mathbf{v}_n, \mathbf{v}\) must be linearly dependent. Thus there exist scalars \(c_1, \dots, c_n, c_{n+1}\) not all zero, such that

\[ c_1 \mathbf{v}_1 + \dots + c_n \mathbf{v}_n + c_{n+1} \mathbf{v} = \mathbf{0}. \]

The scalar \(c_{n+1}\) cannot be zero, since that would imply that \(\mathbf{v}_1,\dots,\mathbf{v}_n\) are linearly dependent, hence

\[ \mathbf{v} = a_1 \mathbf{v}_1 + \dots a_n \mathbf{v}_n, \]

with

\[ a_i = - \frac{c_i}{c_{n+1}} \]

for \(i \in \{1, \dots, n\}\). Since \(\mathbf{v}\) was an arbitrary vector in \(V\) it follows that \(\mathbf{v}_1, \dots, \mathbf{v}_n\) span \(V\).

To prove 2, suppose that \(\mathbf{v}_1,\dots,\mathbf{v}_n\) span \(V\). If \(\mathbf{v}_1,\dots,\mathbf{v}_n\) are linearly dependent, then one vector \(\mathbf{v}_i\) can be written as a linear combination of the others, take \(i=n\) without loss of generality. It follows that \(\mathbf{v}_1,\dots,\mathbf{v}_{n-1}\) will still span \(V\), which contradicts with \(\dim V = n\), therefore \(\mathbf{v}_1, \dots, \mathbf{v}_n\) must be linearly independent.

Therefore no set fewer than \(n\) vectors can span \(V\), if \(\dim V = n\).

Change of basis

Definition: let \(V\) be a vector space and let \(E = \{\mathbf{e}_1, \dots \mathbf{e}_n\}\) be an ordered basis for \(V\). If \(\mathbf{v}\) is any element of \(V\), then \(\mathbf{v}\) can be written in the form

\[ \mathbf{v} = v_1 \mathbf{e}_1 + \dots + v_n \mathbf{e}_n, \]

where \(v_1, \dots, v_n \in \mathbb{R}\) are the coordinates of \(\mathbf{v}\) relative to \(E\).

Row space and column space

Definition: if \(A\) is an \(m \times n\) matrix, the subspace of \(\mathbb{R}^{n}\) spanned by the row vectors of \(A\) is called the row space of \(A\). The subspace of \(\mathbb{R}^m\) spanned by the column vectors of \(A\) is called the column space of \(A\).

With the definition of a row space the following theorem may be posed.

Theorem: two row equivalent matrices have the same row space.

Proof:

Let \(A\) and \(B\) be two matrices, if \(B\) is row equivalent to \(A\) then \(B\) can be formed from \(A\) by a finite sequence of row operations. Thus the row vectors of \(B\) must be linear combinations of the row vectors of \(A\). Consequently, the row space of \(B\) must be a subspace of the row space of \(A\). Since \(A\) is row equivalent to \(B\), by the same reasoning, the row space of \(A\) is a subspace of the row space of \(B\).

With the definition of a column space a theorem posed in systems of linear equations may be restated as.

Theorem: a linear system \(A \mathbf{x} = \mathbf{b}\) is consistent if and only if \(\mathbf{b}\) is in the column space of \(A\).

Proof:

For the proof, see the initial proof in systems of linear equations.

With this restatement the following statements may be proposed.

Proposition: let \(A\) be an \(m \times n\) matrix. The linear system \(A \mathbf{x} = \mathbf{b}\) is consistent for every \(\mathbf{b} \in \mathbb{R}^m\) if and only if the column vectors of \(A\) span \(\mathbb{R}^m\).

The system \(A \mathbf{x} = \mathbf{b}\) has at most one solution for every \(\mathbf{b}\) if and only if the column vectors of \(A\) are linearly independent.

Proof:

Let \(A\) be an \(m \times n\) matrix. It follows that \(A \mathbf{x} = \mathbf{b}\) will be consistent for every \(\mathbf{b} \in \mathbb{R}^m\) if and only if the column vectors of \(A\) span \(\mathbb{R}^m\). To prove the second statement, the system \(A \mathbf{x} = \mathbf{0}\) can have only the trivial solution and hence the column vectors of \(A\) must be linearly independent. Conversely, if the column vectors of \(A\) are linearly independent, \(A \mathbf{x} = \mathbf{0}\) has only the trivial solution. If \(\mathbf{x}_1, \mathbf{x}_2\) were both solutions of \(A \mathbf{x} = \mathbf{b}\) then \(\mathbf{x}_1 - \mathbf{x}_2\) would be a solution of \(A \mathbf{x} = \mathbf{0}\)

\[ A(\mathbf{x}_1 - \mathbf{x}_2) = A\mathbf{x}_1 - A\mathbf{x}_2 = \mathbf{b} - \mathbf{b} = \mathbf{0}. \]

It follows that \(\mathbf{x}_1 - \mathbf{x}_2 = \mathbf{0}\) and hence \(\mathbf{x}_1 = \mathbf{x}_2\).

From these propositions the following corollary emerges.

Corollary: an \(n \times n\) matrix \(A\) is nonsingular if and only if the column vectors of \(A\) form a basis for \(\mathbb{R}^n\).

Proof:

Let \(A\) be an \(m \times n\) matrix. If the column vectors of \(A\) span \(\mathbb{R}^m\), then \(n\) must be greater or equal to \(m\), since no set of fewer than \(m\) vectors could span \(\mathbb{R}^m\). If the columns of \(A\) are linearly independent, then \(n\) must be less than or equal to \(m\), since every set of more than \(m\) vectors in \(\mathbb{R}^m\) is linearly dependent. Thus, if the column vectors of \(A\) form a basis for \(\mathbb{R}^m\), then \(n = m\).

Theorem: if \(A\) is an \(m \times n\) matrix, the dimension of the row space of \(A\) equals the dimension of the column space of \(A\).

Proof:

Will be added later.

Rank and nullity

Definition: the rank of a matrix \(A\), denoted as \(\text{rank}(A)\), is the dimension of the row space of \(A\).

The rank of a matrix may be determined by reducing the matrix to row echelon form. The nonzero rows of the row echelon matrix will form a basis for the row space. The rank may be interpreted as a measure for singularity of the matrix.

Definition: the nullity of a matrix \(A\), denoted as \(\text{nullity}(A)\), is the dimension of the null space of \(A\).

The nullity of \(A\) is the number of columns without a pivot in the reduced echelon form.

Theorem: if \(A\) is an \(m \times n\) matrix, then

\[ \text{rank}(A) + \text{nullity}(A) = n. \]

Proof:

Let \(U\) be the reduced echelon form of \(A\). The system \(A \mathbf{x} = \mathbf{0}\) is equivalent to the system \(U \mathbf{x} = \mathbf{0}\). If \(A\) has rank \(r\), then \(U\) will have \(r\) nonzero rows and consequently the system \(U \mathbf{x} = \mathbf{0}\) will involve \(r\) pivots and \(n - r\) free variables. The dimension of the null space will equal the number of free variables.