Notes on Linear Algebra Part 1

R
Linear Algebra
When is a cross product not a cross product? Terminology run amok!
Author

Bryan Hanson

Published

August 14, 2022

If you are already familiar with much of linear algebra, as well as the relevant functions in R, read no further and do something else!

If you are like me, you’ve had no formal training in linear algebra, which means you learn what you need to when you need to use it. Eventually, you cobble together some hard-won knowledge. That’s good, because almost everything in chemometrics involves linear algebra.

This post is essentially a set of personal notes about the dot product and the cross product, two important manipulations in linear algebra. I’ve tried to harmonize things I learned way back in college physics and math courses, and integrate information I’ve found in various sources I have leaned on more recently. Without a doubt, the greatest impediment to really understanding this material is the use of multiple terminology and notations. I’m going to try really hard to be clear and to the point in my dicussion.

The main sources I’ve relied on are:

Let’s get started. For sanity and consistency, let’s define two 3D vectors and two matrices to illustrate our examples. Most of the time I’m going to write vectors with an arrow over the name, as a nod to the treatment usually given in a physics course. This reminds us that we are thinking about a quantity with direction and magnitude in some coordinate system, something geometric. Of course in the R language a vector is simply a list of numbers with the same data type; R doesn’t care if a vector is a vector in the geometric sense or a list of states.

\[ \vec{u} = (u_x, u_y, u_z) \tag{1}\]

\[ \vec{v} = (v_x, v_y, v_z) \tag{2}\]

\[ \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix} \tag{3}\]

\[ \mathbf{B} = \begin{bmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \\ \end{bmatrix} \tag{4}\]

Dot Product

Terminology

The dot product goes by these other names: inner product, scalar product. Typical notations include:1

  • \(\vec{u} \cdot \vec{v}\) (the \(\cdot\) is the origin of the name “dot” product)
  • \(u \cdot v\)
  • \(u^\mathsf{T}v\) (when thinking of the vectors as column vectors)
  • \(\langle u, v \rangle\) (typically used when \(u, v\) are complex)
  • \(\langle u | v \rangle\)

Formulas

There are two main formulas for the dot product with vectors, the algebraic formula (Equation 5) and the geometric formula (Equation 6).

\[ \vec{u} \cdot \vec{v} = \sum{u_i v_i} \tag{5}\]

\[ \vec{u} \cdot \vec{v} = \| \vec{u} \| \| \vec{v} \| cos \theta \tag{6}\]

\(\| \vec{x} \|\) refers to the \(L_2\) or Euclidian norm, namely the length of the vector:2

\[ \| \vec{x} \| = \sqrt{x^{2}_1 + \ldots + x^{2}_n} \tag{7}\]

The result of the dot product is a scalar. The dot product is also commutative: \(\vec{u} \cdot \vec{v} = \vec{v} \cdot \vec{u}\).

From the perspective of matrices, if we think of \(\vec{u}\) and \(\vec{v}\) as column vectors with dimensions 3 x 1, then transposing \(\vec{u}\) gives us conformable matrices and we find the result of matrix multiplication is the dot product (compare to Equation 5):

\[ \vec{u} \cdot \vec{v} = \vec{u}^\mathsf{T} \ \vec{v} = \begin{bmatrix} u_x & u_y & u_z\end{bmatrix} \begin{bmatrix} v_x \\ v_y \\ v_z \end{bmatrix} = u_x v_x + u_y v_y + u_z v_z \tag{8}\]

Even though this is matrix multiplication, the answer is still a scalar.

Now, rather confusingly, if we think of \(\vec{u}\) and \(\vec{v}\) as row vectors, and we transpose \(\vec{v}\),then we get the dot product:

\[ \vec{u} \cdot \vec{v} = \vec{u} \ \vec{v}^\mathsf{T} = \begin{bmatrix} u_x & u_y & u_z\end{bmatrix} \begin{bmatrix} v_x \\ v_y \\ v_z \end{bmatrix} = u_x v_x + u_y v_y + u_z v_z \tag{9}\]

Equations Equation 8 and Equation 9 can be a source of real confusion at first. They give the impression that the dot product can be either \(\vec{u}^\mathsf{T} \ \vec{v}\) or \(\vec{u} \ \vec{v}^\mathsf{T}\). However, this is only true in the limited contexts defined above. To summarize:

  • Thinking of the vectors as column vectors with dimensions \(n \times 1\) then one can use \(\vec{u}^\mathsf{T} \ \vec{v}\)
  • Thinking of the vectors as row vectors with dimensions \(1 \times n\) then one can use \(\vec{u} \ \vec{v}^\mathsf{T}\)

Unfortunately I think this distinction is not always clearly made by authors, and is a source of great confusion to linear algebra learners. Be careful when working with row and column vectors.

Matrix Multiplication

Suppose we wanted to compute \(\mathbf{A}\mathbf{B} = \mathbf{C}\).3 We use the idea of row and column vectors to accomplish this task. In the process, we discover that matrix multiplication is a series of dot products:

\[ \begin{multline} \mathbf{A}\mathbf{B} = \mathbf{C} = \begin{bmatrix} \textcolor{red}{a_{11}} & \textcolor{red}{a_{12}} & \textcolor{red}{a_{13}} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix} \begin{bmatrix} \textcolor{red}{b_{11}} & b_{12} & b_{13} \\ \textcolor{red}{b_{21}} & b_{22} & b_{23} \\ \textcolor{red}{b_{31}} & b_{32} & b_{33} \\ \end{bmatrix} = \\ \begin{bmatrix} \textcolor{red}{a_{11}b_{11} + a_{12}b_{21} + a_{13}b_{31}} & a_{11}b_{12} + a_{12}b_{22} + a_{13}b_{32} & a_{11}b_{13} + a_{12}b_{23} + a_{13}b_{33}\\ a_{21}b_{11} + a_{22}b_{21} + a_{23}b_{31} & a_{21}b_{12} + a_{22}b_{22} + a_{23}b_{32} & a_{21}b_{13} + a_{22}b_{23} + a_{23}b_{33}\\ \end{bmatrix} \end{multline} \tag{10}\]

The red color shows how the dot product of the first row of \(\mathbf{A}\) and the first column of \(\mathbf{B}\) gives the first entry in \(\mathbf{C}\). Every entry in \(\mathbf{C}\) results from a dot product. Every entry is a scalar, embedded in a matrix.

What Can We Do With the Dot Product?

  • Determine the angle between two vectors, as in Equation 6.
  • As such, determine if two vectors intersect at a right angle (at least in 2-3D). More generally, two vectors of any dimension are orthogonal if their dot product is zero.
  • Matrix multiplication, when applied repeatedly.
  • Compute the length of a vector, via \(\sqrt{v \cdot v}\)
  • Compute the projection of one vector on another, for instance how much of a force is along the \(x\)-direction? A verbal interpretation of \(\vec{u} \cdot \vec{v}\) is it gives the amount of \(\vec{v}\) in the direction of \(\vec{u}\).

Cross Product

Terminology and Notation

The cross product goes by these other names: outer product4, tensor product, vector product.

Formulas

The cross product of two vectors returns a vector rather than a scalar. Vectors are defined in terms of a basis which is a coordinate system. Earlier, when we defined \(\vec{u} = (u_x, u_y, u_z)\) it was intrinsically defined in terms of the standard basis set \(\hat{i}, \hat{j}, \hat{k}\) (in some fields this would be called the unit coordinate system). Thus a fuller definition of \(\vec{u}\) would be:

\[ \vec{u} = u_x\hat{i} + u_y\hat{j} + u_z\hat{k} \tag{11}\]

In terms of vectors, the cross product is defined as: \[ \vec{u} \times \vec{v} = (a_{y}b_{z} - a_{z}b_{y})\hat{i} + (a_{z}b_{x} - a_{x}b_{z})\hat{j} + (a_{x}b_{y} -a_{y}b_{x})\hat{k} \tag{12}\]

In my opinion, this is not exactly intuitive, but there is a pattern to it: notice that the terms for \(\hat{i}\) don’t involve the \(x\) component. The details of how this result is computed relies on some properties of the basis set; this Wikipedia article has a nice explanation. We need not dwell on it however.

There is also a geometric formula for the cross product:

\[ \vec{u} \times \vec{v} = \| \vec{u} \| \| \vec{v} \| sin \theta \hat{n} \tag{13}\]

where \(\hat{n}\) is the unit vector perpendicular to the plane defined by \(\vec{u}\) and \(\vec{v}\). The direction of \(\hat{n}\) is defined by the right-hand rule. Because of this, the cross product is not commutative, i.e. \(\vec{u} \times \vec{v} \ne \vec{v} \times \vec{u}\). The cross product is however anti-commutative: \(\vec{u} \times \vec{v} = - (\vec{v} \times \vec{u})\)

As we did for the dot product, we can look at the cross product from the perspective of column vectors. Instead of transposing the first matrix as we did for the dot product, we transpose the second one: \[ \vec{u} \times \vec{v} = \vec{u} \ \vec{v}^\mathsf{T} = \begin{bmatrix} u_x \\ u_y \\ u_z\end{bmatrix} \begin{bmatrix} v_x & v_y & v_z \end{bmatrix} = \begin{bmatrix} u_{x}v_{x} & u_{x}v_{y} & u_{x}v_{z} \\ u_{y}v_{x} & u_{y}v_{y} & u_{y}v_{z} \\ u_{z}v_{x} & u_{z}v_{y} & u_{z}v_{z}\\ \end{bmatrix} \tag{14}\]

Interestingly, we are using the dot product to compute the cross product.

The case where we treat \(\vec{u}\) and \(\vec{v}\) as row vectors is left to the reader.5

Finally, there is a matrix definition of the cross product as well. Evaluation of the following determinant gives the cross product:

\[ \vec{u} \times \vec{v} = \begin{vmatrix} \hat{i} & \hat{j} & \hat{k}\\ u_{x} & u_{y} & u_{z} \\ v_{x} & v_{y} & v_{z}\\\end{vmatrix} \]

What Can We Do With the Cross Product?

  • In 3D, the result of the cross product is perpendicular or normal to the plane defined by the two input vectors.
  • If however, the two vectors are parallel or anti-parallel, the cross product is zero.
  • The length of the cross product is the area of the parallelogram defined by the two input vectors: \(\| \vec{u} \times \vec{v} \|\)

R Functions

%*%

The workhorse for matrix multiplication in R is the %*% function. This function will accept any combination of vectors and matrices as inputs, so it is flexible. It is also smart: given a vector and a matrix, the vector will be treated as row or column matrix as needed to ensure conformity, if possible. Let’s look at some examples:

# Some data for examples
p <- 1:5
q <- 6:10
M <- matrix(1:15, nrow = 3, ncol = 5)
M
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    4    7   10   13
[2,]    2    5    8   11   14
[3,]    3    6    9   12   15
# A vector times a vector
p %*% q
     [,1]
[1,]  130

Notice that R returns a data type of matrix, but it is a \(1 \times 1\) matrix, and thus a scalar value. That means we just computed the dot product, a descision R made internally. We can verify this by noting that q %*% p gives the same answer. Thus, R handled these vectors as column vectors and computed \(p^{\intercal}q\).

# A vector times a matrix
M %*% p
     [,1]
[1,]  135
[2,]  150
[3,]  165

As M had dimensions \(3 \times 5\), R treated p as a \(5 \times 1\) column vector in order to be conformable. The result is a \(3 \times 1\) vector, so this is the cross product.

If we try to compute p %*% M we get an error, because there is nothing R can do to p which will make it conformable to M.

p %*% M
Error in p %*% M: non-conformable arguments

What about multiplying matrices?

M %*% M
Error in M %*% M: non-conformable arguments

As you can see, when dealing with matrices, %*% will not change a thing, and if your matrices are non-conformable then it’s an error. Of course, if we transpose either instance of M we do have conformable matrices, but the answers are different, and this is neither the dot product or the cross product, just matrix multiplication.

t(M) %*% M
     [,1] [,2] [,3] [,4] [,5]
[1,]   14   32   50   68   86
[2,]   32   77  122  167  212
[3,]   50  122  194  266  338
[4,]   68  167  266  365  464
[5,]   86  212  338  464  590
M %*% t(M)
     [,1] [,2] [,3]
[1,]  335  370  405
[2,]  370  410  450
[3,]  405  450  495

What can we take from these examples?

  • R will give you the dot product if you give it two vectors. Note that this is a design decision, as it could have returned the cross product (see Equation 14).
  • R will promote a vector to a row or column vector if it can to make it conformable with a matrix you provide. If it cannot, R will give you an error. If it can, the cross product is returned.
  • When it comes to two matrices, R will give an error when they are not conformable.
  • One function, %*%, does it all: dot product, cross product, or matrix multiplication, but you need to pay attention.
  • The documentation says as much, but more tersely: “Multiplies two matrices, if they are conformable. If one argument is a vector, it will be promoted to either a row or column matrix to make the two arguments conformable. If both are vectors of the same length, it will return the inner product (as a matrix)”

Other Functions

There are other R functions that do some of the same work:

  • crossprod equivalent to t(M) %*% M but faster.
  • tcrossprod equivalent to M %*% t(M) but faster.
  • outer or %o%

The first two functions will accept combinations of vectors and matrices, as does %*%. Let’s try it with two vectors:

crossprod(p, q)
     [,1]
[1,]  130

Huh. crossprod is returning the dot product! So this is the case where “the cross product is not the cross product.” From a clarity perspective, this is not ideal. Let’s try the other function:

tcrossprod(p, q)
     [,1] [,2] [,3] [,4] [,5]
[1,]    6    7    8    9   10
[2,]   12   14   16   18   20
[3,]   18   21   24   27   30
[4,]   24   28   32   36   40
[5,]   30   35   40   45   50

There’s the cross product!

What about outer? Remember that another name for the cross product is the outer product. So is outer the same as tcrossprod? In the case of two vectors, it is:

identical(outer(p, q), tcrossprod(p, q))
[1] TRUE

What about a vector with a matrix?

tst <- outer(p, M)
dim(tst)
[1] 5 3 5

Alright, that clearly is not a cross product. The result is an array with dimensions \(5 \times 3 \times 5\), not a matrix (which would have only two dimensions). outer does correspond to the cross product in the case of two vectors, but anything with higher dimensions gives a different beast. So perhaps using “outer” as a synonym for cross product is not a good idea.

Advice

Given what we’ve seen above, make your life simple and stick to %*%, and pay close attention to the dimensions of the arguments, especially if row or column vectors are in use. In my experience, thinking about the units and dimensions of whatever it is you are calculating is very helpful. Later, if speed is really important in your work, you can use one of the faster alternatives.

Footnotes

  1. An extensive dicussion of notations can be found here.↩︎

  2. And curiously, the \(L_2\) norm works out to be equal to the square root of the dot product of a vector with itself: \(\| v \| = \sqrt{v \cdot v}\)↩︎

  3. To be multiplied, matrices must be conformable, namely the number of columns of the first matrix must match the number of rows of the second matrix. The reason is so that the dot product terms will match. In the present case we have \(\mathbf{A}^{2\times3}\mathbf{B}^{3\times3} = \mathbf{C}^{2\times3}\).↩︎

  4. Be careful, it turns out that “outer” may not be a great synonym for cross product, as explained later.↩︎

  5. OK fine, here is the answer when treating \(\vec{u}\) and \(\vec{v}\) as row vectors: \(\vec{u} \times \vec{v} = \vec{u}^\mathsf{T} \ \vec{v}\) which expands exactly as the right-hand side of Equation 14.↩︎

Reuse

Citation

BibTeX citation:
@online{hanson2022,
  author = {Bryan Hanson},
  editor = {},
  title = {Notes on {Linear} {Algebra} {Part} 1},
  date = {2022-08-14},
  url = {http://chemospec.org/2022-08-14-Linear-Alg-Notes.html},
  langid = {en}
}
For attribution, please cite this work as:
Bryan Hanson. 2022. “Notes on Linear Algebra Part 1.” August 14, 2022. http://chemospec.org/2022-08-14-Linear-Alg-Notes.html.