[,1] [,2] [,3] [,4] [,5]
[1,] 1 4 7 10 13
[2,] 2 5 8 11 14
[3,] 3 6 9 12 15
Notes on Linear Algebra Part 1
If you are already familiar with much of linear algebra, as well as the relevant functions in R
, read no further and do something else!
If you are like me, you’ve had no formal training in linear algebra, which means you learn what you need to when you need to use it. Eventually, you cobble together some hard-won knowledge. That’s good, because almost everything in chemometrics involves linear algebra.
This post is essentially a set of personal notes about the dot product and the cross product, two important manipulations in linear algebra. I’ve tried to harmonize things I learned way back in college physics and math courses, and integrate information I’ve found in various sources I have leaned on more recently. Without a doubt, the greatest impediment to really understanding this material is the use of multiple terminology and notations. I’m going to try really hard to be clear and to the point in my dicussion.
The main sources I’ve relied on are:
- The No Bullshit Guide to Linear Algebra by Ivan Savov. This is by far my favorite treatment of linear algebra. It gets to the point quickly.
- The Wikipedia pages on dot product, cross product and outer product.
Let’s get started. For sanity and consistency, let’s define two 3D vectors and two matrices to illustrate our examples. Most of the time I’m going to write vectors with an arrow over the name, as a nod to the treatment usually given in a physics course. This reminds us that we are thinking about a quantity with direction and magnitude in some coordinate system, something geometric. Of course in the R
language a vector is simply a list of numbers with the same data type; R
doesn’t care if a vector is a vector in the geometric sense or a list of states.
\[ \vec{u} = (u_x, u_y, u_z) \tag{1}\]
\[ \vec{v} = (v_x, v_y, v_z) \tag{2}\]
\[ \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix} \tag{3}\]
\[ \mathbf{B} = \begin{bmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \\ \end{bmatrix} \tag{4}\]
Dot Product
Terminology
The dot product goes by these other names: inner product, scalar product. Typical notations include:1
- \(\vec{u} \cdot \vec{v}\) (the \(\cdot\) is the origin of the name “dot” product)
- \(u \cdot v\)
- \(u^\mathsf{T}v\) (when thinking of the vectors as column vectors)
- \(\langle u, v \rangle\) (typically used when \(u, v\) are complex)
- \(\langle u | v \rangle\)
Formulas
There are two main formulas for the dot product with vectors, the algebraic formula (Equation 5) and the geometric formula (Equation 6).
\[ \vec{u} \cdot \vec{v} = \sum{u_i v_i} \tag{5}\]
\[ \vec{u} \cdot \vec{v} = \| \vec{u} \| \| \vec{v} \| cos \theta \tag{6}\]
\(\| \vec{x} \|\) refers to the \(L_2\) or Euclidian norm, namely the length of the vector:2
\[ \| \vec{x} \| = \sqrt{x^{2}_1 + \ldots + x^{2}_n} \tag{7}\]
The result of the dot product is a scalar. The dot product is also commutative: \(\vec{u} \cdot \vec{v} = \vec{v} \cdot \vec{u}\).
From the perspective of matrices, if we think of \(\vec{u}\) and \(\vec{v}\) as column vectors with dimensions 3 x 1, then transposing \(\vec{u}\) gives us conformable matrices and we find the result of matrix multiplication is the dot product (compare to Equation 5):
\[ \vec{u} \cdot \vec{v} = \vec{u}^\mathsf{T} \ \vec{v} = \begin{bmatrix} u_x & u_y & u_z\end{bmatrix} \begin{bmatrix} v_x \\ v_y \\ v_z \end{bmatrix} = u_x v_x + u_y v_y + u_z v_z \tag{8}\]
Even though this is matrix multiplication, the answer is still a scalar.
Now, rather confusingly, if we think of \(\vec{u}\) and \(\vec{v}\) as row vectors, and we transpose \(\vec{v}\),then we get the dot product:
\[ \vec{u} \cdot \vec{v} = \vec{u} \ \vec{v}^\mathsf{T} = \begin{bmatrix} u_x & u_y & u_z\end{bmatrix} \begin{bmatrix} v_x \\ v_y \\ v_z \end{bmatrix} = u_x v_x + u_y v_y + u_z v_z \tag{9}\]
Equations Equation 8 and Equation 9 can be a source of real confusion at first. They give the impression that the dot product can be either \(\vec{u}^\mathsf{T} \ \vec{v}\) or \(\vec{u} \ \vec{v}^\mathsf{T}\). However, this is only true in the limited contexts defined above. To summarize:
- Thinking of the vectors as column vectors with dimensions \(n \times 1\) then one can use \(\vec{u}^\mathsf{T} \ \vec{v}\)
- Thinking of the vectors as row vectors with dimensions \(1 \times n\) then one can use \(\vec{u} \ \vec{v}^\mathsf{T}\)
Unfortunately I think this distinction is not always clearly made by authors, and is a source of great confusion to linear algebra learners. Be careful when working with row and column vectors.
Matrix Multiplication
Suppose we wanted to compute \(\mathbf{A}\mathbf{B} = \mathbf{C}\).3 We use the idea of row and column vectors to accomplish this task. In the process, we discover that matrix multiplication is a series of dot products:
\[ \begin{multline} \mathbf{A}\mathbf{B} = \mathbf{C} = \begin{bmatrix} \textcolor{red}{a_{11}} & \textcolor{red}{a_{12}} & \textcolor{red}{a_{13}} \\ a_{21} & a_{22} & a_{23} \\ \end{bmatrix} \begin{bmatrix} \textcolor{red}{b_{11}} & b_{12} & b_{13} \\ \textcolor{red}{b_{21}} & b_{22} & b_{23} \\ \textcolor{red}{b_{31}} & b_{32} & b_{33} \\ \end{bmatrix} = \\ \begin{bmatrix} \textcolor{red}{a_{11}b_{11} + a_{12}b_{21} + a_{13}b_{31}} & a_{11}b_{12} + a_{12}b_{22} + a_{13}b_{32} & a_{11}b_{13} + a_{12}b_{23} + a_{13}b_{33}\\ a_{21}b_{11} + a_{22}b_{21} + a_{23}b_{31} & a_{21}b_{12} + a_{22}b_{22} + a_{23}b_{32} & a_{21}b_{13} + a_{22}b_{23} + a_{23}b_{33}\\ \end{bmatrix} \end{multline} \tag{10}\]
The red color shows how the dot product of the first row of \(\mathbf{A}\) and the first column of \(\mathbf{B}\) gives the first entry in \(\mathbf{C}\). Every entry in \(\mathbf{C}\) results from a dot product. Every entry is a scalar, embedded in a matrix.
What Can We Do With the Dot Product?
- Determine the angle between two vectors, as in Equation 6.
- As such, determine if two vectors intersect at a right angle (at least in 2-3D). More generally, two vectors of any dimension are orthogonal if their dot product is zero.
- Matrix multiplication, when applied repeatedly.
- Compute the length of a vector, via \(\sqrt{v \cdot v}\)
- Compute the projection of one vector on another, for instance how much of a force is along the \(x\)-direction? A verbal interpretation of \(\vec{u} \cdot \vec{v}\) is it gives the amount of \(\vec{v}\) in the direction of \(\vec{u}\).
Cross Product
Terminology and Notation
The cross product goes by these other names: outer product4, tensor product, vector product.
Formulas
The cross product of two vectors returns a vector rather than a scalar. Vectors are defined in terms of a basis which is a coordinate system. Earlier, when we defined \(\vec{u} = (u_x, u_y, u_z)\) it was intrinsically defined in terms of the standard basis set \(\hat{i}, \hat{j}, \hat{k}\) (in some fields this would be called the unit coordinate system). Thus a fuller definition of \(\vec{u}\) would be:
\[ \vec{u} = u_x\hat{i} + u_y\hat{j} + u_z\hat{k} \tag{11}\]
In terms of vectors, the cross product is defined as: \[ \vec{u} \times \vec{v} = (a_{y}b_{z} - a_{z}b_{y})\hat{i} + (a_{z}b_{x} - a_{x}b_{z})\hat{j} + (a_{x}b_{y} -a_{y}b_{x})\hat{k} \tag{12}\]
In my opinion, this is not exactly intuitive, but there is a pattern to it: notice that the terms for \(\hat{i}\) don’t involve the \(x\) component. The details of how this result is computed relies on some properties of the basis set; this Wikipedia article has a nice explanation. We need not dwell on it however.
There is also a geometric formula for the cross product:
\[ \vec{u} \times \vec{v} = \| \vec{u} \| \| \vec{v} \| sin \theta \hat{n} \tag{13}\]
where \(\hat{n}\) is the unit vector perpendicular to the plane defined by \(\vec{u}\) and \(\vec{v}\). The direction of \(\hat{n}\) is defined by the right-hand rule. Because of this, the cross product is not commutative, i.e. \(\vec{u} \times \vec{v} \ne \vec{v} \times \vec{u}\). The cross product is however anti-commutative: \(\vec{u} \times \vec{v} = - (\vec{v} \times \vec{u})\)
As we did for the dot product, we can look at the cross product from the perspective of column vectors. Instead of transposing the first matrix as we did for the dot product, we transpose the second one: \[ \vec{u} \times \vec{v} = \vec{u} \ \vec{v}^\mathsf{T} = \begin{bmatrix} u_x \\ u_y \\ u_z\end{bmatrix} \begin{bmatrix} v_x & v_y & v_z \end{bmatrix} = \begin{bmatrix} u_{x}v_{x} & u_{x}v_{y} & u_{x}v_{z} \\ u_{y}v_{x} & u_{y}v_{y} & u_{y}v_{z} \\ u_{z}v_{x} & u_{z}v_{y} & u_{z}v_{z}\\ \end{bmatrix} \tag{14}\]
Interestingly, we are using the dot product to compute the cross product.
The case where we treat \(\vec{u}\) and \(\vec{v}\) as row vectors is left to the reader.5
Finally, there is a matrix definition of the cross product as well. Evaluation of the following determinant gives the cross product:
\[ \vec{u} \times \vec{v} = \begin{vmatrix} \hat{i} & \hat{j} & \hat{k}\\ u_{x} & u_{y} & u_{z} \\ v_{x} & v_{y} & v_{z}\\\end{vmatrix} \]
What Can We Do With the Cross Product?
- In 3D, the result of the cross product is perpendicular or normal to the plane defined by the two input vectors.
- If however, the two vectors are parallel or anti-parallel, the cross product is zero.
- The length of the cross product is the area of the parallelogram defined by the two input vectors: \(\| \vec{u} \times \vec{v} \|\)
R Functions
%*%
The workhorse for matrix multiplication in R
is the %*%
function. This function will accept any combination of vectors and matrices as inputs, so it is flexible. It is also smart: given a vector and a matrix, the vector will be treated as row or column matrix as needed to ensure conformity, if possible. Let’s look at some examples:
Notice that R
returns a data type of matrix, but it is a \(1 \times 1\) matrix, and thus a scalar value. That means we just computed the dot product, a descision R
made internally. We can verify this by noting that q %*% p
gives the same answer. Thus, R
handled these vectors as column vectors and computed \(p^{\intercal}q\).
As M
had dimensions \(3 \times 5\), R
treated p
as a \(5 \times 1\) column vector in order to be conformable. The result is a \(3 \times 1\) vector, so this is the cross product.
If we try to compute p %*% M
we get an error, because there is nothing R
can do to p
which will make it conformable to M
.
What about multiplying matrices?
As you can see, when dealing with matrices, %*%
will not change a thing, and if your matrices are non-conformable then it’s an error. Of course, if we transpose either instance of M
we do have conformable matrices, but the answers are different, and this is neither the dot product or the cross product, just matrix multiplication.
[,1] [,2] [,3] [,4] [,5]
[1,] 14 32 50 68 86
[2,] 32 77 122 167 212
[3,] 50 122 194 266 338
[4,] 68 167 266 365 464
[5,] 86 212 338 464 590
[,1] [,2] [,3]
[1,] 335 370 405
[2,] 370 410 450
[3,] 405 450 495
What can we take from these examples?
R
will give you the dot product if you give it two vectors. Note that this is a design decision, as it could have returned the cross product (see Equation 14).R
will promote a vector to a row or column vector if it can to make it conformable with a matrix you provide. If it cannot,R
will give you an error. If it can, the cross product is returned.- When it comes to two matrices,
R
will give an error when they are not conformable. - One function,
%*%
, does it all: dot product, cross product, or matrix multiplication, but you need to pay attention. - The documentation says as much, but more tersely: “Multiplies two matrices, if they are conformable. If one argument is a vector, it will be promoted to either a row or column matrix to make the two arguments conformable. If both are vectors of the same length, it will return the inner product (as a matrix)”
Other Functions
There are other R
functions that do some of the same work:
crossprod
equivalent tot(M) %*% M
but faster.tcrossprod
equivalent toM %*% t(M)
but faster.outer
or%o%
The first two functions will accept combinations of vectors and matrices, as does %*%
. Let’s try it with two vectors:
Huh. crossprod
is returning the dot product! So this is the case where “the cross product is not the cross product.” From a clarity perspective, this is not ideal. Let’s try the other function:
[,1] [,2] [,3] [,4] [,5]
[1,] 6 7 8 9 10
[2,] 12 14 16 18 20
[3,] 18 21 24 27 30
[4,] 24 28 32 36 40
[5,] 30 35 40 45 50
There’s the cross product!
What about outer
? Remember that another name for the cross product is the outer product. So is outer
the same as tcrossprod
? In the case of two vectors, it is:
What about a vector with a matrix?
Alright, that clearly is not a cross product. The result is an array with dimensions \(5 \times 3 \times 5\), not a matrix (which would have only two dimensions). outer
does correspond to the cross product in the case of two vectors, but anything with higher dimensions gives a different beast. So perhaps using “outer” as a synonym for cross product is not a good idea.
Advice
Given what we’ve seen above, make your life simple and stick to %*%
, and pay close attention to the dimensions of the arguments, especially if row or column vectors are in use. In my experience, thinking about the units and dimensions of whatever it is you are calculating is very helpful. Later, if speed is really important in your work, you can use one of the faster alternatives.
Footnotes
And curiously, the \(L_2\) norm works out to be equal to the square root of the dot product of a vector with itself: \(\| v \| = \sqrt{v \cdot v}\)↩︎
To be multiplied, matrices must be conformable, namely the number of columns of the first matrix must match the number of rows of the second matrix. The reason is so that the dot product terms will match. In the present case we have \(\mathbf{A}^{2\times3}\mathbf{B}^{3\times3} = \mathbf{C}^{2\times3}\).↩︎
Be careful, it turns out that “outer” may not be a great synonym for cross product, as explained later.↩︎
OK fine, here is the answer when treating \(\vec{u}\) and \(\vec{v}\) as row vectors: \(\vec{u} \times \vec{v} = \vec{u}^\mathsf{T} \ \vec{v}\) which expands exactly as the right-hand side of Equation 14.↩︎
Reuse
Citation
@online{hanson2022,
author = {Hanson, Bryan},
title = {Notes on {Linear} {Algebra} {Part} 1},
date = {2022-08-14},
url = {http://chemospec.org/posts/2022-08-14-Linear-Alg-Notes/2022-08-14-Linear-Alg-Notes.html},
langid = {en}
}