Chemometrics and Spectroscopy Using R

EF-NMR Part 3: Receiver Software

Bryan Hanson — Tue, 16 Apr 2024 07:00:00 GMT

Before We Start…

NMRduino is maturing rapidly! If what I’m doing is at all interesting to you, and you don’t know about the NMRduino Project and their recent publication, be sure to check it out. It’s much more sophisticated than what’s going on here, and will be available soon.

Part 1 Part 2

Capturing an “FID”

In the previous post on biwise operators in C I detailed some of the machinations needed to control the ADC on an Arduino. After considerable work, that knowledge has been put to use to develop a working receiver system (though more work will be needed to perfect it). In the process, I have fine-tuned the code needed to control the instrument and collect the data in a useable form.

The Bnmr software is available on Github. The hardware used for testing and development is shown in Figure 1.

Figure 1: The hardware used for testing. The blue box on the left is the PicoScope, which is generating a sine wave simulating an FID. The Arduino is in the middle. The bread board on the right has an Adafruit micro SD card breakout board mounted at the bottom (at the top is a voltage divider which I use to generate a constant voltage for quicker testing; it’s not hooked up when using the PicoScope). Once data is collected, the micro SD card is removed from the breakout board and put in the dongle to move the data to the laptop.

Once the proper bits were set so the ADC would collect data, I first used a simple voltage divider to generate a constant ADC signal, adjustable via a potentiometer (if one doesn’t provide some kind of input, the ADC output drifts around). With a signal available, there were many rounds of code revision so that a specified number of data points could be collected and stored somewhere.¹

In terms of storage, there were issues. The Arduino has very little actual memory, so the amount of data that can be “stored” is very small. As a result, this data has to be quickly moved somewhere else with significant memory. The solution to the transient data storage is a ring buffer. I was able to implement the code found on Wikipedia in C without too much trouble. The idea behind a ring or circular buffer is that data is stored in a fixed size buffer, and added and removed in a coordinated manner via indices. However, in the big picture data must be removed from the ring buffer as fast or faster than it is put in, otherwise data is overwritten. And, it turns out that the Arduino ADC can really pump out data. In order to keep the ring buffer from filling and overwriting (which is treated as an error), I had to collect data from the ADC at a lower rate that it can produce numbers, for instance every 10th reading.²

The second problem was what to do with the data that was emptied out of the ring buffer. I spent a lot of time trying to send it to the serial port, so I could capture it from there. However, Bnmr also sends a lot of messages about various events to the serial port. These messages inform the user about what is happening and also provide troubleshooting guidance. Ultimately, it was not possible to capture the data this way – the messages invariably introduced problems with the formatting of the data. The solution was to add a micro SD card breakout board to store the data on the fly, effectively separating the message stream from the data stream. Before I settled on that approach, I also tried to use R to both send messages and capture the data.³ In addition, I also tried using a shell script and a terminal emulator to do the same. Neither was completely successful when messages and data were mixed. However, the shell script experience proved helpful in developing the final, successful approach. Another problem with having both messages and data in the same serial stream was that MacOS has a nasty habit of reseting high baud rates desirable for data collection back to lower rates. This is discussed in various forums and workarounds exist, but I could not get the overall process to be reliable and robust.

Results

With functioning software and a method to control the overall acquistion process in hand, I used the PicoScope to generate a sine wave (Figure 2). Bnmr was compiled and uploaded via a shell script calling the arduino-cli (included in the repo, see Listing 1). Control was then transferred to picocom which is a terminal emulation program, and the start signal sent to the Arduino. Once the scans completed, the micro SD card was moved from the Arduino to a dongle connected to the laptop, and analyzed using R as shown later.

Listing 1: Shell Script to Send/Receive Messages & Data.

#!/bin/bash
arduino-cli compile -b arduino:avr:uno $1
arduino-cli upload -p $2 -b arduino:avr:uno $1
picocom $2 -b $3 -g $4

# Typical Usage:
# Set working directory to .../Bnmr
ls -1 /dev/cu.* # With Arduino plugged in, get the name of the port
./go.sh Bnmr.ino /dev/cu.usbmodem101 9600 outfile.txt
# $1 is the name of the Arduino code file
# $2 is the port name
# $3 is the baud rate for messages
# $4 is the filename for messages received
# ADC data is stored on a micro SD card with a filename given in user_input.h
# cntrl-A central-X to quit the Picocom window

Figure 2: The sine wave generated by the PicoScope which was used to represent an NMR FID signal. Peak-to-peak voltage was 500 mV, with a 1 V offset as the Arduino ADC can only process positive voltage readings. Frequency is 50 Hz.

Message Log File

Whatever is typed in the picocom terminal/window is sent to the serial port and then to the Arduino. All messages sent by the Arduino are echoed in the picocom window and saved to a message log file. A typical output is in Listing 2.

Listing 2: Messages Captured by Picocom.

Looking for SD card...
SD card found & working
SD card directory:
SPOTLI~1/
FID_CSV     0
FSEVEN~5/
TRASHE~9/
 
Bnmr listening...
Enter g or s at any time # note: g is typed (but not echoed)
                         # and sent to the Arduino, which starts the program
=====================
Loading experiment...

Starting scans...
    Scan no: 1
    Scan no: 2
    Scan no: 3
    Scan no: 4
    Scan no: 5
Scans complete!
Experiment complete, stop

Data Log File

The data log file is a comma-separated file with an entire FID/scan on one long line. There is a blank line between each data line. This is stored on the micro SD card in a file whose name is provided by the user in user_input.h (this is where all user modifiable parameters are given). We can read in the first two scans and plot the early points as follows (Figure 3).

dat <- readLines("FID_CSV")
res1 <- as.numeric(unlist(strsplit(dat[1], ", ")))
# skipping dat[2] as it is a blank line
res2 <- as.numeric(unlist(strsplit(dat[3], ", ")))

plot(x = 1:length(res1), y = res1, type = "b", xlim = c(1, 25),
xlab = "Index", ylab = "ADC Reading")
lines(x = 1:length(res1), y = res2, col = "red")

Figure 3: Two typical “scans”.

Since there is no coordination (i.e. no common time base) between the generated signal and the ADC data collection, the two sample scans are offset slightly. A common time base is very important for an NMR, so this will be one of the next items for focus.

Footnotes

Right now, a fixed number of data points are collected in whatever time it takes. This needs to be modified so that the data points are collected over a fixed amount of time, specified by the user.↩︎
There’s a potential problem here, and that is one must collect enough points to satisfy Nyquist’s criterion in order to faithfully represent a sine wave. Preliminary experiments suggest that there are plenty of data points because the ADC is extremely fast.↩︎
I spent considerable time writing an R package which I named UtiliDuino for this purpose, but ultimately it was not the best solution.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2024,
  author = {Hanson, Bryan},
  title = {EF-NMR {Part} 3: {Receiver} {Software}},
  date = {2024-04-16},
  url = {http://chemospec.org/posts/2024-04-17-EF-NMR-Build-3/EF-NMR-Build-3.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2024. “EF-NMR Part 3: Receiver Software.” April 16, 2024. http://chemospec.org/posts/2024-04-17-EF-NMR-Build-3/EF-NMR-Build-3.html.

Bitwise Operators in C

Bryan Hanson — Tue, 30 Jan 2024 07:00:00 GMT

For the EF-NMR project, I’ve turned my attention to writing the software to capture the FID, which seemed easier than writing a pulse transmitter.¹ This requires the use of the ADC (analog to digital converter) on the Arduino. Configuring, starting, and stopping the ADC is handled by directly setting bits in a particular register on the Arduino. New territory! This post will serve as a set of notes on what I’ve learned about how this is done. In particular, I want to focus on the code people actually write, which is generally more complex than what one sees in the language reference.

The Bitwise Operators in `C`

The definitions of the bitwise C operators can be found in numerous places, stated with various levels of clarity and understandability. Sometimes the definitions are very terse and seemingly quite clear, but after reading, one simply doesn’t know how to use it. The revered text known as “K & R” doesn’t even devote much space to them, though that may be because microcontrollers were a relatively new thing at the time of Kernighan and Ritchie (1988). The Arduino reference documents give quite a bit more detail but don’t have the complexity seen in the wild.

The following gives my own interpretation and understanding of the individual operators. To be clear, these definitions don’t really give a sense of why they might be useful or how one would use them.

OR (operator: |) compares two bits and sets the destination bit to 1 unless both inputs are 0. It sets a bit.
AND (operator: &) compares two bits and sets the destination bit to 0 unless both inputs are 1. It clears a bit.
XOR (operator: ^) compares two bits and if they are the same, returns 0, if different, returns 1. It toggles a bit.
NOT (operator: ~) is a unary operator which flips all 1’s to 0’s and vice versa.
Left Shift (operator: <<) shifts a series of bits left and fills the space with 0’s. Equivalent to multiplying by 2^n.
Right Shift (operator: >>) shifts a series of bits right and fills the space with 0’s. Equivalent to dividing by 2^n.

A key thing to note is that these operators compare two bits (which are either 0 or 1) and returns an updated bit. The exceptions are:

The left and right shift operators: These operate on a series of bits. You can’t shift a single bit without stomping on adjacent memory. Very ungraceful!
NOT, the toggle: This flips or toggles a single bit, nothing is compared.

The reality is that one rarely sees these operators used on a single bit, even NOT. More often, one sees them applied to a byte, a set of 8 bits residing contiguously in memory. Those bytes, at least in the current use, turn out to be registers on the Arduino, our next topic.

ADC Registers

The ATmega328P microcontroller used in the Arduino Uno has several registers that control the ADC:²

ADMUX = ADC multiplexer selection register
ADCSRA = ADC status and control register A
ADCSRB = ADC status and control register B
ADCL and ADCH = ADC data registers

We’ll use ADCSRA as our example. ADCSRA is of course an acronym. If you look at the iom238p.h file where these things are defined, you find that ADCSRA is an alias for a specific memory address(Figure 1).³ It is the address of the first bit of a single byte, composed of 8 bits, numbered 0-7. In the datasheet we can see what is stored in this register. Each of the individual bits has a name, for instance ADEN, which stands for “ADc ENable”, and in the header file, the name ADEN is aliased to bit 7 (Figure 2). So we have an 8 bit memory address with a name and each bit has its own name to make remembering their roles easier. These are the bits we need to control with the bitwise operators in order to configure the ADC.

Figure 1: A portion of the iom328p.h header file showing how acronyms are aliased to memory locations.

Figure 2: Documentation of the ADCSRA register from the datasheet.

Wild-Type Operator Constructs in Use

As I hinted at earlier, what people actually write is rather different from the simple definitions seen in the reference documents (or my version above). So let’s explore these wild-type examples in detail.

Simple Direct Assignment

One simple example often seen doesn’t even use the bitwise operators.

ADCSRA = 0;

In this case, the right-hand-side (RHS) 0 is interpreted as an 8 bit binary number, 0000 0000 and this sets all 8 bits to zero at once. This incantation is probably most appropriate to reset the entire register, as all zeros is the default setting for this particular register (though not necessarily other registers).

Direct Assignment via Binary Literals

If you know the value for every bit you want to set, and want to set them all at once, you can use a binary literal:

ADCSRA = B00101010; // prefix binary number with B, or
ADCSRA = 0b00101010; // prefix binary number with 0b

The downside here is that future readers of your code have to look up the details of a register’s bit settings everytime they look at your code. Other methods discussed here use aliases for particular bits (e.g. ADEN) which provide at least some mnemonic assistance. Binary literals are only supported in more recent versions of C but you are likely to be using such a version.

Typical Bitwise Operator Use in the Wild

Example 1

ADCSRA |= (1 << ADEN);

In this incantation there are several interesting things going on. Let’s unpack it starting from the RHS. We see this expression: (1 << ADEN), which uses the left shift operator. This means take 1 in binary, so 0000 0001, and shift the 1 left ADEN times. If we look at either Figure 2 or Figure 1, we see that ADEN is 7, so we shift the first bit left 7 places, which gives 1000 0000 in binary. This is a “bit mask”, it’s used in the next step.

The operator |= is a variation on the OR operator. It means take whatever is on the RHS, and OR it against the left-hand-side (LHS), and put the result in the LHS.⁴ What is the current value of ADCSRA in the LHS? We don’t know in this simple example; presumably you would know in a real life example. Whatever it is, when we OR it with the RHS, bit 7, ADEN, gets set to 1, because of how OR is defined. So bit 7 is set to 1, and all other positions are unchanged.

xxxx xxxx // whatever is in ADCSRA
1000 0000 // bitmask from RHS
1xxx xxxx // result of OR (used to overwrite existing ADCSRA)

Example 2

A more involved example using direct assignment as well as bitwise operators is:

ADCSRA = (1 << ADPS2) | (1 << ADPS1) | (1 << ADPS0);

which can be unpacked as three bitmasks, OR’d against each other to get a final result to be put directly into ADCSRA. Using the values of ADPS*, we have:

0000 0001 // 1 << ADPS0 (note ADPS0 = 0 so this is no shift at all)
0000 0010 // 1 << ADPS1
0000 0100 // 1 << ADPS2
0000 0111 // result put directly into ADCSRA overwriting what is there originally

Note that the result overwrites the current value of ADCSRA; the four most significant bits are set to zero, regardless of whatever value was there. The next example shows you how to avoid that.

Example 3

Almost the same action can be accomplished with the following code, except it preserves the current settings in ADCSRA and uses a helper function, bit(), which is specific to Arduino:

ADCSRA |= bit(ADPS0) | bit(ADPS1) | bit(ADPS2);

bit() is an Arduino function that takes an integer argument and returns an 8 bit value with 1 in the position given by the argument, and zeros elsewhere.⁵ Thus it unpacks to:

0000 0001 // bit(ADPS0)
0000 0010 // bit(ADPS1)
0000 0100 // bit(ADPS2)
// the above 3 lines create the same bitmasks as in Example 2; together they become:
0000 0111 // result of OR the above 3 bitmasks
xxxx xxxx // whatever is in ADCSRA
xxxx x111 // result of OR ADCSRA against 0000 0111

In the previous two examples 1 << ADPS0 or bit(ADPS0) does very little since ADPS0 is 0. However, many coders seem to prefer a little verbosity to make clear what they are trying to achieve.⁶

Example 4

Let’s say you wanted to turn the ADC on if it was off, and off if it was on. This is a job for the ^ or toggle operator. You can use ADCSRA ^= (1 << ADEN) which unpacks as follows (ADEN is 7):

1xxx xxxx // initial (on state) of the ADC; other bits unknown
1000 0000 // result of (1 << ADEN) 
0xxx xxxx // result of toggling lines 1 and 2; put into ADCSRA; ADC is off
// or, starting with ADC off
0xxx xxxx // ADC is off
1000 0000 // result of (1 << ADEN)
1xxx xxxx // result put into ADCSRA; ADC is now on

Note that the x bits are toggled against 0, which means they are unchanged. See the truth table here.

Functions that are collections of bitwise operators

The function _BV(bit) is aliased to (1 << (bit)) and for Arduino you can use bitSet(x, n) or sbi(x, n) to write a 1 to the n-th position of register x. Thus,

ADCSRA |= (1 << ADEN); // seen earlier
ADCSRA |= _BV(ADEN);
bitSet(ADCSRA, ADEN);
sbi(ADCSRA, ADEN);

are equivalent ways to change bit 7 in ADCSRA.

For Arduino, you also have bitClear(x, n) which writes a 0 at the n-th position of register x, essentially the complement of bitSet(x, n). Internally, it is defined as ((x) &= ~(1 << (n))). Alternatively, one can use cbi(x, n), the complement of sbi(x, n). Let’s say you had 0000 0110 in ADCSRA and wanted to clear the 2nd bit.

bitClear(ADCSRA, 1); // expands to the following steps:
0000 0110 // initial value in ADCSRA
0000 0010 // value of bit mask (1 << 1)
1111 1101 // value of ~(1 << 1) where all bits have been toggled/flipped
0000 0100 // value after & comparing line 2 to line 3, writing 1 if each mask position is 1

Notice that the 2nd bit has been cleared. The = part of &= assigns the result to the LHS, namely ADCSRA.

Note that sbi() and cbi() only work for certain registers on Arduino.

This StackOverflow Question has examples of more functions and an interesting discussion of pros, cons and caveats.

Sanity-Preserving Helper Function

I modified the function found here to print register contents (well, bytes generally) in an easy-to-read format.

void print_bin(byte aByte) {
  for (int8_t aBit = 7; aBit >= 0; aBit--) {
    if (aBit == 3) {
      Serial.print(" "); // space between nibbles
    }
    Serial.print(bitRead(aByte, aBit) ? '1' : '0');
  }
  Serial.println(" ");
}

Let’s use it to check a set of operations which blend Example 2 and Example 3 above, and stick to pure C operations. This code chunk

  ADCSRA = B10001000; // arbitrary initial value
  print_bin(ADCSRA);
  ADCSRA |= (1 << ADPS2) | (1 << ADPS1) | (1 << ADPS0);
  print_bin(ADCSRA);

displays the following:

1000 1000 
1000 1111

Use it to check your work!

References

Kernighan, Brian W., and Dennis M. Ritchie. 1988. The C Programming Language (2nd Edition). Many reprint publishers exist.

Footnotes

Let me state for the record that this is just a first version; additional complexity will almost certainly be needed later.↩︎
The details on each of these can be found on the datasheet which can be found via a search engine.↩︎
The header file is available many places on the internet.↩︎
All the operators can be used the same way: |=, ^=, &=, <<= and >>=. For example C &= 2 should be thought of as C = C & 2. See this SO answer.↩︎
It’s essential to be careful with language to be clear. A byte is 8 bits, numbered from the right position as 0, i.e. 76543210. So the first bit is at position 0, etc. Thus bit(0) returns 0000 0001.↩︎
These three bits are used as a group to set the clock speed of the ADC, so it makes sense to make it clear you are using all three values together.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2024,
  author = {Hanson, Bryan},
  title = {Bitwise {Operators} in {C}},
  date = {2024-01-30},
  url = {http://chemospec.org/posts/2024-01-30-Bitwise-Operators/Bitwise-Operators.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2024. “Bitwise Operators in C.” January 30, 2024. http://chemospec.org/posts/2024-01-30-Bitwise-Operators/Bitwise-Operators.html.

Building an EF-NMR Part 2

Bryan Hanson — Mon, 01 Jan 2024 07:00:00 GMT

Part 1

With the polarization coil completed, I decided to take a stab at the software to control the instrument. I felt like I needed to get a feel for how to work with the Arduino so I would understand what kinds of signals I could send to the electronics. In turn that would (ideally) make it easier to understand how the circuits work.

I started by studying Michal’s software (available here). Michal’s software is designed for use with students in a lab course and includes a Python GUI, the actual Arduino control software, and several utilities. One of the utilities is a separate pulse programming module that produces a file accessed by the GUI. At least that appears to be the big picture. Inspection of the Arduino software made it clear that I had, and have, a lot to learn. Arduino code is written in C++, which encompases the earlier language C, which some have described as “dressed up assembly language”. Oh boy…

After studying some basic Arduino tutorials, I decided the best way to learn was to write my own software, starting with a simple case of an NMR-like interface that would turn Arduino pins on and off to control the various pieces of hardware I will eventually build. Turning pins on and off is really simple on the Arduino, that’s not the challenge. For this instrument, the challenge is that there are several events that occur one after the other on very short time scales. Roughly, one must turn the polarization coil on, then off, then turn on the transmitter and turn it off, and then turn on the receiver and listen. Due to the realities of electronics, there need to be short delays between some of these events so that the electronic signals can “warm up”, or “cool down”. To make this initial version manageable, I decided to not worry about the time scale in detail for now, and focus on building an extensible framework that takes NMR-like inputs to turn things on and off.

Prototype in R

Since R is my computational lingua franca, I decided to think about how I would set up a series of events in R and calculate their on/off times given the duration (or length) of each event. This was quite straightforward; if you know the duration of each event then the on/off times can be computed with a cumulative sum process.

#' 
#' Convert a Named Vector Giving Event Durations to a Data Frame
#' 
#' @param event_lengths Numeric.  A named numeric vector giving the durations (lengths)
#'        of a series of events which occur in the given order.
#' @return A data frame containing the on and off times for each event.
#'
event_length_to_event_on_off <- function(event_lengths) {
  off <- cumsum(event_lengths)
  on <- c(0, off[1:(length(off) - 1)])
  DF <- data.frame(event = names(event_lengths), on = on, off = off)
  DF
}

And then I needed a function to visualize the result, which is basically a sort of Gantt chart where the events never overlap.

#'
#' Create a Gantt Chart of NMR Event Timing
#'
#' @param my_events Data frame.
#' @return `ggplot2` object.
#'
events <- function(my_events = NULL) {
  p <- ggplot(my_events, aes(x = on, xend = off, y = event, yend = event))
  p <- p + geom_segment(linewidth = 8) + theme_bw()
  p <- p + labs(title = "NMR Event Timing", x = "time, microseconds", y = "")
  p <- p + scale_y_discrete(limits = my_events$event)
  p
}

Figure 1 shows these functions in action. So far, so good.

f <- 1e6 # conversion factor, seconds to microseconds
ev <- c(10 * f, 5, 1 * f, 5, 5 * f, 10 * f )
names(ev) <- c("pol_coil", "del_pt", "transmitter", "del_tr", "receiver", "relax_delay")
p1 <- events(event_length_to_event_on_off(ev))
p1

Figure 1: Event timings. The short delays are too short to be visible.

Implementation in C

Next, I decided to write something more or less equivalent in C. This meant learning C. Suffice it to say, C provides none of the niceties of R. There are few atomic types in C, and in particular strings and arrays are not native entities. Instead, one must think in terms of pointers to particular memory addresses that hold the strings or arrays. So the entire paradigm is different, and requires thinking about solving problems in new ways. Overall, this has been a good experience. After a lot of struggle, I managed to write functions that carry out the equivalent of the R functions above, except instead of graphical output there is tabular output (there really is no graphical output in the usual sense for Arduino so we need to have other ways of verifying our results). I won’t give details of this work here, as the next section reviews how it was implemented for Arduino.

Implementation for Arduino

The version of event timing in C was adapted to the Arduino with relatively minor modifications, mostly related to how results are printed to the console (the C and C++ languages for Arduino are specialized versions of the languages). I also wrote a system to control the starting and stopping of the scans, thinking ahead of how the program is actually going to be used. All user inputs are in a single file, including a simple version of a pulse program (tons of work will be needed in the future on this piece). My overall goal is to write an entire NMR control and acquistion program that runs completely on the Arduino IDE. Well, almost completely: some other entity will have to slurp up the data coming from the Arduino, as there is very little memory on the Arduino. Not sure if this can be done but that’s the goal. The code for this project is stored in a public repo here.

The output of a “run” on this “instrument” is shown in Figure 2. The table lists the event name, the Arduino pin that should be activated, and the on/off times for the events. Times are in milliseconds in the example, and are relevant for testing, not an actual NMR scan. A pin value of -1 indicates no pin is active; such an event is just a delay period so the (not yet built) electronics can settle.

Figure 2: Output of the NMR acquistion program.

This program was further tested by wiring the Arduino to a breadboard with a few LEDs and resistors to limit the current to the LEDs appropriately. The video below shows the program in action, doing two scans with the durations as shown in Figure 2. The pins from left represent polarization coil power, transmit, and receive signal (the latter of course should be listening, not powering something). As a proof of concept I’m pretty happy with this result.

What’s Next?

So much to do, but I’m not in a hurry and can choose to do things in any order that inspires me:

Polarization coil power supply circuit (some work done on this, just needs to be built)
Transmitter circuit
Receiver circuit
Details of T/R on the Arduino; this will require another round of intense learning I’m certain!

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2024,
  author = {Hanson, Bryan},
  title = {Building an {EF-NMR} {Part} 2},
  date = {2024-01-01},
  url = {http://chemospec.org/posts/2024-01-01-EF-NMR-Build-2/EF-NMR-Build-2.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2024. “Building an EF-NMR Part 2.” January 1, 2024. http://chemospec.org/posts/2024-01-01-EF-NMR-Build-2/EF-NMR-Build-2.html.

Building an EF-NMR Part 1

Bryan Hanson — Tue, 24 Oct 2023 07:00:00 GMT

Readers may have noticed an Earth Field NMR theme in several recent posts (here, here and here). Behind the scenes, my interest in this topic was growing, fertilized in large part my desire to learn more about electronics. I may have lost my mind, but I have now embarked on a project to build an EF-NMR!

I was inspired by the really simple EF-NMR instrument developed by Andy Nichol (“Nuclear Magnetic Resonance for Everybody”). Nichol’s work made it clear that one could observe an NMR signal without complex equipment. As I did more reading however, I settled on following the build of Carl Michal (Michal (2010)) as it will allow for more complex experiments, and provides more opportunity to learn electronic circuits.

Michal’s design uses two coils: a polarization coil, and a transmit/receive (T/R) coil. This post will cover the construction of the polarization coil. Michal’s polarization coil is a three-layer solenoid constructed with 18 AWG magnet wire. Each layer is a separate wire but in operation, the three layers are wired in parallel. I scaled the coil dimensions down somewhat so that I could use materials that are readily accessible to me.¹ The plan is to use a 50 mL centrifuge tube as the sample holder. The sample will be placed in a T/R coil wound around a 1.25” schedule 40 PVC pipe. The T/R coil will be located inside the polarization coil, which will be wound on a 2” schedule 40 PVC pipe. The dimensions of these pipes were chosen to allow the sample to nest easily inside the T/R coil which nests inside the polarization coil. Figure 1 shows a cross-section of the design.²

Figure 1: Cross section of the coils and sample. Grey indicates the PVC pipe components. Red indicates windings (dimensions approximate). Blue represents the sample. Dotted lines show the outer extent of the retainer rings. Scale is in mm.

Constructing the Form

The form for the polarization coil was made from a 12 cm length of 2” PVC pipe. Two retaining rings were very carefully cut from a 2” PVC coupling. The retaining rings were 1 cm wide. The parts are shown in Figure 2. The rings were then glued to the ends of the form using a minimal amount of standard PVC glue. The inner edges of the rings correspond to the original end of the coupling which provides a clean and straight edge where it will rest against the magnet wire. The ends of the assembly were lightly sanded. As built, the length available for the windings is 102 mm.

Figure 2: The form and two retaining rings before assembly.

Next, three holes were drilled close to each of the retaining rings, about 1 cm apart. The magnet wire will pass through these holes, which will serve to keep the wire in place as it is wound. Figure 3 shows these holes. A short length of wire was placed in the holes as a “keeper” as the winding was carried out. This ensured that the winding for the first layer did not block the holes for the second and third layers of wire (Figure 4).

Figure 3: View of the holes on each end of the form assembly.

Figure 4: Detail of the wire keepers. The open hole will be used for the first winding layer. The keepers will maintain access to the holes for the subsequent layers.

The Winding Jig

A winding jig was constructed from 1/4” hobby plywood. The base is 6 x 12”. Small nails and glue were used to assemble the sides and back. A 1/4” threaded rod serves as the rotational axis. Nuts and washers secure a simple handle as well as position the rod overall in the jig. Figure 5 and Figure 6 show the jig.

Figure 5: The winding jig.

Figure 6: Another view of the winding jig.

Wire Spool Holder

A holder for the wire spool was constructed with 1/16” x 1” aluminum bar. The bar was bent into a shape that would provide a way to apply friction to the sides of the spool, thus controlling the tension on the wire as it pays out. The spool is mounted on a 1/4” threaded rod and there are wingnuts on each side, which when tightened press the aluminum bar against the spool. The threaded rod does tend to unscrew as the wire is spooled out, but the process is slow enough that one can correct this as needed. If I were going to do this alot I would replace the wingnut on the side that tends to unwind with two nuts locked against each other. The holder is loosely attached to the work bench so that it can pivot as needed to accommodate the changing angle of the wire as it moves across the form. Figure 7 shows the design.

Figure 7: The wire spool holder. Tightening the wingnuts pushes the aluminum bar against the spool and gives some control over the wire tension as it pays out.

The Winding Process

The form was more or less centered on the threaded rod using a couple of wooden guide pieces. The winding process is shown in Figure 8. The wire for the first layer comes from inside the form and up through one of the holes and is wound on the form. The action of the keepers is apparent. The fingers are used to position the wire correctly. In principle tension on the wire is provided by tightening the wing nuts on the wire supply holder. However, I did not tighten them enough and I had to wrestle with getting layer one tight enough. This caused problems with the subsequent layers as you will see!

Figure 8: Winding the first layer.

The completed layer one is shown in Figure 9. The winding looks even. Layer two is shown in Figure 10. Because layer one was a little loose, the wire for layer two would sometimes slip in-between the wires of layer one and force them apart. This was exacerbated because I was using more tension on the wire supply for layer two. Clearly the layer is not even. In addition, winding layer two was more difficult because without the white background one cannot see the progress very well.

Figure 9: Completed layer one. It looks nice and even but the winding is loose.

Figure 10: Completed layer two. Technique short-comings are evident!

The problems only worsened with layer 3 (Figure 11). I am not happy with the final result, but the wire is positionally stable and it should carry out its function well enough. What I’ve learned here will help when winding the T/R coil.

Figure 11: Completed layer three. Not nearly as pretty as I had hoped!

Checking Continuity

The polymeric insulation on the leads was sanded off (Figure 12) and the resistance of each coil was measured. Each gave a resistance of about 0.7 and there were no shorts between the layers.

Figure 12: Leads for each layer with insulation sanded off.

What’s Next

The next step will be the construction of the polarization coil power supply, and integration of the Arduino controller. I’m not in a hurry!

References

Michal, Carl T. 2010. “A Low-Cost Spectrometer for NMR Measurements in the Earth’s Magnetic Field.” Meas. Sci. Technol. 21. https://doi.org/10.1088/0957-0233/21/10/105902.

Footnotes

Of course, there will be less sample and therefore a smaller signal.↩︎
The dimensions of schedule 40 PVC products are readily available online, which made planning the overall design much simpler.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2023,
  author = {Hanson, Bryan},
  title = {Building an {EF-NMR} {Part} 1},
  date = {2023-10-24},
  url = {http://chemospec.org/posts/2023-10-24-EF-NMR-Build-1/EF-NMR-Build-1.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2023. “Building an EF-NMR Part 1.” October 24, 2023. http://chemospec.org/posts/2023-10-24-EF-NMR-Build-1/EF-NMR-Build-1.html.

The n + 1 rule in Earth’s Field NMR

Bryan Hanson — Mon, 18 Sep 2023 07:00:00 GMT

I have been studying Earth’s Field NMR for a bit now. The other day I came across a paper that clued me into some additional interesting features of EF-NMR I was not aware of.

As all organic chemists know, in NMR we use the rule to determine splitting, and Pascal’s triangle as a nemonic to remember the relative areas of the peaks within a multiplet. For instance, we expect that the group in ethanol to be a triplet with areas 1:2:1, due to the group having two proton neighbors in the group. We treat the two protons in as magnetically equivalent.

J-Coupled Specta

The rule works at typical fields used for structural determination, let’s say 60 MHz and above.¹ At these fields one is working in the so-called “weak coupling” region. However, as one lowers the field to really low values, one encounters the “strong coupling” region, where one observes “J-coupled-spectra” or JCS. Under strong coupling, the protons in ethanol are no longer magnetically equivalent, and each of them couples differently to other nuclei, and the rule breaks down.

The strict requirement for JCS is that there be two or more protons attached to a spin heteroatom and the magnetic field be quite small. For a simple system, let’s say , the strict requirement to see separate lines for the no-longer-equivalent protons is:

If this seems a bit strange, well, 1) it is, and 2) it has always been the case that the “equivalent” protons in for example a group do couple, we just don’t normally see it or worry about it.²

How Small is Small?

How small does the magnetic field have to be for J-coupled spectra to appear? This is covered in detail in Appelt et al. (2010) but generally speaking JCS appear at around to Tesla.³ The magnetic field of earth is around 50 mT, right in the sweet spot. The Larmor resonance frequency for at this field strength is around 2 kHz.

What Replaces the Rule?

In the case of a system like , the number of lines that will be observed is

Where is the set of odd numbers (for odd , one evaluates until ; for even , evaluate until ). The leading multiplier of 2 accounts for the doublet due to . This formula doesn’t exactly roll off the tongue. We can evaluate it to get the first few terms:

N <- 5L # evaluate 1:N terms
no.lines <- rep(NA_integer_, N) # initialize storage
for (i in 1:N) {
  odd <- (1:i) %% 2
  n <- (1:i)[as.logical(odd)] # get odd n no larger than N
  no.lines[i] <- sum(i - n + 1) # take advantage of R's vectorization
}
names(no.lines) <- paste("N=", 1:N, sep = "") # pretty it up
no.lines <- no.lines * 2 # account for J_HX
no.lines

N=1 N=2 N=3 N=4 N=5 
  2   4   8  12  18

Examples

A couple of examples should clarify the situation. All of these will be from the perspective of observing .⁴ These examples are taken from Appelt et al. (2007).

At high field the spectrum of would be a symmetric doublet with a peak separation of .

In earth’s field, the spectrum is first split into a doublet by , but the spacing is not symmetric. Then, each part of the doublet is split further into two peaks, also asymmetrically and with varying linewidths. Figure 1 shows how the splitting changes as a function of field strength. Note that in the strong coupling region there are four peaks, as predicted above.

Figure 1: Figure 4 from Appelt et al. (2007), showing the field dependence of the spectrum.

For the case of methanol in Earth’s field, the spectrum is first asymmetrically split by the with spacing . Then each part of the doublet is further split into four peaks. Figure 2 shows the EF spectrum of methanol. The asymmetry of the line spacing and line widths is apparent.

Figure 2: Figure 5 from Appelt et al. (2007), showing the spectrum of methanol.

References

Appelt, Stephan, F. Wolfgang Häsing, Holger Kühn, and Bernhard Blümich. 2007. “Phenomena in -Coupled Nuclear Magnetic Resonance Spectroscopy in Low Magnetic Fields.” Phys. Rev. A 76 (August): 023420. https://doi.org/10.1103/PhysRevA.76.023420.

Appelt, Stephan, F. W. Häsing, U. Sieling, A. Gordji-Nejad, S. Glöggler, and B. Blümich. 2010. “Paths from Weak to Strong Coupling in NMR.” Phys. Rev. A 81 (February): 023420. https://doi.org/10.1103/PhysRevA.81.023420.

Kaseman, Derrick C., Per E. Magnelind, Scarlett Widgeon Paisner, Jacob L. Yoder, Marc Alvarez, Algis V. Urbaitis, Michael T. Janicke, Pulak Nath, Michelle A. Espy, and Robert F. Williams. 2020. “Design and implementation of a J-coupled spectrometer for multidimensional structure and relaxation detection at low magnetic fields.” Review of Scientific Instruments 91 (5): 054103. https://doi.org/10.1063/1.5130391.

Footnotes

60 MHz chosen simply because commercial instruments have been available at that field for forever.↩︎
The protons in something like actually do couple to each other. With a little trick, you can measure .↩︎
This is a general trend. The exact boundaries between various coupling regimes depends on the nuclei involved, the coupling constants and the peak separation in Larmor frequency (in Hz).↩︎
Remember, signals are very weak at EF so observing heteronuclei is significantly more challenging. See the previous post for details.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2023,
  author = {Hanson, Bryan},
  title = {The n + 1 Rule in {Earth’s} {Field} {NMR}},
  date = {2023-09-18},
  url = {http://chemospec.org/posts/2023-09-18-EF-NMR-2/EF-NMR-2.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2023. “The n + 1 Rule in Earth’s Field NMR.” September 18, 2023. http://chemospec.org/posts/2023-09-18-EF-NMR-2/EF-NMR-2.html.

JEOL’s Delta Now Includes ChemoSpec

Bryan Hanson — Wed, 23 Aug 2023 07:00:00 GMT

Over on Twitter I caught news of a new application note from JEOL: Their Delta software for NMR now contains an interface to my R package ChemoSpec. The application note is here and gives a pretty complete overview of what they call the “chemometrics tool”. The JEOL software developers have added a number of short dialog boxes to access the various chemometric methods. The dialog boxes capture the arguments for each underlying function and then the full function call is assembled and passed to Rscript, which is a command line version of R intended for embedded uses such as this one.

This is a good example of Free and Open Source Software (FOSS). ChemoSpec is licensed under GPL-3 which permits any reasonable use as long as there is attribution to the original authors.

Check out the first line of the “About Delta” box:

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2023,
  author = {Hanson, Bryan},
  title = {JEOL’s {Delta} {Now} {Includes} {ChemoSpec}},
  date = {2023-08-23},
  url = {http://chemospec.org/posts/2023-08-23-CS-Delta/CS-Delta.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2023. “JEOL’s Delta Now Includes ChemoSpec.” August 23, 2023. http://chemospec.org/posts/2023-08-23-CS-Delta/CS-Delta.html.

FOSS4Spectroscopy Update

Bryan Hanson — Tue, 15 Aug 2023 07:00:00 GMT

Yesterday I pushed a major update to the FOSS for Spectroscopy web site. Remember that this is a lightly curated and imperfect process; I have some scripts that automate the discovery of packages, but there is still a considerable amount of manual inspection and decision making. If you think I’ve missed a package, please let me know.

It’s been nearly a year, and there are a number of new entries. Let’s do a quick comparison of the results from November 2022 versus August 2023. Back in November 2022 there were 246 packages; nearly a year later there are 287. Figure 1 shows a Venn diagram of the changes.

Figure 1: Venn diagram comparing the two sets of packages

Package Language

Software development in spectroscopy is clearly actively occurring in the Python ecosystem; R has stalled (see Table 1). Interpretation of this observation is challenging. A few thoughts:

One could claim that the R ecosystem for spectroscopy is mature and further development is naturally going to be limited.
The growing popularity of the Python language surely contributes significantly.
One motivation for people to write packages is to learn the language and the package delivery system. There’s nothing wrong with these motivations, however this leads to packages that largely overlap in their features.

Table 1: Package language, 2022 vs 2023.

language	Nov 2022	Aug 2023
Python	162	198
R	60	61
C++	4	5
Java	4	4
Julia	4	5
C	2	2
Qt	2	2
C-shell	1	1
C#	1	2
Fortran	1	1
Go	1	1
html	1	1
JavaScript	1	2
TypeScript	1	1
XML	1	1

Package Focus

Table 2 shows the change in package focus. Most categories grew modestly.

Table 2: Package focus, 2022 vs 2023.

category	Nov 2022	Aug 2023
Any	32	34
Data Sharing	33	41
EEM	3	3
EPR, ESR	5	7
IR (all flavors)	35	38
Raman	28	34
UV-Vis, UV, Vis	19	20
LIBS	3	5
Muon	1	0
PES	1	2
XRF, XAS	10	15
NMR	87	97
Time Series	3	3

Personal Perspective

I’ve curated this site for several years now. One thing that is clear is that there is a lot of duplication of effort and features. I mentioned above a few reasons for this, but at some point it makes more sense to add to an existing package than to write one from scratch. However, this can only happen if people look around for existing software first. That of course is one purpose of the FOSS for Spectroscopy web site.

As I look at it,

One-dimensional spectroscopic techniques produce collections of x,y data, usually spectra¹, and can thus be stored in a matrix. In terms of organization there’s nothing different between an IR spectrum and a UV-Vis spectrum.
Two-dimensional techniques produce data that can be stored in one of two ways:
- One spectrum (or one wavelength) can be stored as matrix, so a set of spectra is a stack of matrices (termed an array in some languages). Think of 2D NMR spectra: one element of the stack is a single 2D spectrum.
- Alternatively, individual spectra can be stored in a matrix and an additional data structure provides a key to how each spectrum relates to the others. Think of a Raman image: spectra are collected over a set of x,y locations.

This design decision is the core of building a package. Once you have decided on a structure:

You need import methods, these are always tedious to write.
- Broadly accepted formats, like JCAMP-DX or plain old csv.
- Manufacturer specific formats, some of which may be poorly documented.
You need processing methods.
- Widely used methods, like normalization and smoothing.
- Technique specific methods, such as zero-filling.
You need analysis methods.
- Common techniques like PCA.
- Analysis unique to a specific technique.
You need visualization methods.

In an ideal world, a data storage structure is chosen and everything else can be built later, quickly at first and then more slowly. The reality however is that people keep reinventing most of the wheel. I suppose this is not too different from people inventing entirely new computer languages…

Footnotes

I say “usually spectra” because for some instruments, depending upon the goal of the package, one may store raw data that must be transformed in a separate step. The best example is raw time-domain NMR data which must be Fourier transformed into frequency-domain spectra before analysis.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2023,
  author = {Hanson, Bryan},
  title = {FOSS4Spectroscopy {Update}},
  date = {2023-08-15},
  url = {http://chemospec.org/posts/2023-08-15-F4S-Update/F4S-Update.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2023. “FOSS4Spectroscopy Update.” August 15, 2023. http://chemospec.org/posts/2023-08-15-F4S-Update/F4S-Update.html.

Earth’s Field NMR

Bryan Hanson — Wed, 26 Jul 2023 07:00:00 GMT

TL;DR

In EF-NMR the line widths are extremely narrow but there is no chemical shift dispersion.
We can observe heteronuclear couplings in EF-NMR as they are field-invariant.
In EF-NMR the population of the two energy states is essentially equal, eliminating any signal. We can get around this with pre-polarization.
The resonance frequency of NMR in EF is in the audio range, greatly simplifying the electronics.

Let’s take a closer look from first principles what kinds of information one can glean from EF-NMR. We’ll restrict our discussion to spin nuclei with ~100% abundance, like , or – you’ll see why soon enough. Table 1 gives some relevant physical parameters for these nuclei.

Table 1: Important NMR parameters for ~100% abundant spin nuclei. The units of are rad . Larmor frequency is relative in MHz.

Nuclei	Gyromagnetic ratio	Larmor Freq.
	26.7522	100
	25.1815	94
	10.8394	40.5

Excellent general references on NMR theory are Friebolin (Friebolin 2011) and Claridge (Claridge 2016).

Line Widths are Very Narrow

The line width of an NMR signal is primarily dependent on the homogeneity of the field, which in the case of earth’s field is very good. Appelt et al. (2006) state that when observations are made >100 meters from buildings and ferrous structures¹ the homogeneity of the earth’s magnetic field for small sample volumes is in the range of . They further state that when seconds line widths will be less than 0.1 Hz.² This all sounds very promising: narrow lines imply good separation between peaks.

No Chemical Shift Information

One of the characteristics of high-field NMR which makes it so useful is the dispersion of chemical shifts as a function of structure. Unfortunately, EF-NMR has effectively zero chemical shift dispersion. The equation for computing chemical shift, , is:

where the units are:

since is a field strength independent quantity. Taking to be zero, e.g. TMS added to the sample, we can rearrange the equation to get . Consider the compound whose methyl group has a chemical shift of 2.63 ppm. Using an earth’s field Larmor frequency of 19.1 KHz, we can compute the shift of in Hz as 0.0191 Hz. This is an extremely small value, smaller than the typical line width in earth’s field (so the promise of narrow line widths is not going to save us).

For further comparison, we can do the same calculation for which has a shift of 4.90 ppm. The result is exactly the same, 0.0191 Hz. We can see that these two compounds with differing numbers of halogens, which would be trivial to distinguish with a low field bench-top instrument operating at 80 MHz, are indistinguishable in earth’s field. This is due to the very small value of earth’s magnetic field.

Heteronuclear Couplings

While the chemical shift dispersion in earth’s field is clearly nil, heteronuclear J couplings are readily observed due to their greater magnitude, up to about 200 Hz. Appelt et al. (2006) gives a number of interesting examples involving , and containing compounds.

Populations of Quantum States

Basic NMR theory tells us that the energy difference between the two quantum states for a spin nucleus is proportional to the field strength :

where is . A plot for is shown in Figure 1; the right-most point corresponds to a 1,000 MHz instrument. Clearly as goes to zero the goes to zero in a simple linear fashion.

Figure 1: E as a function of field strength

We can then relate the number of nuclei in the upper energy state, , to that in the lower energy state, , at thermal equilibrium as:

where is the Boltzman constant and is the temperature in Kelvin. The ratio of population states is nearly equal for any value of but of course gets even worse as decreases. This is the reason for the low overall sensitivity of NMR as an analytical technique. We can compute the ratio for at room temperature; we’ll compare the value for earth’s field to those of a 100 and 1,000 MHz instruments:

      45 uT (Earth)    2.35 T (100 MHz) 23.49 T (1,000 MHz) 
     1.000000000000      0.999999999998      0.999999999982

As you can see, in earth’s field there is basically no difference in the two population states, meaning there is no signal to observe. Clearly a problem!

If all the nuclei were in we could measure the energy required to bump them up to , or more commonly, bump them up and then watch the energy given off as equilibrium returns. Unfortunately, the signal produced is proportional to , which is effectively zero in earth’s field. At the same time however, the more spins we have, the higher the signal will be. More spins total in the detection coil sweet spot will be helpful, but there are other factors mitigating against making large coils to accommodate large samples. One way around this is to use signal averaging.

Pre-Polarization

In the case of earth’s field NMR, the usual way around this problem of very limited signal is to pre-polarize the sample.³ This basically involves subjecting the sample to a fairly high magnetic field for a brief period before measuring the any signals. This pre-polarization field forces more of the nuclei to assume the lower energy state, thus increasing which means there is a signal to be observed. Mohorič has an excellent but technical discussion of the details of this process (Mohorič and Stepišnick 2009).

EF-NMR Signals are in the Audio Range

What is the Larmor (resonance) frequency in earth’s field? Earth’s magnetic field varies from about 25 to 65 T; we’ll use an intermediate value of 45 T for our calculations. The Larmor frequency is given by the equation:

Notice there is a simple linear relation between and .⁴ If we plug in values for our nuclei we get the following values in Hz:

       1H       19F       31P 
19159.852 18034.921  7763.148

What we have shown here is that for EF-NMR, resonance frequencies are in the audio (20 - 20,000 Hz) and lower radio (20,000 Hz +) frequency range. Why is this important? It greatly simplifies signal detection because audio receivers are essentially radios, and the electronics for working in this frequency range are extremely well worked out, and not expensive to buy or build.

Historical Note

The first earth’s field NMR experiment was apparently conducted by Martin Packard and Russell Varian while at Varian Associates (Packard and Varian 1954). Varian Associates was of course a major instrument player, including NMR, and for a long time marketed their instruments largely toward colleges. ⁵

References

Appelt, Stephan, Holger Kühn, F. Wolfgang Häsing, and Bernhard Blümich. 2006. “Chemical Analysis by Ultrahigh-Resolution Nuclear Magnetic Resonance in the Earth’s Magnetic Field.” Nature Physics 2. https://doi.org/10.1038/nphys211.

Claridge, Timothy D. W. 2016. High-Resolution NMR Techniques in Organic Chemistry. Elsevier.

Friebolin, Horst. 2011. Basic One- and Two-Dimensional NMR Spectroscopy. Wiley-VCH.

Mohorič, Aleš, and Janez Stepišnick. 2009. “NMR in the Earth’s Magnetic Field.” Progress in Nuclear Magnetic Resonance Spectroscopy 54: 166–82. https://doi.org/10.1016/j.pnmrs.2008.07.002.

Packard, Martin, and Russell Varian. 1954. “Free Nuclear Induction in the Earth’s Magnetic Field.” Physical Review 93: 939.

Footnotes

Keep in mind that buried utilities made of iron or carrying electrical current can interfere.↩︎
is the relaxation time for magnetization aligned with the axis, which corresponds to the axis. This is the relaxation time that affects the ability to pulse quickly. It’s also called the spin-lattice relaxation time. is the relaxation time corresponding to magnetization in the plane, and is also known as the spin-spin relaxation time. is largely determined by magnetic field inhomogeneity and the line width at half peak height is . . See Friebolin chapter 7 for a detailed discussion.↩︎
In fact pre-polarizing or polarizing the sample is now en-vogue for higher field instruments as well, in the form of DNP, SABRE etc.↩︎
The gyromagnetic ratio can be negative, hence the absolute value is taken here.↩︎
Martin Packard is apparently unrelated to David Packard, one of the founders of HP.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2023,
  author = {Hanson, Bryan},
  title = {Earth’s {Field} {NMR}},
  date = {2023-07-26},
  url = {http://chemospec.org/posts/2023-07-19-EF-NMR-1/EF-NMR-1.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2023. “Earth’s Field NMR.” July 26, 2023. http://chemospec.org/posts/2023-07-19-EF-NMR-1/EF-NMR-1.html.

Home Built Photometer

Bryan Hanson — Sun, 16 Jul 2023 07:00:00 GMT

Way back in 2014, I ordered the parts and started to build a photometer according to the plans laid out by McClain (2014). I didn’t get very far, it was a busy time. Well, I have finally completed the project!

A number of simple designs for photometers and spectrometers have been published. What drew me to McClain’s approach is that his goal is to teach some basic electronics relevant to instrument design, which is something I have wanted to learn for sometime (apparently since 2014, though actually I think this goes back to watching my father build a Heath Kit stereo receiver which used tubes). Further, McClain starts with a very simple design, and then adds circuit modules to improve the design. Everything is laid out logically and is easy to follow. At each step there is an opportunity to go further to understand how the circuit actually works in detail.

In this post I’ll describe the project at various stages. All the electronics are McClain’s design, but instead of McClain’s cuvette holder I used the design of Kvittingen (Kvittingen et al. (2017)) which uses LEGO bricks as a sample holder and can accommodate an additional detector for fluorescence measurements.

This design is a photometer, and not a spectrophotometer, because only one wavelength at a time can be measured. The source LED must have an emission spectrum overlapping with the of the compound to be measured; LEDs are available which cover pieces of the whole visible spectrum so it’s pretty easy to swap for a different wavelength range. The detector photodiode (a type of LED, working in reverse) responds over a broad wavelength range, though with greatly varying efficiency. If one wants to measure fluorescence, the photodiode is moved to the 90 position.¹

A couple of important notes:

The supplementary material to McClain’s article is where everything is covered in detail.
A membership to CircuitLab was really helpful as it allowed me to simulate circuits and change values of components to get a better sense of how things work in detail (as you will see).
I was hoping to get by without an oscilloscope, but ultimately I needed one for troubleshooting. It turned out to really advance my understanding of the circuits. I purchased a PicoScope 2204A which along with the software turns your computer into a basic oscilloscope. Strongly recommended, it’s a very nice product!

Version 1: DC Power Supply for the LED

In this version a standard “green” LED (maximum emission at 523 nm) is used as the light source and has the simplest possible power supply. As built, the system provides a current of about 26 mA to the LED. The data sheet recommends 30 mA max.

The detector in this version is a photodiode linked to a TIA, a transimpedance amplifier. This is an current to voltage (I to V) converter, and something similar can be used in any instrument where a detector generates a current. Figure 1 shows the circuit.

The main deviation from McClain’s design is that R2 needed to be set to 3M in order to reach about 1V on the output. McClain gives a range of 100K to 1M. As the value of this resistor goes up, the output voltage goes up due to increasing amplification. This change is likely necessary as the photodiode in use here is a bit different than McClain specified. After some experimentation, the current on I1 (which replicates the current produced by the photodiode in the simulation) was set to 1/10,000 of the value of the current of D1, based upon currents observed when isolating D2 from the rest of the circuit.

Monitoring the current and voltage across D2 as built and warmed up, the values were about 0.3 A and 0.23 V; if the LEGO holding D1 was moved immediately adjacent to that holding D2 these numbers were 0.7 A and 0.26 V. These readings support the discussion above that the photodiode was generating a relatively small response.

Figure 1: Version 1 with simple LED power supply and a transimpedance amplifier as the detector.

Figure 2 and Figure 3 show the project from each side.

Figure 2: View of project. The simplicity of the supply to the LED is apparent. The top rail is the negative supply, the lower rail is the positive supply, and the 2nd-from-bottom rail is the ground.

Figure 3: View of project. The op amp for the detector is in the foreground.

Version 2: Relaxation Oscillator as the LED Power Supply

The next step in McClain’s scheme is to change the basic power supply to a more sophisticated “relaxation oscillator” which produces a square wave output with a certain frequency. The idea here is to eliminate stray room light from affecting the output by using a specific AC-like frequency as the source and then modify the detector to only see this frequency. Stray room light may consist of random light causing DC offsets in the circuit, or something more determinant like 60 Hz flicker from light fixtures.

Simulation

The relaxation oscillator circuit was modeled in CircuitLab before building the circuit. The circuit is in Figure 4 and the simulation results are shown in Figure 5.

Figure 4: The relaxation oscillator circuit.

Figure 5: The relaxation oscillator simulation output. The gold/orange line indicates the charge on C2 building and decaying. When it reaches an extreme positive or negative value, the phase of the output square wave changes. The lower plot is the very small current produced at the output of the op amp.

As Built

Capacitor C2 controls the frequency of the square wave produced by the relaxation oscillator. Figure 6 shows the oscilloscope traces with C2 set to 1F which gives a frequency of about 8 Hz, as seen in the video below. This serves as visual “proof of concept”. Figure 7 shows the oscilloscope traces for a value of 4700 pF for C2 which generates a square wave with frequency 1,500 Hz. This is higher than the frequency of any room light flickering and thus will serve as a “carrier” of the absorbance value unaltered by any stray room light, once we add the other modules to the detection side.

Note that all oscilloscope traces have two vertical scales, one on the left and one on the right, color coordinated with the trace.

Figure 6: Relaxation oscillator with a 1F capcitor for C2. The blue curve is the output of the op amp, the red line is the charging and discharging of C2. Note the box at the bottom which reports the period of the square wave, which corresponds to a frequency of about 8 Hz.

Figure 7: Relaxation oscillator with C2 at 4700 pF. Note the horizontal scale range is much smaller than in the previous figure, as the frequency is much higher.

The built version of the relaxation oscillator corresponds well with the simulation.

Version 3: Almost All the Bells and Whistles

This final version contains all the circuits as described by McClain. I decided to measure voltages directly at the output rather than use an Arduino and display to provide an absorbance value.

Figure 8 shows the final circuit. Note that several test points are labeled and referred to in the discussion below.

Figure 8: Completed project with key modules labeled. Click for full size.

Relaxation Oscillator

The details of the relaxation oscillator are exactly as described above.

Current Amplifier

As the simulation of the relaxation oscillator shows, the current output of the op amp is very small. Consequently a simple transistor is used to bump up the current driving the LED source to an appropriate value.

I to V Converter

The I to V converter circuit is the same as described earlier.

High Pass Filter

A high pass filter takes a signal that is time-varying, in our case a square wave, and filters it so that only high frequency components are kept. This is a key part of the detector design, since we create an approximately 1,500 Hz square wave and any other component, like 60 Hz flicker from room lights, should be eliminated. Figure 9 shows an isolated version of our high pass filter, and Figure 10 shows the frequency dependency filtering.

Figure 9: Isolated high pass filter circuit.

Figure 10: Frequency dependence of the high pass filter. Lower values on the vertical axis means greater attenuation.

Half Wave Rectifier

A half wave rectifier converts an alternating current, alternating between positive and negative values, into a positive only form. Essentially, the negative portion of the signal is converted to positive values, and the positive portion is set to zero. Figure 11 shows the action of the rectifier.

Figure 11: Action of the half wave rectifier. The red trace is observed at test point D, and fluctuates positive and negative. The blue trace is the rectified wave observed at test point E. Notice that its voltage is always positive.

Active Low Pass Filter

The final step is an active low pass filter which only passes signals below a certain frequency and amplifies them (that’s the active part). Importantly, in addition to amplifying the signal, the op amp emits a steady DC voltage which is ultimately proportional to the current hitting the photodiode. This is the value we are after when making absorbance measurements. Figure 12 shows the actual output.

Figure 12: Effect of the low pass filter. The red curve is the same as in the previous figure, namely test poiont D. The blue line is the final DC output at test point F. This is where the final voltages are measured.

If we isolate the low pass filter circuit we can try to understand its operation in greater detail. Figure 13 shows the isolated circuit with simulation inputs configured to match the measured inputs.

Figure 13: Isolated active low pass filter circuit.

If we look at the frequency dependence of this circuit, we see that low frequencies are passed relatively unattenuated (Figure 14), as expected. The combination of the earlier high pass filter and this low pass filter amounts to a band pass filter. This suggests a potential follow up design which uses a band pass filter followed by rectification and conversion to DC by some combination of op amps.

Figure 14: Attenuation of high frequencies by the low pass filter.

In addition to the filtering behavior, we know that the circuit produces a steady DC current from the approximately square wave input. Let’s check this using the simulator again, but this time looking at output voltages. Figure 15 shows the results, which should ideally be close to those in Figure 12.

Figure 15: Voltages produce by the active low pass filer. The square wave is the simulated input. The gold/orange line is the output DC voltage, which is higher than observed, probably due to an imperfect simulation configuration. The key point is that a steady DC voltage is produced.

Calibration Curve

A calibration curve was prepared using a 10 mL plastic syringe and some small bottles. Two drops of red food coloring were added to 10 mL of water to create the first solution. Three mL of the stock solution was added to seven mL of water. This 2nd solution was then diluted in similar fashion and so forth, to get five total solutions. Tap water was used. The green LED was disconnected and the dark current was measured. Next, tap water was used as a blank. Then the voltage for each sample was recorded (voltage measurements are taken at point F in Figure 8). Listing 1 shows the computational steps. Figure 16 shows the samples from most concentrated to least concentrated.

Figure 16: Calibration samples.

Listing 1: Computation of absorbance values.

dark <- 6.0e-3 # dark voltage
blank <- 0.281 # tap water
voltage <- c(26.2e-3, 26.8e-3, 34.0e-3, 99.9e-3, 196.0e-3) # sample readings
stock <- 1.0 # 2 drops red food coloring in 10 mL tap water
dil <- 3/10 # serial dilution factor
conc <- c(stock, dil^(1:4))
DF <- data.frame(Concentration = conc, Voltage = voltage)
DF$Absorbance <- -log((DF$Voltage - dark)/(blank - dark))

Table 1 shows the results. A calibration curve is shown in Figure 17. Clearly the most concentrated samples exceed the linear behavior expected for Beer’s Law (as observed by McClain). If the two most concentrated samples are dropped, the result is a nice linear relationship, as seen in Figure 18 and the summary of the fit in Listing 2.

Table 1: Relative sample concentrations and corresponding voltages and absorbances.

Concentration	Voltage	Absorbance
1.0000	0.0262	2.611089
0.3000	0.0268	2.581818
0.0900	0.0340	2.284567
0.0270	0.0999	1.074541
0.0081	0.1960	0.369747

Figure 17: Calibration curve, hardware version 3.

Listing 2: Results of fitting the three lowest concentration samples.

DF35 <- DF[3:5,]
fit <- lm(DF35$Absorbance ~ DF35$Concentration)
summary(fit)


Call:
lm(formula = DF35$Absorbance ~ DF35$Concentration)

Residuals:
       1        2        3 
-0.03688  0.15983 -0.12294 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)  
(Intercept)          0.3118     0.1840   1.694   0.3394  
DF35$Concentration  22.3292     3.3801   6.606   0.0956 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.205 on 1 degrees of freedom
Multiple R-squared:  0.9776,    Adjusted R-squared:  0.9552 
F-statistic: 43.64 on 1 and 1 DF,  p-value: 0.09564

Figure 18: Calibration curve, hardware version 3, dropping the two most concentrated samples.

Not too bad!

References

Kvittingen, Eivind V, Lise Kvittingen, Thor Bernt Melo, Birte Johanne Sjursnes, and Richard Verley. 2017. “Demonstrating Basic Properties of Spectroscopy Using a Self-Constructed Combined Fluorimeter and UV-Photometer.” Journal of Chemical Education 94: 1486–91.

McClain, Robert L. 2014. “Construction of a Photometer as an Instructional Tool for Electronics and Instrumentation.” Journal of Chemical Education 91: 747–50.

Footnotes

I have not tested the fluoresence measurement as other projects are calling me. In addition to changing the position of the photodiode, a few resistors may need to be changed in order to achieve sufficient signal.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2023,
  author = {Hanson, Bryan},
  title = {Home {Built} {Photometer}},
  date = {2023-07-16},
  url = {http://chemospec.org/posts/2023-07-16-Photometer/Photometer.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2023. “Home Built Photometer.” July 16, 2023. http://chemospec.org/posts/2023-07-16-Photometer/Photometer.html.

DIY NMR in Earth’s Field

Bryan Hanson — Mon, 12 Jun 2023 07:00:00 GMT

I have always loved every aspect of NMR. My first introduction was as an undergraduate at Cal State Los Angeles, where I was introduced to a Bruker instrument that used a folded punched tape to read in its operating system. Fortunately, that machine was already quite older and there was a Varian EM-360 which was the work horse for routine spectra (bonus points if you can guess roughly what year this was!). Besides the extremely broad usefulness of NMR instruments, the combination of physics, chemistry, computer science and electronics that undergird the practical aspects of NMR are endlessly fascinating to me.

The development of simple, home-built NMR instruments over the past two decades is very interesting and appealing. These instruments typically don’t have a magnet, but rather use the earth’s magnetic field and some type of polarization process to improve sensitivity. Most of these instruments use an inexpensive microprocessor like an Arduino or Raspberry Pi to control the instrument, along with some purpose-built electronic circuits. Good examples are the work of Michal (Michal (2010), Michal (2020)), Trevelyan (Manley (2019)) and Bryden (Bryden et al. (2021)). These instruments of course aren’t able to give the same results as higher-field instruments with superconducting magnets or Halbach arrays. What can you do with these instruments? Because earth’s magnetic field is very homogeneous locally, the line widths are very narrow, and thus coupling constants and can be measured.¹ However, the chemical shift range is really small, so structural studies are out. Sensitivity is relatively poor as well. Imaging (MRI) is in principle possible. By the way, there are also examples of DIY Nuclear Quadropole Resonce (NQR) instruments as well, which require no magnetic field (Hiblot et al. (2008)).

Recently, a simpler DIY NMR instrument was published as a Hackaday project by Andy Nichol. This “Nuclear Magnetic Resonance for Everybody” project is unique due to its use of only off-the-shelf commericially available hardware components. Because the hydrogen Larmor precession frequency in earth’s magnetic field is in the audio range, the project uses a standard and readily available audio amplifier to simplify the signal detection process. In addition, the complexities of pulse programming are avoided in this project by using a mechanical switch to switch between polarization and detection modes. Finally, a single coil is employed for both polarization and detection. Signal processing is handled by readily available software.

This is an interesting project and it is the most basic entry point into DIY NMR that I have encountered. If it whets your appetite, the project can be made progressively more sophisticated by selectively bringing in the more advanced features of some of the other designs.

References

Bryden, Nicolas, Michael Antonacci, Michele Kelley, and Rosa T. Branca. 2021. “An Open-Source, Low-Cost NMR Spectrometer Operating in the mT Field Regime.” Journal Magnetic Resonance 332. https://doi.org/10.1016/j.jmr.2021.107076.

Hiblot, Nicolas, Benoit Cordier, Maude Ferrari, Alain Retournard, Denis Gradclaude, Jerome Bedet, Sebastien Leclerc, and Daniel Canet. 2008. “A Fully Homemade Quadrupole Resonance Spectrometer.” Comptus Rendus Chemie 11: 568–79. https://doi.org/10.1016/j.crci.2007.08.011.

Manley, S. W. 2019. MRI and NMR Spectroscopy in the Earth’s Field.

Michal, Carl T. 2010. “A Low-Cost Spectrometer for NMR Measurements in the Earth’s Magnetic Field.” Meas. Sci. Technol. 21. https://doi.org/10.1088/0957-0233/21/10/105902.

———. 2020. “Low-Cost Low-Field NMR and MRI: Instrumentation and Applications.” Journal Magnetic Resonance 319. https://doi.org/10.1016/j.jmr.2020.106800.

Footnotes

Locally homogeneous provided you are away from buildings, electrical transmission lines etc.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2023,
  author = {Hanson, Bryan},
  title = {DIY {NMR} in {Earth’s} {Field}},
  date = {2023-06-12},
  url = {http://chemospec.org/posts/2023-06-12-DIY-NMR/DIY-NMR.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2023. “DIY NMR in Earth’s Field.” June 12, 2023. http://chemospec.org/posts/2023-06-12-DIY-NMR/DIY-NMR.html.

You Can Now Subscribe

Bryan Hanson — Mon, 07 Nov 2022 07:00:00 GMT

Just a short post to let readers know that you can now subscribe to this blog. Of course, you have always been able to get the RSS feed via the buttons in the navbar, but now you can submit your e-mail and my cheap-ass free level MailChimp account will let you know when there is a new post. You might find this useful if you are leaving Twitter or Twitter collapses completely!

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2022,
  author = {Hanson, Bryan},
  title = {You {Can} {Now} {Subscribe}},
  date = {2022-11-07},
  url = {http://chemospec.org/posts/2022-11-07-Announce-Subscribe/Announce-Subscribe.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2022. “You Can Now Subscribe.” November 7, 2022. http://chemospec.org/posts/2022-11-07-Announce-Subscribe/Announce-Subscribe.html.

Bryan Hanson — Wed, 05 Oct 2022 07:00:00 GMT

undefined

Bryan Hanson — Wed, 05 Oct 2022 07:00:00 GMT

undefined

Notes on Linear Algebra Part 4

Bryan Hanson — Mon, 26 Sep 2022 07:00:00 GMT

Series: Part 1 Part 2 Part 3

Back in Part 2 I mentioned some of the challenges of learning linear algebra. One of those challenges is making sense of all the special types of matrices one encounters. In this post I hope to shed a little light on that topic.

A Taxonomy of Matrices

I am strongly drawn to thinking in terms of categories and relationships. I find visual presentations like phylogenies showing the relationships between species very useful. In the course of my linear algebra journey, I came across an interesting Venn diagram developed by the very creative thinker Kenji Hiranabe. The diagram is discussed at Matrix World, but the latest version is at the Github link. A Venn diagram is a useful format, but I was inspired to recast the information in different format. Figure 1 shows a taxonomy I created using a portion of the information in Hiranabe’s Venn diagram.¹ The taxonomy is primarily organized around what I am calling the structure of a matrix: what does it look like upon visual inspection? Of course this is most obvious with small matrices. To me at least, structure is one of the most obvious characteristics of a matrix: an upper triangular matrix really stands out for instance. Secondarily, the taxonomy includes a number of queries that one can ask about a matrix: for instance, is the matrix invertible? We’ll need to expand on all of this of course, but first take a look at the figure.²

flowchart TD
A(all matrices 
 n x m) --> C(row matrices 
 1 x n)
A --> D(column matrices 
 n x 1)
A ---> B(square matrices 
 n x n)
B --> E(upper triangular
matrices)
B --> F(lower triangular
matrices)
B --> G{{either:
is singular?}}
B --> H{{or:
is invertible?}}
H --> I{{is diagonalizable?}}
I --> J{{is normal?}}
J --> K(symmetric)
K --> L(diagonal)
L --> M(identity)
J --> N{{is orthogonal?}}
N --> M
style G fill:#FFF0F5
style H fill:#FFF0F5
style I fill:#FFF0F5
style J fill:#FFF0F5
style N fill:#FFF0F5

Figure 1: Hierarchical relationships between different types of matrices. Blue Rectangles denote matrices with particular, recognizable structures. Pink Hexagons indicate properties that can be queried.

Touring the Taxonomy

Structure Examples

Let’s use R to construct and inspect examples of each type of matrix. We’ll use integer matrices to keep the print output nice and neat, but of course real numbers could be used as well.³ Most of these are pretty straightforward so we’ll keep comments to a minimum for the simple cases.

Rectangular Matrix

A_rect <- matrix(1:12, nrow = 3) # if you give nrow,
A_rect # R will compute ncol from the length of the data

     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12

Notice that R is “column major” meaning data fills the first column, then the second column and so forth.

Row Matrix/Vector

A_row <- matrix(1:4, nrow = 1)
A_row

     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4

Column Matrix/Vector

A_col <- matrix(1:4, ncol = 1)
A_col

     [,1]
[1,]    1
[2,]    2
[3,]    3
[4,]    4

Keep in mind that to save space in a text-dense document one would often write A_col as its transpose.⁴

Square Matrix

A_sq <- matrix(1:9, nrow = 3)
A_sq

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

Upper and Lower Triangular Matrices

Creating an upper triangular matrix requires a few more steps. Function upper.tri() returns a logical matrix which can be used as a mask to select entries. Function lower.tri() can be used similarly. Both functions have an argument diag = TRUE/FALSE indicating whether to include the diagonal.⁵

upper.tri(A_sq, diag = TRUE)

      [,1]  [,2] [,3]
[1,]  TRUE  TRUE TRUE
[2,] FALSE  TRUE TRUE
[3,] FALSE FALSE TRUE

A_upper <- A_sq[upper.tri(A_sq)] # gives a logical matrix
A_upper # notice that a vector is returned, not quite what might have been expected!

[1] 4 7 8

A_upper <- A_sq # instead, create a copy to be modified
A_upper[lower.tri(A_upper)] <- 0L # assign the lower entries to zero
A_upper

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    0    5    8
[3,]    0    0    9

Notice to create an upper triangular matrix we use lower.tri() to assign zeros to the lower part of an existing matrix.

Identity Matrix

If you give diag() a single value it defines the dimensions and creates a matrix with ones on the diagonal, in other words, an identity matrix.

A_ident <- diag(4)
A_ident

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    0    0    1    0
[4,]    0    0    0    1

Diagonal Matrix

If instead you give diag() a vector of values these go on the diagonal and the length of the vector determines the dimensions.

A_diag <- diag(1:4)
A_diag

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    2    0    0
[3,]    0    0    3    0
[4,]    0    0    0    4

Symmetric Matrices

Matrices created by diag() are symmetric matrices, but any matrix where is symmetric. There is no general function to create symmetric matrices since there is no way to know what data should be used. However, one can ask if a matrix is symmetric, using the function isSymmetric().

isSymmetric(A_diag)

[1] TRUE

The Queries

Let’s take the queries in the taxonomy in order, as the hierarchy is everything.

Is the Matrix Singular or Invertible?

A singular matrix is one in which one or more rows are multiples of another row, or alternatively, one or more columns are multiples of another column. Why do we care? Well, it turns out a singular matrix is a bit of a dead end, you can’t do much with it. An invertible matrix, however, is a very useful entity and has many applications. What is an invertible matrix? In simple terms, being invertible means the matrix has an inverse. This is not the same as the algebraic definition of an inverse, which is related to division:

Instead, for matrices, invertibility of is defined as the existence of another matrix such that

Just as cancels out in , cancels out to give the identity matrix. In other words, is really .

A singular matrix has determinant of zero. On the other hand, an invertible matrix has a non-zero determinant. So to determine which type of matrix we have before us, we can simply compute the determinant.

Let’s look at a few simple examples.

A_singular <- matrix(c(1, -2, -3, 6), nrow = 2, ncol = 2)
A_singular # notice that col 2 is col 1 * -3, they are not independent

     [,1] [,2]
[1,]    1   -3
[2,]   -2    6

det(A_singular)

[1] 0

A_invertible <- matrix(c(2, 2, 7, 8), nrow = 2, ncol = 2)
A_invertible

     [,1] [,2]
[1,]    2    7
[2,]    2    8

det(A_invertible)

[1] 2

Is the Matrix Diagonalizable?

A matrix that is diagonalizable can be expressed as:

where is a diagonal matrix – the diagonalized version of the original matrix . How do we find out if this is possible, and if possible, what are the values of and ? The answer is to decompose using the eigendecomposition:

Now there is a lot to know about the eigendecomposition, but for now let’s just focus on a few key points:

The columns of contains the eigenvectors. Eigenvectors are the most natural basis for describing the data in .⁶
is a diagonal matrix with the eigenvalues on the diagonal, in descending order. The individual eigenvalues are typically denoted .
Eigenvectors and eigenvalues always come in pairs.

We can answer the original question by using the eigen() function in R. Let’s do an example.

A_eigen <- matrix(c(1, 0, 2, 2, 3, -4, 0, 0, 2), ncol = 3)
A_eigen

     [,1] [,2] [,3]
[1,]    1    2    0
[2,]    0    3    0
[3,]    2   -4    2

eA <- eigen(A_eigen)
eA

eigen() decomposition
$values
[1] 3 2 1

$vectors
           [,1] [,2]       [,3]
[1,]  0.4082483    0  0.4472136
[2,]  0.4082483    0  0.0000000
[3,] -0.8164966    1 -0.8944272

Since eigen(A_eigen) was successful, we can conclude that A_eigen was diagonalizable. You can see the eigenvalues and eigenvectors in the returned value. We can reconstruct A_eigen using Equation 4:

eA$vectors %*% diag(eA$values) %*% solve(eA$vectors)

     [,1] [,2] [,3]
[1,]    1    2    0
[2,]    0    3    0
[3,]    2   -4    2

Remember, diag() creates a matrix with the values along the diagonal, and solve() computes the inverse when it gets only one argument.

The only loose end is which matrices are not diagonalizable? These are covered in this Wikipedia article. Briefly, most non-diagonalizable matrices are fairly exotic and real data sets will likely not be a problem.

Nuances About the Presentation of “Eigenstuff”

In texts, eigenvalues and eigenvectors are universally introduced as a scaling relationship

where is a column eigenvector and is a scalar eigenvalue. One says “ scales by a factor of .” A single vector is used as one can readily illustrate how that vector grows or shrinks in length when multiplied by . Let’s call this the “bottom up” explanation.

Let’s check that is true using our values from above by extracting the first eigenvector and eigenvalue from eA. Notice that we are using regular multiplication on the right-hand-side, i.e. *, rather than %*%, because eA$values[1] is a scalar. Also on the right-hand-side, we have to add drop = FALSE to the subsetting process or the result is no longer a matrix.⁷

isTRUE(all.equal(
  A_eigen %*% eA$vectors[,1],
  eA$values[1] * eA$vectors[,1, drop = FALSE]))

[1] TRUE

If instead we start from Equation 4 and rearrange it to show the relationship between and we get:

Let’s call this the “top down” explanation. We can verify this as well, making sure to convert eA$values to a diagonal matrix as the values are stored as a vector to save space.

isTRUE(all.equal(A_eigen %*% eA$vectors, eA$vectors %*% diag(eA$values)))

[1] TRUE

Notice that in Equation 6 is on the right of , but in Equation 5 the corresponding value, , is to the left of . This is a bit confusing until one realizes that Equation 5 could have been written

since is a scalar. It’s too bad that the usual, bottom up, presentation seems to conflict with the top down approach. Perhaps the choice in Equation 5 is a historical artifact.

Is the Matrix Normal?

A normal matrix is one where . As far as I know, there is no function in R to check this condition, but we’ll write our own in a moment. One reason being “normal” is interesting is if is a normal matrix, then the results of the eigendecomposition change slightly:

where is an orthogonal matrix, which we’ll talk about next.

Is the Matrix Orthogonal?

An orthogonal matrix takes the definition of a normal matrix one step further: . If a matrix is orthogonal, then its transpose is equal to its inverse: , which of course makes any special computation of the inverse unnecessary. This is a significant advantage in computations.

To aid our learning, let’s write a simple function that will report if a matrix is normal, orthogonal, or neither.⁸

normal_or_orthogonal <- function(M) {
  if (!inherits(M, "matrix")) stop("M must be a matrix")
  norm <- orthog <- FALSE
  tst1 <- M %*% t(M)
  tst2 <- t(M) %*% M
  norm <- isTRUE(all.equal(tst1, tst2))
  if (norm) orthog <- isTRUE(all.equal(tst1, diag(dim(M)[1])))
  if (orthog) message("This matrix is orthogonal\n") else 
    if (norm) message("This matrix is normal\n") else
    message("This matrix is neither orthogonal nor normal\n")
  invisible(NULL)
}

And let’s run a couple of tests.

normal_or_orthogonal(A_singular)

This matrix is neither orthogonal nor normal

Norm <- matrix(c(1, 0, 1, 1, 1, 0, 0, 1, 1), nrow = 3)
normal_or_orthogonal(Norm)

This matrix is normal

normal_or_orthogonal(diag(3)) # the identity matrix is orthogonal

This matrix is orthogonal

Orth <- matrix(c(0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0), nrow = 4)
normal_or_orthogonal(Orth)

This matrix is orthogonal

Some other properties of an orthogonal matrix

The columns of an orthogonal matrix are orthogonal to each other. We can show this by taking the dot product between any pair of columns. Remember is the dot product is zero the vectors are orthogonal.

t(Orth[,1]) %*% Orth[,2] # col 1 dot col 2

     [,1]
[1,]    0

t(Orth[,1]) %*% Orth[,3] # col 1 dot col 3

     [,1]
[1,]    0

Finally, not only are the columns orthogonal, but each column vector has length one, making them orthonormal.

sqrt(sum(Orth[,1]^2))

[1] 1

Appreciating the Queries

Taking these queries together, we see that symmetric and diagonal matrices are necessarily invertible, diagonalizable and normal. They are not however orthogonal. Identity matrices however, have all these properties. Let’s double-check these statements.

A_sym <- matrix(
  c(1, 5, 4, 5, 2, 9, 4, 9, 3),
  ncol = 3) # symmetric matrix, not diagonal
A_sym

     [,1] [,2] [,3]
[1,]    1    5    4
[2,]    5    2    9
[3,]    4    9    3

normal_or_orthogonal(A_sym)

This matrix is normal

normal_or_orthogonal(diag(1:3)) # diagonal matrix, symmetric, but not the identity matrix

This matrix is normal

normal_or_orthogonal(diag(3)) # identity matrix (also symmetric, diagonal)

This matrix is orthogonal

So what’s the value of these queries? As mentioned, they help us understand the relationships between different types of matrices, so they help us learn more deeply. On a practical computational level they may not have much value, especially when dealing with real-world data sets. However, there are some other interesting aspects of these queries that deal with decompositions and eigenvalues. We might cover these in the future.

An Emerging Theme?

A more personal thought: In the course of writing these posts, and learning more linear algebra, it increasingly seems to me that a lot of the “effort” that goes into linear algebra is about making tedious operations simpler. Anytime one can have more zeros in a matrix, or have orthogonal vectors, or break a matrix into parts, the simpler things become. However, I haven’t really seen this point driven home in texts or tutorials. I think linear algebra learners would do well to keep this in mind.

Annotated Bibliography

These are the main sources I relied on for this post.

The No Bullshit Guide to Linear Algebra by Ivan Savov.
- Section 6.2 Special types of matrices
- Section 6.6 Eigendecomposition
Linear Algebra: step by step by Kuldeep Singh, Oxford Univerity Press, 2014.
- Section 4.4 Orthogonal Matrices
- Section 7.3.2 Diagonalization
- Section 7.4 Diagonalization of Symmetric Matrices
Wikipedia articles on the types of matrices.

Footnotes

I’m only using a portion because the Hiranbe’s original contains a bit too much information for someone trying to get their footing in the field.↩︎
I’m using the term taxonomy a little loosely of course, you can call it whatever you want. The name is not so important really, what is important is the hierarchy of concepts.↩︎
As could complex numbers.↩︎
Usually in written text a row matrix, sometimes called a row vector, is written as . In order to save space in documents, rather than writing , a column matrix/vector can be kept to a single line by writing it as its transpose: , but this requires a little mental gymnastics to visualize.↩︎
Upper and lower triangular matrices play a special role in linear algebra. Because of the presence of many zeros, multiplying them and inverting them is relatively easy, because the zeros cause terms to drop out.↩︎
This idea of the “most natural basis” is most easily visualized in two dimensions. If you have some data plotted on and axes, determining the line of best fit is one way of finding the most natural basis for describing the data. However, more generally and in more dimensions, principal component analysis (PCA) is the most rigorous way of finding this natural basis, and PCA can be calculated with the eigen() function. Lots more information here.↩︎
The drop argument to subsetting/extracting defaults to TRUE which means that if subsetting reduces the necessary number of dimensions, the unneeded dimension attributes are dropped. Under the default, selecting a single column of a matrix leads to a vector, not a one column vector. In this all.equal() expression we need both sides to evaluate to a matrix.↩︎
One might ask why R does not provide a user-facing version of such a function. I think a good argument can be made that the authors of R passed down a robust and lean set of linear algebra functions, geared toward getting work done, and throwing errors as necessary.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2022,
  author = {Hanson, Bryan},
  title = {Notes on {Linear} {Algebra} {Part} 4},
  date = {2022-09-26},
  url = {http://chemospec.org/posts/2022-11-07-Announce-Subscribe/2022-09-26-Linear-Alg-Notes-Pt4/Linear-Alg-Notes-Pt4.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2022. “Notes on Linear Algebra Part 4.” September 26, 2022. http://chemospec.org/posts/2022-11-07-Announce-Subscribe/2022-09-26-Linear-Alg-Notes-Pt4/Linear-Alg-Notes-Pt4.html.

Notes on Linear Algebra Part 3

Bryan Hanson — Sat, 10 Sep 2022 07:00:00 GMT

Series: Part 1 Part 2

Update 19 September 2022: in “Use of outer() for Matrix Multiplication”, corrected use of “cross” to be “outer” and added example in R. Also added links to work by Hiranabe.

This post is a survey of the linear algebra-related functions from base R. Some of these I’ve disccused in other posts and some I may discuss in the future, but this post is primarily an inventory: these are the key tools we have available. “Notes” in the table are taken from the help files.

Matrices, including row and column vectors, will be shown in bold e.g. or while scalars and variables will be shown in script, e.g. . R code will appear like x <- y.

In the table, or is an upper/right triangular matrix. is a lower/left triangular matrix (triangular matrices are square). is a generic matrix of dimensions . is a square matrix of dimensions .

Function	Uses	Notes
operators
`*`	scalar multiplication
`%*%`	matrix multiplication	two vectors the dot product; vector + matrix cross product (vector will be promoted as needed)¹
basic functions
`t()`	transpose	interchange rows and columns
`crossprod()`	matrix multiplication	faster version of `t(A) %*% A`
`tcrossprod()`	matrix multiplication	faster version of `A %*% t(A)`
`outer()`	outer product & more	see discussion below
`det()`	computes determinant	uses the LU decomposition; determinant is a volume
`isSymmetric()`	name says it all
`Conj()`	computes complex conjugate
decompositions
`backsolve()`	solves
`forwardsolve()`	solves
`solve()`	solves and	e.g. linear systems; if given only one matrix returns the inverse
`qr()`	solves	is an orthogonal matrix; can be used to solve ; see `?qr` for several `qr.*` extractor functions
`chol()`	solves	Only applies to positive semi-definite matrices (where ); related to LU decomposition
`chol2inv()`	computes from the results of `chol(M)`
`svd()`	singular value decomposition	input ; can compute PCA; details
`eigen()`	eigen decomposition	requires ; can compute PCA; details

One thing to notice is that there is no LU decomposition in base R. It is apparently used “under the hood” in solve() and there are versions available in contributed packages.²

What is the use of outer()?

As seen in Part 1 calling outer() on two vectors does indeed give the cross product (technically corresponding to tcrossprod()). This works because the defaults carry out multiplication.³ However, looking through the R source code for uses of outer(), the function should really be thought of in simple terms as creating all possible combinations of the two inputs. In that way it is similar to expand.grid(). Here are two illustrations of the flexibility of outer():

# generate a grid of x,y values modified by a function
# from ?colorRamp
m <- outer(1:20, 1:20, function(x,y) sin(sqrt(x*y)/3))
str(m)

 num [1:20, 1:20] 0.327 0.454 0.546 0.618 0.678 ...

# generate all combinations of month and year
# modified from ?outer; any function accepting 2 args can be used
outer(month.abb, 2000:2002, FUN = paste)

      [,1]       [,2]       [,3]      
 [1,] "Jan 2000" "Jan 2001" "Jan 2002"
 [2,] "Feb 2000" "Feb 2001" "Feb 2002"
 [3,] "Mar 2000" "Mar 2001" "Mar 2002"
 [4,] "Apr 2000" "Apr 2001" "Apr 2002"
 [5,] "May 2000" "May 2001" "May 2002"
 [6,] "Jun 2000" "Jun 2001" "Jun 2002"
 [7,] "Jul 2000" "Jul 2001" "Jul 2002"
 [8,] "Aug 2000" "Aug 2001" "Aug 2002"
 [9,] "Sep 2000" "Sep 2001" "Sep 2002"
[10,] "Oct 2000" "Oct 2001" "Oct 2002"
[11,] "Nov 2000" "Nov 2001" "Nov 2002"
[12,] "Dec 2000" "Dec 2001" "Dec 2002"

Bottom line: outer() can be used for linear algebra but its main uses lie elsewhere. You don’t need it for linear algebra!

Using outer() for matrix multiplication

Here’s an interesting connection discussed in this Wikipedia entry. In Part 1 we demonstrated how the repeated application of the dot product underpins matrix multiplication. The first row of the first matrix is multiplied element-wise by the first column of the second matrix, shown in red, to give the first element of the answer matrix. This process is then repeated so that every row (first matrix) has been multiplied by every column (second matrix).

If instead, we treat the first column of the first matrix as a column vector and outer multiply it by the first row of the second matrix as a row vector, we get the following matrix:

Now if you repeat this process for the second column of the first matrix and the second row of the second matrix, you get another matrix. And if you do it one more time using the third column/third row, you get a third matrix. If you then add these three matrices together, you get as seen in Equation 1. Notice how each element in in Equation 1 is a sum of three terms? Each of those terms comes from one of the three matrices just described.

To sum up, one can use the dot product on each row (first matrix) by each column (second matrix) to get the answer, or you can use the outer product on the columns sequentially (first matrix) by rows sequentially (second matrix) to get several matrices, which one then sums to get the answer. It’s pretty clear which option is less work and easier to follow, but I think it’s an interesting connection between operations. The first case corresponds to view “MM1” in The Art of Linear Algebra while the second case is view “MM4”. See this work by Kenji Hiranabe.

Here’s a simple proof in R.

M1 <- matrix(1:6, nrow = 3, byrow = TRUE)
M1

     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6

M2 <- matrix(7:10, nrow = 2, byrow = TRUE)
M2

     [,1] [,2]
[1,]    7    8
[2,]    9   10

tst1 <- M1 %*% M2 # uses dot product
# next line is sum of sequential outer products:
# 1st col M1 by 1st row M2 + 2nd col M1 by 2nd row M2
tst2 <- outer(M1[,1], M2[1,]) + outer(M1[,2], M2[2,])

all.equal(tst1, tst2)

[1] TRUE

Footnotes

For details see the discussion in Part 1.↩︎
Discussed in this Stackoverflow question, which also has an implementation.↩︎
In fact, for the default outer(), FUN = "*", outer() actually calls tcrossprod().↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2022,
  author = {Hanson, Bryan},
  title = {Notes on {Linear} {Algebra} {Part} 3},
  date = {2022-09-10},
  url = {http://chemospec.org/posts/2022-09-10-Linear-Alg-Notes-Pt3/Linear-Alg-Notes-Pt3.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2022. “Notes on Linear Algebra Part 3.” September 10, 2022. http://chemospec.org/posts/2022-09-10-Linear-Alg-Notes-Pt3/Linear-Alg-Notes-Pt3.html.

Notes on Linear Algebra Part 2

Bryan Hanson — Thu, 01 Sep 2022 07:00:00 GMT

TL;DR

Linear algebra is complex. We need a way to penetrate the thicket. Here’s one.
Linear systems of equations are at the heart, not surprisingly, of linear algebra.
A key application is linear regression, which has a matrix solution.
Solving the needed equations requires inverting a matrix.
Inverting a matrix is more easily done after decomposing the matrix into upper and lower triangular matrices.
The upper and lower triangular matrices are individually easy to invert, giving access to the inverse of the original matrix.
Changes in notations and symbols as you move between presentations add significantly to the cognitive burden in learning this material.

For Part 1 of this series, see here.

If you open a linear algebra text, it’s quickly apparent how complex the field is. There are so many special types of matrices, so many different decompositions of matrices. Why are all these needed? Should I care about null spaces? What’s really important? What are the threads that tie the different concepts together? As someone who is trying to improve their understanding of the field, especially with regard to its applications in chemometrics, it can be a tough slog.

In this post I’m going to try to demonstrate how some simple chemometric tasks can be solved using linear algebra. Though I cover some math here, the math is secondary right now – the conceptual connections are more important. I’m more interested in finding (and sharing) a path through the thicket of linear algebra. We can return as needed to expand the basic math concepts. The cognitive effort to work through the math details is likely a lot lower if we have a sense of the big picture.

In this post, matrices, including row and column vectors, will be shown in bold e.g. while scalars and variables will be shown in script, e.g. . Variables used in R code will appear like A.

Systems of Equations

If you’ve had algebra, you have certainly run into “system of equations” such as the following:

In algebra, such systems can be solved several ways, for instance by isolating one or more variables and substituting, or geometrically (particularly for 2D systems, by plotting the lines and looking for the intersection). Once there are more than a few variables however, the only manageable way to solve them is with matrix operations, or more explicitly, linear algebra. This sort of problem is the core of linear algebra, and the reason the field is called linear algebra.

To solve the system above using linear algebra, we have to write it in the form of matrices and column vectors:

or more generally

where is the matrix of coefficients, is the column vector of variable names¹ and is a column vector of constants. Notice that these matrices are conformable:²

To solve such a system, when we have unknowns, we need equations.³ This means that has to be a square matrix, and square matrices play a special role in linear algebra. I’m not sure this point is always conveyed clearly when this material is introduced. In fact, it seems like many texts on linear algebra seem to bury the lede.

To find the values of ⁴, we can do a little rearranging following the rules of linear algebra and matrix operations. First we pre-multiply both sides by the inverse of , which then gives us the identity matrix , which drops out.⁵

So it’s all sounding pretty simple right? Ha. This is actually where things potentially break down. For this to work, must be invertible, which is not always the case.⁶ If there is no inverse, then the system of equations either has no solution or infinite solutions. So finding the inverse of a matrix, or discovering it doesn’t exist, is essential to solving these systems of linear equations.⁷ More on this eventually, but for now, we know must be a square matrix and we hope it is invertible.

A Key Application: Linear Regression

We learn in algebra that a line takes the form . If one has measurements in the form of pairs that one expects to fit to a line, we need linear regression. Carrying out a linear regression is arguably one of the most important, and certainly a very common application of the linear systems described above. One can get the values of and by hand using algebra, but any computer will solve the system using a matrix approach.⁸ Consider this data:

To express this in a matrix form, we recast

into

where:

is the column vector of values. That seems sensible.
is a matrix composed of a column of ones plus a column of the values. This is called a design matrix. At least here contains only values as variables.
is a column vector of coefficients (including, as we will see, the values of and if we are thinking back ).
is new, it is a column vector giving the errors at each point.

With our data above, this looks like:

If we multiply this out, each row works out to be an instance of . Hopefully you can appreciate that corresponds to and corresponds to .⁹

This looks similar to seen in Equation 3, if you set to , to and to :

This contortion of symbols is pretty nasty, but honestly not uncommon when moving about in the world of linear algebra.

As it is composed of real data, presumably with measurement errors, there is not an exact solution to due to the error term. There is however, an approximate solution, which is what is meant when we say we are looking for the line of best fit. This is how linear regression is carried out on a computer. The relevant equation is:

The key point here is that once again we need to invert a matrix to solve this. The details of where Equation 11 comes from are covered in a number of places, but I will note here that refers to the best estimate of .¹⁰

Inverting Matrices

We now have two examples where inverting a matrix is a key step: solving a system of linear equations, and approximating the solution to a system of linear equations (the regression case). These cases are not outliers, the ability to invert a matrix is very important. So how do we do this? The LU decomposition can do it, and is widely used so worth spending some time on. A decomposition is the process of breaking a matrix into pieces that are easier to handle, or that give us special insight, or both. If you are a chemometrician you have almost certainly carried out Principal Components Analysis (PCA). Under the hood, PCA requires either a singular value decomposition, or an eigen decomposition (more info here).

So, about the LU decomposition: it breaks a matrix into two matrices, , a “lower triangular matrix”, and , an “upper triangular matrix”. These special matrices contain only zeros except along the diagonal and the entries below it (in the lower case), or along the diagonal and the entries above it (in the upper case). The advantage of triangular matrices is that they are very easy to invert (all those zeros make many terms drop out). So the LU decomposition breaks the tough job of inverting into two easier jobs.

When all is done, we only need to figure out and which as mentioned is straightforward.¹¹

To summarize, if we want to solve a system of equations we need to carry out matrix inversion, which is turn is much easier to do if one uses the LU decomposition to get two easy to invert triangular matrices. I hope you are beginning to see how pieces of linear algebra fit together, and why it might be good to learn more.

R Functions

Inverting Matrices

Let’s look at how R does these operations, and check our understanding along the way. R makes this really easy. We’ll start with the issue of invertibility. Let’s create a matrix for testing.

A1 <- matrix(c(3, 5, -1, 11, 2, 0, -5, 2, 5), ncol = 3)
A1

     [,1] [,2] [,3]
[1,]    3   11   -5
[2,]    5    2    2
[3,]   -1    0    5

In the matlib package there is a function inv that inverts matrices. It returns the inverted matrix, which we can verify by multiplying the inverted matrix by the original matrix to give the identity matrix (if inversion was successful). diag(3) creates a 3 x 3 matrix with 1’s on the diagonal, in other words an identity matrix.

library("matlib")
A1_inv <- inv(A1)
all.equal(A1_inv %*% A1, diag(3))

[1] "Mean relative difference: 8.999999e-08"

The difference here is really small, but not zero. Let’s use a different function, solve which is part of base R. If solve is given a single matrix, it returns the inverse of that matrix.

A1_solve <- solve(A1) %*% A1
all.equal(A1_solve, diag(3))

[1] TRUE

That’s a better result. Why are there differences? inv uses a method called Gaussian elimination which is similar to how one would invert a matrix using pencil and paper. On the other hand, solve uses the LU decomposition discussed earlier, and no matrix inversion is necessary. Looks like the LU decomposition gives a somewhat better numerical result.

Now let’s look at a different matrix, created by replacing the third column of A1 with different values.

A2 <- matrix(c(3, 5, -1, 11, 2, 0, 6, 10, -2), ncol = 3)
A2

     [,1] [,2] [,3]
[1,]    3   11    6
[2,]    5    2   10
[3,]   -1    0   -2

And let’s compute its inverse using solve.

solve(A2)

Error in solve.default(A2): system is computationally singular: reciprocal condition number = 6.71337e-19

When R reports that A2 is computationally singular, it is saying that it cannot be inverted. Why not? If you look at A2, notice that column 3 is a multiple of column 1. Anytime one column is a multiple of another, or one row is a multiple of another, then the matrix cannot be inverted because the rows or columns are not independent.¹² If this was a matrix of coefficients from an experimental measurement of variables, this would mean that some of your variables are not independent, they must be measuring the same underlying phenomenon.

Solving Systems of Linear Equations

Let’s solve the system from Equation 2. It turns out that the solve function also handles this case, if you give it two arguments. Remember, solve is using the LU decomposition behind the scenes, no matrix inversion is required.

A3 <- matrix(c(1, 2, 3, 2, -1, 2, -3, -1, 1), ncol = 3)
A3

     [,1] [,2] [,3]
[1,]    1    2   -3
[2,]    2   -1   -1
[3,]    3    2    1

colnames(A3) <-c("x", "y", "z") # naming the columns will label the answer
b <- c(3, 11, -5)
solve(A3, b)

 x  y  z 
 2 -4 -3

The answer is the values of that make the system of equations true.

How does LU decomposition avoid inversion?

While we’ve emphasized the importance and challenges of inverting matrices, we’ve also pointed out that to solve a linear system there are alternatives to looking at the problem from the perspective of Equation 5. Here’s an approach using the LU decomposition, starting with substituting with :

We want to solve for the column vector of variables. To do so, define a new vector and substitute it in:

Next we solve for . One way we could do this is to pre-multiply both sides by but we are looking for a way to avoid using the inverse. Instead, we evaluate to give a series of expressions using the dot product (in other words plain matrix multiplication). Because is lower triangular, many of the terms we might have gotten actually disappear because of the zero coefficients. What remains is simple enough that we can algebraically find each element of starting from the first row (this is called forward substitution). Once we have , we can find by solving using a similar approach, but working from the last row upward (this is backward substitution). This is a good illustration of the utility of triangular matrices: some operations can move from the linear algebra realm to the algebra realm. Wikipedia has a good illustration of forward and backward substitution.

Computing Linear Regression

Let’s compute the values for in our regression data shown in Equation 6. First, let’s set up the needed matrices and plot the data since visualizing the data is always a good idea.

y = matrix(c(11.8, 7.2, 21.5, 17.2, 26.8), ncol = 1)
X = matrix(c(rep(1, 5), 2.1, 0.9, 3.9, 3.2, 5.1), ncol = 2) # design matrix
X

     [,1] [,2]
[1,]    1  2.1
[2,]    1  0.9
[3,]    1  3.9
[4,]    1  3.2
[5,]    1  5.1

plot(X[,2], y, xlab = "x") # column 2 of X has the x values

The value of can be found via Equation 11:

solve((t(X) %*% X)) %*%  t(X) %*% y

         [,1]
[1,] 2.399618
[2,] 4.769862

The first value is for or or intecept, the second value is for or or slope.

Let’s compare this answer to R’s built-in lm function (for linear model):

fit <- lm(y ~ X[,2])
fit


Call:
lm(formula = y ~ X[, 2])

Coefficients:
(Intercept)       X[, 2]  
       2.40         4.77

We have good agreement! If you care to learn about the goodness of the fit, the residuals etc, then you can look at the help file ?lm and str(fit). lm returns pretty much all one needs to know about the results, but if you wish to calculate all the interesting values yourself you can do so by manipulating Equation 11 and its relatives.

Finally, let’s plot the line of best fit found by lm to make sure everything looks reasonable.

plot(X[,2], y, xlab = "x")
abline(coef = coef(fit), col = "red")

That’s all for now, and a lot to digest. I hope you are closer to finding your own path through linear algebra. Remember that investing in learning the fundamentals prepares you for tackling the more complex topics. Thanks for reading!

Annotated Bibliography

These are the main sources I relied on for this post.

The No Bullshit Guide to Linear Algebra by Ivan Savov.
- Section 1.15: Solving systems of linear equations.
- Section 6.6: LU decomposition.
Linear Algebra: step by step by Kuldeep Singh, Oxford Univerity Press, 2014.
- Section 1.8.5: Singluar (non-invertible) matrices mean there is no solution or infinite solutions to the linear system. For graphical illustration see sections 1.1.3 and 1.7.2.
- Section 1.6.4: Definition of the inverse and conceptual meaning.
- Section 1.8.4: Solving linear systems when is invertible.
- Section 6.4: LU decomposition.
- Section 6.4.3: Solving linear systems without using inversion, via the LU decomposition.
Linear Models with R by Julian J. Faraway, Chapman and Hall/CRC, 2005.
- Sections 2.1-2.4: Linear regression from the algebraic and matrix perspectives, derivation of Equation 11.
The vignettes of the matlib package are very helpful.

Footnotes

Here we have the slightly unfortunate circumstance where symbol conventions cannot be completely harmonized. We are saying that which seems a bit silly since vector contains and components in addition to . I ask you to accept this for two reasons: First, most linear algebra texts use the symbols in Equation 3 as the general form for this topic, so if you go to study this further that’s what you’ll find. Second, I feel like using , and in Equation 1 will be familar to the most people. If you want to get rid of this infelicity, then you have to write Equation 1 (in part) as which I think clouds the interpretation. Perhaps however you feel my choices are equally bad.↩︎
Conformable means that the number of columns in the first matrix equals the number of rows in the second matrix. This is necessary because of the dot product definition of matrix multiplication. More details here.↩︎
Remember “story problems” where you had to read closely to express what was given in terms of equations, and find enough equations? “If Sally bought 10 pieces of candy and a drink for $1.50…”↩︎
We could also write this as to emphasize that it is a column vector. One might prefer this because the only vector one can write in a row of text is a row vector, so if we mean a column vector many people would prefer to write it transposed.↩︎
The inverse of a matrix is analogous to dividing a variable by itself, since it leads to that variable canceling out and thus simplifying the equation. However, strictly speaking there is no operation that qualifies as division in the matrix world.↩︎
For a matrix to be invertible, there must exist another matrix such that . However, this definition doesn’t offer any clues about how we might find the inverse.↩︎
In truth, there are other ways to solve that don’t require inversion of a matrix. However, if a matrix isn’t invertible, these other methods will also break down. We’ll demonstrate this later when we talk about the LU decomposition.↩︎
A very good discussion of the algebraic approach is available here.↩︎
This is another example of an infelicity of symbol conventions. The typical math/statistics text symbols are not the same as the symbols a student in Physics 101 would likely encounter.↩︎
The careful reader will note that the data set shown in Equation 9 is not square, there are more observations (rows) than variables (columns). This is fine and desireable for a linear regression, we don’t want to use just two data points as that would have no error but not necessarily be accurate. However, only square matrices have inverses, so what’s going on here? In practice, what’s happening is we are using something called a pseudoinverse. The first part of the right side of Equation 11 is in fact the pseudoinverse: . Perhaps we’ll cover this in a future post.↩︎
The switch in the order of matrices on the last line of Equation 12 is one of the properties of the inverse operator.↩︎
This means that the rank of the matrix is less than the number of columns. You can get the rank of a matrix by counting the number of non-zero eigenvalues via eigen(A2)$values, which in this case gives 8.9330344, -5.9330344, -3.5953271^{-16}. There are only two non-zero values, so the rank is two. Perhaps in another post we’ll discuss this in more detail.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2022,
  author = {Hanson, Bryan},
  title = {Notes on {Linear} {Algebra} {Part} 2},
  date = {2022-09-01},
  url = {http://chemospec.org/posts/2022-09-01-Linear-Alg-Notes-Pt2/Linear-Alg-Notes-Pt2.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2022. “Notes on Linear Algebra Part 2.” September 1, 2022. http://chemospec.org/posts/2022-09-01-Linear-Alg-Notes-Pt2/Linear-Alg-Notes-Pt2.html.

Notes on Linear Algebra Part 1

Bryan Hanson — Sun, 14 Aug 2022 07:00:00 GMT

If you are already familiar with much of linear algebra, as well as the relevant functions in R, read no further and do something else!

If you are like me, you’ve had no formal training in linear algebra, which means you learn what you need to when you need to use it. Eventually, you cobble together some hard-won knowledge. That’s good, because almost everything in chemometrics involves linear algebra.

This post is essentially a set of personal notes about the dot product and the cross product, two important manipulations in linear algebra. I’ve tried to harmonize things I learned way back in college physics and math courses, and integrate information I’ve found in various sources I have leaned on more recently. Without a doubt, the greatest impediment to really understanding this material is the use of multiple terminology and notations. I’m going to try really hard to be clear and to the point in my dicussion.

The main sources I’ve relied on are:

The No Bullshit Guide to Linear Algebra by Ivan Savov. This is by far my favorite treatment of linear algebra. It gets to the point quickly.
The Wikipedia pages on dot product, cross product and outer product.

Let’s get started. For sanity and consistency, let’s define two 3D vectors and two matrices to illustrate our examples. Most of the time I’m going to write vectors with an arrow over the name, as a nod to the treatment usually given in a physics course. This reminds us that we are thinking about a quantity with direction and magnitude in some coordinate system, something geometric. Of course in the R language a vector is simply a list of numbers with the same data type; R doesn’t care if a vector is a vector in the geometric sense or a list of states.

Dot Product

Terminology

The dot product goes by these other names: inner product, scalar product. Typical notations include:¹

(the is the origin of the name “dot” product)
(when thinking of the vectors as column vectors)
(typically used when are complex)

Formulas

There are two main formulas for the dot product with vectors, the algebraic formula (Equation 5) and the geometric formula (Equation 6).

refers to the or Euclidian norm, namely the length of the vector:²

The result of the dot product is a scalar. The dot product is also commutative: .

Watch out when using row or column vectors

From the perspective of matrices, if we think of and as column vectors with dimensions 3 x 1, then transposing gives us conformable matrices and we find the result of matrix multiplication is the dot product (compare to Equation 5):

Even though this is matrix multiplication, the answer is still a scalar.

Now, rather confusingly, if we think of and as row vectors, and we transpose ,then we get the dot product:

Equations Equation 8 and Equation 9 can be a source of real confusion at first. They give the impression that the dot product can be either or . However, this is only true in the limited contexts defined above. To summarize:

Thinking of the vectors as column vectors with dimensions then one can use
Thinking of the vectors as row vectors with dimensions then one can use

Unfortunately I think this distinction is not always clearly made by authors, and is a source of great confusion to linear algebra learners. Be careful when working with row and column vectors.

Matrix Multiplication

Suppose we wanted to compute .³ We use the idea of row and column vectors to accomplish this task. In the process, we discover that matrix multiplication is a series of dot products:

The red color shows how the dot product of the first row of and the first column of gives the first entry in . Every entry in results from a dot product. Every entry is a scalar, embedded in a matrix.

What Can We Do With the Dot Product?

Determine the angle between two vectors, as in Equation 6.
As such, determine if two vectors intersect at a right angle (at least in 2-3D). More generally, two vectors of any dimension are orthogonal if their dot product is zero.
Matrix multiplication, when applied repeatedly.
Compute the length of a vector, via
Compute the projection of one vector on another, for instance how much of a force is along the -direction? A verbal interpretation of is it gives the amount of in the direction of .

Cross Product

Terminology and Notation

The cross product goes by these other names: outer product⁴, tensor product, vector product.

Formulas

The cross product of two vectors returns a vector rather than a scalar. Vectors are defined in terms of a basis which is a coordinate system. Earlier, when we defined it was intrinsically defined in terms of the standard basis set (in some fields this would be called the unit coordinate system). Thus a fuller definition of would be:

In terms of vectors, the cross product is defined as:

In my opinion, this is not exactly intuitive, but there is a pattern to it: notice that the terms for don’t involve the component. The details of how this result is computed relies on some properties of the basis set; this Wikipedia article has a nice explanation. We need not dwell on it however.

There is also a geometric formula for the cross product:

where is the unit vector perpendicular to the plane defined by and . The direction of is defined by the right-hand rule. Because of this, the cross product is not commutative, i.e. . The cross product is however anti-commutative:

Cross product using column vectors

As we did for the dot product, we can look at the cross product from the perspective of column vectors. Instead of transposing the first matrix as we did for the dot product, we transpose the second one:

Interestingly, we are using the dot product to compute the cross product.

The case where we treat and as row vectors is left to the reader.⁵

Finally, there is a matrix definition of the cross product as well. Evaluation of the following determinant gives the cross product:

What Can We Do With the Cross Product?

In 3D, the result of the cross product is perpendicular or normal to the plane defined by the two input vectors.
If however, the two vectors are parallel or anti-parallel, the cross product is zero.
The length of the cross product is the area of the parallelogram defined by the two input vectors:

R Functions

`%*%`

The workhorse for matrix multiplication in R is the %*% function. This function will accept any combination of vectors and matrices as inputs, so it is flexible. It is also smart: given a vector and a matrix, the vector will be treated as row or column matrix as needed to ensure conformity, if possible. Let’s look at some examples:

# Some data for examples
p <- 1:5
q <- 6:10
M <- matrix(1:15, nrow = 3, ncol = 5)
M

     [,1] [,2] [,3] [,4] [,5]
[1,]    1    4    7   10   13
[2,]    2    5    8   11   14
[3,]    3    6    9   12   15

# A vector times a vector
p %*% q

     [,1]
[1,]  130

Notice that R returns a data type of matrix, but it is a matrix, and thus a scalar value. That means we just computed the dot product, a descision R made internally. We can verify this by noting that q %*% p gives the same answer. Thus, R handled these vectors as column vectors and computed .

# A vector times a matrix
M %*% p

     [,1]
[1,]  135
[2,]  150
[3,]  165

As M had dimensions , R treated p as a column vector in order to be conformable. The result is a vector, so this is the cross product.

If we try to compute p %*% M we get an error, because there is nothing R can do to p which will make it conformable to M.

p %*% M

Error in p %*% M: non-conformable arguments

What about multiplying matrices?

M %*% M

Error in M %*% M: non-conformable arguments

As you can see, when dealing with matrices, %*% will not change a thing, and if your matrices are non-conformable then it’s an error. Of course, if we transpose either instance of M we do have conformable matrices, but the answers are different, and this is neither the dot product or the cross product, just matrix multiplication.

t(M) %*% M

     [,1] [,2] [,3] [,4] [,5]
[1,]   14   32   50   68   86
[2,]   32   77  122  167  212
[3,]   50  122  194  266  338
[4,]   68  167  266  365  464
[5,]   86  212  338  464  590

M %*% t(M)

     [,1] [,2] [,3]
[1,]  335  370  405
[2,]  370  410  450
[3,]  405  450  495

What can we take from these examples?

R will give you the dot product if you give it two vectors. Note that this is a design decision, as it could have returned the cross product (see Equation 14).
R will promote a vector to a row or column vector if it can to make it conformable with a matrix you provide. If it cannot, R will give you an error. If it can, the cross product is returned.
When it comes to two matrices, R will give an error when they are not conformable.
One function, %*%, does it all: dot product, cross product, or matrix multiplication, but you need to pay attention.
The documentation says as much, but more tersely: “Multiplies two matrices, if they are conformable. If one argument is a vector, it will be promoted to either a row or column matrix to make the two arguments conformable. If both are vectors of the same length, it will return the inner product (as a matrix)”

Other Functions

There are other R functions that do some of the same work:

crossprod equivalent to t(M) %*% M but faster.
tcrossprod equivalent to M %*% t(M) but faster.
outer or %o%

The first two functions will accept combinations of vectors and matrices, as does %*%. Let’s try it with two vectors:

crossprod(p, q)

     [,1]
[1,]  130

Huh. crossprod is returning the dot product! So this is the case where “the cross product is not the cross product.” From a clarity perspective, this is not ideal. Let’s try the other function:

tcrossprod(p, q)

     [,1] [,2] [,3] [,4] [,5]
[1,]    6    7    8    9   10
[2,]   12   14   16   18   20
[3,]   18   21   24   27   30
[4,]   24   28   32   36   40
[5,]   30   35   40   45   50

There’s the cross product!

What about outer? Remember that another name for the cross product is the outer product. So is outer the same as tcrossprod? In the case of two vectors, it is:

identical(outer(p, q), tcrossprod(p, q))

[1] TRUE

What about a vector with a matrix?

tst <- outer(p, M)
dim(tst)

[1] 5 3 5

Alright, that clearly is not a cross product. The result is an array with dimensions , not a matrix (which would have only two dimensions). outer does correspond to the cross product in the case of two vectors, but anything with higher dimensions gives a different beast. So perhaps using “outer” as a synonym for cross product is not a good idea.

Advice

Given what we’ve seen above, make your life simple and stick to %*%, and pay close attention to the dimensions of the arguments, especially if row or column vectors are in use. In my experience, thinking about the units and dimensions of whatever it is you are calculating is very helpful. Later, if speed is really important in your work, you can use one of the faster alternatives.

Footnotes

An extensive dicussion of notations can be found here.↩︎
And curiously, the norm works out to be equal to the square root of the dot product of a vector with itself: ↩︎
To be multiplied, matrices must be conformable, namely the number of columns of the first matrix must match the number of rows of the second matrix. The reason is so that the dot product terms will match. In the present case we have .↩︎
Be careful, it turns out that “outer” may not be a great synonym for cross product, as explained later.↩︎
OK fine, here is the answer when treating and as row vectors: which expands exactly as the right-hand side of Equation 14.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2022,
  author = {Hanson, Bryan},
  title = {Notes on {Linear} {Algebra} {Part} 1},
  date = {2022-08-14},
  url = {http://chemospec.org/posts/2022-08-14-Linear-Alg-Notes/2022-08-14-Linear-Alg-Notes.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2022. “Notes on Linear Algebra Part 1.” August 14, 2022. http://chemospec.org/posts/2022-08-14-Linear-Alg-Notes/2022-08-14-Linear-Alg-Notes.html.

FOSS4Spectroscopy: R vs Python

Bryan Hanson — Wed, 06 Jul 2022 07:00:00 GMT

If you aren’t familiar with it, the FOSS for Spectroscopy web site lists Free and Open Source Software for spectroscopic applications. The collection is of course never really complete, and your package suggestions are most welcome (how to contribute). My methods for finding packages are improving and at this point the major repositories have been searched reasonably well.

A few days ago I pushed a major update, and at this point Python packages outnumber R packages more than two to one. The update was made possible because I recently had time to figure out how to search the PyPi.org site automatically.

In a previous post I explained the methods I used to find packages related to spectroscopy. These have been updated considerably and the rest of this post will cover the updated methods.

Repos & Topics

There are four places I search for packages related to spectroscopy.¹

CRAN, searched manually using the packagefinder package.²
Github, searched using custom functions and scripts, detailed below.
PyPi.org, searched as for Github.
juliapackages.org, searched manually.

The topics I search are as follows:

NMR
EPR
ESR
UV
VIS
spectrophotometry
NIR (IR search terms overlap a lot, and also generate many false positives dealing with IR communications, e.g. TV remotes)
FT-IR
FTIR
Raman
XRF
XAS
LIBS (on PyPi.org one must use “laser induced breakdown spectroscopy” because LIBS is the name of a popular software and generates hundreds of false positives)

Searching CRAN

I search CRAN using packagefinder; the process is quite straightforward and won’t be covered here. However, it is not an automated process (I should probably work on that).

Searching Github

The broad approach used to search Github is the same as described in the original post. However, the scripts have been refined and updated, and now exist as functions in a new package I created called webu (for “webutilities”, but that name is taken on CRAN). The repo is here. webu is not on CRAN and I don’t currently intend to put it there, but you can install from the repo of course if you wish to try it out.

Searching Github is now carried out by a supervising script called /utilities/run_searches.R (in the FOSS4Spectroscopy repo). The script contains some notes about finicky details, but is pretty simple overall and should be easy enough to follow.

Searching PyPi.org

Unlike Github, it is not necessary to authenticate to use the PyPi.org API. That makes things simpler than the Github case. The needed functions are in webu and include some deliberate delays so as to not overload their servers. As for Github, searches are supervised by /utilities/run_searches.R.

One thing I observed at PyPi.org is that authors do not always fill out all the fields that PyPi.org can accept, which means some fields are NULL and we have to trap for that possibility. Package information is accessed via a JSON record, for instance the entry for nmrglue can be seen here. This package is pretty typical in that the author_email field is filled out, but the maintainer_email field is not (they are presumably the same). If one considers these JSON files to be analogous to DESCRIPTION in R packages, it looks like there is less oversight on PyPi.org compared to CRAN.

Searching Julia

Julia packages are readily searched manually at juliapackages.org.

Cleaning & Final Vetting

The raw results from the searches described above still need a lot of inspection and cleaning to be usable. The PyPi.org and Github results are saved in an Excel worksheet with the relevant URLs. These links can be followed to determine the suitability of each package. In the /Utilities folder there are additional scripts to remove entries that are already in the main database (FOSS4Spec.xlsx), as well as to check the names of the packages: Python authors and/or policies seem to lead to cases where different packages can have names differing by case, but also authors are sometimes sloppy when referring to their own packages, sometimes using mypkg and at other times myPkg to refer to the same package.

Footnotes

Once in a while users submit their own package to the repo, and I also find interesting packages in my literature reading.↩︎
packagefinder has recently been archived, but hopefully will be back soon.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2022,
  author = {Hanson, Bryan},
  title = {FOSS4Spectroscopy: {R} Vs {Python}},
  date = {2022-07-06},
  url = {http://chemospec.org/posts/2022-07-06-F4S-Update/2022-07-06-F4S-Update.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2022. “FOSS4Spectroscopy: R Vs Python.” July 6, 2022. http://chemospec.org/posts/2022-07-06-F4S-Update/2022-07-06-F4S-Update.html.

Introducing LearnPCA

Bryan Hanson — Tue, 03 May 2022 07:00:00 GMT

PCA, or principal components analysis, is one of the most wide-spread statistical methods in use. It shows up in many disciplines, from political science and psychology, to chemistry and biology. PCA is also really challenging to understand.

I’m pleased to announce that my colleague David Harvey and I have recently released LearnPCA, an R package to help people with understanding PCA. In LearnPCA we’ve tried to integrate our years of experience teaching the topic, along with the best insights we can find in books, tutorials and the nooks and crannies of the internet. Though our experience is in a chemometrics context, we use examples from different disciplines so that the package will be broadly helpful.

The package contains seven vignettes that proceed from the conceptual basics to advanced topics. As of version 0.2.0, there is also a Shiny app to help visualize the process of finding the principal component axes. The current vignettes are:

A Guide to the LearnPCA Package
A Conceptual Introduction to PCA
Step-by-Step PCA
Understanding Scores and Loadings
Visualizing PCA in 3D
The Math Behind PCA
A Comparison of Functions for PCA

You can access the vignettes at the Github Site, you don’t even have to install the package. For the Shiny app, do the following:

install.packages("LearnPCA") # you'll need version 0.2.0
library("LearnPCA")
PCsearch()

We would really appreciate your feedback on this package. You can do so in the comments below, or open an issue.

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2022,
  author = {Hanson, Bryan},
  title = {Introducing {LearnPCA}},
  date = {2022-05-03},
  url = {http://chemospec.org/posts/2022-05-03-LearnPCA-Intro/2022-05-03-LearnPCA-Intro.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2022. “Introducing LearnPCA.” May 3, 2022. http://chemospec.org/posts/2022-05-03-LearnPCA-Intro/2022-05-03-LearnPCA-Intro.html.

Metabolic Phenotyping Protocol Part 3

Bryan Hanson — Sun, 01 May 2022 07:00:00 GMT

Part 1 of this series is here.
Part 2 of this series is here.

If you aren’t familiar with ChemoSpec, you might wish to look at the introductory vignette first.

In this series of posts we are following the protocol as described in the printed publication closely (Blaise et al. 2021). The authors have also provided a Jupyter notebook. This is well worth your time, even if Python is not your preferred language, as there are additional examples and discussion for study.

Read in the Data

Load the Spectra object we created in Part 2 so we can summarize it.

library("ChemoSpec")
load("Worms2.RData")  # restores the 'Worms2' Spectra object
sumSpectra(Worms2)


 C. elegans metabolic phenotyping study (Blaise 2007) 

    There are 133 spectra in this set.
    The y-axis unit is intensity.

    The frequency scale runs from
    8.9995 to 5e-04 ppm
    There are 8600 frequency values.
    The frequency resolution is
    0.001 ppm/point.

    This data set is not continuous
    along the frequency axis.
    Here are the data chunks:

  beg.freq end.freq   size beg.indx end.indx
1   8.9995   5.0005 -3.999        1     4000
2   4.5995   0.0005 -4.599     4001     8600

    The spectra are divided into 4 groups: 

   group no.     color symbol alt.sym
1 Mut_L2  28 #FB0D16FF      0      m2
2 Mut_L4  33 #FFC0CBFF     15      m4
3  WT_L2  32 #511CFCFF      1      w2
4  WT_L4  40 #2E94E9FF     16      w4


*** Note: this is an S3 object
of class 'Spectra'

Exploratory Data Analysis, Con’t.

If you recall in Part 2 we removed five samples. Let’s re-run PCA without these samples and show the key plots. We will simply report these here without much discussion; they are pretty much as expected.

c_pca <- c_pcaSpectra(Worms2, choice = "autoscale")

plotScree(c_pca)

Figure 1: Scree plot (recommended style).

p <- plotScores(Worms2, c_pca, pcs = 1:2, ellipse = "rob", tol = 0.02)
p

Figure 2: Score plot for PCs 1 and 2. Compare to protocol figure 7d.

p <- plotScores(Worms2, c_pca, pcs = 2:3, ellipse = "rob", leg.loc = "bottomleft",
    tol = 0.02)
p

Figure 3: Score plot for PCS 2 and 3.

One thing the published protocol does not explicitly discuss is an inspection of the loadings, but it is covered in the Jupyter notebook. The loadings are useful in order to see if any particular frequencies are driving the separation of the samples in the score plot. Let’s plot the loadings (Figure 4). Remember that these data were autoscaled, and hence all frequencies, including noisy frequencies, will contribute to the separation. If we had not scaled the data, these plots would look dramatically different.

p <- plotLoadings(Worms2, c_pca, loads = 1:2)
p

Figure 4: Loadings for PC1 and PC2.

The s-plot is another very useful way to find peaks that are important in separating the samples (Figure 5); we can see that the peaks around 1.30-1.32, 1.47-1.48, and 3.03-3.07 are important drivers of the separation in the score plot. Having discovered this, one can investigate the source of those peaks.

p <- sPlotSpectra(Worms2, c_pca, tol = 0.001)
p

Figure 5: s-Plot for PC1.

Supervised Analysis with PLS-DA

ChemoSpec carries out exploratory data analysis, which is an unsupervised process. The next step in the protocol is PLS-DA (partial least squares - discriminant analysis). I have written about ChemoSpec + PLS here if you would like more background on plain PLS. However, PLS-DA is a technique that combines data reduction/variable selection along with classification. We’ll need the mixOmics package (F et al. (2017)) package for this analysis; note that loading it replaces the plotLoadings function from ChemoSpec.

library("mixOmics")

Loading required package: MASS

Loading required package: lattice


Loaded mixOmics 6.20.0
Thank you for using mixOmics!
Tutorials: http://mixomics.org
Bookdown vignette: https://mixomicsteam.github.io/Bookdown
Questions, issues: Follow the prompts at http://mixomics.org/contact-us
Cite us:  citation('mixOmics')


Attaching package: 'mixOmics'

The following object is masked from 'package:ChemoSpec':

    plotLoadings

Figure 6 shows the score plot; the results suggest that classification and modeling may be successful. The splsda function carries out a single sparse computation. One computation should not be considered the ideal answer; a better approach is to use cross-validation, for instance the bootsPLS function in the bootsPLS package (Rohart, Le Cao, and Wells (2018) which uses splsda under the hood). However, that computation is too time-consuming to demonstrate here.

X <- Worms2$data
Y <- Worms2$groups
splsda <- splsda(X, Y, ncomp = 8)

plotIndiv(splsda,
  col.per.group = c("#FB0D16FF", "#FFC0CBFF", "#511CFCFF", "#2E94E9FF"),
  title = "sPLS-DA Score Plot", legend = TRUE, ellipse = TRUE)

Figure 6: sPLS-DA plot showing classification.

To estimate the number of components needed, the perf function can be used. The results are in Figure 7 and suggest that five components are sufficient to describe the data.

perf.splsda <- perf(splsda, folds = 5, nrepeat = 5)
plot(perf.splsda)

Figure 7: Evaluation of the PLS-DA performance.

At this point, we have several ideas of how to proceed. Going forward, one might choose to focus on accurate classification, or on determining which frequencies should be included in a predictive model. Any model will need to refined and more details extracted. The reader is referred to the case study from the mixOmics folks which covers these tasks and explains the process.

This post was created using ChemoSpec version 6.1.3 and ChemoSpecUtils version 1.0.0.

References

Blaise, Benjamin J., Gonçalo D. S. Correia, Gordon A. Haggart, Izabella Surowiec, Caroline Sands, Matthew R. Lewis, Jake T. M. Pearce, et al. 2021. “Statistical Analysis in Metabolic Phenotyping.” Nature Protocols 16: 4299–4326. https://doi.org/10.1038/s41596-021-00579-1.

F, Rohart, Gautier B, Singh A, and Le Cao K-A. 2017. “mixOmics: An r Package for ’Omics Feature Selection and Multiple Data Integration.” PLoS Computational Biology 13 (11): e1005752. http://www.mixOmics.org.

Rohart, Florian, Kim-Anh Le Cao, and Christine Wells. 2018. bootsPLS: Bootstrap Subsamplings of Sparse Partial Least Squares - Discriminant Analysis for Classification and Signature Identification. https://CRAN.R-project.org/package=bootsPLS.

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{hanson2022,
  author = {Hanson, Bryan},
  title = {Metabolic {Phenotyping} {Protocol} {Part} 3},
  date = {2022-05-01},
  url = {http://chemospec.org/posts/2022-05-01-Protocol-Pt3/2022-05-01-Protocol-Pt3.html},
  langid = {en}
}

For attribution, please cite this work as:

Hanson, Bryan. 2022. “Metabolic Phenotyping Protocol Part 3.” May 1, 2022. http://chemospec.org/posts/2022-05-01-Protocol-Pt3/2022-05-01-Protocol-Pt3.html.

Chemometrics and Spectroscopy Using R

EF-NMR Part 3: Receiver Software

Capturing an “FID”

Results

Message Log File

Data Log File

Footnotes

Reuse

Citation

Bitwise Operators in C

The Bitwise Operators in C

ADC Registers

Wild-Type Operator Constructs in Use

Simple Direct Assignment

Direct Assignment via Binary Literals

Typical Bitwise Operator Use in the Wild

Example 1

Example 2

Example 3

Example 4

Functions that are collections of bitwise operators

Sanity-Preserving Helper Function

References

Footnotes

Reuse

Citation

Building an EF-NMR Part 2

Prototype in R

Implementation in C

Implementation for Arduino

What’s Next?

Reuse

Citation

Building an EF-NMR Part 1

Constructing the Form

The Winding Jig

Wire Spool Holder

The Winding Process

Checking Continuity

What’s Next

References

Footnotes

Reuse

Citation

The n + 1 rule in Earth’s Field NMR

J-Coupled Specta

How Small is Small?

What Replaces the Rule?

Examples

Further Reading

References

Footnotes

Reuse

Citation

JEOL’s Delta Now Includes ChemoSpec

Reuse

Citation

FOSS4Spectroscopy Update

Package Language

Package Focus

Personal Perspective

Footnotes

Reuse

Citation

Earth’s Field NMR

TL;DR

Line Widths are Very Narrow

No Chemical Shift Information

Heteronuclear Couplings

Populations of Quantum States

Pre-Polarization

EF-NMR Signals are in the Audio Range

Historical Note

References

Footnotes

Reuse

Citation

Home Built Photometer

Version 1: DC Power Supply for the LED

Version 2: Relaxation Oscillator as the LED Power Supply

The Bitwise Operators in `C`