Scientific Computing in Rust

While getting my degree in Physics, I had to take classes in both MatLab and Python for scientific computing. I preferred python, where we used the SciPy and NumPy packages. In fact, I used those packages again (along with matplotlib) in an undergraduate research project simulating bacteria films. There's a catch: I was also pursuing a degree in Computer Science, and Python just wasn't fast enough for that side of me, so, during my free time in graduate school, I rewrote my biofilm simulation in my new favorite language, Rust.

First problem: I needed to replace jsonpickle for data serialization. This problem was easy, as the serde crate is an amazing replacement (I even transitioned from JSON to Rusty Object Notation, RON). Second problem: Should I try to get matplotlib bindings in Rust or should I use a more Rust-y plotting library? I decided on the latter, finding plotters to be a suitable replacement for matplotlib. Last problem: How do I replace the differential equation solvers of SciPy? I ended up writing a custom predictor-corrector solver which worked well enough, giving identical simulation results while being five hundred times faster than my python code at just the simulation step, rendering not included (plotting directly to an RGB buffer then using libx264 bindings to generate raw H.264 frames that mpv can understand also gave a similar speedup to the rendering step over writing out every frame as a png file in matplotlib and using a shell script to ffmpeg the outputs together). Full disclosure, a large part of this speedup probably comes from my derivative function being not-python, rather than my implementation of a differential equation solver over SciPy's. Furthermore, I only breached the five hundred barrier when switching to using a QuadTree to find the nearest neighbors of a bacteria, before that the speed up was more in the three hundreds. This port of my old project can be found here if you care for some reason.

Introducing bacon-sci

Inspired, I decided to start on a full SciPy replacement in rust as I could not find one I liked on crates.io (I later found the peroxide crate). I even had a clever name: bacon, named after Francis Bacon and the pork based food; however, bacon is already a crate, so I went for the next best thing: bacon-sci.

I started out with the easiest part. SciPy has a set of scientific constants from the NIST. I just took the list of important constants from SciPy and put it in a module. In addition, I took NIST's CODATA and put it into a global map that corresponds a String of the constant's name to a triplet: an f64 of its value, an f64 of its uncertainty, and a String of its units. How do you make a static map? I just used the lazy_static crate.

Next, I returned to the area that inspired me: initial value problems. After that, I tackled root finding and polynomials. Lastly, as of now, I implemented a few special polynomials and numeric differentiation. Armed with a copy of Burden and Faires's "Numerical Analysis (8th ed.)", Wikipedia, and SciPy source code, I went to work.

The first decision I made was to use a linear algebra crate, specifically nalgebra. I have written many vector and matrix implementations in the past, and I didn't feel like doing it this time. Besides, nalgebra is probably faster than anything I can write.

The next decision was using the alga crate to handle generics of complex types and real types. I wanted all algorithms that could work on complex numbers to be able to, so most of the functions in bacon-sci work on N: ComplexField. If an algorithm requires a parameter to be well-ordered (like time), then it is a N: RealField. In cases where a function takes both, the generic parameter is a N: ComplexField with the real parameters being N::RealField, which is the real "backing type" of the possibly complex field. Some functions require complex arithmetic, so they automatically upgrade from N: ComplexField to Complex<N::RealField> from num_complex.

Interestingly, there are cases where I wanted to automatically downgrade from definitely complex back to "maybe complex", which was more difficult than the other way. To upgrade, all I needed was Complex::<N::RealField>::new(z.real(), z.imaginary()), where z is a possibly complex value, with real types giving zero on the imaginary component. To go the other way, I decided to basically take the real component and ignore the imaginary component. To do this from a N: ComplexField I had to check if N::RealField == N to see if N was real or complex. To do this, I used TypeId::of::<N::RealField>() == TypeId::of::<N>(). If N was real, converting from complex was easy: use ComplexField::from_real on the real component, ignoring the imaginary. On the other hand, if N was complex, the conversion needed to preserve the imaginary component. ComplexField has no from_imaginary, so I directly implemented \( a + bi \) with ComplexField::from_real(z.re) + (-ComplexField::one()).sqrt() * ComplexField::from_real(z.im), with z being the definitely complex value.

Euler Method

A differential equation is an equation describing a thing in terms of how that thing changes. For those unfamiliar, this is a good introduction. In order to solve a differential equation, you need some conditions. These come in two flavors: boundary conditions and initial values, with both names being self-describing. My second task on bacon-sci was to implement algorithms that solve initial value differential equations, referred to as initial value problems or ivp in the code.

Abstractly, the system of differential equations can be written, with an arrow representing a vector quantity, as \[ \frac{\mathrm{d}\vec{y}}{\mathrm{d}t} = f(t, \vec{y}) . \] This is a first-order system of differential equations since only the first derivative is present. Generally, many higher-order equations can be reduced to first-order by thinking of each derivative as a new variable. The initial condition is then represented as \[ \vec{y}(t_0) = \vec{y}_0 \]

These equations immediatly lead to the first algorithm for solving initial value problems, Euler's method. Consider this question: given a timestep \( \Delta t \), what is the corresponding \( \Delta \vec{y} \)? We can approximate \( \Delta \vec{y} / \Delta t \approx \mathrm{d}\vec{y}/\mathrm{d}t = f(t, \vec{y}) \), (that is, use the tangent line to estimate the function), leading to \( \Delta \vec{y} = f(t, \vec{y} )\Delta t \). This gives us an iterative algorithm, \( \vec{y}_{i + 1} = \vec{y}_{i} + f(t, \vec{y})\Delta t \). This is Euler's method. From the starting conditions, pick a timestep and iterate until you reach the end.

So how good is Euler's method? Not very good. As an example, imagine solving a mechanics differential equation with positions and velocities. Conservation of energy says that energy should be conserved. Will Euler's method conserve energy? In general, it will not. Energy in Euler's method tends to explode. One solution to this is to use a slightly different version of the algorithm known as Euler-Cromer or the semi-implicit Euler's method. In this method, velocities are updated first, then positions are updated using the new velocities (in Euler's method, you'd use the old velocities). Euler-Cromer is much better, tending to conserve energy, especially in spring systems. According to Gaffer on Games, game physics tend to use the Euler-Cromer as it only requires calculating the derivative once but gives much better results than Euler's.

Runge-Kutta Methods

One problem with Euler's method is that the error in each step is linear in the step size. We can do better. To start with, remember Taylor series from basic calculus. That is, remember you can approximate a function of one variable like \[ f(x) = f(x_0) + f'(x_0) (x - x_0) + \frac{f''(x_0)}{2} {(x - x_0)}^2 + \ldots . \] Similarly, we can approximate our derivative function from the last section with \[ f(t, \vec{y}) = f(t_0, \vec{y}_0) + \left((t - t_0)\frac{\partial f}{\partial t}(t_0, \vec{y}_0) +(\vec{y} - \vec{y}_0)\frac{\partial f}{\partial \vec{y}}(t_0, \vec{y}_0)\right) + \ldots , \] where the powers of our vectors are done element-wise. Additionally, Taylor's Theorem bounds the error via some small parameters \(\xi, \mu\). In this light, we can say Euler's method is a zeroth order approximation, which is why the error is linear. You can use Taylor polynomials for better IVP solvers, but this requries information about the derivatives of \(f\). Instead, Runge-Kutta methods allow for tighter error bounds without using the derivative.

To see an example of this, I will derivethe midpoint method here. Imagine for a moment that instead of taking a full timestep, you take a partial timestep and use \(f\) at that point to approximate the second Taylor polynomial. That is, choose a \(a_1, \alpha_1, \beta_1\) so that \(a_1 f(t + \alpha_1, \vec{y} + \beta_1)\) approximates \[ T^{(2)}(t, \vec{y}) = f(t, \vec{y}) + \frac{h}{2}f'(t, \vec{y}), \] where \(h\) is the timestep, with quadratic error in the timestep. Note that \(f'(t, y) = \partial f/\partial t(t, \vec{y}) + \partial f/\partial \vec{y}(t, \vec{y}) \vec{y}'(t) \). Thus, \[ T^{(2)} = f(t, \vec{y}) + \frac{h}{2}\frac{\partial f}{\partial t}f(t, \vec{y}) + \frac{h}{2}\frac{\partial f}{\partial \vec{y}}(t, \vec{y}) f(t, \vec{y}) . \] Expanding \(a_1 f(t + \alpha_1, \vec{y} + \beta_1)\) via Taylor series gives \[ a_1 f(t, \vec{y}) + a_1 \alpha_1 \frac{\partial f}{\partial t}(t, \vec{y}) + a_1\beta_1 \frac{\partial f}{\partial y}(t, y) . \] Thus, \(a_1 = 1\), \(a_1 \alpha_1 = h/2\), and \(a_1 \beta_1 = f(t, \vec{y}) h / 2\). This uniquely determines the coefficients, giving rise to the midpoint method: \[ \vec{y}_{i + 1} = \vec{y}_i + hf\left(t_i + \frac{h}{2}, \vec{y}_i + \frac{h}{2}f\left(t_i, \vec{y}_i\right)\right) . \]

This process can be repeated for higher order polynomials. The classic Runge-Kutta method is order 4, and is used as follows: \[ \vec{k}_1 = hf(t_i, \vec{y}_i), \] \[ \vec{k}_2 = hf\left(t_i + \frac{h}{2}, \vec{y}_i + \frac{1}{2}\vec{k}_1\right), \] \[ \vec{k}_3 = hf\left(t_i + \frac{h}{2}, \vec{y}_i + \frac{1}{2}\vec{k}_2\right), \] \[ \vec{k}_4 = hf(t_i + h, \vec{y}_i + \vec{k}_3), \] \[ \vec{y}_{i+1} = \vec{y}_i + \frac{1}{6}(\vec{k}_1 + 2\vec{k}_2 + 2\vec{k}_3 + \vec{k}_4). \] This is almost what is in bacon-sci. Note that Runge-Kutta methods can be described by a table of the coefficients of \(h\) when added to \(t_i\) for the various intermediate steps, the coefficients of each previous intermediate step for an intermediate step, and the final weighted average coefficients.

Adam-Bashforth Methods

There's a problem with all the IVP solvers discussed so far. The algorithm may take steps between two solution points, but it never remembers those steps. In other words, we throw away a lot of useful information. Multistep methods solve this problem by retaining information about previous steps. In an equation, you can characterize an \(m\) step method as \[ \vec{y}_{i+1} = \sum_{j=0}^{m-1} a_{m-1-j}\vec{y}_{i-j} + h\sum_{j=0}^{m} b_j f(t_{i+1-m + j}). \] Notice that some methods can have a point depend on the derivative at that point. These are implicit methods.

Adam-Bashforth methods are explicit methods that use Newton's backward difference formula to find the coefficients. bacon-sci almost implements a fifth order Adams-Bashforth method.

Adaptive Step Size Methods

Still, all of these methods have a problem. The step size, \(h\), is the same for complicated interesting parts of the solution and boring flat parts! Ideally, the step size will be small in the complex bits and large in the flat bits to minimize computation time. Enter adaptive step size methods!

The main idea of adaptive step sizes is to have two estimates of the next step, one more accurate and one less accurate. Let the norm of the difference between the two be the error. If the error is small, then increase the step size. If the error is large, try the step again with a smaller step size. A simple algorithm for any of the above solvers is to solve the next step twice, once with the full step size and once with half the step size. Then, if the error is too small, double the step size. If the error is too big, half the step size. This algorithm works, but is computationally expensive, and smaller step sizes may produce worse results due to round-off error. Better methods exist.

A good way to estimate error for Runge-Kutta methods is to use two orders of methods, taking the value from the higher order method. The classic example of this is the Runge-Kutta-Fehlberg method, which uses an order 4 and order 5 method. The best part about this technique is that the fourth and fifth order methods share some intermediate steps, so only 6 intermediate steps need to be found!

For the multistep solvers, a good strategy is to make a prediction of the next step using an explicit method. Then, take the prediction and use it in an implicit method of the same order. These techniques are known as predictor-corrector methods.

Finally, I can talk about the implementation of bacon-sci. In my crate, I have a trait IVPSolver that defines everything I want an IVP solver to be. I have one direct implementation, Euler. On top of that, I have a struct AdamsInfo and struct RKInfo that implement IVPSolver for general predictor-corrector methods and adaptive runge-kutta methods respectively. Then, I have traits RungeKuttaSolver and AdamsSolver that specify which coefficients are needed and how to interact with RKInfo and AdamsInfo. I have specified the Runge-Kutta-Fehlberg method and a fifth order predictor-corrector method within bacon-sci for the end user. They are built using builder functions after a new call.

Why do I need the *Info structs? Why can't I just impl<T: RungeKuttaSolver> IVPSolver for T, so runge-kutta implementors only need to define the coefficients? Well, doing that wildcard implementation conflicts with Euler. Rust isn't smart enough to know that since my crate has control over both Euler and RungeKuttaSolver, downstream crates can not possible implement RungeKuttaSolver for Euler. Furthermore, the wildcard implementations would conflict on possible downstream types implementing RungeKuttaSolver and AdamsSolver.

What else is there for bacon-sci? Well, there exist methods known an backwards differentiation formulas which are very useful for differential equations the aforementioned methods fail at; however, they are implicit methods. How does one solve an implicit equation? Well, stay tuned!

Solving Initial Value Problems with bacon-sci

The theory is all well and good, but how do you solve an initial value problem with bacon-sci? There are currently seven implemented solvers in the library: RK45, RK23, Adams, Adams2, BDF, BDF2, and Euler. As mentioned previously, IVPSolver is a trait, so all of these solvers have a shared interface. For this example, I'll solve a one-dimensional problem with \(y(0) = 0\) using RK45. The derivative function (that is, \(f(t, y)\) is of the form: fn deriv<T>(t: f64, y: &[f64], params: &mut T) -> Result<VectorN<f64, U1>, String> { ... }. Here, I am using f64 for both t and y, but f32 would work as well (you can even have y be a Complex<{float}> type with t being the corresponding real float type). In this case, you'd solve the initial value problem in a manner such as:

fn solve() -> Result<(), String> {
    let mut rk = RK45::new()
        .with_dt_min(0.01)?
        .with_dt_max(0.1)?
        .with_tolerance(1e-3)?
        .with_start(0.0)?
        .with_end(1.0)?
        .with_initial_conditions(&[1.0])?
        .build();
    let path = rk.solve_ivp(deriv, &mut ());

    Ok(())
}

bacon-sci also has a general-purpose function solve_ivp which tries to solve an initial value problem with a fourth-order Adams predictor-corrector, then the Runge-Kutta-Fehlberg method, and finally with adaptive BDF6. This is the function you'll probably want to use.

Root finding

After initial value problems, the next area of numerical analysis I tackled is root finding. What is root finding? It's finding the solutions of \[f(x) = 0, \] for an arbitrary function. In other words, you are finding where a function becomes zero. There's a related problem of finding stationary points, that is the solutions of \[f(x) = x,\] for an arbitrary function. Problems can be easily translated between the two domains; however, in bacon-sci I have only implemented algorithms for the first problem, root finding.

What are the algorithms bacon-sci has? I have currently implemented the bisection method, Newton's method, the secant method, and Müller's method for polynomials.

Bisection Method

The bisection method is the easiest to understand. It requires two starting points, \(a\) and \(b\), such that the sign of \(f(a)\) and the sign of \(f(b)\) are different (if one is zero then you found a root!). Then, the algorithm performs a binary search, dividing the interval \([a,b]\) in half. At the midpoint, the sign of the function is either zero, meaning success, the same as at \(a\) or the same as at \(b\). Thusly, you either get a result or have one new interval with differing signs at the end points. This is done recursively until an answer is found.

What are the problems with this method? Firstly, you need two starting points with differing signs. Next, this only works with single dimensional real numbers. Other algorithms described here work with complex vector functions. On the bright side, the initial two guesses can be any distance from a root of the function without the algorithm failing. I should note here that there's a closely related algorithm called the method of false position. In this algorithm, instead of the midpoint, you choose the dividing line between two subintervals to be where the line from \(f(a)\) to \(f(b)\) crosses the x axis.

The bisection method is implemented as bacon_sci::roots::bisection.

Newton's Method

Newton's method is a more sophisticated method of root finding. To derive it, first assume that \(f\) is continuous and doubly differentiable on \([a,b]\). Now, give a starting guess for the root, \(p_0\). Then, we can expand \(f\) as a Taylor series centered at \(p_0\): \[ f(p) = f(p_0) + (p - p_0)f'(p_0) + \ldots . \] Ignoring the higher order terms, \[ f(p) = f(p_0) + (p - p_0)f'(p_0) . \] Now, let \(p\) be the true root of the function. Thus, \[ 0 = f(p_0) + (p - p_0)f'(p_0) , \] \[ 0 = f(p_0) + pf'(p_0) - p_0f'(p_0), \] \[ pf'(p_0) = p_0 f'(p_0) - f(p_0), \] \[ p = p_0 - \frac{f(p_0)}{f'(p_0)}. \] Since we ignored higher order terms, this is only an approximation of the true root \(p\). However, this is a better approximation than the starting one!

Iterating this equation over and over again until the guess produces a zero of the function is Newton's method. What are the problems with it? Firstly, there's the obvious problem that you need the derivative of the function, which may not be available. Secondly, the procedure fails if the derivative is ever zero. Thirdly, the process fails if the initial guess is far from a root (Taylor series need only be accurate around the point of expansion).

Newton's method is implemented as bacon_sci::roots::newton, with a specialized version for polynomials as bacon_sci::roots::newton_polynomial.

Secant Method

The secant method is an adaptation of Newton's method to remove the need for an explicit derivative. To derive it, imagine calculating \(p_{n+1}\), the \(n+1\)th iteration of Newton's method. You would need the derivative at \(p_n\), but the derivative at that point is just the slope of the tangent line at that point! This can be approximated by a secant line from the point and a nearby point. Luckily, we have a nearby point: \(p_{n-1}\). Thus, \[ f'(p_n) \approx \frac{f(p_n) - f(p_{n-1})}{p_n - p_{n-1}} . \] Putting this into Newton's method, \[ p_{n+1} = p_n + \frac{f(p_n)(p_n - p_{n-1})}{f(p_n) - f(p_{n-1})} . \] This is the secant method.

The secant method has an obvious advantage over Newton's method. It also has an obvious trade-off: you need two starting points instead of one. There is also a trade-off that the secant method shares with Newton's method: the roots aren't bracketed. What does this mean? In the bisection method, we always had an interval within which there was sure to be a root. The root we were approaching always was within this interval. The successive iterations of Newton's method and the secant method do not bracket a root like this. There is a second method of false position which modifies the secant method to be bracketed, but bacon-sci does not implement either version.

The secant method is implemented as bacon_sci::roots::secant.

Müller's Method

Müller's method is related to the secant method. While the secant method uses a line to approximate the derivative, Müller's method uses a parabola. Currently, bacon-sci only implements Müller's method for polynomials.

This concludes a tour of the root finding available in bacon-sci. I think these methods are rather out of date, especially for polynomial root finding. In the case of polynomials, I solve for linear and quadratic roots exactly, but all higher order polynomials of order use something called Laguerre's method. It's a specialized method that pretty much always returns a root of a polynomial, no matter the initial guess, so I start with an initial guess of zero and find a root. Then, I divide out the root from the polynomial and find another root. So on and so forth. This dividing process is known as deflation. Then, going up the recursion tree, I use Newton's method to remove some error introduced by floating points in that the root of the lower order polynomials may not exactly match the roots of the higher polynomials.

Müller's method for polynomials is implemented as bacon_sci::roots::muller_polynomial.

Polynomials

If you've made it this far, I assume you know what polynomials are. bacon-sci provides a struct Polynomial<N: ComplexField> for polynomials. How do I represent them? In the most straightforward way: a list of coefficients. In this form, adding and subtracting polynomials is easy. Furthermore, it makes it simple to do polynomial long division, which is always slow. Finally, it allows for a way to evaluate polynomials at a value that is more numerically stable than the straight forward way. This evaluation method is known as Horner's method or synthetic division. Consider \[ P(x) = a + bx + cx^2 + dx^3 . \] You can evaluate it in this order: \[ P(x) = a + x \cdot (b + x\cdot (c + x\cdot (d))) . \] That is Horner's method, It only requires \(n\) additions and multiplications, where \(n\) is the order of the polynomial. As a bonus, it is numerically more stable. Let \(b_k = a_k + b_{k+1}x_0\) for \(k = n-1,n-2,\ldots, 0\) with \(b_n = a_n\) (\(a_k\) is the \(k\)th coefficient of the polynomial). This is just formalizing Horner's method at \(x_0\): every \(b\) is a step in the process. This means that \(b_0 = P(x_0)\). Now, if you make a polynomial \[ Q(x) = b_n x^{n-1} + b_{n-1} x^{n-2} + \ldots + b_1 , \] then \[ P(x) = (x - x_0)Q(x) + b_0 . \] This is where the name synthetic division comes from: you are dividing a polynomial by a linear factor and getting a quotient and a remainder.

Notice what differentiating does to this expression: \[ P'(x) = Q(x) + (x - x_0)Q'(x), \] so \[ P'(x_0) = Q(x_0). \] This allows an optimization: the derivative and polynomial value at a point can be computed at the same time, first getting the \(b_k\) then performing Horner's method on these to evaluate \(Q\) in the same loop. This makes Newton's method for polynomials especially efficient.

So you've seen the benefits of the coefficient representation. What are the downsides? Well, it can be much bigger than factored representations. Think about \({(x + 2)}^{100}\) in coefficient form. Another downside is that naive multiplication is \(O(n^2)\), assuming we're multiplying two polynomials of order \(n\). We can do better. Enter the point form of polynomials. You may know this, but \(n\) points on a polynomial completely determines the polynomial, assuming the polynomial degree is bounded to \(n\). Thus, you can represent a polynomial of degree bound \(n\) by its evaluation at \(n\) preset points, \(x_k\). To multiply in point form, we have \(C(x) = A(x)B(x) \implies C(x_k) = A(x_k)B(x_k)\). This is \(O(n)\)!

So, a better algorithm for multiplying two order \(n\) polynomials is to change them to point form of order \(2n\) (the maximum order their product can be), do the \(O(n)\) multiplication, then convert back. Sadly, to evaluate \(n\) points with Horner's method, which is \(O(n)\), is \(O(n^2)\). Luckily, one can choose special points to evaluate the polynomial at: the \(nth\) roots of unity. What are those? They are the solutions to the equation \(\omega^n = 1\). For example, the second roots of unity are \(\pm 1\), and the fourth are \(\pm 1,\pm i\). These roots of unity have cyclic properties that makes them useful as points. Using Euler's formula, the roots of unity can easily be calculated. Furthermore, they can be entirely generated in a sequence from a "first" root of unity. Other circle-y properties give rise to the fast Fourier transform algorithm. Technically, this algorithm computes a discrete Fourier transform; however, it can be used to transform the coefficient representation to the point representation in \(O(n \log n)\) time. Thus, we get \(O(n \log n)\) polynomial multiplication! bacon-sci uses the classic fft algorithm which requires the order to be a power of two, so I pad the coefficient representation to the nearest power of two that is larger than \(2n\). Thus, the multiplication requires a tolerance to get rid of leading zeros in the result. I store this tolerance within the polynomial. As a caveat, the roots of unity are complex by nature, so I do the automatic "upgrading" process from a possibly complex type to a complex type I mentioned earlier for the fft, and I do the "downgrading" process mentioned earlier for the reverse fft. Finally, I special cased multiplying by a constant polynomial or a linear polynomial to be \(O(n)\). This is because these operations are common in other algorithms and thus must be fast.

Special Polynomials

I wanted some special functions in bacon-sci. For the first pass, I just have some special polynomials.

To get to why these polynomials are special, we have to talk about the inner product. We can define a inner products between two functions from a set of functions in relation to a weighting function \(w(x)\) on an interval \([a,b]\) with \[ {\langle f, g\rangle}_w = \int_a^b f(x) g(x) w(x) \,\mathrm{d} x . \] \(a\) can be \(-\infty\), \(b\) can be \(\infty\), but the inner product between all functions in your set must be finite. Two functions are orthogonal if their inner product is zero. This concept is a generalization of the dot product. The dot product is \[ \vec{v} \cdot \vec{w} = \sum_n v_n w_n . \] Since functions have a continuum basis, the sum becomes an integral. Furthermore, we can throw in a weight function because the integral will still follow the rules from the dot product with it in there (think about how you can define norms in space not using the regular Pythagorean formula, like the taxi cab norm).

Classic orthogonal polynomials are sets of polynomials defined as orthogonal in this sense. Every polynomial in the set must have a different degree, and every degree must be present. Thus, the polynomials from the set can be labeled with a subscript of their degree. Any two different polynomials from the set must be orthogonal with relation to the weighting function on a specified interval. These orthogonal polynomials are the complete orthogonal basis (but not necessarily orthonormal) for the space of functions on some definite interval.

The first set of orthogonal polynomials are the Legendre polynomials. For this set, the weighting function is \(w(x) = 1\) and the interval is \([-1,1]\). They can be defined with: \[ P_0 = 1, \] \[ P_1 = x, \] \[ (n+1)P_{n+1} = (2n+1) x P_n - n P_{n-1} . \]

The next set of orthogonal polynomials are the Hermite polynomials. For this set, the interval is \((-\infty,\infty)\), for which the most natural weighting function is \(\exp(-x^2)\). There are actually two versions of Hermite polynomials. bacon-sci implements the so-called physicist's Hermite polynomials instead of the statistician's Hermite polynomials. The kind implemented here can be defined with: \[ H_0 = 1, \] \[ H_1 = 2x, \] \[ H_{n+1} = 2 x H_n - 2n H_{n-1} . \]

The third set of orthogonal polynomials are the Laguerre polynomials. These are orthogonal on the interval \([0,\infty)\) with the most natural weighting function \(\exp(-x)\). Unlike the previous two, Laguerre polynomials can be found without a recurrence relation: \[ L_n = \sum_{k=0}^{n} \binom{n}{k} \frac{{(-1)}^k}{k!}x^k . \]

Finally, bacon-sci currently also implements Chebyshev polynomials, which are polynomials relating to trigonometric functions. Consider \(\cos n\theta\). This can be expanded into a polynomial of \(\cos \theta\). Replace \(\cos \theta\) with \(x\) and you have the \(n\)th degree Chebyshev polynomial of the first kind. These polynomials have the recurrence relation \[ T_0 = 1, \] \[ T_1 = x, \] \[ T_{n+1} = 2x T_n - T_{n-1} . \] Likewise, you can expand \(\sin n\theta\) as \(\sin \theta\) times a polynomial term in \(\cos \theta\). Take that polynomial term and replace \(\cos \theta\) with \(x\) to get Chebyshev polynomials of the second kind, also defined by the recurrence relation \[ U_0 = 1, \] \[ U_1 = 2x, \] \[ U_{n+1} = 2x U_n - U_{n-1} . \] Both kinds of Chebyshev polynomials are orthogonal polynomials on \([-1,1]\). The first kind polynomials have weight \(1/\sqrt{1 - x^2}\) and the second kind polynomials have weight \(\sqrt{1 - x^2}\).

All of these special polynomials can be found under bacon-sci::special.

Polynomial Interpolation

Sometimes, you want to define a function that passes through a set of points. A convenient class of functions to use are, of course, polynomials. The first problem of polynomial interpolation is then this: Given a list of points, find a polynomial that passes through all the points. This problem is analogous to approximating a function by taking several point values and interpolating with a polynomial.

The first polynomial that may come to mind is a Taylor series, but these series are actually not very good at this task. Taylor series concentrate their accuracy on a single point, so they can only be relied on for approximating a function at values close to that point. This does not jive with our problem statement. Instead, we use what are known as Lagrange polynomials. To understand these, consider the case where we have 2 points: \((x_0, y_0)\) and \((x_1, y_1)\) of some function \(f\). Then, define \[ L_0(x) = \frac{x - x_1}{x_0 - x_1} \] and \[ L_1(x) = \frac{x - x_0}{x_1 - x_0} . \] Now, we can define a polynomial \[ P(x) = L_0 (x) y_0 + L_(x) y_1 . \] Notice that \(L_0(x_0) = 1\), \(L_0(x_1) = 0\), \(L_1(x_0) = 0\), and \(L_1(x_1) = 1\). Thus, \(P(x_0) = y_0\) and \(P(x_1) = y_1\), exactly as desired. \(P\) is the unique linear interpolation of the function for these points! This is an order two Lagrange interpolation. In general, to interpolate with a polynomial of degree \(n\), \[ L_{n,k} = \frac{(x - x_0) \ldots (x - x_{k-1})(x - x_{k+1})\ldots (x - x_n)} {(x_k - x_0) \ldots (x_k - x_{k - 1})(x_k - x_{k+1}) \ldots (x_k - x_n)}, \] or \[ L_{n,k} = \prod_{i \neq k}^{n} \frac{x - x_i}{x_k - x_i} . \] Then, the Lagrange interpolating polynomial is \[ P(x) = \sum_{i=0}^{n} y_i L_{n,i}(x) . \]

There is a slight problem implementing the above definition in code: a lot of work needs to be redone. An order \(n\) interpolating polynomial can be built up gradually from lower orders by taking different points at a time. This work can be cached in a table to avoid re-doing things. Calculating the polynomial in this way is known as Neville's method and is what bacon-sci does.

Now, what if you have more information about \(f\)? For example, what if you have the derivative at all of your points as well? This is a generalization of Lagrange interpolation. For the case the first derivatives are known, the interpolating polynomial is called the Hermite interpolating polynomial. bacon-sci implements this, but it has stability issues for a large number of points, which isn't surprising. Think about high-order polynomials: they tend to oscillate a lot. This is similar to high-frequency oscillation in Fourier transforms, and it limits the usefulness of polynomial interpolation.

To get around high oscillation polynomials, we can define a piecewise interpolation of the function with a lower-order polynomial. Cubic polynomials are low order and they have four unknowns, which is enough to ensure continuity of the interpolation, smoothness (continuity of the derivatives), and end point behavior. Thus, we call the piecewise cubic interpolation of a function a cubic spline. There are two types of cubic splines, free and clamped. Free cubic splines are subject to the constraint that the endpoints have a second derivative of zero. Clamped cubic splines are subject to the constraint that the derivative at the endpoints match the function derivatives. Generally, a clamped cubic spline is more accurate. bacon-sci provides both types of interpolation.

Polynomial interpolation, including cubic spline interpolation, are implemented under bacon_sci::interp.

Numerical Differentiation

Sometimes, we want to numerically find a derivative of a function. Taking a look at the definition of a derivative, \[ f'(x_0) = \lim_{h \rightarrow 0} \frac{f(x_0 + h) - f(x_0)}{h}, \] there is a simple solution: just pick some small \(h\); however, this approach is not good because subtracting two close values is prone to round off error. Instead, we can estimate the derivative by doing Lagrange interpolation on several points. Using four interpolating points, we can get the so-called five point formula: \[ f'(x_0) \approx \frac{1}{12h} \left(f(x_0 - 2h) -8f(x_0 - h) + 8f(x_0 + h) - f(x_0 + 2h) \right) . \] We can also do higher derivatives this way. For example, \[ f''(x_0) \approx \frac{1}{h^2} \left(f(x_0 - h) - 2f(x_0) + f(x_0 + h)\right) \]

Numerical differentiation is implemented under bacon_sci::differentiate.

Numerical Integration

Sometimes, you want to integrate functions numerically. This is called numeric quadrature due to historical reasons. Technically, you could use the fundamental theorem of calculus and turn integration problems into differential equation problems, but integrals are a special case of differential equation, so you should use specialized algorithms.

To start off with, take a function like so:

function

To find the area under the curve, there are some simple solutions. Two easy ones are the right- and left-rectangle rules:

right-rectangle rule

left-rectangle rule

Mathematically, this becomes \[ \int_a^b f(x)\,\mathrm{d}x \approx (b - a)f(b) \] and \[ \int_a^b f(x)\,\mathrm{d}x \approx (b - a)f(a) . \]

A better approach is to use the trapezoidal rule:

trapezoidal rule

Mathematically, this is \[ \int_a^b f(x)\,\mathrm{d}x \approx \frac{b - a}{2}\left(f(a) + f(b)\right) . \] There is also Simpson's rule, which uses an approximating parabola instead.

An easy way to improve accuracy is to break the interval up into smaller subintervals, turning a rule into a composite rule like so:

composite trapezoidal rule

Mathematically, the composite trapezoidal rule is with step size \(h\) is: \[ \int_a^b f(x)\,\mathrm{d}x \approx \frac{h}{2}\left(f(a) + f(b)\right) + h\sum_1^{n-1} f(a + nh) . \] This takes the form because each intermediate step is part of two trapezoids, the left trapezoid and the right trapezoid, canceling out the half. This rule is easily interleaved to use in adaptive quadrature. To half the step size, you can half the current integral value and then add in points half way between each of the current points multiplied by the halved step size.

Another approach to adaptive quadrature is to use Simpson's rule. The range can be split into two, using Simpson's rule on both the full range and each half. If the desired error is not met, each side of the range can be further subdivided. This concentrates calculations on the interesting bits of the integrand. bacon-sci has a function implementing this adaptive Simpson's rule.

Generally, Simpson's rule is better than the trapezoidal rule; however, it turns out that the trapezoidal rule is particularly good when the integrand decays double exponentially towards the end points (that is, decays as fast as \(\exp(\exp(x))\)). Taking advantage of this, we can do a variable substitution to integrate over the interval \([-1, 1]\) (Any two-sided closed interval can be changed to this one using another simple change of variable). A good change of variable to choose is \[ x = \tanh\left(\frac{\pi}{2}\sinh t\right), \] using hyperbolic trig functions. This changes the interval to \((-\infty, \infty)\), pushing the end points out, so this gives an integration technique not sensitive to end point behavior. In the future, bacon-sci could use this to make integration functions where one or both endpoints have a singularity in the integrand. This change of variable changes the integral to \[ \int_{-1}^1 f(x)\,\mathrm{d}x \approx \sum_{-\infty}^\infty w_k f(x_k), \] where \[ x_k = \tanh\left(\frac{\pi}{2}\sinh kh \right) \] and \[ w_k = \frac{\frac{h}{2}\pi \cosh kh}{\cosh^2\left(\frac{\pi}{2}\sinh kh\right)} . \] This evaluates the transformed integral using the trapezoidal rule. Since the decay is double exponential, the limits on the sum can be made finite without much loss of precision. In bacon-sci, the sum is done from -3 to 3, which is correct within \(1 \times 10^{-12}\). For example, the transformed function from before looks like:

tanh-sinh integration

This method is called Tanh-Sinh integration. The weights and evaluation points are precomputed and stored in a lookup table. The trapezoidal rule is interlaced to compute the integral to a given precision. bacon-sci takes its implementation of Tanh-Sinh integration specifically from the quadrature crate, with attribution (quadrature is BSD licensed).

Tanh-Sinh quadrature is implemented as bacon_sci::integrate::integrate. Adaptive Simpson's rule is bacon_sci::integrate::integrate_simpson. There is a fixed integration technique called Romberg integration implemented as bacon_sci::integrate::integrate_fixed.

Gaussian Integration

In the previous section, all of the integration rules used equally-spaced points. What if you could choose points spaced optimally for the integrand? Well, Gaussian integration does just that: it optimally spaces points so that the highest degree polynomial possible as an integrand is perfectly calculated.

Much like the section on special polynomials, Gaussian integration rules are determined by an interval and a weighting function. In general, the question is, what are the optimal points to evaluate to numerically integrate: \[ \int_a^b f(x) w(x) \,\mathrm{d}x ? \] As it turns out, the optimal \(n\) points are the zeros of the \(nth\) classical orthogonal polynomial on the interval \([a,b]\) with weighting function \(w(x)\) as defined in the Special Polynomials section.

For general purpose integrals, the best interval is, like in Tanh-Sinh quadrature, \([-1,1]\). This becomes Gaussian-Legendre quadrature. You can also integrate over \([0,\infty)\) with Gaussian-Laguerre quadrature and \((-\infty,\infty)\) with Gaussian-Hermite quadrature using the most natural weighting functions. In addition, bacon-sci provides both kinds of Chebyshev-Guassian quadrature. All of these integration functions can be done within a specified tolerance. Furthermore, all of the weighting function and polynomial zero evaluations are done ahead of time and stored in a table at compile-time. This is cool, but the best general purpose integration scheme is still Tanh-Sinh quadrature.

All Gaussian integration techniques discussed here are implemented under bacon_sci::integrate.

What's Next?

There is much more to be added to bacon-sci. For example, many special functions can be added. Integration can be extended to integrate with end point singularities. Furthermore, I want to add statistics to bacon-sci. Distributions are already covered by rand_distr, but PDFs and CDFs can be added, as well as descriptive statistics. The next chapter in my numerical analysis book is optimization, so data fitting to a model using least-squares regression is to come. Lastly, of course, is optimizing what is currently in use. My integration functions are already fast, and my initial value problem solvers have benchmarked to be faster than both peroxide and SciPy. Maybe I'll have a follow up blog comparing performance.

Whole Wheat Rye Bread

I really like making bread, even though I'm not really supposed to be eating bread in my diet. Nevertheless, I persist in eating and making bread. This is a recipe I'm currently working on that came out with good results for a whole wheat rye bread.

Equipment

  • A large bowl
  • A wooden spoon
  • A dutch oven
  • A cooling rack
  • Possibly a whisk
  • Possibly a thermometer

Ingredients

  • 2 teaspoons instant yeast
  • 200 g whole wheat flour
  • 30 g bread flour
  • 150 g dark rye flour
  • 1.5 teaspoons table salt
  • 1 tablespoon sugar
  • 3 tablespoons of sesame seed
  • 80 g old-fashioned rolled oats
  • 60 g butter, melted
  • 300 g warm water

Instructions

  1. Combine everything except the butter and water in a large bowl. Whisk together.
  2. Add the melted butter and whisk in
  3. Add the warm water and stir in with the handle end of a wooden spoon (so the dough doesn't clump on the spoon end).
  4. Once you can't mix anymore, flour your hands and knead the dough right there in the bowl until smooth and supple, about 10 minutes.
  5. Cover loosely with plastic wrap and let the dough rise until doubled in size, about 1.5 to 2 hours.
  6. Punch the dough down and flip over in the bowl. Sprinkle the top with rolled oats and and gently press them into the top. Loosely cover again until doubled in size, 1 to 1.25 hours.
  7. Preheat your oven to 375°F with the dutch oven inside. Once hot, gently lay your dough ball inside. Cut two cuts into the top of the ball with a knife and sprinkle more sesame seeds on top. Throw around two handfuls of water into the dutch oven put on the lid to keep the steam in. Bake for 30 to 40 minutes or until the temperature is 190°F inside.
  8. Let cool completely on a rack before cutting in.

Notes

This recipe yields a fairly dense loaf. I am still testing ways to make it fluffier. The flavor is great. I'd take a photo but my phone camera is fuzzy.

Self-hosted git

Git was made to be decentralized. Why trust only gitlab or github with your code? It's actually extremely easy to set up your own git server, assuming you already have a VPS and domain name.

I set up read/write access to repos on my VPS using the standard git tutorial. Then, I went through my repos and git remote set-url origin <my-url>, adding github and gitlab remotes as well (so my code will be mirrored on my gitlab and my github). To make pushing commits to all my remotes easy, I added alias gua="git remote list | grep -v upstream | xargs -l git push" and alias gum="git remote list | grep -v upstream | xargs -I _ git push _ master" to my .zshrc file, allowing me to push the current branch or the master branch to all my remotes (except for upstream, which is usually not a repo I have write access to). To give people the ability to clone my repos directly from my server, I set up the git daemon with systemd. Now, I had read/write access to my repos, and the entire world can read my repos right from my new subdomain, git.aftix.xyz with the URL git://git.aftix.xyz/<repo>.git.

Now everyone can use my git server as I wanted them to. However, I also wanted to make my repos explorable in a browser over HTTPS. There are many solutions to this, such as cgit and git tea. These solutions are a bit much for me, as git tea has many features I don't need (e.g. user logins and pull requests), and cgit uses cgi. Since my repos are small, I chose to serve static project pages via nginx. This matches well with my blog, which is also serving static pages over nginx. So, I built and installed stagit, which generates static pages from git repos (this is what suckless uses to make their git pages). After creating some git hooks and scripts, I just had to change the style to be consistent with my blog. I'm pleased with the results, but I do wish stagit supported javascript plugins so I could use code syntax highlighting. I could make a shell script that goes through the html pages and uses sed to insert a script tag before the closing body tag, but that would probably be very slow to run. Regardless, my git trees are now browsable at https://git.aftix.xyz. The only thing left to do is to see if I can make the dwm repo show the my_dwm branch instead of master.

First Post

After bug testing housecat and setting a theme, this site is actually up and running! This is just a status post, to get real content out instead of the filler I had before.

The final layout has taken heavy inspiration from mort, but I did add flexboxes to the layout, as well as the little RSS icon next to the home tab when you're actually in the home section of the site. There are RSS feeds for the other sections since I chose to generate per-section feeds in housecat, but I hid those away since I assume no one will ever want to subscribe to anything other than the blog page. The color scheme I ended up going with is unique, but I think it's high enough contrast to be readable and not ugly enough to hurt eyes.

I don't know if I'll ever actually write more posts, but I might. At a later date I may need to use MathJAX and some code syntax highlighting plugin to help with making posts.