Skip to content

Latest commit

 

History

History
973 lines (847 loc) · 35.6 KB

kalman-folding-06-008-non-dimensionalization.org

File metadata and controls

973 lines (847 loc) · 35.6 KB

Kalman Folding 6: Dimensional Analysis (WORKING DRAFT)

1 Abstract

In Kalman Folding, Part 1,klfl we present basic, static Kalman filtering as a functional fold, highlighting the unique advantages of this form for deploying test-hardened code verbatim in harsh, mission-critical environments. In Kalman Folding 2: Tracking,klf2 we reproduce a tracking example from the literature, showing that these advantages extend to time-dependent, linear models. Here, we extend that example further to include aerodynamic drag, making the model nonlinear. We must change the Kalman filter itself to handle such problems. The resulting class of filters are called Extended Kalman Filters or EKFs. Other papers in this series feature applications of EKFs to a variety of problems including navigation and pursuit.

The particular EKF designed here includes integration of non-linear equations of motion. We integrate these equations by folding over a lazy stream that generates, on demand, differential updates to the solution. Folds over lazy streams were introduced in /Kalman Folding 4: Streams and Observables/klf4 where we used them to step a static Kalman filter over observations. They also afford a constant-memory representation for solutions of differential equations, making them suitable for integration components in constant-memory filters.

2 Kalman Folding in the Wolfram Language

In this series of papers, we use the Wolfram languagewolf because it excels at concise expression of mathematical code. All examples in these papers can be directly transcribed to any modern mainstream language that supports closures. For example, it is easy to write them in C++11 and beyond, Python, any modern Lisp, not to mention Haskell, Scala, Erlang, and OCaml. Many can be written without full closures; function pointers will suffice, so they are easy to write in C. It’s also not difficult to add extra arguments to simulate just enough closure-like support in C to write the rest of the examples in that language.

In Kalman Folding 2: Tracking,klf2 we found the following formulation for the accumulator function of a fold that implements the linear dynamic Kalman filter, that is, a filter that can track states that evolve with timetime according to a linear transformation.

\noindent where

\noindent and all quantities are matrices:

  • $\mathbold{z}$ is a ${b}×{1}$ column vector containing one multidimensional observation
  • $\mathbold{x}$ and $\mathbold{x}2$ are ${n}×{1}$ column vectors of model states
  • $\mathbold{Z}$ is a ${b}×{b}$ matrix, the covariance of observation noise
  • $\mathbold{P}$ and $\mathbold{P}_2$ are ${n}×{n}$ matrices, the theoretical covariances of $\mathbold{x}$ and $\mathbold{x}_2$, respectively
  • $\mathbold{A}$ is a ${b}×{n}$ matrix, the observation partials
  • $\mathbold{D}$ is a ${b}×{b}$ matrix, the Kalman denominator
  • $\mathbold{K}$ is an ${n}×{b}$ matrix, the Kalman gain
  • $\mathbold{Ξ}$ is a non-dimensionalized $n×{n}$ integral of process noise $\mathbold{ξ}$, namely \newline \(∫0δ t\mathbold{Φ}(τ)⋅{\left(\begin{array}{c|c}\mathbold{0}&\mathbold{0}\\hline\mathbold{0}&E\left[\mathbold{ξ}\mathbold{ξ}\intercal\right]\end{array}\right)⋅\mathbold{Φ}(τ)^\intercal\,\textrm{d}τ}\)
  • $\mathbold{Φ}$ is a non-dimensionalized $n×{n}$ propagator for $\mathbold{F}$, namely $e\mathbold{F\,{δ t}}$
  • $\mathbold{Γ}$ is an $n×{dim(\mathbold{u})}$ integral of system response, namely \(∫0δ t{\mathbold{Φ}(τ) ⋅ \mathbold{G}\,\textrm{d}τ}\)
  • $\mathbold{u}$ is a vector of external disturbances or control inputs
  • $δ{t}$ is an increment of time (or, more generally, the independent variable of the problem)

\noindent and the time-evolving states satisfy the following differential equation in state-space form:

\noindent $\mathbold{F}$, $\mathbold{G}$, and $\mathbold{u}$ may depend on time, but not on $\mathbold{x}$; that is the meaning of “linear dynamic” in this context. In this paper, we relieve those restrictions by explicitly integrating non-linear equations of motion and by using Taylor-series approximations for the gain $\mathbold{K}$ and denominator $\mathbold{D}$ matrices.

2.1 Dimensional Arguments

In physical or engineering applications, these quantities carry physical dimensions of units of measure in addition to their matrix dimensions as numbers of rows and columns. Both kinds of dimensions are aspects of the type of a quantity. Dimensional arguments, like type-arguments more generally, are invaluable for checking equations.

If the physical and matrix dimensions of $\mathbold{x}$ are $\left[\left[\mathbold{x}\right]\right] \stackrel{\text{\tiny def}}{=} (\mathcal{X}, n×{1})$, of $\mathbold{z}$ are $\left[\left[\mathbold{z}\right]\right] \stackrel{\text{\tiny def}}{=} (\mathcal{Z}, b×{1})$, of $δ{t}$ are $\left[\left[δ{t}\right]\right] \stackrel{\text{\tiny def}}{=} (\mathcal{T}, \textrm{scalar})$, and of $\mathbold{u}$ are $\left[\left[\mathbold{u}\right]\right] \stackrel{\text{\tiny def}}{=} (\mathcal{U}, dim(\mathbold{u})×{1})$, then

The non-dimensionalization of $\mathbold{F}$ and $\mathbold{Ξ}$ requires careful analysis and is the subject of another paper in this series. For now, assume they have been implicitly non-dimensionalized.

\noindent In all examples in this paper, the observations $\mathbold{z}$ are $1×{1}$ matrices, equivalent to scalars, so $b=1$, but the theory and code carry over to multi-dimensional vector observations.

3 Tracking with Drag

Let us reproduce Zarchan and Musoff’szarc example of tracking a falling object in one position dimension, with aerodynamic drag. Throughout this series of papers, we refer to the examples in that reference because they provide solid benchmarks against which to check our contribution. The difference between our approach and typical presentations of Kalman-type filters is functional decomposition: writing code in functional style affords the ability to deploy it verbatim, even and especially at the binary level, in both laboratory and field. This ability can make the difference in a successful application because seemingly insignificant changes, even instruction order, can make qualitative differences in filter behavior due to floating-point issues.

To accommodate nonlinearity, we replace equation \ref{eqn:state-propagation-equation} for time-propagation of the state $\mathbold{x}$ with explicit numerical integration of the nonlinear equations of motion. We use an internal fold over a lazy stream of differential updates to the state, a kind of fold introduced in part 4 of this series.klf4 This form runs in constant memory and allows easy change of the integrator, say from Euler to Runge-Kutta.

3.1 Equations of Motion

To establish a benchmark solution, we solve the differential equations of motion using Wolfram’s built-in numerical integrator. We then introduce our own Euler and Runge-Kutta integrators and show they produce similar results when folded over lazy streams. These integrators do not use special features of the Wolfram language, so they are easy to implement in other languages.

Let $h(t)$ be the height of the falling object, and let the state vector $\mathbold{x}(t)$ contain $h(t)$ and its first derivative, $\dot{h}(t)$, the speed of descent.

Omitting, for clarity’s sake, explicit dependence of $h$ and $\dot{h}$ on time, the equations of motion are elementary:

\noindent where

  • $g$ is the acceleration of Earth’s gravitation, about $32.2\,\textrm{ft}/{\textrm{s}}^2$
  • $ρ(h)$ is the density of airzerr in $\textrm{slug}/{\textrm{ft}}^2$; $ρ\,{{\dot{h}}^2}$ has units of pressure, that is, $\textrm{slug}/(\textrm{ft}⋅{\textrm{sec}^2})$
  • $β = 500\,\textrm{slug}/(\textrm{ft}⋅{\textrm{sec}^2})$ is a constant ballistic coefficient of the object in units of pressure (it is possible to estimate this coefficient in the filter; here, we treat it as a known constant).

The positive direction is up and we are only concerned with negative velocities where the object is approaching the ground. We may provisionally replace the factor $\textrm{sign}({\dot{h}})$ with -1 and keep our eye out for improper positive speeds.

In scalar form, the equations are

\noindent or

\noindent with $k=22,000\,\left[\textrm{ft}\right]$, the e-folding height of the atmosphere, and \(A=0.0034\,[\textrm{slug}/{{\textrm{ft}}^3}]\) for the density of air at $h=0$.

3.2 Built-in Solver

We integrate these equations for thirty seconds with the initial conditions $h(0)=200,000\,[ft]$, ${\dot{h}}=-6,000\,[ft/s^2]$ with Wolfram’s built-in NDSolve, the numerical integrator for differential equations, as follows:

NDSolveFallingWithDrag.png

\noindent producing the results in figure fig:ndsolve-falling-with-drag-results. These results are indistinguishable from those in the reference.

3.3 Stream Solver

We can write the same differential equation as a lazy stream, which uses only constant memory. Thus, it is suitable for the internals of a Kalman filter. We implement the integrator as an accumulator function for a foldStream from paper 3klf3 which produces all intermediate results as a new stream:

The simplest integrator is the Euler integrator, which updates a state with its derivative times a small interval of time:

This is a binary function that takes two compound arguments. The first is an instance of the accumulation type: a pair of a time t and a (usually compound) state x. The second is an element of the input stream, a triple of a time differential dt, the same time t that appears in the first argument, and a function Dx that computes the derivative of the state given the state and the time as Dx[x,t].

Folding this integrator over the streamed differential equation produces a streamed solution. The input stream must produce elements of the form {dt, t, Dx} and, like all streams, contain a thunk that produces the rest of the stream.thnk

This bit contains nothing specific to our example, but just threads around the integration inputs and increments time. It could be much more rich, manipulating dt and Dx for speed or numerics (adaptive integration).

The kernel of our differential equation is the derivative function Dx, which, for our example, is

\noindent Integrating the differential equation for thirty seconds looks like this:

The type of the result, here, is a lazy stream produced by takeUntil from the lazy stream produced by foldStream. Because these streams are lazy, nothing happens until we demand values for, say, plotting. The results are indistinguishable from those in figure fig:ndsolve-falling-with-drag-results.

The arguments of takeUntil are a stream and a predicate. The result is a new stream that pulls values from the original stream, applying the predicate until it produces True. At that point, the rest of the stream returned by takeUntil is empty, represented by invocation of the null thunk, Null[], The implementation of takeUntil is in three overloads:

Given an empty stream and any predicate, produce the empty stream:

Given a stream containing a value v and a tail thunk, return the empty stream if the predicate evaluates to True:

Otherwise, recurse by invoking the thunk in the stream:

3.4 What’s the Point?

The point of this style of integration is that we can change three aspects of the integration independently of one another, leaving the others verbatim, without even recompilation, because we have un-nested and /decomplected/hick these aspects:

  1. the integrator
  2. sophisticated adaptive treatments of the time increment dt and derivative function Dx
  3. the internals of the derivative function Dx

For example, should Euler integration prove inadequate, we can easily substitute second- or fourth-order Runge-Kutta integrators. The only requirement is that an integrator must match the integrator’s functional interface:

Decomplecting these bits also makes them easier to review and verify by hand because dependencies are lexically localized, making expressions smaller, easier to memorize and to find on a page.

3.5 Gain and Covariance Updates

For gains and covariances, we need the best linear approximation of the equations of motion so that we have an expression that structurally resembles equation \ref{eqn:state-space-form}. When there are no disturbances, $\mathbold{G}\,\mathbold{u}=\mathbold{0}$ and the solution of the linear equation $\mathbold{\dot{x}}=\mathbold{F}\,\mathbold{x}$ also satisfies $Δ\mathbold{\dot{x}}=\mathbold{F}\,Δ\mathbold{x}$ for small differences $Δ\mathbold{\dot{x}}$ and $Δ\mathbold{x}$. We seek a similar form for our nonlinear equations of motion because we can linearize them around small differences $Δ{h}$ and $Δ{\dot{h}}$:

\noindent Thus, with $k=22,000\,\left[\textrm{ft}\right]$, the e-folding height of the atmosphere, and \(A=0.0034\,[\textrm{slug}/{{\textrm{ft}}^3}]\) for the density of airzerr at $h=0$, our linearized system-dynamics matrix is

We need $\mathbold{Φ}=e\mathbold{Ft}$ to propagate solutions forward, because, if $\mathbold{\dot{x}}=\mathbold{F}\,\mathbold{x}$, then $e\mathbold{Ft}\,\mathbold{x}$(t) effects a Taylor series. To first order,

\noindent First-order expansions turn out to be enough, so we take $\mathbold{Φ}(δ{t})=\mathbold{1}+\mathbold{F}\,δ{t}$ for our propagator matrix.

We compute the gains and covariances as in equations \ref{eqn:covariance-propagation-equation}, \ref{eqn:kalman-gain-definition}, and \ref{eqn:kalman-denominator-definition}:

\noindent where $Ξ$, integral of the process noise, is

\noindent with matrix element $\mathbold{F}22$ evaluated at the current state $\mathbold{x}$.

4 Concluding Remarks

It’s easy to add system dynamics to a static Kalman filter. Expressed as the accumulator function for a fold, the filter is decoupled from the environment in which it runs. We can run exactly the same code, even and especially the same binary, over arrays in memory, lazy streams, asynchronous observables, any data source that can support a fold operator. Such flexibility of deployment allows us to address the difficult issues of modeling, statistics, and numerics in friendly environments where we have large memories and powerful debugging tools, then to deploy with confidence in unfriendly, real-world environments where we have small memories, asynchronous, real-time data delivery, and seldom more than logging for forensics.

affn https://en.wikipedia.org/wiki/Affine_transformation bars Bar-Shalom, Yaakov, et al. Estimation with applications to tracking and navigation. New York: Wiley, 2001. bier http://tinyurl.com/h3jh4kt bssl https://en.wikipedia.org/wiki/Bessel’s_correction busi https://en.wikipedia.org/wiki/Business_logic cdot We sometimes use the center dot or the $×$ symbols to clarify matrix multiplication. They have no other significance and we can always write matrix multiplication just by juxtaposing the matrices. clos https://en.wikipedia.org/wiki/Closure_(computer_programming) cold This convention only models so-called cold observables, but it’s enough to demonstrate Kalman’s working over them. cons This is quite similar to the standard — not Wolfram’s — definition of a list as a pair of a value and of another list. cova We use the terms covariance for matrices and variance for scalars. csoc https://en.wikipedia.org/wiki/Separation_of_concerns ctsc https://en.wikipedia.org/wiki/Catastrophic_cancellation dstr http://tinyurl.com/ze6qfb3 elib Brookner, Eli. Tracking and Kalman Filtering Made Easy, New York: Wiley, 1998. http://tinyurl.com/h8see8k fldl http://tinyurl.com/jmxsevr fwik https://en.wikipedia.org/wiki/Fold_%28higher-order_function%29 gama https://en.wikipedia.org/wiki/Gauss%E2%80%93Markov_theorem hick “Decomplecting” is a term coined by Rich Hickey for un-braiding and un-nesting bits of software. intr http://introtorx.com/ jplg JPL Geodynamics Program http://www.jpl.nasa.gov/report/1981.pdf just justified by the fact that $\mathbold{D}$ is a diagonal matrix that commutes with all other products, therefore its left and right inverses are equal and can be written as a reciprocal; in fact, $\mathbold{D}$ is a $1×{1}$ matrix — effectively a scalar — in all examples in this paper klde B. Beckman, Kalman Folding 3: Derivations, to appear. klf2 B. Beckman, Kalman Folding 2: Tracking, to appear. klf3 B. Beckman, Kalman Folding 3: Derivations, to appear. klf4 B. Beckman, Kalman Folding 3: Streams and Observable, to appear. klfl B. Beckman, Kalman Folding, Part 1, to appear. layi https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering lmbd Many languages use the keyword lambda for such expressions; Wolfram uses the name Function. lmlf https://en.wikipedia.org/wiki/Lambda_lifting lols Let Over Lambda lssq https://en.wikipedia.org/wiki/Least_squares ltis http://tinyurl.com/hhhcgca matt https://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf mcmc https://en.wikipedia.org/wiki/Particle_filter musc http://www1.cs.dartmouth.edu/~doug/music.ps.gz ndim https://en.wikipedia.org/wiki/Nondimensionalization patt http://tinyurl.com/j5jzy69 pseu http://tinyurl.com/j8gvlug rasp http://www.wolfram.com/raspberry-pi/ rcrn https://en.wikipedia.org/wiki/Recurrence_relation rsfr http://rosettacode.org/wiki/Loops/Foreach rxbk http://www.introtorx.com/content/v1.0.10621.0/07_Aggregation.html scan and of Haskell’s scans and folds, and Rx’s scans and folds, etc. scla http://tinyurl.com/hhdot36 scnd A state-space form containing a position and derivative is commonplace in second-order dynamics like Newton’s Second Law. We usually employ state-space form to reduce \(n\)-th-order differential equations to first-order differential equations by stacking the dependent variable on $n-1$ of its derivatives in the state vector. scnl http://learnyouahaskell.com/higher-order-functions stsp https://en.wikipedia.org/wiki/State-space_representation thnk Wolfram’s ampersand postfix operator can covert its operand into a thunk. time In most applications, the independent variable is physical time, however, it need not be. For convenience, we use the term time to mean the independent variable of the problem simply because it is shorter to write. uncl The initial uncial (lower-case) letter signifies that we wrote this function; it wasn’t supplied by Wolfram. wfld http://reference.wolfram.com/language/ref/FoldList.html?q=FoldList wlf1 http://tinyurl.com/nfz9fyo wlf2 http://rebcabin.github.io/blog/2013/02/04/welfords-better-formula/ wolf http://reference.wolfram.com/language/ zarc Zarchan and Musoff, Fundamentals of Kalman Filtering, A Practical Approach, Fourth Edition, Ch. 4 zerr Zarchan and Musoff, on page 228, report $0.0034$ for the density of air in $\textrm{slug}/\textrm{ft}^3$ at the surface; we believe the correct value is about $0.00242$ but continue with $0.0034$ for comparison’s sake.