Automatically Computing the Derivatives of a Vector Field using Dual Numbers

Dual numbers are 2-dimensional numbers that can be regarded as an extension of the real numbers. Along with the complex and split-complex numbers, they are one of the three 2-dimensional unital algebras.

Each dual number can be described as a linear combination of the real unit 1 and the dual unit \varepsilon. Whereas the complex numbers feature an imaginary unit that squares to -1, the dual unit squares to zero, \varepsilon^2=0. This leads to the following algebraic operations.

We will write a dual number x+\varepsilon y as the pair (x,y).

Addition and subtraction are element-wise,
(x,y) \pm (a,b) = (x\pm a,y\pm b),
with identity (0,0).

Multiplication is
(x,y) (a,b) = (xa,ya+xb),
which is commutative and has identity (1,0).

Division is therefore
\frac{(x,y)}{(a,b)}= (\frac xa,\frac{ya-xb}{a^2}).

From this we can see that
(x,y)^n = (x^n,nx^{n-1} y),\quad n\in\mathbb N_1.

Extending Real Functions to the Dual Plane

We can use this algebraic structure to extend any real analytic function f:\mathbb R\to\mathbb R to the dual plane, \tilde f:\mathbb R^2\to\mathbb R^2 by taking its Taylor series and substituting a dual argument.

f(x) = \sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!} (x-a)^n

\tilde f(x,y) = \sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!} (x-a,y)^n

\tilde f(x,y) = \sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!} \left((x-a)^n,n(x-a)^{n-1} y\right)

\tilde f(x,y) = \left(\sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!}(x-a)^n,y\sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!}n(x-a)^{n-1}\right)

\tilde f(x,y) = (f(x),yf'(x)).

We can use this rule to uniquely extend any differentiable function to the dual plane, or indeed any suitable subset of the real line to the corresponding subset of the dual plane. This extension is compatible with the composition of functions, i.e. the extension of a composition is the composition of the extensions,

\widetilde{f\circ g} = \tilde f\circ\tilde g.

This is useful as it means we can extend complicated functions by composing elementary functions.

f(x) f'(x) \tilde f(x,y)
x+c 1 (x+c,y)
cx c (cx,cy)
x^n nx^{n-1} (x^n,nx^{n-1}y)
e^x e^x (e^x,e^xy)
\log x 1/x (\log x,y/x)
\sin x \cos x (\sin x,y\cos x)

For example, to extend \sin^2 x we extend \sin x to give (\sin x,y\cos x), then square the result to give (\sin^2 x,2y\sin x\cos x).

In other words, we don’t need to know the derivative of the complicated function in order to extend it, we only need to know the derivatives of the elementary functions of which it is composed.

Automatic Differentiation

A real function’s derivative at a given point x\in\mathbb R can be obtained by applying its extension to the dual number (x,1) and then taking the dual part of the result,

\tilde f(x,1) = (f(x),f'(x)).

This is called automatic differentiation. This is particularly useful when we don’t necessarily need to know f' as a function, we just want to obtain its value at a particular point x. If the function f is a complicated combination of many elementary functions, then computing the derivative on paper can be both time-consuming and error-prone. Instead of finding f' in full and then evaluating it, we can just evaluate the original function f with a dual number.

This technique can be applied to multivariable functions, such as vector fields, V:\mathbb R^n\to\mathbb R^n, giving

\tilde V(x+\varepsilon y) = V(x) + \varepsilon (y\cdot\nabla) V|_{x}.

The dual part is the directional derivative of the vector field V in the direction y, evaluated at x. By choosing the direction to be the jth basis vector, e_j, and then taking the ith component of the result, we can automatically compute the partial derivatives of the vector field,

\tilde V^i(x+\varepsilon e_j) = V^i(x) + \varepsilon \partial_j V^i|_{x}.

If we denote the dual part as \mathfrak D(x,y) = y then the Jacobian is given by

\left[\begin{array}{ccc}\mathfrak D\tilde V(x,e_1)&\cdots&\mathfrak D\tilde V(x,e_n)\end{array}\right].

This means we can compute the derivatives of an n-dimensional vector field by evaluating the vector field n times with dual numbers.

In a practical implementation it is more convenient to think of (x,y) as a vector of dual numbers, rather than a pair of real vectors. This allows for a single implementation of the vector field to be evaluated with a vector of either real or dual elements, for example, by templating the scalar type in C++:

template<typename Scalar>
Vector<Scalar> MyVectorField( const Vector<Scalar>& x ) {
    // evaluate the vector field
}

The version of automatic differentation discussed here (using dual numbers) is called forward automatic differentiation. There is an alternate version called reverse automatic differentiation, and also extensions to the higher derivatives. Wikipedia has a decent article on the topic.

Advertisements

One comment

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s