Could someone kindly provide me with a good reference about automatic differentiation? I have already had a look at the wikipedia article, but I would need a book or article, which gives a better theoretical understanding. It would also be helpful if I had a step by step guide to understand the concepts.
$\endgroup$ 73 Answers
$\begingroup$Try also these simpler papers:
The arithmetic of differentiation by L. B. Rall
A simple automatic derivative evaluation program by R. E. Wengert
Another valuable source, both for the theoretical background and for code, is Sebastian Walter's PhD thesis:
The Python implementation of the ideas in the above thesis is called Algopy and is fairly easy to understand.
Algorithmic differentiation, which is an exact numerical approach for calculating derivatives of computer programs, is different from symbolic differentiation of expression trees. Reverse-mode AD has to work on graphs necessarily. Any implementation that scales to real world code and problems is actually exceedingly difficult to implement. Pretty much all existing systems have significant limitations of one sort or another.
$\endgroup$ $\begingroup$You won't find a reference because for a programmer the problem is almost trivial. However, you can find many students projects implementing this problem in every language.
The steps are :
parsing, transforming the text $\sin(z)+4 \exp(2\log(z))$ to a binary tree. At first, you can avoid this step by constructing binary trees directly :
new add(new sin(z),new mult(4,new exp(new mult(new constant(2),new log(new var('z')))))differentiating the binary tree to return a new binary tree
simplifying the obtained mathematical expression (this step is of course the most difficult, and is still a research problem)
Once you have your binary tree of mathematical expressions, differentiating is really not complicated.
A node will be of the form $u+v, u-v, uv, u/v, u^v$ where $u,v$ are two mathematical expressions, or $f(u)$ where $f$ is an elementary function. Each of those cases is easy to differentiate :
$$(u+v)' = u'+v', \ (uv)' = u'v+v'u, \ (u/v)' = u'/v - uv'/v^2,\ u^v = u' u^{v-1} + v' \log(u) u^v , \ (f(u))' = u' f'(u)$$
Programatically in C++ you'll have a classadd and a class sin
class add : expression { expression * u,* v; void print() { .... } expression * differentiate() { return new add(u->differentiate(),v->differentiate()); } }; class sin : expression { expression * u; expression * differentiate() { return new mult(u->differentiate(),new cos(u)); } };And the other operators and elementary functions follow the same idea.
$\endgroup$ 3