Dual number differentiation - discussion

FMDenaro · November 18, 2018, 13:08

Have you ever read or used such methodology? I never read but recently I had a discussion on RG where it is proposed as a method to get an exact derivative (no truncation error) using a somehow tricky Taylor expansion.
I have several doubts about the theoretical foundation and I wonder if some of you has worked on.
Some papers were linked in the discussion:
https://www.researchgate.net/post/wh...ite_difference

sbaffini · November 19, 2018, 06:25

Dear Filippo,

the way I understand the matter is that dual number differentiation simply is a clever mathematical machinery by which automatic differentiation can be implemented. Thus, the matter is just about automatic differentiation.

Automatic differentiation refers to algorithms that can be used to compute derivatives in cases where symbolic differentiation would be, in theory, an option. It is not directly applicable for the spatial discretization of stuff. Basically, it can be used for Jacobians.

For example, in some codes it is used to fill the matrix of implicit coefficients. Actually (but I'm not sure), I think it is more relevant in those cases where the matrix is never formed and its coefficients just used on the fly during the linear iterations.

In these cases it is exact. The advantage with respect to symbolic derivation is that it can be applied blindly to any symbol in your source code... it is automatic. Also, for a large number of derivatives, the complexity of symbolic computation (even just copy-pasting into your source code from a symbolic manipulation software) increases.

Note how, as cited in the thread you linked, when the method has to be applied to tabulated aerodynamic data then an interpolation is required. And you know better than me that the truncation error in finite difference derivatives is the same error you get with exact differentiation of the interpolant.

Long story short, nothing new under the sun.

FMDenaro · November 19, 2018, 06:46

Paolo
Thanks for your idea, but is that really exact just because one introduces a nilpotent matrix??
And what about my counter-example of the lagrangian form??

sbaffini · November 19, 2018, 07:12

I guess you already looked at these two pages:

https://en.wikipedia.org/wiki/Dual_number

https://en.wikipedia.org/wiki/Automatic_differentiation

The point is that it is exact because IT IS, in a certain sense, a symbolic manipulation of an expression which, it is very important to note, YOU HAVE TO KNOW.

When overloading is cited in that thread, it is actually referred to the programming language used for the implementation. This basically means that a new derived type is defined, the dual number, and all the operations are defined for it through dedicated functions, including all the intrinsic functions (sin, cos, exp, log, etc.). Which basically means, in this case, that you also instruct the compiler on the derivatives of those intrinsics.

Once you have this, whatever subroutine or function you write in terms of dual numbers, it can also keep track of the derivative with respect to each variable stored in the dual number in input. It is exact because for every possible operation you do on the input you have previously coded the corresponding derivative with respect to the input (otherwise, btw, the code won't compile if you haven't).

Concerning your example, I think it is not well posed. Note that you actually need to evaluate the function in a point and, if that point is a dual number, the output of the function is both the function value and the function derivative at that point. The demonstration of this, I think, has nothing to do with the lagrange point that truncates the Taylor series expansion making it exact.

sbaffini · November 19, 2018, 07:35

Let me give you an example. Imagine you have the following function:

function myf(x) result(y)
real, intent(in) :: x
real :: y

y = x**2 + x + sin(x)

endfunction myf

You could easily write its exact derivative with respect to x:

function mydf(x) result(y)
real, intent(in) :: x
real :: y

y = 2*x + 1 + cos(x)

endfunction mydf

But this easily becomes a nightmare if you have to do it for each function you write in your code (especially if they are more complex than this simple example).

Automatic differentiation via dual numbers allows you to do this in the main function, without having to write the derivative as a separate function:

function myf(x) result(y)
type(dual), intent(in) :: x
type(dual) :: y

y = x**2 + x + sin(x)

endfunction myf

and you would probably access the function value and its derivative in x via type bound procedures in Fortran. For example, y%real and y%dual.

You see how this can be exact. It has nothing to do with the computation of a derivative for a set of discrete points. You can use it in this way if, for example, given a set of discrete points in a vector v and an evaluation point x, you write an interpolation routine that gives you the value of the underlying function for arbitrary x. It is easy to see, say, for a piecewise linear interpolation, that the resulting derivative in the dual is nothing more than the underlying first order approximation.

sbaffini · November 19, 2018, 07:49

In a certain sense, the dual number algebra gives you the rules to use in order to code a program that would give you exact derivatives of the expressions you use in your code. Not really different from implementing imaginary numbers, just different rules.

FMDenaro · November 19, 2018, 08:03

Paolo, I still see some confusion in the theory and in what should be conversely the practical use od dual number...
I alread had a look to the wikipedia links (it seems to me that are exactly what was written in the papers and thesis) and things appear not clear.

1) If they introduce a (lagrangian) polynomial P(x) defining the counterpart in dual space P(a+b*eps), this is by definition a truncated Taylor series, you can compute the analytical derivative of a polynomila approximation but it has a truncation error.

2) In both polynomials and Taylor series there is a key to understand, the derivatives are numbers not function! You have derivatives evaluated at specific points they do no longer depends on a independent variable. The functionla dependence is only in the monomials. Conversely, in symbolic computations you have functions.

sbaffini · November 19, 2018, 08:27

Indeed, I think no actual functions are involved here. Look at the Taylor series example under Differentiation in the Dual number wikipedia page. As the expansion is in the dual (b eps) no actual real function is really involved.

The comparison with symbolic manipulation programs was not meant to be so deep. Dual numbers indeed return the derivative value, not the function. But, constructed following exact rules.

FMDenaro · November 19, 2018, 08:34

Quote:

Originally Posted by sbaffini

Indeed, I think no actual functions are involved here. Look at the Taylor series example under Differentiation in the Dual number wikipedia page. As the expansion is in the dual (b eps) no actual real function is really involved.

The comparison with symbolic manipulation programs was not meant to be so deep. Dual numbers indeed return the derivative value, not the function. But, constructed following exact rules.

Again, by a practical point, you compute an exact derivative for the polynomial that has a truncation error. This fact is not highlighted in the RG discussion and does not appear in the linked papers. They stated that, differently from FD method, there is no truncation and round-off error...

sbaffini · November 19, 2018, 09:10

I think the misunderstanding comes from the different areas where you use finite difference.

Finite differences can be used indeed to compute jacobians of known functions with multiple evaluations of the function over a small interval. But if you know the function you can, in theory, also provide its derivative analytically.

The point is that there are cases where this is not practically feasible or optimal. In this case AD comes to help. But the way I see it is just a formalization of something you know that is used to actually code the analytic differentiation and perform it automatically.

There is no way it can be used, say, on a vector of equispaced values in space or time to get the exact derivative at any point in the interval. It can be used in this way only if you have an interpolation routine for that vector.

The Taylor series demonstration, I think, only serves the purpose of showing how to apply functions to dual numbers. At some point, in practice, you have to inform the program about the derivative of all the intrinsics. In a certain sense, I think, it is not different from using the taylor series to compute standard intrinsics. It is just the equivalent way for duals.

FMDenaro · November 19, 2018, 09:21

Well, still not clear in my opinion ...Also FD is nothing else that an exact derivative of a polynomial of some degree, exactly as is introduced in the dual number differentiation.
The original discussion in RG is about the accuracy order in the computation of a derivative. The person stated that dual number differentiation has no truncation error. And that is theoretically questionable. The thesis and the papers that are linked show no errors but that can be possible only if the comparison is done with the expression of the derivative of the polynomial....

Sorry, but I do not understand how you get this point...

sbaffini · November 19, 2018, 09:48

Quote:

Originally Posted by FMDenaro

FD is nothing else that an exact derivative of a polynomial of some degree, exactly as is introduced in the dual number differentiation.

Exactly. The confusion, I guess, stems from the fact that they claim exactness in differentiation of something known. Thus, for a given vector of point values, they claim exact differentiation of the underlying interpolator (which needs to be explicitly coded for AD to be applied in this case).

What they fail to recognize is that the error is in the interpolation itself for a set of discrete points.

Yet, if I can, I think you fail to recognize their intended use case, which might be trivial for you, it being when differentiating a known analytical function with respect to a given input. No spatial or temporal intervals being involved. Just a known (differentiable) function evaluated in a single point. In this case you have two possible routes: a) write yourself the analytical derivative or b) evaluate the known function in two close points and use finite differences. In theory, as you know the function, you would never follow the route b) but, formally, what they state in that thread is that, for this scenario, following route a) gives an exact derivative, while route b) doesn't. AD via dual numbers is just a way to actually implement route a) automatically for arbitrarily complex source codes.

Their taylor series expansion is just a demonstration of how to apply functions to dual numbers.

They don't actually use finite differences on point values promoted to dual numbers. You can't do this because, by definition, you know things for real values only.

Yet, if you know a function and its exact derivative in a set of points, you can construct the dual number version of that function in those points. After that, it is easy to see that if you apply FD to that dual number version (using the taylor series example of the dual number) you actually get back the exact derivative in the given point. But this is cheating, because you actually already knew that information.

FMDenaro · November 19, 2018, 10:00

Quote:

Originally Posted by sbaffini

Exactly. The confusion, I guess, stems from the fact that they claim exactness in differentiation of something known. Thus, for a given vector of point values, they claim exact differentiation of the underlying interpolator (which needs to be explicitly coded for AD to be applied in this case).

What they fail to recognize is that the error is in the interpolation itself for a set of discrete points.

Yet, if I can, I think you fail to recognize their intended use case, which might be trivial for you, it being when differentiating a known analytical function with respect to a given input. No spatial or temporal intervals being involved. Just a known (differentiable) function evaluated in a single point. In this case you have two possible routes: a) write yourself the analytical derivative or b) evaluate the known function in two close points and use finite differences. In theory, as you know the function, you would never follow the route b) but, formally, what they state in that thread is that, for this scenario, following route a) gives an exact derivative, while route b) doesn't. AD via dual numbers is just a way to actually implement route a) automatically for arbitrarily complex source codes.

Their taylor series expansion is just a demonstration of how to apply functions to dual numbers.

They don't actually use finite differences on point values promoted to dual numbers. You can't do this because, by definition, you know things for real values only.

Yet, if you know a function and its exact derivative in a set of points, you can construct the dual number version of that function in those points. After that, it is easy to see that if you apply FD to that dual number version (using the taylor series example of the dual number) you actually get back the exact derivative in the given point. But this is cheating, because you actually already knew that information.

Paolo,
I explicitly asked to the author for the presence of an error (truncation error) in the dual number differentiation. He stated "DN-differentiation does not have truncation errors". And the diagrams that are reported in the thesis show that, apparently, there is no truncation error.

As a counterpart, I can immagine a suite programmed in Maple to automatically compute "exact" FD derivatives if the function is known.

It seems to me that people working in automatic differentiation have no idea of "discretization error" and "local truncation error"...

sbaffini · November 19, 2018, 10:47

I agree that the material linked on RG is of poor quality in regards to the comparison made between FD, the complex approach and the dual step approach. They only cite the need of an analytical function, but:

1) In applying non-FD methods to test functions they automatically embed the exact derivative in evaluating the function. Once they build the test data as, say, the sin of a dual number, all the exact derivatives are copied in through the test function overloading.

2) In contrast, FD are applied to the rough data.

FMDenaro · November 19, 2018, 10:55

So, the comparison makes no sense...

sbaffini · November 19, 2018, 11:05

Quote:

Originally Posted by FMDenaro

So, the comparison makes no sense...

That's my understanding

November 18, 2018, 13:08	Dual number differentiation - discussion	#1
FMDenaro Senior Member Filippo Maria Denaro Join Date: Jul 2010 Posts: 6,865 Rep Power: 73	Have you ever read or used such methodology? I never read but recently I had a discussion on RG where it is proposed as a method to get an exact derivative (no truncation error) using a somehow tricky Taylor expansion. I have several doubts about the theoretical foundation and I wonder if some of you has worked on. Some papers were linked in the discussion: https://www.researchgate.net/post/wh...ite_difference

November 19, 2018, 06:25		#2
sbaffini Senior Member Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 2,191 Blog Entries: 29 Rep Power: 39	Dear Filippo, the way I understand the matter is that dual number differentiation simply is a clever mathematical machinery by which automatic differentiation can be implemented. Thus, the matter is just about automatic differentiation. Automatic differentiation refers to algorithms that can be used to compute derivatives in cases where symbolic differentiation would be, in theory, an option. It is not directly applicable for the spatial discretization of stuff. Basically, it can be used for Jacobians. For example, in some codes it is used to fill the matrix of implicit coefficients. Actually (but I'm not sure), I think it is more relevant in those cases where the matrix is never formed and its coefficients just used on the fly during the linear iterations. In these cases it is exact. The advantage with respect to symbolic derivation is that it can be applied blindly to any symbol in your source code... it is automatic. Also, for a large number of derivatives, the complexity of symbolic computation (even just copy-pasting into your source code from a symbolic manipulation software) increases. Note how, as cited in the thread you linked, when the method has to be applied to tabulated aerodynamic data then an interpolation is required. And you know better than me that the truncation error in finite difference derivatives is the same error you get with exact differentiation of the interpolant. Long story short, nothing new under the sun. FMDenaro likes this.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
GenerateVolumeMesh Error - Surface Wrapper Self Interacting (?)	AndreP	STAR-CCM+	10	August 2, 2018 08:48
timeVaryingMappedFixedValue: large data mapping on complex inlet boundaries	vkrastev	OpenFOAM Pre-Processing	7	June 2, 2016 16:50
MPI run run fails of quadcore openfoam2.4	Priya Somasundaran	OpenFOAM Running, Solving & CFD	3	January 25, 2016 09:50
[OpenFOAM.org] OF2.3.1 + OS13.2 - Trying to use the dummy Pstream library	aylalisa	OpenFOAM Installation	23	June 15, 2015 15:49
decomposePar pointfield	flying	OpenFOAM Running, Solving & CFD	28	December 30, 2013 16:05

November 19, 2018, 06:46		#3
FMDenaro Senior Member Filippo Maria Denaro Join Date: Jul 2010 Posts: 6,865 Rep Power: 73	Paolo Thanks for your idea, but is that really exact just because one introduces a nilpotent matrix?? And what about my counter-example of the lagrangian form??

November 19, 2018, 07:12		#4
sbaffini Senior Member Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 2,191 Blog Entries: 29 Rep Power: 39	I guess you already looked at these two pages: https://en.wikipedia.org/wiki/Dual_number https://en.wikipedia.org/wiki/Automatic_differentiation The point is that it is exact because IT IS, in a certain sense, a symbolic manipulation of an expression which, it is very important to note, YOU HAVE TO KNOW. When overloading is cited in that thread, it is actually referred to the programming language used for the implementation. This basically means that a new derived type is defined, the dual number, and all the operations are defined for it through dedicated functions, including all the intrinsic functions (sin, cos, exp, log, etc.). Which basically means, in this case, that you also instruct the compiler on the derivatives of those intrinsics. Once you have this, whatever subroutine or function you write in terms of dual numbers, it can also keep track of the derivative with respect to each variable stored in the dual number in input. It is exact because for every possible operation you do on the input you have previously coded the corresponding derivative with respect to the input (otherwise, btw, the code won't compile if you haven't). Concerning your example, I think it is not well posed. Note that you actually need to evaluate the function in a point and, if that point is a dual number, the output of the function is both the function value and the function derivative at that point. The demonstration of this, I think, has nothing to do with the lagrange point that truncates the Taylor series expansion making it exact.

November 19, 2018, 07:35		#5
sbaffini Senior Member Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 2,191 Blog Entries: 29 Rep Power: 39	Let me give you an example. Imagine you have the following function: function myf(x) result(y) real, intent(in) :: x real :: y y = x*2 + x + sin(x) endfunction myf You could easily write its exact derivative with respect to x: function mydf(x) result(y) real, intent(in) :: x real :: y y = 2x + 1 + cos(x) endfunction mydf But this easily becomes a nightmare if you have to do it for each function you write in your code (especially if they are more complex than this simple example). Automatic differentiation via dual numbers allows you to do this in the main function, without having to write the derivative as a separate function: function myf(x) result(y) type(dual), intent(in) :: x type(dual) :: y y = x**2 + x + sin(x) endfunction myf and you would probably access the function value and its derivative in x via type bound procedures in Fortran. For example, y%real and y%dual. You see how this can be exact. It has nothing to do with the computation of a derivative for a set of discrete points. You can use it in this way if, for example, given a set of discrete points in a vector v and an evaluation point x, you write an interpolation routine that gives you the value of the underlying function for arbitrary x. It is easy to see, say, for a piecewise linear interpolation, that the resulting derivative in the dual is nothing more than the underlying first order approximation.

November 19, 2018, 07:49		#6
sbaffini Senior Member Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 2,191 Blog Entries: 29 Rep Power: 39	In a certain sense, the dual number algebra gives you the rules to use in order to code a program that would give you exact derivatives of the expressions you use in your code. Not really different from implementing imaginary numbers, just different rules.

November 19, 2018, 08:03		#7
FMDenaro Senior Member Filippo Maria Denaro Join Date: Jul 2010 Posts: 6,865 Rep Power: 73	Paolo, I still see some confusion in the theory and in what should be conversely the practical use od dual number... I alread had a look to the wikipedia links (it seems to me that are exactly what was written in the papers and thesis) and things appear not clear. 1) If they introduce a (lagrangian) polynomial P(x) defining the counterpart in dual space P(a+beps), this is by definition a truncated* Taylor series, you can compute the analytical derivative of a polynomila approximation but it has a truncation error. 2) In both polynomials and Taylor series there is a key to understand, the derivatives are numbers not function! You have derivatives evaluated at specific points they do no longer depends on a independent variable. The functionla dependence is only in the monomials. Conversely, in symbolic computations you have functions.

November 19, 2018, 08:27		#8
sbaffini Senior Member Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 2,191 Blog Entries: 29 Rep Power: 39	Indeed, I think no actual functions are involved here. Look at the Taylor series example under Differentiation in the Dual number wikipedia page. As the expansion is in the dual (b eps) no actual real function is really involved. The comparison with symbolic manipulation programs was not meant to be so deep. Dual numbers indeed return the derivative value, not the function. But, constructed following exact rules.

November 19, 2018, 09:10		#10
sbaffini Senior Member Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 2,191 Blog Entries: 29 Rep Power: 39	I think the misunderstanding comes from the different areas where you use finite difference. Finite differences can be used indeed to compute jacobians of known functions with multiple evaluations of the function over a small interval. But if you know the function you can, in theory, also provide its derivative analytically. The point is that there are cases where this is not practically feasible or optimal. In this case AD comes to help. But the way I see it is just a formalization of something you know that is used to actually code the analytic differentiation and perform it automatically. There is no way it can be used, say, on a vector of equispaced values in space or time to get the exact derivative at any point in the interval. It can be used in this way only if you have an interpolation routine for that vector. The Taylor series demonstration, I think, only serves the purpose of showing how to apply functions to dual numbers. At some point, in practice, you have to inform the program about the derivative of all the intrinsics. In a certain sense, I think, it is not different from using the taylor series to compute standard intrinsics. It is just the equivalent way for duals.

November 19, 2018, 09:21		#11
FMDenaro Senior Member Filippo Maria Denaro Join Date: Jul 2010 Posts: 6,865 Rep Power: 73	Well, still not clear in my opinion ...Also FD is nothing else that an exact derivative of a polynomial of some degree, exactly as is introduced in the dual number differentiation. The original discussion in RG is about the accuracy order in the computation of a derivative. The person stated that dual number differentiation has no truncation error. And that is theoretically questionable. The thesis and the papers that are linked show no errors but that can be possible only if the comparison is done with the expression of the derivative of the polynomial.... Sorry, but I do not understand how you get this point...

November 19, 2018, 10:47		#14
sbaffini Senior Member Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 2,191 Blog Entries: 29 Rep Power: 39	I agree that the material linked on RG is of poor quality in regards to the comparison made between FD, the complex approach and the dual step approach. They only cite the need of an analytical function, but: 1) In applying non-FD methods to test functions they automatically embed the exact derivative in evaluating the function. Once they build the test data as, say, the sin of a dual number, all the exact derivatives are copied in through the test function overloading. 2) In contrast, FD are applied to the rough data.

November 19, 2018, 10:55		#15
FMDenaro Senior Member Filippo Maria Denaro Join Date: Jul 2010 Posts: 6,865 Rep Power: 73	So, the comparison makes no sense...