Ted Ersek
2010-09-20 09:41:32 UTC
In [1] I gave a long response to the thread [FindRoots?]
(******** Why doesn't RootSearch Use FindRoot? ********)
In [2] Andrzej Kozlowski said,
"I still do not see any reason for not using FindRoot in RootSearch or
any way in which RootSearch is superior to FindRoot for finding a
single root (which is the only task FindRoot attempts)."
A few days ago I did some experiments and found that FindRoot is much better
at finding a root in one dimension than it was when I wrote RootSearch
in 2001. In particular FindRoot in Mathematica 7 does a very good job of
converging on a root in the following example.
Clear[f2];
F2[x_]:=If[x<3/2,Sqrt[3/2-x]-Exp[-4 x],2 x Exp[-x]];
FindRoot[f2[x],{x,1.2,0.7,1.7}]
When I first wrote RootSearch, I had to write my own adaptation of the
Secant Method that could handle the previous example.
Evaluate Plot[ f2[x], {x, 0.7, 1.7} ] and you should see why straight
forward use
of the Secant method or Newton method will not work in this example.
My adaptation would first consider a secant step, but take a golden section
find minimum step if the secant step looked like a bad choice.
(***** However, FindRoot doesn't know the difference between a pole and a
root. ********)
In the next example I use Mathematica 7 to demonstrate the main
reason I wrote my own (modified) implementation of Brent's method.
In[1]:= f[x_]:=x+Tan[x];
{soln,{samples}}=***@FindRoot[f[x]==0,{x,1.4,1.7},
EvaluationMonitor:>Sow[x] ];
{soln,Length[samples]}
FindRoot::brmp: The root has been bracketed as closely as
possible with machine precision but the function value
exceeds the absolute tolerance 1.0536712127723497`*^-8. >>
{ {x->1.5708}, 63 }
In this example FindRoot performed 63 iterations and located a pole of
x+Tan[x] to the nearest machine number. As the iterations went on
Abs[f[xn]] increases until we get to {x -> 1.5707963267948968}
where f[x]>10^15. Notice that the message from the previous example
indicates Mathematica still thinks it was converging on a root.
I implemented a modification of Brent's method that will stop searching
if Abs[f[xn]] nearly doubles (increases by a factor of 19/10 or more)
on 5 consecutive iterations. In addition to stopping the search in such
an example, RootSearch posts a message indicating that it appears
RootSearch is converging on a pole near [some number]. If you get the
Message FindRoot shows in the above example, you have to do some
investigating to see what's going on. I think my approach is "superior"
to the way FindRoot handles this example.
(***** Finding Multiple Order Roots ********)
With the default options FindRoot will converge slowly (linear order)
to a multiple root. The documentation says "DampingFactor can be used
to help speed convergence to higher-order roots:" They don't mention
that this is only works when Newton's method can be used, and all you
do is use the setting (DampingFactor->n) where n is the order of the root.
That's pretty simple, but why doesn't the documentation put it
that way? Instead Wolfram makes the user figure out how
"DampingFactor can be used to ...."
It's also unfortunate that very few books on numerical methods or
numerical analysis mention that:
If (1) f[x] has a root at (x=a).
and (2) f[x] can be expressed as a Taylor series about a root (x=a).
Then (x=a) is a first order root of u[x] where
u[x] = Piecewise[{ {0, f[x]==0}, {f[x]/f'[x], True} }]
This is handy because many root finding methods can efficiently
converge on any root of u[x] above. BTW I have yet to find a place
where Wolfram Research mentions that finding a root of u[x] above is
another way to efficiently converge on a multiple root.
Well I just mentioned two ways to efficiently converge to a multiple root,
and both require Mathematica to symbolically determine f'[x]. I think
finding a root of u[x] above has a great advantage over using a non-default
setting for DampingFactor. This is because we typically don't know the order
of the roots we are trying to find, and in that case the DampingFactor
approach can't be used.
(******* How RootSearch Currently Works ********)
RootSearch starts by taking initial samples that are almost equally spaced.
Out of all the initial samples it finds adjacent samples {a,f[a]} and
{b,f[b]}
where f[a], f[b] have opposite sign. Then my implementation of Brent's
method
starts with these samples at (a,b) to find a root of u[x] above
(provided Mathematica can symbolically determine f'[x] ).
RootSearch also picks from the initial samples the adjacent samples
{a,f[a]}, {b,f[b]}, {c,f[c]} that meet the following conditions.
1) a < b < c
2) 0 < f[b]/f[a] < 1
3) 0 < f[b]/f[c] < 1
Then it computes f'[a], f'[b], f'[c] and if f'[a], f'[b] have opposite
sign my
implementation of Brent's method to find a root of u[x] between (a,b).
On the other hand if f'[b], f'[c] have opposite sign Brent's method
is used to find a root of u[x] between (b,c).
The asymptotic rate of convergence for Brent's method at a first order root
is
about 1.618. By using Brent's method to find a root of u[x] rather than f[x]
directly we can ensure the rate of convergence is 1.618 regardless of the
order of
the root, and we don't have to know if it's a first know the order of the
root.
RootSearch also considers other cases that rarely occur in practice.
Of course if Mathematica can't symbolically determine f'[x] RootSearch only
uses
samples of f[x] and uses Brent's method or my modified Secant method to find
roots
of f[x] directly.
(***** Improvement For My Next Version of RootSearch ********)
Brent's method is described in the material available from [3].
When Brent's method is looking for a root of f[x] it keeps track of three
samples {a,f[a]}, {b,f[b]}, {c,f[c]} during each iteration.
During some iterations samples {a,f[a]}, {c,f[c]} are duplicate. On every
iteration we have (b!=c) and f[x] changes sign between (b,c).
During each iteration Brent's method decides on a new approximate root from
either
1) Inverse quadratic interpolation using {a,f[a]}, {b,f[b]}, {c,f[c]}
2) Linear interpolation using {b,f[b]}, {c,f[c]}
3) Bisection with the next approximate root at (b+c)/2.
Brent's method decides which of the three to use based on the values of
(a,b,c,f[a],f[b],f[c]).
In my current version of RootSearch I use Brent's method on u[x], so during
each
iteration the next approximate root depends on the values of
(a,b,c,u[a],u[b],u[c]).
My next version of RootSearch will use a variation of Brent's method where
the next
approximate root depends on the values of
(a,b,c,f[a],f[b],f[c],f'[a],f'[b],f'[c]).
Armed with all this information my new method will decide to use a bisection
step
if f'[b] or f'[c] is (zero, infinite, or Indeterminate). I will also take a
bisection step if f[x] can't possibly be monotonic between (b,c). In
addition there
are other heuristics I will use to decide if a bisection step is best. In
many cases
this new variation on Brent's method will pick exactly the same new
approximation as if
I used Brent's method on u[x] directly. The difference is that on some
iterations my
new variation will know f[x] is not very well behaved, and take a bisection
step instead.
(***** References ******)
[1] http://forums.wolfram.com/mathgroup/archive/2010/Sep/msg00255.html
[2] http://forums.wolfram.com/mathgroup/archive/2010/Sep/msg00262.html
[3] http://www.nr.com/
(******** Why doesn't RootSearch Use FindRoot? ********)
In [2] Andrzej Kozlowski said,
"I still do not see any reason for not using FindRoot in RootSearch or
any way in which RootSearch is superior to FindRoot for finding a
single root (which is the only task FindRoot attempts)."
A few days ago I did some experiments and found that FindRoot is much better
at finding a root in one dimension than it was when I wrote RootSearch
in 2001. In particular FindRoot in Mathematica 7 does a very good job of
converging on a root in the following example.
Clear[f2];
F2[x_]:=If[x<3/2,Sqrt[3/2-x]-Exp[-4 x],2 x Exp[-x]];
FindRoot[f2[x],{x,1.2,0.7,1.7}]
When I first wrote RootSearch, I had to write my own adaptation of the
Secant Method that could handle the previous example.
Evaluate Plot[ f2[x], {x, 0.7, 1.7} ] and you should see why straight
forward use
of the Secant method or Newton method will not work in this example.
My adaptation would first consider a secant step, but take a golden section
find minimum step if the secant step looked like a bad choice.
(***** However, FindRoot doesn't know the difference between a pole and a
root. ********)
In the next example I use Mathematica 7 to demonstrate the main
reason I wrote my own (modified) implementation of Brent's method.
In[1]:= f[x_]:=x+Tan[x];
{soln,{samples}}=***@FindRoot[f[x]==0,{x,1.4,1.7},
EvaluationMonitor:>Sow[x] ];
{soln,Length[samples]}
FindRoot::brmp: The root has been bracketed as closely as
possible with machine precision but the function value
exceeds the absolute tolerance 1.0536712127723497`*^-8. >>
{ {x->1.5708}, 63 }
In this example FindRoot performed 63 iterations and located a pole of
x+Tan[x] to the nearest machine number. As the iterations went on
Abs[f[xn]] increases until we get to {x -> 1.5707963267948968}
where f[x]>10^15. Notice that the message from the previous example
indicates Mathematica still thinks it was converging on a root.
I implemented a modification of Brent's method that will stop searching
if Abs[f[xn]] nearly doubles (increases by a factor of 19/10 or more)
on 5 consecutive iterations. In addition to stopping the search in such
an example, RootSearch posts a message indicating that it appears
RootSearch is converging on a pole near [some number]. If you get the
Message FindRoot shows in the above example, you have to do some
investigating to see what's going on. I think my approach is "superior"
to the way FindRoot handles this example.
(***** Finding Multiple Order Roots ********)
With the default options FindRoot will converge slowly (linear order)
to a multiple root. The documentation says "DampingFactor can be used
to help speed convergence to higher-order roots:" They don't mention
that this is only works when Newton's method can be used, and all you
do is use the setting (DampingFactor->n) where n is the order of the root.
That's pretty simple, but why doesn't the documentation put it
that way? Instead Wolfram makes the user figure out how
"DampingFactor can be used to ...."
It's also unfortunate that very few books on numerical methods or
numerical analysis mention that:
If (1) f[x] has a root at (x=a).
and (2) f[x] can be expressed as a Taylor series about a root (x=a).
Then (x=a) is a first order root of u[x] where
u[x] = Piecewise[{ {0, f[x]==0}, {f[x]/f'[x], True} }]
This is handy because many root finding methods can efficiently
converge on any root of u[x] above. BTW I have yet to find a place
where Wolfram Research mentions that finding a root of u[x] above is
another way to efficiently converge on a multiple root.
Well I just mentioned two ways to efficiently converge to a multiple root,
and both require Mathematica to symbolically determine f'[x]. I think
finding a root of u[x] above has a great advantage over using a non-default
setting for DampingFactor. This is because we typically don't know the order
of the roots we are trying to find, and in that case the DampingFactor
approach can't be used.
(******* How RootSearch Currently Works ********)
RootSearch starts by taking initial samples that are almost equally spaced.
Out of all the initial samples it finds adjacent samples {a,f[a]} and
{b,f[b]}
where f[a], f[b] have opposite sign. Then my implementation of Brent's
method
starts with these samples at (a,b) to find a root of u[x] above
(provided Mathematica can symbolically determine f'[x] ).
RootSearch also picks from the initial samples the adjacent samples
{a,f[a]}, {b,f[b]}, {c,f[c]} that meet the following conditions.
1) a < b < c
2) 0 < f[b]/f[a] < 1
3) 0 < f[b]/f[c] < 1
Then it computes f'[a], f'[b], f'[c] and if f'[a], f'[b] have opposite
sign my
implementation of Brent's method to find a root of u[x] between (a,b).
On the other hand if f'[b], f'[c] have opposite sign Brent's method
is used to find a root of u[x] between (b,c).
The asymptotic rate of convergence for Brent's method at a first order root
is
about 1.618. By using Brent's method to find a root of u[x] rather than f[x]
directly we can ensure the rate of convergence is 1.618 regardless of the
order of
the root, and we don't have to know if it's a first know the order of the
root.
RootSearch also considers other cases that rarely occur in practice.
Of course if Mathematica can't symbolically determine f'[x] RootSearch only
uses
samples of f[x] and uses Brent's method or my modified Secant method to find
roots
of f[x] directly.
(***** Improvement For My Next Version of RootSearch ********)
Brent's method is described in the material available from [3].
When Brent's method is looking for a root of f[x] it keeps track of three
samples {a,f[a]}, {b,f[b]}, {c,f[c]} during each iteration.
During some iterations samples {a,f[a]}, {c,f[c]} are duplicate. On every
iteration we have (b!=c) and f[x] changes sign between (b,c).
During each iteration Brent's method decides on a new approximate root from
either
1) Inverse quadratic interpolation using {a,f[a]}, {b,f[b]}, {c,f[c]}
2) Linear interpolation using {b,f[b]}, {c,f[c]}
3) Bisection with the next approximate root at (b+c)/2.
Brent's method decides which of the three to use based on the values of
(a,b,c,f[a],f[b],f[c]).
In my current version of RootSearch I use Brent's method on u[x], so during
each
iteration the next approximate root depends on the values of
(a,b,c,u[a],u[b],u[c]).
My next version of RootSearch will use a variation of Brent's method where
the next
approximate root depends on the values of
(a,b,c,f[a],f[b],f[c],f'[a],f'[b],f'[c]).
Armed with all this information my new method will decide to use a bisection
step
if f'[b] or f'[c] is (zero, infinite, or Indeterminate). I will also take a
bisection step if f[x] can't possibly be monotonic between (b,c). In
addition there
are other heuristics I will use to decide if a bisection step is best. In
many cases
this new variation on Brent's method will pick exactly the same new
approximation as if
I used Brent's method on u[x] directly. The difference is that on some
iterations my
new variation will know f[x] is not very well behaved, and take a bisection
step instead.
(***** References ******)
[1] http://forums.wolfram.com/mathgroup/archive/2010/Sep/msg00255.html
[2] http://forums.wolfram.com/mathgroup/archive/2010/Sep/msg00262.html
[3] http://www.nr.com/