Stability and Conditioning in Numerical Analysis

Stability and Conditioning
in Numerical Analysis
D. Trigiante
University of Florence
[email protected]
September, 2005.
Rodi – Stability and Conditioning. – p.1/73
"... It is by looking into the same
problem from different points of
view that one arrives to a
complete insight of it. "
Euler
Rodi – Stability and Conditioning. – p.2/73
Stability vs. Conditioning
The terms stability and conditioning are used with a variety of
meanings in Numerical Analysis.
They have in common the general concept of the response of
a set of computations to perturbations arising from
the data,
the specific arithmetic used on computers.
They are not synonymous.
Rodi – Stability and Conditioning. – p.3/73
Stability
In Mathematics the notion of stability derives from the
homonymous notion in mechanics.
It regards the behavior of the motion of a system when it is
moved away from the equilibrium.
Three ingredients enters in the definition, i.e.
the existence of a reference solution, i.e. the equilibrium;
the perturbation of the initial status (the initial conditions);
the duration of the motion, which is supposed to be
infinite.
Rodi – Stability and Conditioning. – p.4/73
Stability
The behaviors around the equilibrium may be many:
stability (marginal stability), uniform stability,
asymptotic stability (AS), uniform AS,
contractivity,
orbital stability,
instability,
Rodi – Stability and Conditioning. – p.5/73
General Perturbations
More general perturbations are also considered in the
qualitative theory of dynamical systems.
The concept of stability under perturbation of the whole
system, often called total stability, is also considered.
This kind of perturbation is frequent in Numerical Analysis
where the source of errors, due to the computer arithmetic,
can be seen as perturbation of the whole set of computations.
In other words, a numerical algorithm is not only perturbed by
the errors in the data, but also with respect to the errors arising
in the process of computations.
Rodi – Stability and Conditioning. – p.6/73
Conditioning
Many problems, however, do not last for a long (in principle,
infinite) time, and/or do not have an equilibrium.
The above concept of stability do not apply, as it stands.
The non numerical analysis distinguishes such problems in
well posed and ill-posed, according whether the solution
depends continuously on data or not.
This is not enough for the Numerical Analysis purposes,
where a more refined distinction is needed.
The numerical analysts would like to know if such
dependence, although continuous, may result disastrous for
the error growth.
This requires the notion of Conditioning.
Rodi – Stability and Conditioning. – p.7/73
Stability and Numerical Analysis
Often, in the N.A. literature, two or more concepts are
superimposed.
For example, concepts such as discretization (in the case
where the original problem is continuous), the stability of the
algorithms and the ability of the arithmetic system
implemented on the computers to perform operations with
acceptable relative errors are often strictly tight together.
We almost completely agree with the following statements
taken from Lax and Richmyer (written in 1956!).
Rodi – Stability and Conditioning. – p.8/73
Lax and Richmyer, 1956
We shall not be concerned with rounding errors..., but it will be
evident to the reader that there is an intimate connection between stability and practicality of the equations from the point
of view of growth and the amplification of rounding errors...
(Some authors) define stability in terms of the growth of rounding errors. However we have a slight preference for the ( our)
definition, (i.e. independent on the rounding errors) because it
emphasizes that stability still has to be considered, even if the
rounding errors are negligible, unless, of course, the initial data
are chosen with diabolic care so as to be exactly free of those
components that would be amplified if they were present.
Rodi – Stability and Conditioning. – p.9/73
Dahlquist
A similar concept was expressed in the same years by
Dahlquist.
"The most fundamental is the distinction between instability in
the underlying mathematical problem and instability in an
algorithm for the (exact or approximate) treatment of the
problem".
We shall then avoid, when unnecessarily, to consider rounding
errors.
Rodi – Stability and Conditioning. – p.10/73
Classes of Problems
Asymptotically stable Problems (AS),
Marginally Stable Problems,
Unstable Problems,
Boundary Value Problems,
Rodi – Stability and Conditioning. – p.11/73
Algorithms and Asymptotic Stability
More often than it can be thought, numerical algorithms can
be considered as discrete dynamical systems around critical
points. (equilibria).
to the space of
real or
or
siderably, ranging from
The spaces where such dynamics are placed may vary con-
complex matrices.
Rodi – Stability and Conditioning. – p.12/73
Example 1, trivial
or
where
is real or complex.
unstable for
stable for
AS for
The equilibrium is at the origin and it is
Rodi – Stability and Conditioning. – p.13/73
and
..
.
stable for
and unstable for
and it is AS for
The equilibrium is the origin in
.
.
..
.
..
.
..
..
.
..
where
Example 2, less trivial
Not uniformly with respect to the dimension!
In the case of PDE, this makes the difference.
Rodi – Stability and Conditioning. – p.14/73
Example 3, time scales
The origin is A.S., but a generic solution approaches the
equilibrium with two different modes, one slow and the other
fast.
Rodi – Stability and Conditioning. – p.15/73
Example 4, splittings
Iterative methods for the solution of linear systems
fit very well in the framework of AS dynamical systems.
The splitting techniques are nothing but strategies to
transform the solution
into an AS equilibrium point of
an appropriate dynamical system.
What is usually less emphasized is that, even in the so called
direct methods, the AS plays a central role, especially when
dealing with structured matrices.
Rodi – Stability and Conditioning. – p.16/73
What’s wrong with marginal stability?
We have said that AS appears often in Numerical Analysis,
although problems which are naturally marginally stable or
even unstable appear as well.
Naturally means that they are unstable by themselves and not
because of previous wrong choices, for example a wrong
discretization.
In principle, small perturbations may lead them to be unstable.
In the real world, however, there are very important marginally
stable systems which are far to be unstable and only
perturbations of improbable nature (!) would turn them from
stability to instability. In other words, many marginally stable
problems need to be safely computed.
Rodi – Stability and Conditioning. – p.17/73
What’s wrong with unstable problems?
Unstable solution may be computed within the limit imposed
by the computer arithmetic.
Computer arithmetic is able to represent relatively well
numbers in a certain fixed range. When the numbers become
large such representation become poorer and poorer until it
may not have a single digit in common with the represented
number.
The use of large numbers is then unsafe on the computers.
Rodi – Stability and Conditioning. – p.18/73
Examples of unstable problems
The power method to find the maximum eigenvalue of a
matrix is an unstable problem.
In this case the required information is only partial, i.e. the
grow rate of the norm of a vector, instead of the norm of
the vector itself.
The algorithm is designed in order to avoid large numbers.
The Miller’s problem provides an example of unstable
problems.
It arose in the fifties in computing recursively the values of
the Bessel functions. It is worth to mention it because of
the peculiar aspect of the devised stable algorithm.
Rodi – Stability and Conditioning. – p.19/73
Miller’s problem
In the simplest form the problem is
Approximate solution.
Let
Rodi – Stability and Conditioning. – p.20/73
Conditioning
The notion of stability is not enough. A more general concept
is needed.
Let the discrete solution be
(1)
We shall consider the following two parameters:
They may provide two different types of information, especially
in the case of AS.
Rodi – Stability and Conditioning. – p.21/73
Qualitative vs Quantitative Information
Consider for example
If we are interested in knowing the maximum value reached
by the solution, we immediately obtain
with
in case of asymptotic stability and growing
exponentially with in the case of instability.
We may be interested in knowing how fast the solution returns
to the equilibrium in case of A.S. Such information is provided
by
which, in this case becomes:
in case of AS.
Rodi – Stability and Conditioning. – p.22/73
Information measure
is a sort of information measure in the sense that its
i.e.
smallness informs us that we are using large values of
we are computing terms already too small (smaller, for
example, than the machine precision).
In the case of AS, by posing
we have
which, if
is large, becomes
The quantity is already known in NA and is usually called
asymptotic rate of convergence. It is useful because its
inverse measures approximatively the number of iterations
needed to approximate the equilibrium within a precision of
Rodi – Stability and Conditioning. – p.23/73
and machine precision
It follows that
should be of the order of
It is useless to use
If the machine precision is
then the number of iteration
necessary to reach the machine precision is
One has then
will be called stiffness ratio. In the example
The ratio
Rodi – Stability and Conditioning. – p.24/73
Stiffness
and
are small and of the same
is large (in the example >
stiff if
well conditioned if
order;
Both and measure the sensitivity of the solution with
respect to change of the initial condition, although such
measure may change considerably.
The problems will be classified as follows
);
ill conditioned if both parameters are large.
The definition of stiffness is unusual in this context. We will
show later that it fits with the usual definition in the context of
numerical methods for ODE.
Rodi – Stability and Conditioning. – p.25/73
Example of stiff problem
α=0.1
10
9
8
7
6
5
4
3
2
γ = 1/5
d
0
0
5
10
γd = 1/40
15
20
25
30
35
40
45
Figure 1: There is no need to use smaller value of
50
1
Rodi – Stability and Conditioning. – p.26/73
Entropy and information content
Let associate to
the frequencies
This can be obtained as usual by dividing the
interval
in subintervals and counting the numbers of
points in each of them. The entropy is given by
The maximum value is attained at
measures the expected values of information of the
distribution.
It is maximum when the distribution is uniform, i.e. when in
each subinterval fall the same number of points, for example
one.
Rodi – Stability and Conditioning. – p.27/73
1.5
Entropy
1
0.5
0
0
N*+1
50
100
Rodi – Stability and Conditioning. – p.28/73
Optimal mesh
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
h
1
0
0
h2
2
h4
h3
4
6
10
12
14
16
18
20
axis, it will correspond a mesh
To the distribution on the
on the axis.
8
.
Rodi – Stability and Conditioning. – p.29/73
Reduction of stiffness
It is then enough to choose such that
have comparable stiffness ratio for the two modes.
Stiffness may be reduced by reducing . Consider the
and
multiscale example. The two modes are
After the fastest mode has become less then the machine
precision (say for
), on may compute the slower mode
not at each step but every steps, i.e. one may define a new
variable
In the new variable the slower mode
equation becomes
to
Rodi – Stability and Conditioning. – p.30/73
Multiscale case
stiff
non stiff
10
10
9
9
8
8
7
7
6
6
5
5
4
4
3
3
2
2
1
1
0
0
10
20
30
40
50
0
0
10
20
30
40
50
Time scale change a stiff problem to a non stiff one
Rodi – Stability and Conditioning. – p.31/73
Ill conditioned problem.
As example of ill conditioned problem consider the case of
the polynomial associated to
Note that
In
is the root of
This is not the case if
Even if
the solution may become large for large
this case both parameters and are large.
where
and
The solution is
Rodi – Stability and Conditioning. – p.32/73
Ill conditioned
4
4.5
||y||
x 10
4
3.5
3
2.5
2
1.5
1
0.5
40
50
60
30
20
10
0
0
Rodi – Stability and Conditioning. – p.33/73
More general problems
is any norm in
where
The same approach of using two conditioning parameters can
be used for more general problems.
In the case of boundary value problems, the parameters
and can be defined as well. In other words, the notion of
conditioning continues to hold, as well as the notion of
stiffness.
is a solution of a discrete BVP depending on some
If
we can define
boundary condition
Rodi – Stability and Conditioning. – p.34/73
LDU factorization
matrix. Consider the following
Let be a real
sequence,
where
Rodi – Stability and Conditioning. – p.35/73
and
appropriately defined vectors. The final result is
with
This example differs from the others for the following reasons:
matrices;
1. the underlying space is the space of real
2. the motion is not autonomous, since the factor matrices
and
change at each step.
The problem here is in the possibility that the entries of
product matrices
and
may become large and then not
well represented in the computer arithmetic.
Rodi – Stability and Conditioning. – p.36/73
Grow factors
The dynamic is not uniquely defined in the sense that there is
and
One is able
a certain freedom in the choice of both
to maintain small the entries of
by the pivoting strategy, but
still the entries of
and consequently
may grow.
Parameters which monitor such grow, called grow factors,
have been defined by Wilkinson and more recently by Amodio
and Mazzia. When such parameters grow exponentially with
the problem is said unstable. The parameter now
becomes:
which coincides with the grow factor used by Amodio and
Mazzia.
Rodi – Stability and Conditioning. – p.37/73
Tridiagonal Toeplitz matrices
Let
with
real.
is a tridiagonal
Toeplitz matrix.
Consider the problem
where
is the first unit vector in
It is equivalent to solve the
following discrete BVP, where are the entries of the vector
Rodi – Stability and Conditioning. – p.38/73
Tridiagonal Toeplitz matrix 2
After
Let
be such roots, the solution is
imposing the boundary condition, we obtain,
The solution of the above problem can be expressed by
means of the roots of the polynomial
Rodi – Stability and Conditioning. – p.39/73
Tridiagonal Toeplitz matrix 3
1. in order to the solution being bounded, the two roots need
to be distinct. Actually they should have distinct moduli in
order to prevent the denominator becoming too small;
the solution is bounded with respect
and
and not as the faster mode
perturbation growing as
small perturbations of initial data, will cause
Even when
3. if
to
2. the solution is essentially generate by the root of minimal
modulus
Rodi – Stability and Conditioning. – p.40/73
behaves as
and
Both
Tridiagonal Toeplitz matrix 4
They are bounded with respect to when
Considering that the vector is the first unit vector in
this
implies that the vector is the first column of
By using sequentially the vectors
and applying similar
arguments, we arrive to the conclusion that
is a necessary and sufficient condition to have
bounded
uniformly with respect to
Let and be the corresponding conditioning parameters,
we have
Rodi – Stability and Conditioning. – p.41/73
Tridiagonal Toeplitz matrix 5
When each is bounded with respect to
the matrix is said
well conditioned (
).
or
may
When either
We say that the matrix is weakly well
grow linearly with
conditioned.
Let
What kind of information may provide this parameter?
Let us summarize with a specific example.
Rodi – Stability and Conditioning. – p.42/73
..
.
.
..
..
.
..
.
Stiffness and Toeplitz matrices
1
c(α)
0.5
0
−0.5
−1
−1.5
−10
−8
−6
−4
−2
0
2
4
6
8
10
α
Rodi – Stability and Conditioning. – p.43/73
10
|z1|
|z |
9
2
8
7
6
5
4
3
2
1
0
−10 −9 −8 −7 −6 −5 −4 −3 −2 −1
0
1
2
3
4
5
6
7
8
9 10
Rodi – Stability and Conditioning. – p.44/73
15
10
k
γ
k/γ
cond(A)
10
10
5
10
0
10
−5
10
−10
−5
0
5
10
Rodi – Stability and Conditioning. – p.45/73
0
α=−10
0
50
−0.5
−1
0
20
40
60
80
100
α=−3
α=−1.5
100
50
100
50
100
50
−0.5
20
40
60
80
100
100
0
5
0
0
50
−5
0
50
0
0
−1
0
100
0
20
40
60
80
100
100
0
Rodi – Stability and Conditioning. – p.46/73
6
α=1
2
x 10
0
0
−2
0
50
20
40
60
80
100
α=4
0.5
100
0
50
100
50
100
50
100
0
0
50
−0.5
α=10
−1
0
14
x 10
5
20
40
60
80
100
0
0
−5
0
100
0
50
20
40
60
80
100
100
0
Rodi – Stability and Conditioning. – p.47/73
More general Toeplitz matrices
be a generic banded Toeplitz matrix.
Let
The matrix is well conditioned if the roots of the polynomial
associated to the matrix are such that of them are inside
and are outside the unit disk.
Further generalization to define well conditioning in a region
can be done as follows.
Rodi – Stability and Conditioning. – p.48/73
Conditioning in a region
such that
In correspondence of the values of
Now the roots of the polynomial
depend on the complex parameter
Consider the matrix
in the previous example, and let
with real and complex.
By posing
the problem becomes
the
solution does not exist
Rodi – Stability and Conditioning. – p.49/73
Boundary locus
Suppose now that we ask to the system to be well conditioned
for all value of in a prefixed region, for example
This means that we ask that for all
the two roots need
to be one outside and one inside the unit circle.
The best way to check if the two roots remain in the same
position with respect the unit disk is to monitor if they cross
the unit circle for
This leads to consider the map
and the curve (boundary locus)
Rodi – Stability and Conditioning. – p.50/73
is the locus of value of where one or both the
cross the unit circle.
The curve
roots
then the matrix
will be well conditioned in
.
If such curve completely lies in
Rodi – Stability and Conditioning. – p.51/73
BC with eigenvalues
2
1.5
1
imag(λ)
0.5
0
−0.5
−1
−1.5
−2
0
0.5
1
1.5
real(λ)
2
2.5
3
Typical boundary locus
Rodi – Stability and Conditioning. – p.52/73
Let
Generalization
two banded Toeplitz matrices;
and
the number of
is the number of lower diagonals and
upper diagonals of
the two polynomial of degree
and
the maximum value of the corresponding
bandwidths.
Rodi – Stability and Conditioning. – p.53/73
b)
of the roots of the polynomial
disk and are outside.
are inside the unit
a)
Theorem.
is well conditioned in a region of the
iff one of the two conditions hold
complex plane, for all
true:
If
have common points and if is a Jordan curve, then
the matrix is weakly well conditioned.
Rodi – Stability and Conditioning. – p.54/73
Conditioning of continuous problems
Moreover,
the parameters
to the discrete case.
The definitions of critical points, stability, asymptotic stability
and instability apply to this case as well, only considering that
now the variable is continuous.
can be defined in a way very similar
Rodi – Stability and Conditioning. – p.55/73
In the non scalar case,
Stiffness of continuous problems
is the ratio of largest and smallest
eigenvalues.
Rodi – Stability and Conditioning. – p.56/73
Discrete representation
be the mesh.
depend on h
and
Let h
Both
Definition. A continuous problem will be said
well-represented by a discrete problem if
The request may lead to define the optimal mesh.
This has been discussed by Francesca Mazzia in her talk.
Rodi – Stability and Conditioning. – p.57/73
be the unknown second component,
Let
Not well represeted problem
Rodi – Stability and Conditioning. – p.58/73
The solution of problem is
which corresponds to
. The problem is very well
but very ill conditioned with
conditioned as BVP
respect to (shooting method).
Let
Rodi – Stability and Conditioning. – p.59/73
x10 43
8
7
6
e
5
4
3
2
1
0
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
mu
Rodi – Stability and Conditioning. – p.60/73
Test equations
In the process of constructing numerical methods the notions
of stability and asymptotic stability have played a central role.
In fact the numerical methods have been essentially modeled
on the two test equations
Rodi – Stability and Conditioning. – p.61/73
Perron Theorem
and
where
In the first case the origin is a (marginally) stable equilibrium,
while it is AS in the second case. There is a fundamental
difference between the two cases.
The second case may be considered as a model
representative of more difficult (non linear) equations. A
general theorem of dynamical systems (Perron) support such
model.
Theorem. Let
If the is A.S. for the the linear part, then it is AS for the
complete equation.
Rodi – Stability and Conditioning. – p.62/73
Perron & Ostrowsky
It is not by chance that a theorem essentially similar to the
above one was established independently by Ostrowski in the
context of iterative procedures to find zeros of nonlinear
equations (see Ortega ). The linear test equation is then more
representative.
No similar general results are available in the case of marginal
stability.
Test equation
has, however, played an important role
essentially in proving the convergence of the methods.
Rodi – Stability and Conditioning. – p.63/73
LMM for ODEs
zero diagonals and
are Toeplitz matrices having
and
where
the the initial data and
be the final data.
A LMM applied to the test equation with the above boundary
conditions leads to the following discrete problem,
Let
lower non
upper non zero diagonals. The case
corresponds to the classical choice of using discrete
IVPs.
Rodi – Stability and Conditioning. – p.64/73
roots will be inside and
will be outside the unit disk. The convergence is ensured if
is a Jordan curve (this will prevent having double roots on the
unit circle) and
will be well conditioned if
The
matrix
The representative polynomial is now
The non zero entries of the matrix
are the coefficients of
are the coefficients of the
polynomial while those of
polynomial To the matrix
we may apply
either the generalized root condition or the boundary locus
condition.
Rodi – Stability and Conditioning. – p.65/73
Example: The midpoint method
..
.
..
.
.
..
.
..
.
.
..
..
The midpoint rule may generate two different discrete
methods, the classical one where the additional condition is
placed at the origin, and the BVM method where the
additional condition is placed at the end. In the first case the
matrix
is
Rodi – Stability and Conditioning. – p.66/73
..
..
.
.
.
..
while in the second case it is
The representing polynomial
is the same
has two lower diagonals, while
in the two cases but
has only one.
Except for the values of on the segment
the roots
of
are always one outside and one inside the unit disk.
The matrix
is well conditioned for
while
is
not.
Rodi – Stability and Conditioning. – p.67/73
Classes of BVMs
The result in the previous example is not isolated.
The BVM approach permits to obtain classes of
well-conditioned numerical methods ( -stable methods)
and also classes of perfect stable methods of any order.
Rodi – Stability and Conditioning. – p.68/73
Conservative Problems
is a constant of the motion.
The energy
Consider the non linear pendulum.
When the trapezoidal method is applied we get
Rodi – Stability and Conditioning. – p.69/73
BREATHING EFFECT
where
will be called volatile hamiltonian.
Rodi – Stability and Conditioning. – p.70/73
1.5
h=2.09
1
0.5
h=0.8
0
−0.5
−1
−1.5
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
Rodi – Stability 0.8
and Conditioning. –1
p.71/73
0.6
Volatile Hamiltonian
0.2
0
−0.2
−0.4
−0.6
−0.8
−1
−1.2
0
50
100
150
200
250
300
350
Rodi – Stability
and Conditioning.
– p.72/73
400
450
500
Further conditions
Simplecticity
Conservation
Ergodicity
.....
Rodi – Stability and Conditioning. – p.73/73
fixed
Definitions
is stable if
for
Definition. The critical solution
such that
The critical solutions (equilibria) are the
where
solutions of
Definition. The critical solution
If can be chosen independent on
stability is global.
Definition. The critical solution is asymptotically stable
(AS) if it is stable and, moreover,
the stability or the AS
is unstable if it is not stable.
Rodi – Stability and Conditioning. – p.74/73