Numerical Linear Algebra Chap. 3: Eigenvalue Problems Heinrich Voss [email protected] Hamburg University of Technology Institute of Numerical Simulation TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 1 / 43 Eigenvalues λ ∈ C is an eigenvalue of A ∈ Cn×n if the homogeneous linear system of equations Ax = λx has a nontrivial solution x ∈ Cn \ {0}. Then, x is called an eigenvector of A corresponding to λ. The set of all eigenvalues of A is called the spectrum of A and is denoted by σ(A). λ is an eigenvalue of A if and only if det(A − λI) = 0. χ(λ) := det(A − λI) is a polynomial of degree n, the characteristic polynomial of A. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 2 / 43 Eigenvalues ct. If λ̃ is a root of χ of multiplicity k (i.e. the poynomial χ(λ) is divisable by (λ − λ̃)k but not by (λ − λ̃)k +1 ) then k is called the algebraic multiplicity of λ̃. The algebraic multiplicity of λ̃ is denoted by α(λ̃). For A ∈ Cn×n its characterictic polynomial χ has degree n. Hence, the sum of all algebraic multiplicities of eigenvalues equals n. If λ is an eigenvalue of A then Eλ := {x ∈ Cn : (A − λI)x = 0} is a subspace of Cn , which is called the eigenspace of A corresponding to λ. γ(λ) := dim Eλ is the geometric multiplicity of an eigenvalue λ of A. It can be shown that γ(λ) ≤ α(λ) for every eigenvalue λ. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 3 / 43 Similar matrices Let X ∈ Cn×n be nonsingular. Then A and B := X −1 AX are called similar matrices. A 7→ X −1 AX is called similarity transformation. Since det(B − λI) = det(X −1 (A − λI)X ) = det(X −1 ) det(A − λI) det(X ) = det(A − λI), similar matrices have the same eigenvalues including their algebraic multiplicities. It can be shown that the geometric multiplicities coincide as well. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 4 / 43 Diagonalizable matrix Let Ax j = λj x j , j = 1, . . . , k where λi 6= λj for i 6= j. Then the set {x 1 , . . . , x k } is linearly independent. Let x = Pk j=1 αj x j = 0. For j ∈ {1, . . . , k } it follows k Y (A − λ1 I) · · · · · (A − λj−1 I)(A − λj+1 I) · · · · · (A − λk I)x = αj (λj − λi )x j = 0, i=1,i6=j and therefore αj = 0. In particular, if A has n different eigenvalues λj with eigenvectors x j , then X := (x 1 , . . . , x n ) is nonsingular, and it holds AX = (Ax 1 , . . . , Ax n ) = (λ1 x 1 , . . . , λn x n ) = X Λ ⇐⇒ X −1 AX = Λ where Λ =: diag(λ1 , . . . , λn ) denotes a diagonal matrix with entries λ1 , . . . , λn . Hence, A is diagonalizable, i.e. similar to a diagonal matrix. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 5 / 43 Diagonalizable matrix ct. More generally, if for all eigenvalues λj , j = 1, . . . , k of A the algebraic and geometric multiplies coincide (α(λj ) = γ(λj )), then choosing in each of the eigenspaces Eλj a basis x j,1 , . . . , x j,α(λj ) , the matrix X = (x 1,1 , . . . , x 1,α(λ1 ) , x 2,1 , . . . , x k ,α(λk ) ) is nonsingular, and it digonalizes A. It can be shown that A is diagonalizable if and only if α(λj ) = γ(λj ) for every eigenvalue λj of A. For A= 0 1 0 0 α(0) = 2 6= 1 = γ(0), and therefore not every matrix is diagonalizable. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 6 / 43 Jordan’s canonical form Let A ∈ Cn×n with distict eigenvalues λ1 , . . . , λk . nonsingular matrix X such that J1 O X −1 AX = diag(J1 , . . . , Jk ) := O Then there exists a O J2 ... ...,O .. . O O ... Jk is a block diagonal matrix. Each of the diagonal blocks Jj = diag(Jj,1 , . . . , Jj,γ(λj ) ) is a block diagonal matrix of dimension α(λj ) with γ(λj ) blocks where λj 1 ... 0 .. 0 λj . . . . .. .. .. Jj,i = . . . . . . .. .. 1 .. 0 ... 0 λj TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 7 / 43 Hermitian matrices A ∈ Rn×n is symmetric if A = AT . More generally, A ∈ Cn×n is a Hermitian T matrix if AH := A = A, where A denotes the matrix obtained from A by replacing each of its entries by its conjugate complex. All eigenvalues of a Hermitian matrix are real: for Ax = λx it holds x H Ax = x H (λx) = λx H x and x H Ax = (AH x)H x = (Ax)H x = (λx)H x = λx H x from which we get λ = λ, i.e. λ ∈ R. Eigenvectors of a Hermitian matrix correponding to distinct eigenvalues are orthogonal: for Ax = λx, Ay = µy and λ 6= µ it holds y H Ax = λy H x and y H Ax = (AH y )H x = (Ay )H x = µy H x. Hence, (λ − µ)y H x = 0, and λ 6= µ implies y H x = 0. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 8 / 43 Invariant subspace A subspace V of Cn is an invariant subspace of A if Ax ∈ V for every x ∈ V . Every invariant subspace of A contains an eigenvector of A. Let x 1 , . . . , x k ∈ Cn be a basis of V . Then for j = 1, . . . , k there exists bij ∈ C Pk such that Ax j = i=1 bij x i . Let λ be an eigenvalue of B = (bij ) ∈ Ck ×k with eigenvector ξ = (ξ1 , . . . , ξk )T , Pk and let x := i=1 ξi x i 6= 0. Then Ax = k X ξj Ax j = j=1 TUHH Heinrich Voss k X k X j=1 i=1 ξj bij x i = k X k k X X ( bij ξj )x i = λξi x i = λx. i=1 j=1 NLA: Chap.3, Eigenvalue Problems i=1 2006 9 / 43 Hermitian matrices are diagonalizable Let A be a Hermitian matrix. Then there exists a unitary matrix U ∈ Cn×n (i.e. U H U = I) such that U H AU = diag(λ1 , . . . , λn ). Let x 1 be an eigenvector of A such that Ax 1 = λ1 x 1 and (x 1 )H x 1 = 1. Then for x ∈ Cn such that x H x 1 = 0 it holds (Ax)H x 1 = x H AH x 1 = x H (Ax 1 ) = λ1 x H x 1 = 0. Hence, V1 := {x ∈ Cn : x H x 1 = 0} is an invariant subspace of A, and therefore it contains an eigenvector x 2 which can be normalized such that (x 2 )H x 2 = 1. If x 1 , . . . , x j are j orthogonal eigenvectors of A, then in the same way as before Vj := {x 1 , . . . , x j }⊥ = {x ∈ Cn : x H x i = 0, i = 1, . . . , j} is an invariant subspace of A, and hence there exists an eigenvector x j+1 which is orthogonal to x 1 , . . . , x j . U = (x 1 , . . . , x n ) renders the desired property. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 10 / 43 Rayleigh’s principle Let A ∈ Cn×n be a Hermitian matrix. Then for x 6= 0 RA (x) := x H Ax xHx is called Rayleigh quotient of A at x. Let λ1 ≤ λ2 ≤ · · · ≤ λn be the eigenvalues of A, and let x 1 , . . . , x n be a set of corresponding orthogonalized eigenvectors. Then it holds λ1 = min RA (x) and λn = max RA (x). x6=0 x6=0 for i = 1, 2, . . . , n it holds λi TUHH = min{RA (x) : x ∈ Cn , x H x j = 0, j = 1, . . . , i − 1} = max{RA (x) : x ∈ Cn , x H x j = 0, j = i + 1, . . . , n} Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 11 / 43 Proof of Rayleigh’s principle Let x 1 , . . . , x n be an orthonormal system of eigenvectors of A ∈ Cn×n where Ax j = λj x j . Pn For x ∈ Cn , x 6= 0 let x = j=1 ξj x j . xHx = n X ξj x j n H X j=1 x H Ax = n X k =1 ξj x j H = j,k =1 n X A j=1 n X n n X X ξk x k = ξj ξk (x j )H x k = |ξj |2 ξk x k = k =1 ξj x j n H X j=1 j=1 n X ξj x j n H X j=1 ξ k λk x k = n X ξk Ax k k =1 λj |ξj |2 k =1 j=1 αj λj , |ξj |2 with αj = Pn 2 k =1 |ξk | Hence, RA (x) = n X j=1 TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 12 / 43 Proof of Rayleigh’s principle From 0 ≤ αj ≤ 1 and Pn λ1 = j=1 αj = 1 one obtains n X j=1 αj λ1 ≤ n X j=1 λ1 = RA (x 1 ), λi αj λj ≤ n X αj λn = λn . j=1 λn = RA (x n ). = min{RA (x) : x ∈ Cn , x H x j = 0, j = 1, . . . , i − 1} = max{RA (x) : x ∈ Cn , x H x j = 0, j = i + 1, . . . , n} follow in a similar way since ξ1 = · · · = ξi−1 = 0 if x H x j = 0 for j = 1, . . . , i − 1. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 13 / 43 Numerical methods Linear systems of equations Ax = b can be solved by a finite algorithm (i.e. a finite number of operations) like Gauss elimination. Determining an eigenvalue of a matrix A ∈ Rn×n is equivalent to finding a root of the characteristic polynomial χ(λ) := det(A − λI) = 0. It is known (Theorem of Abel) that for n ≥ 5 there is no formula for solving det(A − λI) = 0 for λ. Hence, the eigenvalue problem Ax = λx usually can be solved only by iterative methods. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 14 / 43 Example 0.2 0.3 0.4 A = 0.6 0.2 0.5 0.2 0.5 0.1 Choose any vector x 0 ∈ R3 and compute the sequence x k := Ax k −1 , k = 1, 2, 3, . . . After a small number of steps (≈ 10) we obtain 0.5122 x k = 0.6974 and kAx k − x k k small. 0.5013 x k seems to be an eigenvector corresponding to the eigenvalue λ = 1. Is this a miracle? TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 15 / 43 A is stochastic All elements of A are nonnegative, and every column of A adds to 1. Matrices with these properties are called stochastic. They describe the behavior of Markov chaines. If A is stochastic, then every row of AT adds to 1, and therefore (1, 1, . . . , 1)T is an eigenvector of AT corresponding to the eigenvalue 1. det(A − λI) = det(AT − λI) implies that the eigenvalues of A and AT coincide. Hence, every stochastic matrix has one eigenvalue λ = 1. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 16 / 43 Power method Assume that A is diagonizable, i.e. there exist n linearly independent eigenvectors u 1 , . . . , u n of A, and assume that λ1 is a dominant eigenvalue |λ1 | > |λ2 |, |λ3 |, . . . , |λn |. The initial vector x 0 can be representeted as x0 = n X αj u j j=1 n n n X X X αj Au j = αj λj u j Ax 0 = A αj u j = j=1 TUHH Heinrich Voss j=1 NLA: Chap.3, Eigenvalue Problems j=1 2006 17 / 43 Power method ct. n n n X X X A2 x 0 = A αj λj u j = αj λj Au j = αj λ2j u j j=1 j=1 j=1 By induction it follows Am x 0 = n X j=1 n λ m X j j m 1 αj λm u = λ α u + α uj . 1 j j 1 λ1 j=2 From |λj |/|λ1 | < 1 it follows that (λj /λ1 )m → 0. Hence, if α1 6= 0, then the sequence n λ m X j m 0 1 λ−m A x = α u + αj uj 1 1 λ1 j=2 converges to an eigenvector corresponding to λ1 . TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 18 / 43 Power method ct. If |λ1 | = 6 1, then for increasing m one obtains overflow or underflow. Apply the method to 0.2 0.3 0.4 B = 0.6 −0.1 0.5 0.2 0.5 0.1 The sequence x m converges to the null vector. The largest eigenvalue of B in modulus seems to be smaller than 1. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 19 / 43 Power method Normalize x m in each step to avoid underflow or overflow. Power method 1: Given initial vector x 0 2: for m = 0, 1, 2, . . . until convergence do 3: y m+1 = Ax m ; 4: km+1 = ky m+1 k 5: x m+1 = y m+1 /km+1 6: end for With this modification the power method converges in a reasonable number of steps to an eigenvector corresponding to the dominant eigenvalue λ1 = 0.9304. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 20 / 43 Observations m 0 1 λ−m 1 A x = α1 u + n X αj λ m j=2 j λ1 uj demonstrates that the speed of convergence depends on q := max j=2,...,m |λj | . |λ1 | The smaller q is, the faster is the convergence of the power method. If the initial vector x 0 has no component of the eigenvector corresponding to the dominant eigenvalue (i.e. α1 = 0), then in the course of the algorithm rounding errors usually produce a component of u 1 which is amplified in further iterations until convergence. Starting the power method for A with a linear combination of eigenvectors corresponding to λ2 and λ3 one obtains a reasonable approximation to an eigenvector corresponding to λ1 after 40 iterations. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 21 / 43 Observations ct. If λ1 is a multiple dominant eigenvalue of A λ1 = λ 2 = · · · = λ p , |λ1 | > |λj | for j = p + 1, . . . , n, and A is diagonalizable, then all considerations above stay true. For |λ1 | = |λ2 | > |λj | for j = 3, . . . , n, and λ1 6= λ2 one does not obtain convergence of the power method. In steps 4 and 5 of the power method the normalization can be replaced by a scaling km+1 = `T y m+1 where ` ∈ Rn is a vector which is not orthogonal to the eigenvector u 1 corresponding to the dominant eigenvector. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 22 / 43 Inverse iteration Applying the power method to the inverse matrix A−1 one can determine the smallest eigenvalue in modulus. Inverse iteration Given initial vector x 0 for m = 0, 1, 2, . . . until convergence do Solve Ay m+1 = x m for y m+1 km+1 = ky m+1 k x m+1 = y m+1 /km+1 end for Applying inverse iteration to the matrix B one gets fast convergence to an eigenvector corresponding to the smallest eigenvalue λ3 = −0.2111. For A the convergence is very slow. What is the difference? TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 23 / 43 Inverse iteration ct. The shifted matrix A − λ̃I has eigenvalues λj − λ̃, if λj are the eigenvalues of A. If λ̃ is not an eigenvalue of A, then (A − λ̃I)−1 has eigenvalues 1 . λj −λ̃ If |λp − λ̃| < |λj − λ̃| for j = 1, . . . , n, j 6= p then Inverse iteration with fixed shift Given initial vector x 0 for m = 0, 1, 2, . . . until convergence do Solve (A − λ̃I)y m+1 = x m for y m+1 km+1 = `T y m+1 x m+1 = y m+1 /km+1 end for converges to an eigenvector corresponding to λp . The rate of convergence is q = max j6=k TUHH Heinrich Voss |λk − λ̃| . |λj − λ̃| NLA: Chap.3, Eigenvalue Problems 2006 24 / 43 Inverse iteration with variable shifts For large m it holds that x m is an approximate eigenvector corresponding to λp and `T x m = 1. Hence, km+1 = `T y m+1 = `T ((A − λ̃I)−1 x m ) ≈ 1 1 `T x m = . λp − λ̃ λp − λ̃ This observations suggests to iterate the shift as well: km+1 ≈ 1 λm+1 − λm =⇒ λm+1 := λm + 1/km+1 Inverse iteration with variable shifts Given initial vector x 0 and initial approximation λ0 for m = 0, 1, 2, . . . until convergence do Solve (A − λm I)y m+1 = x m for y m+1 km+1 = `T y m+1 x m+1 = y m+1 /km+1 λm+1 = λm + 1/km+1 end for TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 25 / 43 Quadratic convergence Let λ̃ be an algebraically simple eigenvalue of A (i.e. λ̃ is a simple root of det(A − λI) = 0), let ũ be a corresponding eigenvector such that `T ũ = 1. Then inverse iteration with variable shifts converges locally and quadratically to (λ̃, ũ): There exists some positive constant C > 0 such that, if λ0 is sufficiently close to λ̃ and x 0 is sufficiently close to ũ, then it holds |λ̃ − λm+1 | ≤ C|λ̃ − λm |2 TUHH Heinrich Voss and kũ − x m+1 k ≤ Ckũ − x m k2 . NLA: Chap.3, Eigenvalue Problems 2006 26 / 43 Deflation Assume that we have already obtained the largest (smallest, closest to a given shift) eigenvalue λ̃ and corresponding eigenvector ũ. How can we compute further eigenpairs by the power method? Let ỹ be a left eigenvector of A corresponding to some eigenvalue µ̃ 6= λ̃, i.e. ỹ T A = µ̃ỹ . Then it holds µ̃ỹ T ũ = (ỹ T A)ũ = ỹ T (Aũ) = λ̃ỹ T ũ TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems =⇒ ỹ T ũ = 0. 2006 27 / 43 Deflation ct. Let B := A − ũw T , where w ∈ Rn satisfies w T ũ 6= 0 B ũ = Aũ − ũw T ũ = (λ̃ − w T ũ)ũ, i.e. ũ is an eigenvector of B corresponding to the eigenvalue λ̃ − w T ũ. With eigenvalue µ̃ 6= λ̃ of A and its corresponding left eigenvector y , it holds y T B = y T A − y T ũw T = λ̃y T . Hence, all eigenvalues of A are kept (only the right eigenvectors can change), whereas the eigenvalue λ̃ − w T ũ can be moved anywhere by the choice of w (for instance to 0 to compute the second largest eigenvalue of A in modulus). TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 28 / 43 Symmetric matrices Let A = AT ∈ Rn×n be a symmetric matrix, λ̃ an eigenvalue of A, and ũ a corresponding eigenvalue such that kũk = 1. Let B = A − λ̃ũ ũ T If v ∈ Rn is an eigenvector of A (Av = µv ) such that v T ũ = 0 then Bv = Av − ũ ũ T v = Av = µv Hence, all eigenvalues of A which are different from λ̃ are eigenvalues of B as well. 0 is an eigenvalue of B replacing λ̃. If λ̃ is a multiple eigenvalue of A, then λ̃ is an eigenvalue of B, but the multiplicity is reduced by 1. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 29 / 43 QR Algorithm QR algorithm A0 := A for m = 0, 1, 2, . . . until convergence do Factorize Am = Qm Rm Am+1 = Rm Qm end for T T Am+1 = Rm Qm = Qm (Qm Rm )Qm = Qm Am Qm Hence, all Am are (orthogonally) similar, and therefore they have the same eigenvalues. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 30 / 43 QR algorithm ct. If the eigenvalues of A are pairwise different of each other in modulus, |λ1 | > |λ2 > | · · · > |λn | and if a further technical condition is satisfied, then the QR algorithm converges in the following sense: (m) If (Am )jk = ajk , then (m) lim ajk m→∞ (m) lim ajj m→∞ TUHH Heinrich Voss =0 for j > k = λj for j = 1, . . . , n NLA: Chap.3, Eigenvalue Problems 2006 31 / 43 QR algorithm and power method With Um = Q1 Q2 · · · · · Qm , Sm = Rm Rm−1 · · · · · R1 it holds Am = Um Sm . (∗) For m = 1 the statement is trivial: A = Q1 R1 = U1 S1 . T T Am+1 = Rm Qm = Qm Am Qm yields by induction Am+1 = Um AUm . TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 32 / 43 QR algorithm and power method ct. If (∗) is valid for some m − 1, then it follows from the definition of Am+1 T T T T Rm = Am+1 Qm = Um AUm Qm = Um AUm−1 Multiplying by Sm−1 from the right und by Um from the left we obtain Um Sm = AUm−1 Sm−1 = Am which is the proposition for m. From (∗) we obtain for the first unit vector e1 and ρ = (Rm )(1,1) An e1 = Um Rm e1 = ρUm e1 . Hence, the first column has the same direction as the m-th iterate of the power method with initial vector e1 , and it is not surprising that r11 converges to the largest eigenvalue of A in modulus and the first column to a corresponding eigenvector. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 33 / 43 Examples For 1 −1 −1 6 3 A= 4 −4 −4 −1 the upper triangular form appears after approximately 10 steps, and the diagonal elements are in the right order. For 1 0 1 3 −1 B= 2 −2 −2 2 the upper triangular form is arrived after approximately 20 steps, but the diagonal elements are not ordered by magnitude (So, the technical condition of the last Theorem is not satisfied). After further 50 steps the diagonal elements are ordered by magnitude. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 34 / 43 QR algorithm with shifts QR algorithm with shifts A0 := A for m = 0, 1, 2, . . . until convergence do Choose a suitable shift κm Factorize Am − κm I = Qm Rm Am+1 = Rm Qm + κm I end for Again all matrices Am are similar Am+1 T = Rm Qm + κm I = Qm (Qm Rm )Qm + κm I T T = Qm (Am − κm I)Qm + κm I = Qm Am Qm . and therefore all eigenvalues of the matrices Am coincide. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 35 / 43 Choice of shifts Let Qj and Rj be the orthogonal and upper triangular matrices obtained in the QR algorithm with shifts κj , and let Um = Q1 Q2 · · · · · Qm , Sm = Rm Rm−1 · · · · · R1 . Then Um Sm = (A − κm I)(A − κm−1 I) · · · · · (A − κ1 I). (+) T H From Am+1 = Qm Am Qm it follows immediately by induction Am+1 = Um AUm . For m = 1 equation (+) reads U1 S1 = Q1 R1 = A − κ1 I which is the decomposition in the first step of the QR algorithm with shifts. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 36 / 43 Choice of shifts ct. Assume that (+) holds for some m − 1. From the definition of Am+1 follows T T T T Rm = (Am+1 − κm I)Qm = Um (A − κm I)Um Qm = Um (A − κm I)Um−1 . Multiplying with Sm−1 from the right and Um from the left one obtains Um Sm = (A − κm I)Um−1 Sm−1 = (A − κm I)(A − κm−1 I) · · · · · (A − κ1 I). TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 37 / 43 Choice of shifts ct. From (+) one gets for the last unit vector en T −1 n (AT − κm I)−1 · · · · · (AT − κ1 I)−1 en = Um (Sm ) e T T −1 Since Sm and (Sm ) are lower triangular matrices, it holds that T −1 n Um (Sm ) e = σUm en for some σ. Hence (AT − κm I)−1 · · · · · (AT − κ1 I)−1 en = σUm en and the last column of Um can be interpreted as the result of m steps of inverse iteration with shifts κ1 ,. . . ,κm and initial vector en (m) This suggests to choose κm = an,n which is expected to converge to λn . TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 38 / 43 Reducing the cost The most expensive part in the QR algorithm (shifted or not) is the computation of the QR factorization in every step. This cost can be reduced considerably, if the matrix is transformed to upper Hessenberg form first: a11 a21 0 . A= .. . .. 0 a12 a22 a32 a13 a23 a33 .. . ... ... ... .. . a1,n−1 a2,n−1 a3,n−1 .. 0 0 .. . . . . . an.n−1 a1n a2n a3n ann A has upper Hessenberg form, if ajk = 0 for j > k 1. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 39 / 43 Reducing the cost ct. Assume that Am has upper Hessenberg form. Then a QR decomposition can be obtained in the following way: Multiply Am from the left by a rotation in the plane spanned by the first two unit vectors e1 and e2 , i.e. by a matrix cos θ sin θ 0 0 . . . 0 − sin θ cos θ 0 0 . . . 0 0 0 1 0 . . . 0 U12 = 0 0 0 1 . . . 0 .. .. .. .. .. . . . . . 0 0 0 0 ... 1 Then U12 Am contains in its first two rows linear combinatiosn of the first two rows of A, and the rows 3, . . . , n are the same as in Am . The rotation angle can be chosen such that the element in the position (2, 1) is annihilated. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 40 / 43 Reducing the cost ct. Multiplying U12 Am from the left by a rotation matrix U23 corresponding to rows 2 and 3, we annihilate the element in position (3, 2), which does not change the element 0 in the (2, 1) position. Continuing that way we annihilate the elements in positions (i + 1, 1) by a rotation Ui,i+1 in the plane spanned by ei and ei+1 . We finally arrive at Un−1,n · . . . · · · U23 U12 Am = R, TUHH Heinrich Voss T T i.e. Am = QR, Q = U12 · · · · · Un−1,n . NLA: Chap.3, Eigenvalue Problems 2006 41 / 43 Reducing the cost ct. T T Am+1 = RQ = RU12 · · · · · Un−1,n T Multiplying R by U12 combines the first two columns of R and leaves the other T columns unchanged. Multiplying by U23 combines columns 2 and 3 and leaves the other ones unchanged, etc. Obviously T T Am+1 = RU12 · · · · · Un−1,n becomes an upper Hessenberg matrix. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 42 / 43 Reduction to Hessenberg form A given matrix can be transformed to upper Hessenberg form using Householder matrices. For a11 c T A= , B ∈ R(n−1)×(n−1) , b, c ∈ Rn−1 b B let w ∈ Rn−1 , kwk = 1 such that the Householder matrix Q1 = I − 2ww T maps b to a multiple ofthe firstunit vector in Rn−1 . 1 0 Then with P1 = we get 0 Q1 a11 k A1 := P1 AP1 = 0 .. . c T Q1 Q1 BQ1 0 and the first column already has obtained the desired form. The following columns can be tranformed in a similar way. TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 43 / 43
© Copyright 2025 Paperzz