A Globally and $R$-Linearly Convergent Hybrid HS and PRP Method and its Inexact Version with Applications

We present a hybrid HS- and PRP-type conjugate gradient method for smooth optimization that converges globally and $R$-linearly for general functions. We also introduce its inexact version for problems of this kind in which gradients or values of the functions are unknown or difficult to compute. Mo...

Full description

Saved in:
Bibliographic Details
Date:2015
Main Authors: Zhou, Weijun, Жу, Веійун
Format: Article
Language:English
Published: Institute of Mathematics, NAS of Ukraine 2015
Online Access:https://umj.imath.kiev.ua/index.php/umj/article/view/2019
Tags: Add Tag
No Tags, Be the first to tag this record!
Journal Title:Ukrains’kyi Matematychnyi Zhurnal
Download file: Pdf

Institution

Ukrains’kyi Matematychnyi Zhurnal
_version_ 1860507930700283904
author Zhou, Weijun
Жу, Веійун
author_facet Zhou, Weijun
Жу, Веійун
author_sort Zhou, Weijun
baseUrl_str https://umj.imath.kiev.ua/index.php/umj/oai
collection OJS
datestamp_date 2019-12-05T09:48:59Z
description We present a hybrid HS- and PRP-type conjugate gradient method for smooth optimization that converges globally and $R$-linearly for general functions. We also introduce its inexact version for problems of this kind in which gradients or values of the functions are unknown or difficult to compute. Moreover, we apply the inexact method to solve a nonsmooth convex optimization problem by converting it into a one-time continuously differentiable function by the method of Moreau–Yosida regularization.
first_indexed 2026-03-24T02:17:08Z
format Article
fulltext UDC 517.9 Weijun Zhou (Changsha Univ. Sci. and Technology, China) A GLOBALLY AND R-LINEARLY CONVERGENT HYBRID HS AND PRP METHOD AND ITS INEXACT VERSION WITH APPLICATIONS* ГЛОБАЛЬНО ТА R-ЛIНIЙНО ЗБIЖНИЙ ГIБРИДНИЙ HS ТА PRP МЕТОД ТА ЙОГО НЕТОЧНА ВЕРСIЯ З ЗАСТОСУВАННЯМИ We present a hybrid HS- and PRP-type conjugate gradient method for smooth optimization that converges globally and R-linearly for general functions. We also introduce its inexact version for problems of this sort whose gradients or function values are unavailable or difficult to compute. Moreover, we apply the inexact method to solve a nonsmooth convex optimization problem by converting it into a one-time continuously differentiable function by the method of Moreau – Yosida regularization. Наведено гiбридний HS та PRP метод спряженого аргументу, глобально та R-лiнiйно збiжний для загальних функцiй. Також введено неточний метод для таких проблем, в яких градiєнти або значення функцiй невiдомi або важко визначаються. Крiм того, неточний метод застосовано до негладкої опуклої проблеми оптимiзацiї, що перетворює її в однократно неперервно диференцiйовну функцiю за допомогою регуляризацiї Моро – Йосiди. 1. Introduction. Conjugate gradient methods are a class of important methods for solving the large-scale unconstrained optimization problem min f(x), x ∈ Rn, (1.1) where f : Rn → R is continuously differentiable and its gradient is denoted by g(x). A general scheme of conjugate gradient methods is xk+1 = xk + αkdk, where αk > 0 is a stepsize, and the search direction dk is given by dk =    −gk, if k = 0, −gk + βkdk−1, if k ≥ 1, where βk ia a parameter and gk = g(xk). The Fletcher – Reeves (FR) method [9], the Polak – Ribiére – Polyak (PRP) method [17, 18], the Hestenes – Stiefel (HS) method [12] and the Dai – Yuan (DY) method [7] are several well-known nonlinear conjugate gradient algorithms [16]. They are specified by βFR k = ‖gk‖2 ‖gk−1‖2 , βDY k = ‖gk‖2 dTk−1 yk−1 , βHS k = gTk yk−1 dTk−1yk−1 , βPRP k = gTk yk−1 ‖gk−1‖2 , * This work was supported by the NSF (11371073 and 11461015) of China, the Project of the Scientific Research Fund (12A004 and 13B137) of the Hunan Provincial Education Department, and the NSF (13JJ4062) of Hunan Province. c© WEIJUN ZHOU, 2015 752 ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 A GLOBALLY AND R-LINEARLY CONVERGENT HYBRID HS AND PRP METHOD AND ITS. . . 753 where yk−1 = gk − gk−1 and ‖ · ‖ stands for the Euclidean norm. If exact line search is used, they are equivalent in the sense that all yield the same search directions and converge globally and R-linearly for strongly convex functions [20]. However, for a general nonlinear function with inexact line search, their behavior is markedly different. Since 1985, many efforts have been devoted to study the global convergence properties of various conjugate gradient methods with inexact line searches for general functions. Al-Baali [1] showed that the FR method can produce sufficient descent directions and converges for nonconvex functions with the strong Wolfe line search. Dai and Yuan [7] proved that the DY method is a descent method and globally convergent in the case of the standard Wolfe line search. However, the HS method and the PRP method may generate ascent directions even with the strong Wolfe line search [4], which prevent them from global convergence. To guarantee global convergence of the PRP method, some line searches which force it to generate descent direction were proposed [4, 14]. Recently, by the use of an approximate descent backtracking line search, Zhou [25] showed that the original PRP method converges globally even for nonconvex functions whether the search direction is descent or not. A simple way for ensuring global convergence is that of using the steepest descent direction when the sufficient descent condition is violated. However, it is not guaranteed that the resulting algorithm will differ significantly from the steepest descent method. Some other globalization techniques for conjugate gradient methods also have been proposed when solving nonconvex optimization. The most famous one is the PRP+ globalization technique [13], namely, βPRP+ k = max{βPRP k , 0}. After this, almost all existing PRP type or HS type methods have adopted the PRP+ technique to obtain global convergence for nonconvex functions such as [6, 11, 21]. But these modifed methods can not reduce to the original PRP method when the exact line search is used and they are not the standard conjugate gradient methods any more in this sense. To improve practical computation efficiency and convergence properties of conjugate gradient methods, many hybrid methods have been proposed, please see the recent survey [4] and references therein. These hybrid methods can be divided into two classes, one is the hybrid FR and PRP type methods such as the hybrid method [13] where βk = max { −βFR k ,min{βPRP k , βFR k } } , another is the DY and HS type methods such as that of [5] where βk = max { 0,min{βHS k , βDY k } } . To our knowledge, there is little study on hybrid HS and PRP type conjugate gradient methods. One purpose of the paper is to investigate this problem. In fact, we propose a sufficient descent hybrid HS and PRP method (1.4) below. Our motivation is based on the following two methods. One is the three-term PRP method proposed by Zhang, Zhou and Li [22], whose search direction is defined by dk =    −gk, if k = 0, −gk + βPRP k dk−1 − θPRP k yk−1, if k ≥ 1, (1.2) where θPRP k = gTk dk−1 ‖gk−1‖2 . Another is the three-term HS method proposed by Zhang, Zhou and Li [24], which generates the search direction dk =    −gk, if k = 0, −gk + βHS k dk−1 − θHS k yk−1, if k ≥ 1, (1.3) ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 754 WEIJUN ZHOU where θHS k = gTk dk−1 dTk−1yk−1 . It is clear that if the line search is exact, both methods reduce to the standard PRP method. Extensive numerical results [22, 24] show that both methods are very efficient. The three-term PRP method (1.2) converges globally for nonconvex functions [22]. The three-term HS method (1.3) converges globally and R-linearly for strongly convex functions [24], but it has not been proved to be globally convergent for general nonconvex functions. In order to utilize advantages of both methods sufficiently, based on (1.2) and (1.3), we propose a hybrid HS and PRP method as follows, namely, dk =    −gk, if k = 0, −gk + βhybrid k dk−1 − θhybridk yk−1, if k ≥ 1, (1.4) where βhybrid k = gTk yk−1 max { dTk−1yk−1, ‖gk−1‖2 } , θhybridk = gTk dk−1 max { dTk−1yk−1, ‖gk−1‖2 } . (1.5) From (1.4), (1.5) and by direct computation, it is easy to get gTk dk = −‖gk‖2, (1.6) which is independent of convexity of the objective function and the line search used. It is clear that the proposed method reduces to the standard HS or PRP method when exact line search is used since gTk dk−1 = 0 in this case. In general, conjugate gradient methods use the exact gradient and function values in their conver- gence analysis. However, in many practical problems, the exact function value or exact gradient value can not be obtained or may be very difficult to compute [3]. In these cases, the inexact algorithms are often required. Another purpose of the paper is to present an inexact conjugate gradient method only using approximate gradient or/and function values. In fact, we extend the above exact hybrid HS and PRP method to inexact case. The paper is organized as follows. In next section, we prove the global and R-linear convergence of the proposed method with a descent backtracking line search for nonconvex optimization. In Section 3, we present the inexact algorithm in detail and show its global convergence by the use of some approximate function value descent line search. In Section 4, we apply the inexact method to solve a nonsmooth convex optimization problem by converting it into a once continuously differentiable function by way of the Moreau – Yosida regularization technique. 2. Exact algorithm and its convergence properties. In this section, based on the above discussion, we first describe the complete hybrid HS and PRP algorithm as follows. Algorithm 2.1 (Exact version). Step 0. Given an initial point x0 ∈ Rn. Choose some constants δ > 0 and ρ ∈ (0, 1). Let k := 0. Step 1. Compute dk by (1.4), (1.5). Step 2. Compute the stepsize αk = max{γkρj, j = 0, 1, 2, . . .} satisfying f(xk + αkdk) ≤ f(xk)− δ‖αkdk‖2, (2.1) where γk = |gTk dk| ‖dk‖2 . ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 A GLOBALLY AND R-LINEARLY CONVERGENT HYBRID HS AND PRP METHOD AND ITS. . . 755 Step 3. Let xk+1 = xk + αkdk. Step 4. Let k := k + 1 and go to Step 1. To ensure global convergence of Algorithm 2.1, we make the following standard assumption. Assumption 2.1. (i) The level set Ω = { x ∈ Rn| f(x) ≤ f(x0) } is bounded. (ii) In some neighborhood N of Ω, f is continuously differentiable and its gradient is Lipschitz continuous, namely, there is a constant L > 0 such that ∥ ∥g(x) − g(y) ∥ ∥ ≤ L‖x− y‖ ∀x, y ∈ N. (2.2) From Assumption 2.1 and the line search (2.1), we have ∞ ∑ k=0 ‖αkdk‖2 < ∞, which implies lim k→∞ αk‖dk‖ = 0. (2.3) Lemma 2.1. Let Assumption 2.1 hold and {xk} be generated by Algorithm 2.1. Then there exists a constant M1 > 0 such that ‖gk‖ ≤ ‖dk‖ ≤ M1‖gk‖. (2.4) Proof. By (1.6), we get ‖gk‖2 = |gTk dk| ≤ ‖gk‖‖dk‖, which shows that ‖gk‖ ≤ ‖dk‖. From (1.4), (1.5), (2.2) and the line search (2.1), we obtain ‖dk‖ ≤ ‖gk‖+ 2L‖gk‖αk−1 ‖dk−1‖2 ‖gk−1‖2 ≤ (1 + 2L)‖gk‖ = M1‖gk‖, where the last inequality follows from the fact αk ≤ γk = |gTk dk| ‖dk‖2 and (1.6). Lemma 2.1 is proved. The following lemma gives a bound of the stepsize αk from below. Lemma 2.2. Let Assumption 2.1 hold and {xk} be generated by Algorithm 2.1. Then there exists two positive constant m̄1 and m1 such that αk ≥ m̄1 −gTk dk ‖dk‖2 ≥ m1. (2.5) Proof. The proof of the first inequality in (2.5) is standard, for example, see Lemma 3.1 in [23]. The second inequality in (2.5) follows from (1.6) and (2.4) directly. Theorem 2.1. Let Assumption 2.1 hold and {xk} be generated by Algorithm 2.1. Then lim k→∞ ‖gk‖ = 0. (2.6) ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 756 WEIJUN ZHOU Proof. From (2.3) and (2.5), we get lim k→∞ ‖dk‖ = 0, which together with (2.4) yields (2.6). The above theorem shows the global convergence property of Algorithm 2.1 without convexity assumption on f. It only relies on the assumption that f has Lipschitz continuous gradients. Now we turn to establishing the R-linear convergence property of Algorithm 2.1. To do this, we need the following assumption. Assumption 2.2. (i) f is twice continuously differentiable near x∗. (ii) The sequence {xk} converges to x∗ where g(x∗) = 0 and the Hessian matrix ∇2f(x∗) is positive definite. Assumption 2.2 implies that f is strongly convex in some neighborhood N(x∗) of x∗, that is, there are two positive constants m and M such that m‖d‖2 ≤ dT∇2f(x)d ≤ M‖d‖2 ∀x ∈ N(x∗) ∀d ∈ Rn. (2.7) From (2.7), it is easy to obtain (can see [2], Theorem 3.1) m 2 ‖x− x∗‖2 ≤ f(x)− f(x∗) ≤ 1 m ∥ ∥g(x) ∥ ∥ 2 ∀x ∈ N(x∗). (2.8) By (2.4), (2.5) and (2.1), there is a positive constant m2 such that f(xk+1) ≤ f(xk)− δm2 1 ‖gk‖2 ‖dk‖2 ‖gk‖2 ≤ f(xk)−m2‖gk‖2. (2.9) Without loss of generality, we assume {xk} ⊂ N(x∗). From (2.9) and (2.8), we get f(xk+1)− f(x∗) ≤ (1−mm2) ( f(xk)− f(x∗) ) ≤ . . . ≤ (1−mm2) k ( f(x0)− f(x∗) ) . (2.10) The following theorem shows the R-linear convergence property of Algorithm 2.1. Theorem 2.2. Let Assumption 2.2 hold and the sequence {xk} be generated by Algorithm 2.1. Then there exist three positive constants r ∈ (0, 1), C2 and C3 such that f(xk+1)− f(x∗) ≤ C2r k, ‖xk+1 − x∗‖ ≤ C3 √ r k . Proof. The first inequality follows from (2.10) with r = 1 − mm2 and C2 = f(x0) − f(x∗) directly. Inequalities (2.10) and (2.8) yield the second inequality with C3 = √ 2(f(x0)− f(x∗)) m . 3. Inexact algorithm and its global convergence. In this section, we consider the inexact version of Algorithm 2.1 with approximate gradient or/and function values. For simplicity, we denote fa(x, ǫ) and ga(x, ǫ) as the approximations of f(x) and g(x) with the possible error ǫ, respectively. More accurately, we assume that, for each x ∈ Rn, the approximations fa(x, ǫ) and ga(x, ǫ) can be made arbitrarily close to the exact values f(x) and g(x) by choosing the parameter ǫ small enough, namely, |fa(x, ǫ)− f(x)| ≤ ǫ, (3.1) ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 A GLOBALLY AND R-LINEARLY CONVERGENT HYBRID HS AND PRP METHOD AND ITS. . . 757 ‖ga(x, ǫ)− g(x)‖ ≤ ǫ. (3.2) With these approximations, we define the inexact method of Algorithm 2.1 as follows: dk =    −ga(xk, ǫk), if k = 0, −ga(xk, ǫk) + βkdk−1 − θky a k−1, if k ≥ 1, (3.3) where yak−1 = ga(xk, ǫk)− ga(xk−1, ǫk−1), βk = ga(xk, ǫk) T yak−1 max { dTk−1y a k−1, ‖ga(xk−1, ǫk−1)‖2 } , (3.4) θk = ga(xk, ǫk) Tdk−1 max { dTk−1y a k−1, ‖ga(xk−1, ǫk−1)‖2 } . (3.5) It is clear that dTk g a(xk, ǫk) = − ∥ ∥ga(xk, ǫk) ∥ ∥ 2 . (3.6) However, the direction dk defined by (3.3) – (3.5) with inexact gradient ga(xk, ǫk) is not necessarily a descent direction of the objective function f at xk. Then some line search procedures such as the Wolfe (or strong Wolfe) line search and the line search given by (2.1) can not be used any more. In this case, we have to modify the line search (2.1). Let {ǫk} and η be a given positive sequence and a positive constant satisfying ∞ ∑ k=0 ǫk ≤ η < ∞. (3.7) Set γk = |ga(xk, ǫk)T dk| ‖dk‖2 , we determine the stepsize by the following approximate descent line search, that is, compute αk = max{γkρj , j = 0, 1, 2, . . .} satisfying fa(xk + αkdk, ρ1ǫk) ≤ fa(xk, ǫk)− δ‖αkdk‖2 + 2ǫk, (3.8) where ρ, ρ1 ∈ (0, 1) are two constants. The following result shows that the line search (3.8) terminates finitely. Proposition 3.1. The line search (3.8) is well-defined. Proof. Suppose it is not true. Then for all j ≥ 0, (3.8) does not hold, namely, fa(xk + γkρ jdk, ρ1ǫk) > fa(xk, ǫk)− δ‖γkρjdk‖2 + 2ǫk, (3.9) which together with (3.1) yields f(xk + γkρ jdk) + ρ1ǫk > f(xk)− ǫk − δ‖γkρjdk‖2 + 2ǫk. This implies f(xk + γkρ jdk)− f(xk) > −δ‖γkρjdk‖2 + (1− ρ1)ǫk. Let j → ∞ in the above inequality, we have ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 758 WEIJUN ZHOU 0 ≥ (1− ρ1)ǫk, which is a contradiction since ρ1 ∈ (0, 1) and ǫk > 0. Proposition 3.1 is proved. For clarity, we give the complete inexact algorithm as follows. Algorithm 3.1 (Exact version). Step 0. Given an initial point x0 ∈ Rn. Choose some constants δ > 0 and ρ, ρ1 ∈ (0, 1). Let k := 0. Step 1. Compute the search direction dk by (3.3) – (3.5). Step 2. Compute the stepsize αk by (3.8). Step 3. Let the next iterate be xk+1 = xk + αkdk. Step 4. Let k := k + 1. Go to Step 1. We suppose that the following assumption is satisfied. Assumption 3.1. (i) The level set Ω = {x ∈ Rn| f(x) ≤ f(x0) + (3 + ρ1)η} is bounded. (ii) In some neighborhood N of Ω, f is continuously differentiable and its gradient is Lipschitz continuous, namely, (2.2) holds. It is clear that the sequence {xk} ⊂ Ω. In fact, from (3.1), (3.8) and (3.7), we have f(xk+1) ≤ fa(xk+1, ρ1ǫk) + ρ1ǫk ≤ f(xk) + ǫk − δ‖αkdk‖2 + 2ǫk + ρ1ǫk = = f(xk)− δ‖αkdk‖2 + (3 + ρ1)ǫk ≤ f(xk) + (3 + ρ1)ǫk ≤ f(x0) + (3 + ρ1)η. (3.10) Moreover, (3.10) and (3.7) imply that ∑∞ k=0 ‖αkdk‖2 < ∞, which shows lim k→∞ αk‖dk‖ = 0. (3.11) Lemma 3.1. Let Assumption 3.1 hold and the sequence {xk} be generated by Algorithm 3.1. If ∥ ∥ga(xk, ǫk) ∥ ∥ ≥ τ1 with some positive constant τ1 for all k, then there exists a positive constant M3 such that ‖dk‖ ≤ M3. (3.12) Proof. From (2.2), (3.2), (3.7) and (3.11), we have ‖yak−1‖ = ‖ga(xk, ǫk)− ga(xk−1, ǫk−1)‖ ≤ ≤ ‖ga(xk, ǫk)− gk‖+ ‖gk − gk−1‖+ ‖ga(xk−1, ǫk−1)− gk−1‖ ≤ ≤ ǫk + ǫk−1 + L‖αk−1dk−1‖ → 0. (3.13) This together with the assumption yield (3.12) by the same argument as that of Lemma 3.1 in [22]. Lemma 3.1 is proved. Lemma 3.2. Let Assumption 3.1 hold and the sequence {xk} be generated by Algorithm 3.1. Then there exist a constant m3 > 0 such that αk ≥ ‖ga(xk, ǫk)‖2 ‖dk‖2 or αk ≥ m3 ‖ga(xk, ǫk)‖2 ‖dk‖2 − m3ǫk ‖dk‖ . (3.14) ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 A GLOBALLY AND R-LINEARLY CONVERGENT HYBRID HS AND PRP METHOD AND ITS. . . 759 Proof. (i) If αk = γk, by (3.6), then the first inequality holds. (ii) If αk 6= γk, then α′ k = αk/ρ can not satisfy the line search (3.8). This together with (3.1) yields f(xk + α′ kdk) > f(xk)− δ‖α′ kdk‖2 + (1− ρ1)ǫk > f(xk)− δ‖α′ kdk‖2. By the mean value theorem and (2.2), it is easy to obtain that f(xk + α′ kdk)− f(xk) ≤ α′ kg T k dk + L‖α′ kdk‖2. Then the above two inequalities and (3.6) imply α′ k ≥ −gTk dk (L+ δ)‖dk‖2 = −ga(xk, ǫk) T dk + (ga(xk, ǫk)− gk) Tdk (L+ δ)‖dk‖2 ≥ ≥ −ga(xk, ǫk) Tdk − ‖ga(xk, ǫk)− gk‖‖dk‖ (L+ δ)‖dk‖2 ≥ ‖ga(xk, ǫk)‖2 − ǫk‖dk‖ (L+ δ)‖dk‖2 , where the last inequality uses (3.2), which implies (3.14) with m3 = ρ L+ δ . Lemma 3.2 is proved. Theorem 3.1. Let Assumption 3.1 hold. Then the sequence {xk} be generated by Algorithm 3.1 converges globally in the sense that lim inf k→∞ ∥ ∥∇f(xk) ∥ ∥ = 0. (3.15) Proof. Suppose it is not true. Then there exists a constant τ1 > 0 such that ‖gk‖ ≥ 2τ1 ∀k ≥ 0, which together with (3.2) yields ∥ ∥ga(xk, ǫk) ∥ ∥ ≥ τ1 (3.16) for sufficiently large k. Then from Lemma 3.1 and (3.14), we have that for sufficiently large k, αk ≥ m3 M3 ( τ21 M3 − ǫk ) ≥ m3τ 2 1 2M2 3 , which together with (3.11) means ‖dk‖ → 0. Then from the definition of the search direction (3.3) and (3.16), we have ∥ ∥ga(xk, ǫk) ∥ ∥ ≤ ‖dk‖+ 2 ‖ga(xk, ǫk)‖‖yak−1‖‖dk−1‖ ‖ga(xk−1, ǫk−1)‖2 ≤ ‖dk‖+ 2 ‖ga(xk, ǫk)‖‖yak−1‖‖dk−1‖ τ21 → 0, which contradicts (3.16). Theorem 3.1 is proved. Remark 3.1. We can obtain the strong convergence of Algorithm 3.1, that is limk→∞ ‖gk‖ = 0, by the same argument as that of Algorithm 2.1 if we suitably choose the positive sequence {ǫk}. For example, let ǫ−1 = 1, ǫk = min { ǫk−1 2 , 1 ‖dk‖ } for k ≥ 0, then the sequence {ǫk} satisfies ǫk < ǫk−1 ≤ 1 ‖dk−1‖ , and ∑∞ k=0 ǫk < ∞. ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 760 WEIJUN ZHOU Lemma 3.3 ([8], Lemma 3.3). Let {ak} and {rk} be positive sequences satisfying ak+1 ≤ (1 + + rk)ak + rk and ∑∞ k=0 rk < ∞. Then {ak} converges. Corollary 3.1. Let Assumption 3.1 hold and the sequence {xk} be generated by Algorithm 3.1. If the function f is convex, then the sequence {f(xk)} converges to the minimum of (1.1). Proof. From Assumption 3.1 and Theorem 3.1, there exists a subsequence {xki}∞i=0 which converges to some minimizer x∗ satisfying g(x∗) = 0. From (3.10), we have f(xk+1)− f(x∗) ≤ f(xk)− f(x∗) + (3 + ρ1)ǫk, (3.17) which together with Lemma 3.3 and (3.7) shows that the sequence {f(xk)−f(x∗)} converges. Since the subsequence {f(xki)} converges to f(x∗), therefore {f(xk)} converges to f(x∗). Corollary 3.1 is proved. 4. Application to nonsmooth convex optimization. In this section, we consider the following convex optimization problem: minF (x), x ∈ Rn, (4.1) where F : Rn → R is a possibly nondifferentiable convex function. Then general methods such as conjugate gradient methods for smooth optimization can not be used to solve (4.1) directly. An efficient way is to convert the nonsmooth problem (4.1) into an equivalent smooth problem by the Moreau – Yosida regularization such as [10, 15, 19], that is, min f(x), x ∈ Rn, (4.2) where f is defined by f(x) = min z∈Rn { F (z) + 1 2λ ‖z − x‖2 } (4.3) and λ is a positive parameter. It is well-known that problems (4.1) and (4.2) are equivalent in the sense that the solution sets of the two problems coincide with each other. Moreover, the function f is convex and differentiable with Lipschitz continuous gradient [10, 19] given by g(x) = 1 λ ( x− p(x) ) , which satisfies ‖g(x) − g(y)‖ ≤ 1 λ ‖x− y‖ ∀x, y ∈ Rn, (4.4) where g(x) = ∇f(x) and p(x) is the unique minimizer in (4.3), i.e., p(x) = arg min z∈Rn { F (z) + 1 2λ ‖z − x‖2 } since this is a strongly convex minimization problem. It is clear that it is impossible in general to compute exactly the function f defined by (4.3) and its gradient g at an arbitrary point x. But for each x ∈ Rn, we may obtain approximate values of the gradient and the function by some existing methods such as [10, 19]. Therefore we can suppose that, for each x ∈ Rn, we can evaluate f(x) and g(x) approximately but with any desired accuracy, that is, for each x ∈ Rn and any ǫ > 0, we can find a vector pa(x, ǫ) ∈ Rn such that F ( pa(x, ǫ) ) + 1 2λ ‖pa(x, ǫ)− x‖2 ≤ f(x) + ǫ. (4.5) With this pa(x, ǫ), we define the approximations to f(x) and g(x) by ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 A GLOBALLY AND R-LINEARLY CONVERGENT HYBRID HS AND PRP METHOD AND ITS. . . 761 fa(x, ǫ) = F ( pa(x, ǫ) ) + 1 2λ ‖pa(x, ǫ)− x‖2 (4.6) and ga(x, ǫ) = 1 λ ( x− pa(x, ǫ) ) , (4.7) respectively. The following lemma shows that the approximations fa(x, ǫ) and ga(x, ǫ) satisfy (3.1) and (3.2), respectively. Lemma 4.1 ([10], Lemma 3.1). Let pa(x, ǫ) be a vector satisfying (4.5), fa(x, ǫ) and ga(x, ǫ) be defined by (4.6) and (4.7), respectively. Then f(x) ≤ fa(x, ǫ) ≤ f(x) + ǫ and ‖ga(x, ǫ)− g(x)‖ ≤ √ 2ǫ λ . From (4.4), Lemma 4.1 and Corollary 3.1, we have the following result. Corollary 4.1. Let the problem (4.2) be solved by Algorithm 3.1. If the condition (i) in Assump- tion 3.1 holds, then the sequence {f(xk)} converges to the minimum of (4.2). 5. Conclusions. We have proposed a hybrid HS and PRP method which converges globally and R-linearly for general optimization problems. It is also extended to inexact case which admits approximate function and gradient values. Hence this inexact method is very suitable for solving such problems whose exact gradient and function values are not available or difficult to compute. We have applied this inexact algorithm to solve nonsmooth convex problems by way of Moreau – Yosida regularization. We believe that the basic idea of this paper can be applied to other conjugate gradient methods. How to extend the proposed methods or linear conjugate gradient methods to fully derivative-free ones for solving large-scale nonlinear equations will be our further study. 1. Al-Baali M. Descent property and global convergence of the Fletcher – Reeves method with inexact line search // IMA J. Numer. Anal. – 1985. – 5. – P. 121 – 124. 2. Byrd R. H., Nocedal J. A tool for the analysis of quasi-Newton methods with application to unconstrained minimiza- tion // SIAM J. Numer. Anal. – 1989. – 26. – P. 727 – 739. 3. Conn A. R., Scheinberg K., Toint Ph. L. Recent progress in unconstrained nonlinear optimization without derivatives // Math. Program. – 1997. – 79. – P. 397 – 414. 4. Dai Y. Nonlinear conjugate gradient methods // http://lsec.cc.ac.cn/˜ dyh/worklist.html. 5. Dai Y., Yuan Y. An efficient hybrid conjugate gradient method for unconstrained optimization // Ann. Oper. Res. – 2001. – 103. – P. 33 – 47. 6. Dai Y., Liao L. Z. New conjugacy conditions and related nonlinear conjugate gradient methods // Appl. Math. and Optim. – 2001. – 43. – P. 87 – 101. 7. Dai Y., Yuan Y. A nonlinear conjugate gradient method with a strong global convergence property // SIAM J. Optim. – 2000. – 10. – P. 177 – 182. 8. Dennis J. E., More J. J. A characterization of superlinear convergence and its applications to quasi-Newton methods // Math. Comput. – 1974. – 28. – P. 549 – 560. 9. Fletcher R., Reeves C. Function minimization by conjugate gradients // Comput. J. – 1964. – 7. – P. 149 – 154. 10. Fukushima M., Qi L. A globally and superlinearly convergent algorithm for nonsmooth convex minimization // SIAM J. Optim. – 1996. – 6. – P. 1106 – 1120. 11. Hager W. W., Zhang H. A new conjugate gradient method with guaranteed descent and an efficient line search // SIAM J. Optim. – 2005. – 16. – P. 170 – 192. 12. Hestenes M. R., Stiefel E. L. Method of conjugate gradient for solving linear systems // J. Res. Nat. Bur. Stand. – 1952. – 49. – P. 409 – 432. 13. Gilbert J. C., Nocedal J. Global convergence properties of conjugate gradient methods for optimization // SIAM J. Optim. – 1992. – 2. – P. 21 – 42. ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6 762 WEIJUN ZHOU 14. Grippo L., Lucidi S. A globally convergent version of the Polak – Ribière conjugate gradient method // Math. Program. – 1997. – 78. – P. 375 – 391. 15. Mifflin R., Sun D., Qi L. Quasi-Newton bundle-type methods for nondifferentiable convex optimization // SIAM J. Optim. – 1998. – 8. – P. 583 – 603. 16. Nazareth L. Conjugate-gradient methods // Encycl. Optim. / Eds C. Floudas and P. Pardalos. – Boston, USA; Dordrecht, The Netherlands: Kluwer Acad. Publ., 2001. – P. 319 – 323. 17. Polak E., Ribière G. Note surla convergence de méthodes de directions conjuguées // Rev. franc. inform. rech. opér. – 1969. – 16. – P. 35 – 43. 18. Polyak B. T. The conjugate gradient method in extreme problems // USSR Comput. Math. and Math. Phys. – 1969. – 9. – P. 94 – 112. 19. Rauf A. I., Fukushima M. Globally convergent BFGS method for nonsmooth convex optimization // J. Optim. Theory and Appl. – 2000. – 104. – P. 539 – 558. 20. Sun W., Yuan Y. Optimization theory and methods. – New York: Springer Sci. and Business Media, LLC, 2006. 21. Yabe H., Takano M. Global convergence properties of nonlinear conjugate gradient methods with modified secant condition // Comput. Optim. and Appl. – 2004. – 28. – P. 203 – 225. 22. Zhang L., Zhou W., Li D. A descent modified Polak – Ribiére – Polyak conjugate gradient method and its global convergence // IMA J. Numer. Anal. – 2006. – 26. – P. 629 – 640. 23. Zhang L., Zhou W., Li D. Global convergence of a modified Fletcher – Reeves conjugate gradient method with Armijo-type line search // Numer. Math. – 2006. – 104. – P. 561 – 572. 24. Zhang L., Zhou W., Li D. Some descent three-term conjugate gradient methods and their global convergence // Optim. Methods and Software. – 2007. – 22. – P. 697 – 711. 25. Zhou W. A short note on the global convergence of the unmodified PRP method // Optim. Lett. – 2013. – 7. – P. 1367 – 1372. Received 16.03.12, after revision — 07.12.14 ISSN 1027-3190. Укр. мат. журн., 2015, т. 67, № 6
id umjimathkievua-article-2019
institution Ukrains’kyi Matematychnyi Zhurnal
keywords_txt_mv keywords
language English
last_indexed 2026-03-24T02:17:08Z
publishDate 2015
publisher Institute of Mathematics, NAS of Ukraine
record_format ojs
resource_txt_mv umjimathkievua/08/aece9ddcfdc1ac0ccd815aca867c8d08.pdf
spelling umjimathkievua-article-20192019-12-05T09:48:59Z A Globally and $R$-Linearly Convergent Hybrid HS and PRP Method and its Inexact Version with Applications Глобально та $R$-лінійно збіжний гібридний HS та PRP метод та його неточна версія з застосуваннями Zhou, Weijun Жу, Веійун We present a hybrid HS- and PRP-type conjugate gradient method for smooth optimization that converges globally and $R$-linearly for general functions. We also introduce its inexact version for problems of this kind in which gradients or values of the functions are unknown or difficult to compute. Moreover, we apply the inexact method to solve a nonsmooth convex optimization problem by converting it into a one-time continuously differentiable function by the method of Moreau–Yosida regularization. Наведено гібридний HS та PRP метод спряженого аргументу, глобально та $R$-лінійно з6іжний для загальних Функцій. Також введено неточний метод для таких проблем, в яких градієнти або значення функцій невідомі або важко визначаються. Крім того, неточний метод застосовано до негладкої опуклої проблеми оптимізації, що перетворює її в однократно неперервно диференційовну функцію за допомогою регуляризації Моро-Йосіди. Institute of Mathematics, NAS of Ukraine 2015-06-25 Article Article application/pdf https://umj.imath.kiev.ua/index.php/umj/article/view/2019 Ukrains’kyi Matematychnyi Zhurnal; Vol. 67 No. 6 (2015); 752–762 Український математичний журнал; Том 67 № 6 (2015); 752–762 1027-3190 en https://umj.imath.kiev.ua/index.php/umj/article/view/2019/1056 https://umj.imath.kiev.ua/index.php/umj/article/view/2019/1057 Copyright (c) 2015 Zhou Weijun
spellingShingle Zhou, Weijun
Жу, Веійун
A Globally and $R$-Linearly Convergent Hybrid HS and PRP Method and its Inexact Version with Applications
title A Globally and $R$-Linearly Convergent Hybrid HS and PRP Method and its Inexact Version with Applications
title_alt Глобально та $R$-лінійно збіжний гібридний HS та PRP метод та його неточна версія з застосуваннями
title_full A Globally and $R$-Linearly Convergent Hybrid HS and PRP Method and its Inexact Version with Applications
title_fullStr A Globally and $R$-Linearly Convergent Hybrid HS and PRP Method and its Inexact Version with Applications
title_full_unstemmed A Globally and $R$-Linearly Convergent Hybrid HS and PRP Method and its Inexact Version with Applications
title_short A Globally and $R$-Linearly Convergent Hybrid HS and PRP Method and its Inexact Version with Applications
title_sort globally and $r$-linearly convergent hybrid hs and prp method and its inexact version with applications
url https://umj.imath.kiev.ua/index.php/umj/article/view/2019
work_keys_str_mv AT zhouweijun agloballyandrlinearlyconvergenthybridhsandprpmethodanditsinexactversionwithapplications
AT žuveíjun agloballyandrlinearlyconvergenthybridhsandprpmethodanditsinexactversionwithapplications
AT zhouweijun globalʹnotarlíníjnozbížnijgíbridnijhstaprpmetodtajogonetočnaversíâzzastosuvannâmi
AT žuveíjun globalʹnotarlíníjnozbížnijgíbridnijhstaprpmetodtajogonetočnaversíâzzastosuvannâmi
AT zhouweijun globallyandrlinearlyconvergenthybridhsandprpmethodanditsinexactversionwithapplications
AT žuveíjun globallyandrlinearlyconvergenthybridhsandprpmethodanditsinexactversionwithapplications