TAILIEUCHUNG - Bài giảng Tối ưu hóa nâng cao: Chương 7 - Hoàng Nam Dũng
Bài giảng "Tối ưu hóa nâng cao - Chương 7: Subgradient method" cung cấp cho người học các kiến thức: Last last time - gradient descent, subgradient method, step size choices, convergence analysis, lipschitz continuity, convergence analysis - Proof,. . | Bài giảng Tối ưu hóa nâng cao: Chương 7 - Hoàng Nam Dũng Subgradient Method Hoàng Nam Dũng Khoa Toán - Cơ - Tin học, Đại học Khoa học Tự nhiên, Đại học Quốc gia Hà Nội Last last time: gradient descent Consider the problem min f (x) x for f convex and differentiable, dom(f ) = Rn . Gradient descent: choose initial x (0) ∈ Rn , repeat: x (k) = x (k−1) − tk · ∇f (x (k−1) ), k = 1, 2, 3, . . . Step sizes tk chosen to be fixed and small, or by backtracking line search. If ∇f Lipschitz, gradient descent has convergence rate O(1/ε). Downsides: I Requires f differentiable — addressed this lecture. I Can be slow to converge — addressed next lecture. 1 Subgradient method Now consider f convex, having dom(f ) = Rn , but not necessarily differentiable. Subgradient method: like gradient descent, but replacing gradients with subgradients, ., initialize x (0) , repeat: x (k) = x (k−1) − tk · g (k−1) , k = 1, 2, 3, . . . where g (k−1) ∈ ∂f (x (k−1) ) any subgradient of f at x (k−1) . 2 Subgradient method Now consider f convex, having dom(f ) = Rn , but not necessarily differentiable. Subgradient method: like gradient descent, but replacing gradients with subgradients, ., initialize x (0) , repeat: x (k) = x (k−1) − tk · g (k−1) , k = 1, 2, 3, . . . where g (k−1) ∈ ∂f (x (k−1) ) any subgradient of f at x (k−1) . Subgradient method is not necessarily a descent method, so we (k) keep track of best iterate xbest among x (0) , .x (k) so far, ., (k) f (xbest ) = min f (x (i) ). i=0,.,k 2 Outline Today: I How to choose step sizes I Convergence analysis I Intersection of sets I Projected subgradient method 3 Step size choices I Fixed step sizes: tk = t all k = 1, 2, 3, . . . I Fixed step length, ., tk = s/kg (k−1) k2 , and hence ktk g (k−1) k2 = s. I Diminishing step sizes: choose to meet conditions X∞ X∞ tk2 < ∞, tk = ∞, k=1 k=1 ., square summable but
đang nạp các trang xem trước