Paper Summary4: Projection-free Online Learning

0 Paper

Hazan, Elad, and Satyen Kale. "Projection-free online learning." arXiv preprint arXiv:1206.4657 (2012).

1 Key contribution

Providing a projection free algorithm for online learning that has the following advantages:
(1) computationally efficient
(2) better regret bounds for cases such as stochastic online smooth convex optimization
(3) parameter-free in the stochastic case and produce sparse decisions

2 Motivation

Assuming it is possible to do
linear optimization over the convex domain efficiently, an algorithm with sublinear for online convex optimization can be achieved via utilizing an online version of the classic Frank-Wolfe algorithm


Frank-Wolfe algorithm from wiki

3 The proposed method

(1) Preliminaries
Diameter of set \kappa is D and f is L-Lipchitz

Beta-smooth and Sigma strongly convex

Smoothed function

\kappa -Sparsity

kappa sparsity

(2) main claim


main claim

(3) Algorithm


OFW

4 Some thoughts

When we propose an algorithm, we may consider the following aspects:
(1) Effective
(2) Parameter-less or parameter-free
(3) Computationally feasible
(4) Sample efficient

First of all, an algorithm should be effective in solving the target problem. Effectiveness is the fundamental requirement. Otherwise, the proposed algorithm is nothing but useless. Secondly, it is advantageous if the algorithm is parameter-less or parameter-free. An algorithm with many parameters may be good in terms of being flexible with solving different questions. However, tuning the parameter usually requires expert knowledge, which is not user-friendly. A less experienced practitioner may end up choosing the wrong parameter, which will significantly affect the performance of the algorithm. Therefore, an algorithm with parameters should be equipped with the guidance of parameter selection. Thirdly, an algorithm should be computationally feasible. Otherwise, one cannot even use such an algorithm. However, feasible computation is just the basic requirement. A faster algorithm even with just a slightly worse performance is favorable. This paper mainly focuses on the computation aspect and the proposed algorithm not only faster but also has a good performance in solving the problem. Finally, it is favorable if the algorithm is sample efficient.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 夜莺2517阅读 127,807评论 1 9
  • 版本:ios 1.2.1 亮点: 1.app角标可以实时更新天气温度或选择空气质量,建议处女座就不要选了,不然老想...
    我就是沉沉阅读 11,843评论 1 6
  • 我是黑夜里大雨纷飞的人啊 1 “又到一年六月,有人笑有人哭,有人欢乐有人忧愁,有人惊喜有人失落,有的觉得收获满满有...
    陌忘宇阅读 12,732评论 28 53
  • 兔子虽然是枚小硕 但学校的硕士四人寝不够 就被分到了博士楼里 两人一间 在学校的最西边 靠山 兔子的室友身体不好 ...
    待业的兔子阅读 7,512评论 2 9