「Deep Learning」Note on AMSGrad（比Adam好的优化算法） - 代码天地

「Deep Learning」Note on AMSGrad（比Adam好的优化算法）

其他 2018-05-25 02:59:56 阅读次数: 2

Sina Weibo：小锋子Shawn
Tencent E-mail：[email protected]
http://blog.csdn.net/dgyuanshaofeng/article/details/80370826

今时今日，SGD是训练深度网络的首选利器。后来，提出了一堆变种算法，比如，ADAGRAD，RMSPROP，ADAM，ADADELTA，NADM等。

#

基于指数滑动平均（exponential moving averages）的自适应方法

1、基于简单平均函数的ADAGrad
平均策略为： $\phi_{t}(g_1,...,g_t)=g_t$ ， $\psi_{t}(g_1,...,g_t)=\frac{diag(\sum_{i=1}^{t})}{g^2_{i}}$
2、基于指数滑动平均的方法
方法包括：RMSprop、Adam、NAdam和ADADELTA。
针对Adam，其采用的平均函数为 $\phi_{t}(g_1,...,g_t)=(1-\beta_1)\sum^{t}_{i=1}\beta_1^{t-i}g_i$ ， $\psi_{t}(g_1,...,g_t)=(1-\beta_1)diag(\sum_{i=1}^{t} \beta_{2}^{t-i}g_{i}^{2})$ 。其中， $\beta_1$ 和 $\beta_2$ 是我们熟知的两个超参数，通常分别取值为0.9和0.999。

[1] On the Convergence of Adam and Beyond 2018 [paper]

猜你喜欢

转载自blog.csdn.net/dgyuanshaofeng/article/details/80370826

「Deep Learning」Note on AMSGrad（比Adam好的优化算法）

「Deep Learning」Note on Adam

「Deep Learning」Note on ADAGrad（比vanilla SGD好的优化算法）

「Deep Learning」Note on NADM

「Deep Learning」Note on ADADELTA

「Deep Learning」Note on RMSprop

「Deep Learning」Note on ReLU

「Deep Learning」Note on Softplus

「Deep Learning」Note on Swish

「Deep Learning」Note on CycleGAN

「Deep Learning」Note on CondenseNet

「Deep Learning」Note on BigGANs

「Deep Learning」Note on MentorNet

Deep Learning Note

「Deep Learning」Note on leaky ReLU

FAST AI Deep Learning Note

「Deep Learning」Note on Activation Functions

「Deep Learning」Note on Octave Convolution

Deep Learning 最优化方法之Adam

「Deep Learning」Note on Deep Video Portraits

「Deep Learning」Note on Dynamic Bound of Learning Rate

「Deep Learning」Note on DeepLab V2

「Deep Learning」Note on SqueezeDet（挤压检测）

「Deep Learning」Note on the Shattered Gradients Problem

「Deep Learning」Note on Gather and Excite Network (GENet)

「Medical Image Analysis」Note on Deep Learning in Cardiology

「Deep Learning」Note on Decoupled Weight Decay Regularization

「Deep Learning」Note on Shake-Shake Regularization

「Deep Learning」Note on Noisy Labels with Bootstrapping

「Deep Learning」Note on Noise Adaptation Layer

今日推荐

周排行

成为C++高手之宏与枚举

在CAD二次开发中使用进度条

Js插件ECharts，HighCharts学习网址整理

Celery提交任务出错(on windows.)

cephfs内核客户端性能追踪

thinkphp中PHPExcel用法

EntityFramework动态组合多排序字段

汇编语言（八）实验9 根据材料编程

安装ubuntu后必须做的事情（对我而言）

JS函数式编程

每日归档

更多

2024-10-22(0)

2024-10-21(0)

2024-10-20(0)

2024-10-19(0)

2024-10-18(0)

2024-10-17(0)

2024-10-16(0)

2024-10-15(0)

2024-10-14(0)

2024-10-13(0)