Automl---模型评估/搜索加速(态射、one-shot、参数分享)

在NAS过程中，最为耗时的其实就是对于候选模型的训练。而初版的NAS因为对每个候选模型都是从头训练的，因此会相当耗时。一个直观的想法是有没有办法让训练好的网络尽可能重用，目前主要有两种途径：

一种思路是利用网络态射从小网络开始做加法，所谓网络态射就是将网络进行变形，同时保持其功能不变。这样带来的好处是变形后可以重用之前训练好的权重，而不用重头开始训练。例如上海交大和伦敦大学学院的论文《Reinforcement Learning for Architecture Search by Network Transformation》中将Network morphisms（网络态射）与神经网络搜索结合，论文《Simple And Efficient Architecture Search for Convolutional Neural Networks》也使用了网络态射来达到共享权重的目的，只是它使用了爬山算法为搜索策略；
另一种思路就是从大网络开始做减法，如One-Shot Architecture Search方法，就是在一个大而全的网络上做减法。

网络态射-权重继承

学习AutoML系统设计的四种不同技术（三）权重继承—渐变(morphism) 渐变是网络架构的生成方法，其对应的评估加速技术是权重继承

渐变（morphism）是指神经网络进行修改的过程，可能会有两个神经网络，a是原来的神经网络，b是新的神经网络，如果a和b的功能是等价的，他们结构虽然有些不同，但是是等价的，那么a和b就互为渐变。简单来说就是功能相同结构不同的神经网络互为渐变。
auto-keras就是用的渐变以及权重继承来做神经架构搜索的： Auto-Keras: An Efficient Neural Architecture Search System

参数共享（one-shot）

重要文章：

Understanding and Simplifying One-Shot Architecture Search：博文解读1
Single path one-shot neural architecture search with uniform sampling：博文解读1
官方源码，源码(仅block search)

网络博文：
AutoDL论文解读（七）：基于one-shot的NAS
学习AutoML系统设计的四种不同技术（四）共享参数
 Efficient Neural Architecture Search via Parameter Sharing
基于one-shot的NAS（上）

在这里插入图片描述
one-shot的基本思想：

在一个大而全的网络上做减法

one-shot模型的基本套路：

第一步，训练一个one-shot模型(或者超图)，得到超图的权重参数。
第二步，通过超图生成许多子图，每个子图都是一个网络架构。子图继承了超图的权重参数（不再训练），用继承的参数直接进行评估，得到各个子图的评估值并进行排序。（一种简单的生成子图的方式：超图包含了全部的连接边，将超图中的部分边删除，就可得到多个子图）
第三步，将最好的子图遴选出来，在训练集上从头开始训练，得到该子图的权重参数，并用该参数在测试集上进行评估。

《Understanding and Simplifying One-Shot Architecture Search》
The proposed approach for one-shot architecture search consists of four steps:
(1) Design a search space that allows us to represent a wide variety of architectures using a single one-shot model.
(2) Train the one-shot model to make it predictive of the validation accuracies of the architectures.
(3) Evaluate candidate architectures on the validation set using the pre-trained one shot model.
(4) Re-train the most promising architectures from scratch and evaluate their performance on the test set.

xys430381_1

发布了180 篇原创文章 · 获赞 309 · 访问量 27万+

私信关注