caffe2（一）基本概念

1 blobs,workspace以及tensor

Caffe2中的数据被认为是blobs，一个blob即内存中的一块数据. 大部分blobs都包含一个tensor(可以想成是多维数组), 在python中被转换成numpy.array。

Caffe2 的 Data 是以 blobs 的形式组织的.
blob 即是内存中被命名的 data chunk(数据块).

workspace存储全部的blobs。
下面代码展示了如何往workspace中FeedBlob（将blob输入到workspace）以及FetchBlob（workspace读取blob）。workspace从开始被使用时初始化自己。

from caffe2.python import workspace, model_helper
import numpy as np
# Create random tensor of three dimensions
x = np.random.rand(4, 3, 2)
print(x)
print(x.shape)
 
workspace.FeedBlob("my_x", x)
 
x2 = workspace.FetchBlob("my_x")
print(x2)

2 Nets and Operators

Caffe2中的基本模型抽象是net。net是一张operator的图，每个operator需要一组输入blobs然后产生一个或多个输出blobs。

Caffe2中FC op接受一个输入blob，weights和bias。使用XavierFill或者ConstantFill生成weights和bias，参数均为空数组[]，名字和形状。

3 实例

下面的代码块构建了一个包含如下组件的简单模型：

One fully-connected layer (FC)
a Sigmoid activation with a Softmax a
a CrossEntropy loss

直接构建网络很乏味，所以最好是使用python中的model helpers类。我们以"my first net"命名它,ModelHelper将创建两个相互联系的net：

one that initializes the parameters (ref. init_net)
one that runsone that runs the actual training (ref. exec_net)

model_helper会创建两个nets：只执行一次的m.param_init_net，它会初始化所有参数blobs，例如weights和bias。实际的训练在执行m.net时完成.

下面是产生一个网络的例子：

from caffe2.python import workspace, model_helper
import numpy as np
# Create the input data 产生输入数据
data = np.random.rand(16, 100).astype(np.float32) # astype(np.float32)表示数据类型为float32
 
# Create labels for the data as integers [0, 9]. 产生输入标签
label = (np.random.rand(16) * 10).astype(np.int32)
 
workspace.FeedBlob("data", data) # 将数据输入到Blob
workspace.FeedBlob("label", label)

# Create model using a model helper
m = model_helper.ModelHelper(name="my first net") # 用model_helper会产生两个网络，一个是参数初始化网络param_init_net，一个是真正的网络net

weight = m.param_init_net.XavierFill([], 'fc_w', shape=[10, 100]) #参数初始化网络用XavierFill产生weight
bias = m.param_init_net.ConstantFill([], 'fc_b', shape=[10, ])

fc_1 = m.net.FC(["data", "fc_w", "fc_b"], "fc1") # 生成全连接层
pred = m.net.Sigmoid(fc_1, "pred") # 生成Sigmoid层
softmax, loss = m.net.SoftmaxWithLoss([pred, "label"], ["softmax", "loss"]) #预测 

print(m.net.Proto()) # 输出网络结构
print(m.param_init_net.Proto()) # 输出网络结构

workspace.RunNetOnce(m.param_init_net) #执行一次且仅一次参数初始化：
workspace.CreateNet(m.net) # 训练网络
# Run 100 x 10 iterations #创建一次网络，然后多次运行该网络：（训练网络）
for _ in range(100):
    data = np.random.rand(16, 100).astype(np.float32)
    label = (np.random.rand(16) * 10).astype(np.int32)
    workspace.FeedBlob("data", data)
    workspace.FeedBlob("label", label)
    workspace.RunNet(m.name, 10)   # run for 10 times
print(workspace.FetchBlob("softmax")) #输出网络训练的结果
print(workspace.FetchBlob("loss"))

总结

首先，创建内存里的 input data 和 label blobs；实际应用中，从相应的 database 来加载读取.
data 和 label blobs 的 first dim=16，即 batchsize=16. 基于 ModelHelper 可以进行处理许多 Caffe2 operators，更多细节参考 ModelHelper’s Operator List.
然后，通过定义多个 operators 来创建模型：FC, Sigmoid 和 SoftmaxWithLoss. 此时，只是定义了 operators 和 model.ModelHelper 创建了两个 nets：
m.param_init_net 只运行一次；初始化参数 blobs；
m.net
训练网络.

反向传播

上面的net只包含了前向传播，所以它学不到任何东西。反向传播通过在前向传播的每个op上加入梯度op来生成。

在RunNetOnce()前面加上：

扫描二维码关注公众号，回复： 5838386 查看本文章

m.AddGradientOperators([loss])

Caffe2 API 的基本思想：

采用 Python 方便快捷的组织网络来训练模型；
将网络以序列化 serialized protobuffers 传递到 C++ code；
然后，利用 C++ code 运行网络.

1 Caffe2学习指南（一）https://blog.csdn.net/gitprx/article/details/81389171
Caffe2（三） https://blog.csdn.net/zziahgf/article/details/78929017
Caffe2 Intro Tutorial （官方参考资料）https://caffe2.ai/docs/intro-tutorial.html

4 Brewing Models

brew是Caffe2的新API，它可以帮助我们构建模型。

概念：Ops vs Helper Functions

在学习brew之前，我们应该回顾一下Caffe2的一些约定以及神经网络的layer是怎么表示的。Caffe2中深度学习网络由operators构成。这些op用c++实现以获得最佳性能。Caffe2也提供了python的API。Caffe2中，op的命名总是采用驼峰法，而具有相同名称的python helper functions则是用小写表示。下面是一些实例。

4.1 Ops

创建一个FC OP：

model.net.FC([blob_in, weight, bias], blob_out)

创建一个Copy Op

model.net.Copy(blob_in, blob_out)

还应该注意到我们也可以直接在model上创建op，如下：

model.Copy(blob_in, blob_out)

4.2 Helper Functions

仅仅使用一个op来构建模型/网络是很艰苦的，因为你要做参数初始化和设备/引擎的选择。礼物，创建一个FC layer，你需要几行代码来准备feed到op的weights和bias：

model = model_helper.ModelHelper(name='train') # 创建一个名字为“train”的网络
#initailize weight 初始化权重
weight = model.param_init_net.XavierFill(
    [],
    blob_out + '_w',
    shape=[dim_out, dim_in],
    **kwargs, # maybe indicating weight should be on GPU here 也许在这里指示权重应该在GPU上
)
# initialize your bias 初始化
bias = model.param_init_net.ConstantFill(
    [],
    blob_out + '_b',
    shape=[dim_out, ],
    **kwargs,
)
# finally building FC 建立全连接层
model.net.FC([blob_in, weights, bias], blob_out, **kwargs)

这里Caffe2的helper functions就派上用场了。helper functions是上面代码的封装。helper functions会承担参数初始化，op定义以及引擎选择等任务。Caffe2默认的helper functions命名在python PEP8函数约定中。例如：

fcLayer = fc(model, blob_in, blob_out, **kwargs)#return a blob reference

一些helper functions会构造多于一个的op。这里有更多的helper function

helper function参考：https://github.com/pytorch/pytorch/tree/master/caffe2/python/helpers

4 Brew

brew将会使构建模型更加轻松。brew是help function的一个智能集（瞎β翻）。通过import brew模块，我们可以使用所有的Caffe2 help function。下面代码向模型中加入一个全连接层：

from caffe2.python import brew
brew.fc(model, blob_in, blob_out, ...)

看起来好像没简单多少，但是我们只需import一次brew就可以构建更加复杂的网络：

from caffe2.python import brew
def AddLeNetModel(model, data):
    conv1 = brew.conv(model, data, 'conv1', 1, 20, 5)
    pool1 = brew.max_pool(mode, conv1, 'pool1', kernel=2, stride=2)
    conv2 = brew.conv(model, pool1, 'conv2', 20, 50, 5)
    pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2)
    fc3 = brew.fc(model, pool2, 'fc3', 50 * 4 * 4, 500)
    fc3 = brew.relu(model, fc3, fc3)
    pred = brew.fc(model, fc3, 'pred', 500, 10)
    softmax = brew.softmax(model, pred, 'softmax')

每一层的创建都由brew调用对应的op的实例化代码完成。

5 arg_scope

arg_scope是在上下文中设置help function默认参数值的语法糖。例如，我们想在ResNet-150的脚本中尝试不同的weights的初始化方法化。我们可以：

# change all weight_init here
brew.conv(model, ..., weight_init=('XavierFill', {}),...)
...
# repeat 150 times
...
brew.conv(model, ..., weight_init=('XavierFill', {}),...)

或者使用arg_scope：

with brew.arg_scope([brew.conv], weight_init=('XavierFill', {})):
     brew.conv(model, ...) # no weight_init needed here!
     brew.conv(model, ...)
     ...

6 定制Helper Function

可以用一下代码来注册自己的Helper Function：

def my_super_layer(model, blob_in, blob_out, **kwargs):
"""
   100x faster, awesome code that you'll share one day.
"""
 
brew.Register(my_super_layer)
brew.my_super_layer(model, blob_in, blob_out)

2 Caffe2学习指南（二）https://blog.csdn.net/gitprx/article/details/81394904