Caffe Tutorial (Data : how to caffeinate data for model input)

Data

数据采用blob格式流经caffe。数据层通过将Blob转换为其他格式来加载输入并保存输出。像平均减法（mean-subtraction）和特征缩放(feature-scaling)这样的常见转换是通过数据层配置完成的。通过开发一个新的数据层来支持新的输入类型 - 网络的其余部分遵循Caffe层目录的模块化准则。

数据层定义为：

layer {
  name: "mnist"
  # Data layer loads leveldb or lmdb storage DBs for high-throughput.
  type: "Data"
  # the 1st top is the data itself: the name is only convention
  top: "data"
  # the 2nd top is the ground truth: the name is only convention
  top: "label"
  # the Data layer configuration
  data_param {
    # path to the DB
    source: "examples/mnist/mnist_train_lmdb"
    # type of DB: LEVELDB or LMDB (LMDB supports concurrent reads)
    backend: LMDB
    # batch processing improves efficiency.
    batch_size: 64
  }
  # common data transformations
  transform_param {
    # feature scaling coefficient: this maps the [0, 255] MNIST data to [0, 1]#归一化by康
    scale: 0.00390625
  }
}

加载MNIST数据集

Tops and Bottoms:数据层生成顶层blob将数据输出到模型。它没有底部blob，因为它没有输入。

Data and Label:一个数据层至少有一个最常规的命名data。对于ground truth，第二层顶层通常被定义为label。这两个top都只是产生blob，而这些名称并没有什么特别之处。（数据，标签）配对是分类模型的一种便利应用。

Transformations：数据预处理通过数据层定义中的转换信息进行参数化。

layer {
  name: "data"
  type: "Data"
  [...]
  transform_param {
    scale: 0.1
    mean_file_size: mean.binaryproto
    # for images in particular horizontal mirroring and random cropping
    # can be done as simple data augmentations.
    mirror: 1  # 1 = on, 0 = off
    # crop a `crop_size` x `crop_size` patch:
    # - at random during training
    # - from the center during testing
    crop_size: 227
  }
}

Prefetching：吞吐数据层提前获取下一批数据，并在网络计算当前批次数据时在后台预准备。

Multiple Inputs: 一个Net可以有任意数量和类型的多端输入。根据需要定义尽可能多的数据层，为每个数据层分配唯一的名称和顶层。多端输入对于重要的ground truth很有用：一个数据层载入实际数据，另一个数据层载入ground truth in lock-step。在这种设置下，数据和标签都可以是任何4D矩阵。在多模式和序列模型中可以找到多端输入的进一步应用。在这些情况下，你可能需要实现自己的数据准备例程或特殊数据层。

Formats

关于Caffe每种数据的应用介绍，请参考数据层的图层目录。

Deployment Input

对于实时计算部署，Nets通过输入字段定义输入：这些Nets接受直接分配的数据并进行在线或交互式计算。

Caffe Tutorial (Data : how to caffeinate data for model input)

猜你喜欢