1、首先我们把caffe的example跑一下。
cd caffe //首先进入caffe目录
./data/cifar10/get_cifar10.sh //图片分类 ./data/mnist/get_mnist.sh //手写字体 ./data/ilsvrc12/get_ilsvrc_aux.sh //图片识别比赛然后进行训练的数据集下载
解释一下什么是数据集:数据集就是一些先有的已知的信息,比如有编号1-100张车的图片,编号1-100张猫的照片等等。
解释一下什么是测试集:测试集就是从数据集中随机抽出的的图片或者未经过训练的图片,比如编号为32的车,编号102的猫,编号201狗等。注意可能测试集的图片,训练的数据并不包含这一类,比如小狗照片等。
解释一下什么是.sh文件: .sh文件是类似脚本的文件,是一系列命令集合,一些命令可以都放在.sh文件里面,可以执行./sh 命令集合可以顺序执行,没必要一步一步敲打了,
主要是提高效率和一些应用常用命令的操作集合。比如:get_cifar10.sh
文件里面的内容
#!/usr/bin/env sh #sh文件的环境变量 # This scripts downloads the CIFAR10 (binary version) data and unzips it. DIR="$( cd "$(dirname "$0")" ; pwd -P )" #目录路径信息 cd "$DIR" echo "Downloading..." wget --no-check-certificate http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz #下载测试集数据包,这些都是实际的linux命令 是可以在命令行执行的。 echo "Unzipping..." tar -xf cifar-10-binary.tar.gz && rm -f cifar-10-binary.tar.gz #解压命令等 mv cifar-10-batches-bin/* . && rm -rf cifar-10-batches-bin # Creation is split out because leveldb sometimes causes segfault # and needs to be re-created. echo "Done." ./examples/cifar10/create_cifar10.sh //创建caffe可以识别的数据格式lmdb ./examples/mnist/create_mnist.sh ./examples/cifar10/train_quick.sh #!/usr/bin/env sh set -e TOOLS=./build/tools $TOOLS/caffe train \ --solver=examples/cifar10/cifar10_quick_solver.prototxt $@ # reduce learning rate by factor of 10 after 8 epochs $TOOLS/caffe train \ --solver=examples/cifar10/cifar10_quick_solver_lr1.prototxt \ --snapshot=examples/cifar10/cifar10_quick_iter_4000.solverstate.h5 $@下面两个文件内容来自train_quick.sh 文件里面需要的配置文件和网络
# reduce the learning rate after 8 epochs (4000 iters) by a factor of 10 # The train/test net protocol buffer definition net: "examples/cifar10/cifar10_quick_train_test.prototxt" # test_iter specifies how many forward passes the test should carry out. # In the case of MNIST, we have test batch size 100 and 100 test iterations, # covering the full 10,000 testing images. test_iter: 100 #测试批次,意思是一次测试多少数据 如 一次测试100张图片 # Carry out testing every 500 training iterations. test_interval: 500 #每迭代500次输出score 0 准确率 和 score 1 测试损失函数 # The base learning rate, momentum and the weight decay of the network. base_lr: 0.001 #是基础学习率 momentum: 0.9 #学习的参数 weight_decay: 0.004 #学习的参数 # The learning rate policy lr_policy: "fixed" #学习率变化 # Display every 100 iterations display: 100 #每 100 次迭代数显示一训练时 lr(learning rate) 和 loss (训练损失函数) # The maximum number of iterations max_iter: 4000 #最大的迭代次数 # snapshot intermediate results snapshot: 4000 #迭代4000次显示状态 snapshot_format: HDF5 #输入的数据格式是HDF5 snapshot_prefix: "examples/cifar10/cifar10_quick" # solver mode: CPU or GPU假设:训练样本有50000张图片,配置文件的training stage 的batch_size设置为100 50000/100=500次之后就可以训练完一个批次,然后就可以测试一下,因此 test_interval: 500 如果想训练80个批次 80*500=4000 侧设置 max_iter: 40000 假设:测试样本有10000张图片,配置文件的testing stage 的batch_size设置为100 10000/100=100次 才可以完整的覆盖测试一遍,因此 test_iter: 100 学习率变化规律随着迭代次数的增加,慢慢变低。总共迭代40000次,如果保证学习率不变设置为“fixed” 若希望变化lr_rate四次,所以stepsize设置为40000/4=10000,即每迭代10000次,就降低一次学习率,因此,此时lr_policy 设置为10000 solver_mode: CPU #使用CPU处理具体应用,会在阅读代码过程中详细讲解。
name: "CIFAR10_quick" layer { name: "cifar" type: "Data" top: "data" top: "label" include { phase: TRAIN #数据处理阶段为:训练阶段 } transform_param { mean_file: "examples/cifar10/mean.binaryproto" #数据的预处理 就是归一化处理,可以将数据变换到定义的范围内。如设置scale为0.00390625,实际上就是1/255, 即将输入数据由 0-255归一化到0-1之间。 } data_param { source: "examples/cifar10/cifar10_train_lmdb" #数据来源路径 batch_size: 100 #每次处理数据(图片的张数) backend: LMDB #数据的格式 } } layer {#测试阶段的和训练阶段的变量表示一致 name: "cifar" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { mean_file: "examples/cifar10/mean.binaryproto" } data_param { source: "examples/cifar10/cifar10_test_lmdb" batch_size: 100 backend: LMDB } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 32 #经过卷积处理输出的结果矩阵(feature maritx) pad: 2 #识别图片的边缘区域扩展大小 kernel_size: 5 #卷积核大小5x5 stride: 1 #每次卷积核在识别的图片上的移动步长 weight_filler { type: "gaussian" #使用gaussian去产生初始化卷积核内部的数据 std: 0.0001 } bias_filler { type: "constant" #类似y=ax+b,此处的b为常量 } } } layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX # 等宽高区域大小取最大值 如4x4矩阵里面,选取四个小方格的最大值 kernel_size: 3 #等区域的大小,此处区域大小为3x3 stride: 2 #下一个区域的位置,原区域移动两个位置,即下个区域的位置 } } layer { name: "relu1" type: "ReLU" #进行线性化计算 bottom: "pool1" top: "pool1" } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 32 pad: 2 kernel_size: 5 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: AVE kernel_size: 3 stride: 2 } } layer { name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "pool3" type: "Pooling" bottom: "conv3" top: "pool3" pooling_param { pool: AVE kernel_size: 3 stride: 2 } } layer { name: "ip1" type: "InnerProduct" bottom: "pool3" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 64 weight_filler { type: "gaussian" std: 0.1 } bias_filler { type: "constant" } } } layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "gaussian" std: 0.1 } bias_filler { type: "constant" } } } layer { name: "accuracy" type: "Accuracy" bottom: "ip2" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss" }下图为上面代码表示的网络层次图,详细原理和代码对照阅读细节会在后面详细介绍:
这就是首先使用语言定义出整个网络层次。