研一已经过半了,自我感觉还是一事无成,大半年了一直都是从索取知识,也一直有想回馈的想法,但也一直担心自己的水平会误人子弟。
已经学习使用caffe数月有余,也稍微了解了一点相关的知识,特写此博文与刚入门的同胞共同探讨和学习,也请大牛给予指导与批评。
第一次写博文整理的比较简单,主要目的也是熟悉一下在caffe中跑自己的数据需要经历的简要步骤。以后我会随着学习写一些更加深入的相关知识。
一、首先根据自己的图片数据集生成train.txt文档,用于把image类型的数据集转换成lmdb格式
#!/usr/bin/env sh echo "Create train.txt" rm -rf train.txt #rm -rf train.txt path='/' j=0 for dirname in `ls -F | grep /$` do for filename in `find ${dirname} -type f -name '*.png'` do echo "$path${filename} ${j}" >> train.txt done j=`expr $j + 1` done 二、这个脚本是用来把image格式的数据集生成lmdb格式 步骤2: 生成lmdb格式文件 OUTDIR=/caffe/examples/mymnist TOOLS=/caffe/build/tools TRAIN_DATA_ROOT=/caffe/examples/mymnist/training rm -rf $OUTDIR/mean_hack_lmdb IND_NAME=train.txt OUT_NAME=mean_hack_lmdb RESIZE=true if $RESIZE; then RESIZE_HEIGHT=28 RESIZE_WIDTH=28 else RESIZE_HEIGHT=0 RESIZE_WIDTH=0 fi if [ ! -d "$TRAIN_DATA_ROOT" ]; then echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT" echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \ "where the ImageNet training data is stored." exit 1 fi echo "Creating train lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \ --resize_height=$RESIZE_HEIGHT \ --resize_width=$RESIZE_WIDTH \ --shuffle \ --gray \ $TRAIN_DATA_ROOT \ $IND_NAME \ $OUTDIR/$OUT_NAME
三、写solver文件lenet_solver.prototxt,这个文件在caffe中的example文件夹中有
# The train/test net protocol buffer definition net: "caffe/examples/mymnist/lenet_train_test.ptototxt" # test_iter specifies how many forward passes the test should carry out. # In the case of MNIST, we have test batch size 100 and 100 test iterations, # covering the full 10,000 testing images. test_iter: 100 # Carry out testing every 500 training iterations. test_interval: 500 # The base learning rate, momentum and the weight decay of the network. base_lr: 0.01 momentum: 0.9 weight_decay: 0.0005 # The learning rate policy lr_policy: "inv" gamma: 0.0001 power: 0.75 # Display every 100 iterations display: 100 # The maximum number of iterations max_iter: 10000 # snapshot intermediate results snapshot: 5000 snapshot_prefix: "examples/mnist/lenet" # solver mode: CPU or GPU solver_mode: CPU 四、设计网络name: "LeNet" layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { scale: 0.00390625 } data_param { source: "caffe/examples/mymnist/mean_hack_lmdb" batch_size: 64 backend: LMDB } } layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 } data_param { source: "caffe/examples/mymnist/mean_hack_lmdbd" batch_size: 100 backend: LMDB } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1" } layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } } } layer { name: "accuracy" type: "Accuracy" bottom: "ip2" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss" }
最后在终端上运行 ./build/tools/caffe train --solver=examples/mymnist/lenet_solver.prototxt
最后所有的文件中都要注意路径的设置。