对caffe卷积层中的weight进行Round化

xiaoxiao2021-12-14 19

前言

我也只是刚入门而已，为了老板的项目对实现caffe中的一个功能而看了好久的代码，也感谢我的师兄给我的教导。为了防止后期自己遗忘，所以特地来这里写一些过程帮助自己记忆。

一

我们都知道，卷积过程一般是在conv层中实现的，我们此处只讨论前向传播的卷积。

以下是conv_layer.cpp中的代码

template <typename Dtype> void Convolution1Layer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) { const Dtype* weight = this->blobs_[0]->cpu_data(); //权重读取 for (int i = 0; i < bottom.size(); ++i) { const Dtype* bottom_data = bottom[i]->cpu_data(); //输入数据读取 Dtype* top_data = top[i]->mutable_cpu_data(); //输出数据读取 // num_ = batchsize 批大小 for (int n = 0; n < this->num_; ++n) { this->forward_cpu_gemm(bottom_data + n * this->bottom_dim_, weight, top_data + n * this->top_dim_); if (this->bias_term_) { //是否开启偏置 const Dtype* bias = this->blobs_[1]->cpu_data(); this->forward_cpu_bias(top_data + n * this->top_dim_, bias); } } } }

此处权重是以指针形式读取的，并在forwad_cpu_gemm函数中被调用，为了弄清楚此处weight到底是什么形式，我们有比较看一下forwad_cpu_gemm函数的代码，forwad_cpu_gemm函数在base_conv_layer.cpp中。

base_conv_layer.cpp

我们在base_conv_layer.cpp中找到forwad_cpu_gemm函数。

template <typename Dtype> void BaseConvolutionLayer<Dtype>::forward_cpu_gemm(const Dtype* input, const Dtype* weights, Dtype* output, bool skip_im2col) { const Dtype* col_buff = input; if (!is_1x1_) { if (!skip_im2col) { conv_im2col_cpu(input, col_buffer_.mutable_cpu_data()); } col_buff = col_buffer_.cpu_data(); } for (int g = 0; g < group_; ++g) { caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, conv_out_channels_ / group_, conv_out_spatial_dim_, kernel_dim_, (Dtype)1., weights + weight_offset_ * g, col_buff + col_offset_ * g, (Dtype)0., output + output_offset_ * g); } } 我们看到，此处weight出现的地方是在caffe_cpu_gemm函数中，它的变量有10个。caffe_cpu_gemm函数在math_functions.cpp文件里面，此处我先将其列出来进行一下解释。 template<> void caffe_cpu_gemm<float>(const CBLAS_TRANSPOSE TransA, const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K, const float alpha, const float* A, const float* B, const float beta, float* C) { int lda = (TransA == CblasNoTrans) ? K : M; int ldb = (TransB == CblasNoTrans) ? N : K; cblas_sgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B, ldb, beta, C, N); }

这个函数实现的是公式C=alpha*A*B+beta*C 函数中的变量为TransA, TransB,M,N,K,alpha,A,B,beta,C 在卷积层中，A中包含的是权重数据，B中包含的是输入数据，一般而言，alpha=1，beta=0，这样C=A*B。 M：A的行 K：A的列；B的行 N：B的列 PS：此处的行与列是针对矩阵形式的A、B而言，在代码中，A、B是一维数组例如： A= [1,2,3; 4,5,6; 7,8,9; 0,1,2;] A是一个4*3的矩阵，那么M=4，K=3，实际上A={1,2,3,4,5,6,7,8,9,0,1,2}。既然我们知道了M是权重矩阵的行数，我们来看看M的赋值 M=conv_out_channels_/group_ ; 即为输出通道数/group 可知，每一行所包含的值都是该行所代表的通道的权重值。

此处我们需要提及一下group的概念。

groups是代表滤波器filter组的个数。引入gruop主要是为了选择性的连接卷基层的输入端和输出端的channels，否则参数会太多。例如conv1层有4个核，即有4个channel，若group=1，则为全连接，四个通道同时输出；若group=2，则channel1-2为一组，channel3-4为一组，先计算channel1-2的输出，再算channel3-4的输出，减少了参数。

二.Round化

Round化的目的是将权重调整为alpha*round(weight/alpha)

此处alpha=max(|weight|)

多的不说直接上代码吧

base_conv_layer.cpp

for (int g = 0; g < group_; ++g) { caffe_cpu_gemm_Round<Dtype>(CblasNoTrans, CblasNoTrans, conv_out_channels_ / group_, conv_out_spatial_dim_, kernel_dim_, (Dtype)1., weights + weight_offset_ * g, col_buff + col_offset_ * g, (Dtype)0., output + output_offset_ * g); } }

math_functions.cpp

template<> int round(float number) { return (number > 0.0) ? (number + 0.5) : (number - 0.5); } //caffe_cpu_gemm_Round template<> void caffe_cpu_gemm_Round<float>(const CBLAS_TRANSPOSE TransA, const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K, const float alpha, const float* A, const float* B, const float beta, float* C) { float max[M] = {0}; for (int j = 0; j < M; ++j) { for (int i = 0; i < N; ++i) { if (abs(A[j * N + i] > max[j]) { max[j] = abs(A[j * N + i]); } } } for (int m = 0; m < M; ++m) { for (int n = 0; n < N; ++n) { A[m * N + n] = max[m] * round(A[m * N + n] / max[m]); } } //替换weight int lda = (TransA == CblasNoTrans) ? K : M; int ldb = (TransB == CblasNoTrans) ? N : K; cblas_sgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B, ldb, beta, C, N); }另外还需要加一个double型的函数，此处就不贴了。

math_functions.hpp 对cpp中出现的函数进行声明

template <typename Dtype> int round(Dtype number); template <typename Dtype> void caffe_cpu_gemm_Round(const CBLAS_TRANSPOSE TransA, const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K, const Dtype alpha, const Dtype* A, const Dtype* B, const Dtype beta, Dtype* C);

转载请注明原文地址: https://ju.6miu.com/read-964025.html

专利

最新回复(0)