torch:LUA编程学习

    xiaoxiao2021-03-25  106

    torch


    Before Lua is pretty close to javascript(var are global by default, unless local keyword is used) only has one data structure: table {}. hash table and an array. indexing from 1 foo:bar() is equal to foo:bar(foo)

    Start

    String: like python: 英文状态下单双引号都可以。print: 输出函数,需要加括号,函数名后直接加括号,也可以加空格再加括号。table: 定义一个变量为table类型通过:a = {}.a[1] = ‘hello’,a[2] = 30.length:’#b’代表b的长度。经过试验,可以获取string和table的长度,不能获取整数的长度。

    for循环:

    输出table:b中的每一项

    for i=1,#b do print(b[i]) end

    Tensors: 张量(矩阵都用张量来表示)

    a = torch.Tensor(5,3) -- 创建5*3的随机矩阵矩阵,未初始化 b= torch.rand(3,3) --创建随机矩阵 a*b torch.mm(a,b) --跟a*b一样的效果 c = torch.Tensor(5,3) c:mm(a,b) --store the result of a*b in c 并输出c的值。乘法 --加法不需要维数相同,只需要个数相同,比如2\*2矩阵和1\*4矩阵也可以相加。 y = torch.add(a, b)-- returns a new Tensor. torch.add(y, a, b) --puts a + b in y. a:add(b) --accumulates all elements of b into a. y:add(a, b) --puts a + b in y. x:add(2, y) -- 两个y相加,相当于×2 [res] torch.mul([res,] tensor1, value) --跟加法原理一样,result的值是tensor1×value

    注释: 两个短横线注释一行 - - dash

    CUDA tensors: Tensors can be moved onto GPU using the :cuda function (cutorch)

    require 'cutorch'; a = a:cuda() b = b:cuda() c = c:cuda() c:mm(a,b) --done on GPU

    重点来了——网络篇 require ‘nn’ 利用nn package 构建 网络模型框架

    LeNet 这是一个简单的网络结构,这个网络的容器是nn,序列能够通过几层layers反馈给输入。

    net = nn.Sequential() --指定网络net的容器类型 后面就是给net添加网络参数 net:add(nn.SpatialConvolution(1, 6, 5, 5)) -- 1 input image channel, 6 output channels, 5x5 convolution kernel net:add(nn.ReLU()) -- non-linearity net:add(nn.SpatialMaxPooling(2,2,2,2)) -- A max-pooling operation that looks at 2x2 windows and finds the max. net:add(nn.SpatialConvolution(6, 16, 5, 5)) net:add(nn.ReLU()) -- non-linearity net:add(nn.SpatialMaxPooling(2,2,2,2)) net:add(nn.View(16*5*5)) -- reshapes from a 3D tensor of 16x5x5 into 1D tensor of 16*5*5 net:add(nn.Linear(16*5*5, 120)) -- fully connected layer (matrix multiplication between input and weights) net:add(nn.ReLU()) -- non-linearity net:add(nn.Linear(120, 84)) net:add(nn.ReLU()) -- non-linearity net:add(nn.Linear(84, 10)) -- 10 is the number of outputs of the network (in this case, 10 digits) net:add(nn.LogSoftMax()) -- converts the output to a log-probability. Useful for classification problems print('Lenet5\n' .. net:__tostring()); --把网络结构打印出来

    结果:

    Lenet5 nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> output] (1): nn.SpatialConvolution(1 -> 6, 5x5) (2): nn.ReLU (3): nn.SpatialMaxPooling(2x2, 2,2) (4): nn.SpatialConvolution(6 -> 16, 5x5) (5): nn.ReLU (6): nn.SpatialMaxPooling(2x2, 2,2) (7): nn.View(400) (8): nn.Linear(400 -> 120) (9): nn.ReLU (10): nn.Linear(120 -> 84) (11): nn.ReLU (12): nn.Linear(84 -> 10) (13): nn.LogSoftMax }

    nn网络结构除了上面提到的sequential还有 上面的例子比较简单,只是一个序列网络,如果以后要构建复杂网络,可以使用下面的网络构架。相当于容器吧。

    参数传递,:forward(input),:backward(input,gradient)

    input = torch.rand(1,32,32) -- 随便给定一个输入 output = net:forward(input) --仅前向传递 print(output) --输出logsoftmax结果 net:zeroGradParameters() --将网络模型梯度参数设置为0 gradInput = net:backward(input, torch.rand(10)) --backward,输入为input, 梯度为torch.rand(10) print(#gradInput) --易知,经过backward后,参数返回到输入,因此gradInput的维度应该跟input相同,但此处input为随机数,故可视化也没什么意义,仅表明原理

    loss function(评估) :在torch中,loss function就像network中的modules,并且可以自动微分。两个函数: forward(input,target),backward(input,target)

    criterion = nn.ClassNLLCriterion() -- a negative log-likelihood criterion for multi-class classification criterion:forward(output, 3) -- 假设3是正确的分类结果,先forward gradients = criterion:backward(output, 3)--然后backward 获得梯度更新,注意forward和backward的主体是criterion gradInput = net:backward(input, gradients)

    如何衔接起来,还不会,待解决

    参数学习(learnable parameters) 卷积层需要学习weights和bias,pooling层没有参数可以学习。 此外, gradWeight 和gradBias需要学习 网络训练

    weight = weight + learningRate * gradWeight [equation 1] 介绍一下随机梯度下降的方法 这里演示2维数据,图像数据看后面。

    dataset={}; function dataset:size() return 100 end -- 100 examples for i=1,dataset:size() do local input = torch.randn(2); -- normally distributed example in 2d local output = torch.Tensor(1); if input[1]*input[2]>0 then -- calculate label for XOR function output[1] = -1; else output[1] = 1 end dataset[i] = {input, output} end

    [neural network] : 这里创建一个隐层的多层感知机。

    require "nn" mlp = nn.Sequential(); -- make a multi-layer perceptron inputs = 2; outputs = 1; HUs = 20; -- parameters mlp:add(nn.Linear(inputs, HUs)) mlp:add(nn.Tanh()) mlp:add(nn.Linear(HUs, outputs))

    [training]: 我们选择 均方差最小 作为最终的更新条件,使用梯度下降策略

    criterion = nn.MSECriterion() --定义criterion trainer = nn.StochasticGradient(mlp, criterion) --训练器参数:网络和criterion trainer.learningRate = 0.01 -- 学习率 trainer:train(dataset) --开始训练

    网络测试

    x = torch.Tensor(2) x[1] = 0.5; x[2] = 0.5; print(mlp:forward(x)) x[1] = 0.5; x[2] = -0.5; print(mlp:forward(x)) x[1] = -0.5; x[2] = 0.5; print(mlp:forward(x)) x[1] = -0.5; x[2] = -0.5; print(mlp:forward(x))

    输出结果有点狰狞,我想是因为没有用softmax进行处理。

    x[1] = 0.5; x[2] = 0.5; print(mlp:forward(x)) -0.3490 [torch.Tensor of dimension 1]

    网络完整过程

    Load and normalize the data

    利用cifar10数据集

    require 'paths' if (not paths.filep("cifar10torchsmall.zip")) then os.execute('wget -c https://s3.amazonaws.com/torch7/data/cifar10torchsmall.zip') --下载 os.execute('unzip cifar10torchsmall.zip') --解压 end trainset = torch.load('cifar10-train.t7') --load train 数据 testset = torch.load('cifar10-test.t7') --load test 数据 classes = {'airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'} itorch.image(trainset.data[100]) -- 可视化display the 100-th image in dataset itorch.image(trainset[100][1]) -- 与上句结果一样 print(classes[trainset.label[100]]) --输出类别 -- ignore setmetatable for now, it is a feature beyond the scope of this tutorial. It sets the index operator. setmetatable(trainset, {__index = function(t, i) return {t.data[i], t.label[i]} end} ); trainset.data = trainset.data:double() -- convert the data from a ByteTensor to a DoubleTensor. function trainset:size() return self.data:size(1) end itorch.image(trainset[33][1])

    正则化数据:求均值和方差

    mean = {} -- store the mean, to normalize the test set in the future stdv = {} -- store the standard-deviation for the future for i=1,3 do -- over each image channel mean[i] = trainset.data[{ {}, {i}, {}, {} }]:mean() -- mean estimation print('Channel ' .. i .. ', Mean: ' .. mean[i]) trainset.data[{ {}, {i}, {}, {} }]:add(-mean[i]) -- mean subtraction stdv[i] = trainset.data[{ {}, {i}, {}, {} }]:std() -- std estimation print('Channel ' .. i .. ', Standard Deviation: ' .. stdv[i]) trainset.data[{ {}, {i}, {}, {} }]:div(stdv[i]) -- std scaling end

    数据处理完了。

    Define Neural Network net = nn.Sequential() net:add(nn.SpatialConvolution(3, 6, 5, 5)) -- 3 input image channels, 6 output channels, 5x5 convolution kernel input数据是3维的,所以第一个数字是3 net:add(nn.ReLU()) -- non-linearity net:add(nn.SpatialMaxPooling(2,2,2,2)) -- A max-pooling operation that looks at 2x2 windows and finds the max. net:add(nn.SpatialConvolution(6, 16, 5, 5)) net:add(nn.ReLU()) -- non-linearity net:add(nn.SpatialMaxPooling(2,2,2,2)) net:add(nn.View(16*5*5)) -- reshapes from a 3D tensor of 16x5x5 into 1D tensor of 16*5*5 net:add(nn.Linear(16*5*5, 120)) -- fully connected layer (matrix multiplication between input and weights) net:add(nn.ReLU()) -- non-linearity net:add(nn.Linear(120, 84)) net:add(nn.ReLU()) -- non-linearity net:add(nn.Linear(84, 10)) -- 10 is the number of outputs of the network (in this case, 10 digits) net:add(nn.LogSoftMax()) -- converts the output to a log-probability. Useful for classification problems Define Loss Function criterion = nn.ClassNLLCriterion() -- negtive log-likelihood Train network on training data trainer = nn.StochasticGradient(net, criterion) --随机梯度下降方法 trainer.learningRate = 0.001 trainer.maxIteration = 5 -- just do 5 epochs of training. trainer:train(trainset) Test network on test data, print accuracy

    图像预处理

    testset.data = testset.data:double() -- convert from Byte tensor to Double tensor for i=1,3 do -- over each image channel testset.data[{ {}, {i}, {}, {} }]:add(-mean[i]) -- mean subtraction testset.data[{ {}, {i}, {}, {} }]:div(stdv[i]) -- std scaling end print(classes[testset.label[100]]) itorch.image(testset.data[100]) predicted = net:forward(testset.data[100]) -- the output of the network is Log-Probabilities. To convert them to probabilities, you have to take e^x print(predicted:exp()) --预测所有 correct = 0 for i=1,10000 do local groundtruth = testset.label[i] local prediction = net:forward(testset.data[i]) local confidences, indices = torch.sort(prediction, true) -- true means sort in descending order if groundtruth == indices[1] then correct = correct + 1 end end print(correct, 100*correct/10000 .. ' % ') --正确率 --what are the classes that performed well, and the classes that did not perform well: class_performance = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0} for i=1,10000 do local groundtruth = testset.label[i] local prediction = net:forward(testset.data[i]) local confidences, indices = torch.sort(prediction, true) -- true means sort in descending order if groundtruth == indices[1] then class_performance[groundtruth] = class_performance[groundtruth] + 1 end end for i=1,#classes do print(classes[i], 100*class_performance[i]/1000 .. ' %') end 如何使用GPU编程 把数据模型和评判标准变成CUDA模式,就可以了。 require 'cunn'; net = net:cuda() criterion = criterion:cuda() trainset.data = trainset.data:cuda() trainset.label = trainset.label:cuda() trainer = nn.StochasticGradient(net, criterion) trainer.learningRate = 0.001 trainer.maxIteration = 5 -- just do 5 epochs of training. trainer:train(trainset)

    Reference 1. https://github.com/soumith/cvpr2015/blob/master/Deep Learning with Torch.ipynb


    EMMA SIAT 2017.03.13

    转载请注明原文地址: https://ju.6miu.com/read-36361.html

    最新回复(0)