Torch学习笔记

xiaoxiao2025-06-18 38

Torch笔记（四）DNN训练方法

神经网络训练在torch中是比较固定的模式，在torch中都比较简单，torch就好像一个计算机硬件供应商，已经生产好了CPU、显卡、内存条、硬盘等等核心部件，咱们使用torch就是在组装电脑，按照不同的需求组装不同的电脑，不用自己造轮子，组装起来就快的多了，降低使用门槛，让大家都来玩玩现在红的发紫的深度学习。言归正传，训练方法常用的有两种，一种是使用for循环手动训练，另外一种是使用optim包进行训练。这里以简单的异或问题为例说明，例子太简单了，不过重点在训练流程上面。

-- 创建网络 require "nn" mlp = nn.Sequential(); inputs = 2; outputs = 1; hiddens = 20; mlp:add(nn.Linear(inputs, hiddens )) mlp:add(nn.Tanh()) mlp:add(nn.Linear(hiddens , outputs)) -- 定义损失函数 criterion = nn.MSECriterion() -- 开始训练 for i = 1,2500 do -- random sample local input= torch.randn(2); local output= torch.Tensor(1); if input[1]*input[2] > 0 then output[1] = -1 else output[1] = 1 end -- 前向过程 criterion:forward(mlp:forward(input), output) -- 这里一般是固定的三步 -- (1) 梯度清零，不然会累加 mlp:zeroGradParameters() -- (2) 计算各梯度 mlp:backward(input, criterion:backward(mlp.output, output)) -- (3) 以学习率为0.01的速率来更新参数 mlp:updateParameters(0.01) end -- 简单测试 x = torch.Tensor(2) x[1] = 0.5; x[2] = 0.5; print(mlp:forward(x)) x[1] = 0.5; x[2] = -0.5; print(mlp:forward(x)) x[1] = -0.5; x[2] = 0.5; print(mlp:forward(x)) x[1] = -0.5; x[2] = -0.5; print(mlp:forward(x)) -- 结果 > x = torch.Tensor(2) > x[1] = 0.5; x[2] = 0.5; print(mlp:forward(x)) -0.6140 [torch.Tensor of dimension 1] > x[1] = 0.5; x[2] = -0.5; print(mlp:forward(x)) 0.8878 [torch.Tensor of dimension 1] > x[1] = -0.5; x[2] = 0.5; print(mlp:forward(x)) 0.8548 [torch.Tensor of dimension 1] > x[1] = -0.5; x[2] = -0.5; print(mlp:forward(x)) -0.5498 [torch.Tensor of dimension 1]

另外一种使用optim包来训练

-- 创建网络 require "nn" mlp = nn.Sequential(); inputs = 2; outputs = 1; hiddens = 20; mlp:add(nn.Linear(inputs, hiddens )) mlp:add(nn.Tanh()) mlp:add(nn.Linear(hiddens , outputs)) -- 定义损失函数 criterion = nn.MSECriterion() -- 数据准备 -- 当数据量很大时，需要分成多个batchsize来进行训练 local batchSize = 128 local batchInputs = torch.Tensor(batchSize, inputs) local batchLabels = torch.DoubleTensor(batchSize) for i=1,batchSize do local input = torch.randn(2) local label = 1 if input[1]*input[2]>0 then label = -1; end batchInputs[i]:copy(input) batchLabels[i] = label end -- 训练 -- 这里使用sgd训练，还可以用lbfgs、adadelta、adagrad、adam、adamax、FistaLS、rprop、cmaes等很多优化算法 -- 使用这些算法时，形式基本都是一样的，以sgd为例，sgd(opfunc, x[, config][, state]) opfunc是只含有一个输入 -- 参数x的函数，返回f(X) and和df/dX。config是包含算法的参数配置的table，有config.learningRate即学习率， -- config.learningRateDecay学习衰减，config.weightDecay权重衰减，config.weightDecays多个权重的衰减， -- config.momentum是算法中的动量，config.dampening是dampening 动量，config.nesterov是Nesterov 动量， -- state包含算法状态的table，当然这么多参数不一定全用，根据实际来选取使用。 -- 设置算法参数 local optimState = {learningRate=0.01} require 'optim' for epoch=1,50 do local function feval(params) gradParams:zero() local outputs = model:forward(batchInputs) local loss = criterion:forward(outputs, batchLabels) local dloss_doutput = criterion:backward(outputs, batchLabels) model:backward(batchInputs, dloss_doutput) return loss,gradParams end optim.sgd(feval, params, optimState) end -- 测试 x = torch.Tensor(2) x[1] = 0.5; x[2] = 0.5; print(model:forward(x)) x[1] = 0.5; x[2] = -0.5; print(model:forward(x)) x[1] = -0.5; x[2] = 0.5; print(model:forward(x)) x[1] = -0.5; x[2] = -0.5; print(model:forward(x)) -- 结果 > x = torch.Tensor(2) > x[1] = 0.5; x[2] = 0.5; print(model:forward(x)) -0.3490 [torch.Tensor of dimension 1] > x[1] = 0.5; x[2] = -0.5; print(model:forward(x)) 1.0561 [torch.Tensor of dimension 1] > x[1] = -0.5; x[2] = 0.5; print(model:forward(x)) 0.8640 [torch.Tensor of dimension 1] > x[1] = -0.5; x[2] = -0.5; print(model:forward(x)) -0.2941 [torch.Tensor of dimension 1]

下一次将介绍用DNN实现多分类

转载请注明原文地址: https://ju.6miu.com/read-1300079.html

最新回复(0)

Torch学习笔记

Torch笔记 （四）DNN训练方法

Torch笔记（四）DNN训练方法