如何在Torch中计算损失函数关于任意层/权重的梯度？

2016-4-6 20:40:34

收藏：0

阅读：67

评论：1

我从Theano转换到Torch。请耐心等待。在Theano中，计算损失函数相对于特定权重的梯度有点直观。我想知道，在Torch中如何做到这一点？

假设我们有以下代码生成一些数据/标签并定义一个模型：

t = require 'torch'
require 'nn'
require 'cunn'
require 'cutorch'

-- 生成随机标签
function randLabels(nExamples, nClasses)
    -- nClasses：类别数
    -- nExamples：样本数
    label = {}
    for i=1, nExamples do
        label[i] = t.random(1, nClasses)
    end
    return t.FloatTensor(label)
end

inputs = t.rand(1000，3，32，32) --1000个样本，3个彩色通道
inputs = inputs:cuda()
labels = randLabels(inputs:size()[1]，10)
labels = labels:cuda()

net = nn.Sequential()
net:add(nn.SpatialConvolution(3，6，5，5))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(2，2，2，2))
net:add(nn.View(6*14*14))
net:add(nn.Linear(6*14*14，300))
net:add(nn.ReLU())
net:add(nn.Linear(300，10))
net = net:cuda()

-- Loss
crietion = nn.CrossEntropyCriterion()
criterion = criterion:cuda()
forwardPass = net:forward(inputs)
net:zeroGradParameters()

-- 如何计算这个？
dEd_WeightsOfLayer1

forwardPass = nil
net = nil
criterion = nil
inputs = nil
labels = nil

collectgarbage()

我如何计算卷积层权重的梯度？

用户2838606

好的，我找到了答案（感谢Torch7 Google组中的AlbanDesmaison）。问题中的代码存在错误，不起作用。因此，我重新编写了代码。以下是您可以按节点/参数获取梯度的方法：

t=require 'torch'
require 'cunn'
require 'nn'
require 'cutorch'

-- A function to generate some random labels
function randLabels(nExamples, nClasses)
    -- nClasses: number of classes
    -- nExamples: number of examples
    label={}
    for i=1,nExamples do
        label[i] = t.random(1, nClasses)
    end
    return t.FloatTensor(label)
end

-- Declare some variables
nClass=10
kernelSize=5
stride=2
poolKernelSize=2
nData=100
nChannel=3
imageSize=32

-- Generate some [random] data
data=t.rand(nData, nChannel, imageSize, imageSize) -- 100 Random images with 3 channels
data=data:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)
label=randLabels(data:size()[1], nClass)
label=label:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)

-- Define model
net=nn.Sequential()
net:add(nn.SpatialConvolution(3, 6, 5, 5))
net:add(nn.ReLU())
net:add(nn.SpatialMaxPooling(poolKernelSize, poolKernelSize, stride, stride))
net:add(nn.View(6*14*14))
net:add(nn.Linear(6*14*14, 350))
net:add(nn.ReLU())
net:add(nn.Linear(350, 10))
net=net:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)

criterion=nn.CrossEntropyCriterion()
criterion=criterion:cuda() -- Transfer to the GPU (remove this line if you're not using GPU)

-- Do forward pass and get the gradient for each node/parameter:

net:forward(data) -- Do the forward propagation
criterion:forward(net.output, label) -- Computer the overall negative log-likelihood error
criterion:backward(net.output, label); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen
net:backward(data, criterion.gradInput); -- Don't forget to put ';'. Otherwise you'll get everything printed on the screen

-- Now you can access the gradient values

layer1InputGrad = net:get(1).gradInput
layer1WeightGrads = net:get(1).gradWeight

net = nil
data = nil
label = nil
criterion = nil

将代码复制并粘贴，效果就像迷人的一样 :)

2016-04-07 23:25:54

评论区的留言会收到邮件通知哦~

作者:

用户2838606

技术支撑

Nana 框架
Kong API 网关
Nuxt 服务端渲染

统计信息

会员 0
文章数: 0
话题数: ...