1 引言
在深度学习中,通过简单的堆叠网络层增加网络深度的方式并不能增加网络的性能,另外,深度网络在训练时容易引起“梯度消失”的问题(即梯度反向传播到上层,重复的乘法可能会使梯度变得非常小)。
ResNet提出了残差学习来解决退化问题。对于一个堆积层结构(几层堆积而成)。对于一个堆积层结构(几层堆积而成)当输入为x时其学习到的特征记为F(x), 现在再加一条分支,直接跳到堆积层的输出,则此时最终输出H(x) = F(x) + x,如下图所示
这种跳跃连接就叫做shortcut connection(类似电路中的短路)。上面这种两层结构的叫BasicBlock,一般适用于ResNet18和ResNet34,而ResNet50以后都使用下面这种三层的残差结构叫Bottleneck。
在阅读本文之前,假设你已经了解什么是图像卷积,全连接,以及熟练使用Python和Pytorch。
2 Pytorch中的卷积
2.1 torch.nn.Conv2d
详情见官方文档:https://pytorch.org/docs/1.8.0/generated/torch.nn.Conv2d.html?highlight=nn%20conv2d#torch.nn.Conv2d
1. 函数形式
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')
2. 函数参数
- in_channels:输入的通道数
- out_channels:输出的通道数
- kernel_size:卷积核的大小
- stride:卷积的步长
- padding:填充的padding大小
2.2 卷积的输出
在Pytorch中,对一个batch的图像进行卷积时,经常会使用以下形状为(b,c,h,w)的Tensor,
其中
- b表示batch_size
- c表示图片的通道数
- h表示图片的高度
- w表示图片的宽度
那么经过卷积之后,一个图片的大小通过以下公式计算
output = floor(\frac{input+2*padding-kernel}{stride} +1)
\end{array}
例如,假设输入的图片大小为(64,64),padding=1,kernel=3,stride=2,那么通过上述公式,经过卷积之后的图片大小为(32,32)。
3 ResNet
ResNet有ResNet18,ResNet34,ResNet50等,如下表所示。
上述图中的框选的红色部分,3 \times 3表示卷积核的大小,64表示滤波器,\times 2表示重复的数量。
ResNet主要有以下三个重要的组成部分:
- input层,conv1 + max pooling,通常作为第0层
- ResBlocks,第1层-第4层
- average pool + 全连接层,最后一层
3.1 ResNet18
我们先从最简单的ResNet18开始。假设输入的图片的大小为224 \times 224,输出为1000个分类。
ResNet18的网络结构如下所示,
或者
3.1.1 ResBlock
根据两幅ResNet18的网络结构图,ResBlock有两种方式,一种是使用stride=2也就是下采样进行卷积,另一种则是步长为1,我们需要在代码中分别实现,ResBlock的代码如下:
class ResBlock(nn.Module):
def __init__(self, in_channels, out_channels, downsample):
super().__init__()
if downsample:
self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=2, padding=1)
self.shortcut = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=2),
nn.BatchNorm2d(out_channels)
)
else:
self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=1, padding=1)
self.shortcut = nn.Sequential()
self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1)
self.bn1 = nn.BatchNorm2d(out_channels)
self.bn2 = nn.BatchNorm2d(out_channels)
def forward(self, input):
shortcut = self.shortcut(input)
input = nn.ReLU()(self.bn1(self.conv1(input)))
input = nn.ReLU()(self.bn2(self.conv2(input)))
input = input + shortcut
return nn.ReLU()(input)
3.1.2 输入层,第0层
输入层即第0层由7 \times 7的卷积层和3 \times 3的max pooling层组成,代码如下:
self.layer0 = nn.Sequential(
nn.Conv2d(in_channels, 64, kernel_size=7, stride=2, padding=3),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
nn.BatchNorm2d(64),
nn.ReLU()
)
3.1.3 最后一层
最后一层由全局的平均池化层和全连接层组成,代码如下
self.gap = torch.nn.AdaptiveAvgPool2d(1)
self.fc = torch.nn.Linear(filters[4], outputs)
全局平均池化如下图所示:
3.1.4 ResNet18代码
将上述的代码进行组合,得到ResNet18的代码
class ResNet18(nn.Module):
def __init__(self, in_channels, resblock, outputs=1000):
super().__init__()
self.layer0 = nn.Sequential(
nn.Conv2d(in_channels, 64, kernel_size=7, stride=2, padding=3),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
nn.BatchNorm2d(64),
nn.ReLU()
)
self.layer1 = nn.Sequential(
resblock(64, 64, downsample=False),
resblock(64, 64, downsample=False)
)
self.layer2 = nn.Sequential(
resblock(64, 128, downsample=True),
resblock(128, 128, downsample=False)
)
self.layer3 = nn.Sequential(
resblock(128, 256, downsample=True),
resblock(256, 256, downsample=False)
)
self.layer4 = nn.Sequential(
resblock(256, 512, downsample=True),
resblock(512, 512, downsample=False)
)
self.gap = torch.nn.AdaptiveAvgPool2d(1)
self.fc = torch.nn.Linear(512, outputs)
def forward(self, input):
input = self.layer0(input)
input = self.layer1(input)
input = self.layer2(input)
input = self.layer3(input)
input = self.layer4(input)
input = self.gap(input)
input = torch.flatten(input)
input = self.fc(input)
return input
我们可以使用torchsummary对上述的模型的网络结构进行统计,
from torchsummary import summary
resnet18 = ResNet18(3, ResBlock, outputs=1000)
resnet18.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet18, (3, 224, 224))
3.2 ResNet34
ResNet34与ResNet18输入层与最后一层都相同,只需要改变layer1-layer4中重复的ResBlock的数量即可,ResNet34的代码如下:
class ResNet34(nn.Module):
def __init__(self, in_channels, resblock, outputs=1000):
super().__init__()
self.layer0 = nn.Sequential(
nn.Conv2d(in_channels, 64, kernel_size=7, stride=2, padding=3),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
nn.BatchNorm2d(64),
nn.ReLU()
)
self.layer1 = nn.Sequential(
resblock(64, 64, downsample=False),
resblock(64, 64, downsample=False),
resblock(64, 64, downsample=False)
)
self.layer2 = nn.Sequential(
resblock(64, 128, downsample=True),
resblock(128, 128, downsample=False),
resblock(128, 128, downsample=False),
resblock(128, 128, downsample=False)
)
self.layer3 = nn.Sequential(
resblock(128, 256, downsample=True),
resblock(256, 256, downsample=False),
resblock(256, 256, downsample=False),
resblock(256, 256, downsample=False),
resblock(256, 256, downsample=False),
resblock(256, 256, downsample=False)
)
self.layer4 = nn.Sequential(
resblock(256, 512, downsample=True),
resblock(512, 512, downsample=False),
resblock(512, 512, downsample=False),
)
self.gap = torch.nn.AdaptiveAvgPool2d(1)
self.fc = torch.nn.Linear(512, outputs)
def forward(self, input):
input = self.layer0(input)
input = self.layer1(input)
input = self.layer2(input)
input = self.layer3(input)
input = self.layer4(input)
input = self.gap(input)
input = torch.flatten(input)
input = self.fc(input)
return input
我们同样可以使用torchsummary统计网络信息,
resnet34 = ResNet34(3, ResBlock, outputs=1000)
resnet34.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet34, (3, 224, 224))
3.3 ResNet50,ResNet101,ResNet152
如上图所示,左边的就是我们在ResNet18和ResNet34中使用的ResBlock结构,右边为在ResNet50,ResNet101,ResNet152中使用的bottleneck结构。
3.3.1 ResBottleneckBlock
ResNet34和ResNet50之间最明显的区别是ResBlocks,如上图所示。我们需要将此组件重写为一个名为“ResBottleneckBlock”的新组件。
class ResBottleneckBlock(nn.Module):
def __init__(self, in_channels, out_channels, downsample):
super().__init__()
self.downsample = downsample
self.conv1 = nn.Conv2d(in_channels, out_channels//4, kernel_size=1, stride=1)
self.conv2 = nn.Conv2d(out_channels//4, out_channels//4, kernel_size=3, stride=2 if downsample else 1, padding=1)
self.conv3 = nn.Conv2d(out_channels//4, out_channels, kernel_size=1, stride=1)
self.shortcut = nn.Sequential()
if self.downsample or in_channels != out_channels:
self.shortcut = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=2 if self.downsample else 1),
nn.BatchNorm2d(out_channels)
)
self.bn1 = nn.BatchNorm2d(out_channels//4)
self.bn2 = nn.BatchNorm2d(out_channels//4)
self.bn3 = nn.BatchNorm2d(out_channels)
def forward(self, input):
shortcut = self.shortcut(input)
input = nn.ReLU()(self.bn1(self.conv1(input)))
input = nn.ReLU()(self.bn2(self.conv2(input)))
input = nn.ReLU()(self.bn3(self.conv3(input)))
input = input + shortcut
return nn.ReLU()(input)
结合上述代码以及以上图片,我们可以产生3中不同的ResBottleneckBlock:
- Left:ResBottleneckBlock(64, 256, downsample=False)
- Middle:ResBottleneckBlock(256, 256, downsample=False)
- Right:ResBottleneckBlock(256, 512, downsample=True)
3.3.2 代码整合
为了可以兼容所有的RetNet网络,在上述代码ResNet18/34的代码中增加了一个新的布尔变量useBottleneck
用于指定是否使用bottleneck。如果想要生成ResNet18/34则将useBottleneck
设置为False;如果想要生成ResNet-50/101/152则将useBottleneck
设置为True。
最后的代码如下:
class ResNet(nn.Module):
def __init__(self, in_channels, resblock, repeat, useBottleneck=False, outputs=1000):
super().__init__()
self.layer0 = nn.Sequential(
nn.Conv2d(in_channels, 64, kernel_size=7, stride=2, padding=3),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
nn.BatchNorm2d(64),
nn.ReLU()
)
if useBottleneck:
filters = [64, 256, 512, 1024, 2048]
else:
filters = [64, 64, 128, 256, 512]
self.layer1 = nn.Sequential()
self.layer1.add_module('conv2_1', resblock(filters[0], filters[1], downsample=False))
for i in range(1, repeat[0]):
self.layer1.add_module('conv2_%d'%(i+1,), resblock(filters[1], filters[1], downsample=False))
self.layer2 = nn.Sequential()
self.layer2.add_module('conv3_1', resblock(filters[1], filters[2], downsample=True))
for i in range(1, repeat[1]):
self.layer2.add_module('conv3_%d' % (i+1,), resblock(filters[2], filters[2], downsample=False))
self.layer3 = nn.Sequential()
self.layer3.add_module('conv4_1', resblock(filters[2], filters[3], downsample=True))
for i in range(1, repeat[2]):
self.layer3.add_module('conv2_%d' % (i+1,), resblock(filters[3], filters[3], downsample=False))
self.layer4 = nn.Sequential()
self.layer4.add_module('conv5_1', resblock(filters[3], filters[4], downsample=True))
for i in range(1, repeat[3]):
self.layer4.add_module('conv3_%d'%(i+1,), resblock(filters[4], filters[4], downsample=False))
self.gap = torch.nn.AdaptiveAvgPool2d(1)
self.fc = torch.nn.Linear(filters[4], outputs)
def forward(self, input):
input = self.layer0(input)
input = self.layer1(input)
input = self.layer2(input)
input = self.layer3(input)
input = self.layer4(input)
input = self.gap(input)
input = torch.flatten(input, start_dim=1)
input = self.fc(input)
return input
我们可以通过以下的方式构造不同的ResNet,
# resnet18
resnet18 = ResNet(3, ResBlock, [2, 2, 2, 2], useBottleneck=False, outputs=1000)
resnet18.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet18, (3, 224, 224))
# resnet34
resnet34 = ResNet(3, ResBlock, [3, 4, 6, 3], useBottleneck=False, outputs=1000)
resnet34.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet34, (3, 224, 224))
# resnet50
resnet50 = ResNet(3, ResBottleneckBlock, [3, 4, 6, 3], useBottleneck=True, outputs=1000)
resnet50.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet50, (3, 224, 224))
# resnet101
resnet101 = ResNet(3, ResBottleneckBlock, [3, 4, 23, 3], useBottleneck=True, outputs=1000)
resnet101.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet101, (3, 224, 224))
# resnet152
resnet152 = ResNet(3, ResBottleneckBlock, [3, 8, 36, 3], useBottleneck=True, outputs=1000)
resnet152.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet152, (3, 224, 224))
4 ResNet网络代码
完整的ResNet网络代码如下:
import torch
from torchsummary import summary
from torch import nn
class ResBlock(nn.Module):
def __init__(self, in_channels, out_channels, downsample):
super().__init__()
if downsample:
self.conv1 = nn.Conv2d(
in_channels, out_channels, kernel_size=3, stride=2, padding=1)
self.shortcut = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=2),
nn.BatchNorm2d(out_channels)
)
else:
self.conv1 = nn.Conv2d(
in_channels, out_channels, kernel_size=3, stride=1, padding=1)
self.shortcut = nn.Sequential()
self.conv2 = nn.Conv2d(out_channels, out_channels,
kernel_size=3, stride=1, padding=1)
self.bn1 = nn.BatchNorm2d(out_channels)
self.bn2 = nn.BatchNorm2d(out_channels)
def forward(self, input):
shortcut = self.shortcut(input)
input = nn.ReLU()(self.bn1(self.conv1(input)))
input = nn.ReLU()(self.bn2(self.conv2(input)))
input = input + shortcut
return nn.ReLU()(input)
class ResBottleneckBlock(nn.Module):
def __init__(self, in_channels, out_channels, downsample):
super().__init__()
self.downsample = downsample
self.conv1 = nn.Conv2d(in_channels, out_channels//4,
kernel_size=1, stride=1)
self.conv2 = nn.Conv2d(
out_channels//4, out_channels//4, kernel_size=3, stride=2 if downsample else 1, padding=1)
self.conv3 = nn.Conv2d(out_channels//4, out_channels, kernel_size=1, stride=1)
if self.downsample or in_channels != out_channels:
self.shortcut = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=1,
stride=2 if self.downsample else 1),
nn.BatchNorm2d(out_channels)
)
else:
self.shortcut = nn.Sequential()
self.bn1 = nn.BatchNorm2d(out_channels//4)
self.bn2 = nn.BatchNorm2d(out_channels//4)
self.bn3 = nn.BatchNorm2d(out_channels)
def forward(self, input):
shortcut = self.shortcut(input)
input = nn.ReLU()(self.bn1(self.conv1(input)))
input = nn.ReLU()(self.bn2(self.conv2(input)))
input = nn.ReLU()(self.bn3(self.conv3(input)))
input = input + shortcut
return nn.ReLU()(input)
class ResNet(nn.Module):
def __init__(self, in_channels, resblock, repeat, useBottleneck=False, outputs=1000):
super().__init__()
self.layer0 = nn.Sequential(
nn.Conv2d(in_channels, 64, kernel_size=7, stride=2, padding=3),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
nn.BatchNorm2d(64),
nn.ReLU()
)
if useBottleneck:
filters = [64, 256, 512, 1024, 2048]
else:
filters = [64, 64, 128, 256, 512]
self.layer1 = nn.Sequential()
self.layer1.add_module('conv2_1', resblock(filters[0], filters[1], downsample=False))
for i in range(1, repeat[0]):
self.layer1.add_module('conv2_%d'%(i+1,), resblock(filters[1], filters[1], downsample=False))
self.layer2 = nn.Sequential()
self.layer2.add_module('conv3_1', resblock(filters[1], filters[2], downsample=True))
for i in range(1, repeat[1]):
self.layer2.add_module('conv3_%d' % (
i+1,), resblock(filters[2], filters[2], downsample=False))
self.layer3 = nn.Sequential()
self.layer3.add_module('conv4_1', resblock(filters[2], filters[3], downsample=True))
for i in range(1, repeat[2]):
self.layer3.add_module('conv2_%d' % (
i+1,), resblock(filters[3], filters[3], downsample=False))
self.layer4 = nn.Sequential()
self.layer4.add_module('conv5_1', resblock(filters[3], filters[4], downsample=True))
for i in range(1, repeat[3]):
self.layer4.add_module('conv3_%d'%(i+1,),resblock(filters[4], filters[4], downsample=False))
self.gap = torch.nn.AdaptiveAvgPool2d(1)
self.fc = torch.nn.Linear(filters[4], outputs)
def forward(self, input):
input = self.layer0(input)
input = self.layer1(input)
input = self.layer2(input)
input = self.layer3(input)
input = self.layer4(input)
input = self.gap(input)
# torch.flatten()
# https://stackoverflow.com/questions/60115633/pytorch-flatten-doesnt-maintain-batch-size
input = torch.flatten(input, start_dim=1)
input = self.fc(input)
return input
1. ResNet18
resnet18 = ResNet(3, ResBlock, [2, 2, 2, 2], useBottleneck=False, outputs=1000)
resnet18.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet18, (3, 224, 224))
2. ResNet34
resnet34 = ResNet(3, ResBlock, [3, 4, 6, 3], useBottleneck=False, outputs=1000)
resnet34.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet34, (3, 224, 224))
3. ResNet50
resnet50 = ResNet(3, ResBottleneckBlock, [
3, 4, 6, 3], useBottleneck=True, outputs=1000)
resnet50.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet50, (3, 224, 224))
4. ResNet101
resnet101 = ResNet(3, ResBottleneckBlock, [
3, 4, 23, 3], useBottleneck=True, outputs=1000)
resnet101.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet101, (3, 224, 224))
5. ResNet152
resnet152 = ResNet(3, ResBottleneckBlock, [
3, 8, 36, 3], useBottleneck=True, outputs=1000)
resnet152.to(torch.device("cuda:0" if torch.cuda.is_available() else "cpu"))
summary(resnet152, (3, 224, 224))
也可以参考:https://github.com/weiaicunzai/pytorch-cifar100/blob/master/models/resnet.py
参考链接
本文作者:StubbornHuang
版权声明:本文为站长原创文章,如果转载请注明原文链接!
原文标题:Pytorch – 用Pytorch实现ResNet
原文链接:https://www.stubbornhuang.com/2283/
发布于:2022年08月06日 23:53:31
修改于:2023年06月25日 20:45:15
声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。
评论
50