FCOS论文及源码详解(二)
在 FCOS论文及源码详解(一)中,已摘录并大致翻译论文中关于FCOS算法结构的部分,现对FCOS源码进行解析。
FCOS项目
FCOS项目.
其中,有关模型训练的部分说明如下:
Training
The following command line will train FCOS_imprv_R_50_FPN_1x on 8 GPUs with Synchronous Stochastic Gradient Descent (SGD):
python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=$((RANDOM + 10000)) \
tools/train_net.py \
--config-file configs/fcos/fcos_imprv_R_50_FPN_1x.yaml \
DATALOADER.NUM_WORKERS 2 \
OUTPUT_DIR training_dir/fcos_imprv_R_50_FPN_1x
其中关键在于调用 tools/train_net.py
在文件夹中找到这一文件,便是从train_net.py这里开始读代码
FCOS代码
tools/train_net.py
main()函数中关键一句指向train()函数
model = train(cfg, args.local_rank, args.distributed)
train()函数开头则是调用build_detection_model()函数
model = build_detection_model(cfg)
build_detection_model()函数调用自fcos_core.modeling.detector,依次追索至fcos_core.modeling.detector.generalized_rcnn.GeneralizedRCNN,该类继承torch.nn.model,有三个实例化变量
self.backbone = build_backbone(cfg)
self.rpn = build_rpn(cfg, self.backbone.out_channels)
self.roi_heads = build_roi_heads(cfg, self.backbone.out_channels)
build_backbone()函数调用自fcos_core.modeling.backbone
build_rpn()函数调用自fcos_core.modeling.rpn.rpn
build_roi_heads()函数调用自fcos_core.modeling.roi_heads.roi_heads
build_backbone()
首先来看build_backbone()
关键一句
return registry.BACKBONES[cfg.MODEL.BACKBONE.CONV_BODY](cfg)
在fcos_core/config/defaults.py中可找到
cfg.MODEL.BACKBONE.CONV_BODY→_C.MODEL.BACKBONE.CONV_BODY = “R-50-C4”
registry调用自fcos_core.modeling,追索至fcos_core.utils.registry.Registry
Registry()类有如下说明:
A helper class for managing registering modules, it extends a dictionary
and provides a register functions.
Eg. creeting a registry:
some_registry = Registry({"default": default_module})
There're two ways of registering new modules:
1): normal way is just calling register function:
def foo():
...
some_registry.register("foo_module", foo)
2): used as decorator when declaring the module:
@some_registry.register("foo_module")
@some_registry.register("foo_modeul_nickname")
def foo():
...
Access of module is just like using a dictionary, eg:
f = some_registry["foo_modeul"]
在build_backbone()函数上方索至
@registry.BACKBONES.register("R-50-C4")
@registry.BACKBONES.register("R-50-C5")
@registry.BACKBONES.register("R-101-C4")
@registry.BACKBONES.register("R-101-C5")
def build_resnet_backbone(cfg):
body = resnet.ResNet(cfg)
model = nn.Sequential(OrderedDict([("body", body)]))
model.out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
return model
因此build_backbone()→build_resnet_backbone()
resnet.ResNet()
即fcos_core.modeling.backbone.resnet.ResNet()
class ResNet(nn.Module):
def __init__(self, cfg):
super(ResNet, self).__init__()
# If we want to use the cfg in forward(), then we should make a copy
# of it and store it for later use:
# self.cfg = cfg.clone()
# Translate string names to implementations
stem_module = _STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
stage_specs = _STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
transformation_module = _TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]
首先实施stem_module、stage_specs、transformation_module
stem_module
→_STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
→StemWithFixedBatchNorm(BaseStem), norm_func=FrozenBatchNorm2d
→BaseStem
self.conv1 = Conv2d(
3, out_channels, kernel_size=7, stride=2, padding=3, bias=False
)
self.bn1 = norm_func(out_channels)
Conv2d追索至fcos_core.layers.misc.Conv2d()
class Conv2d(torch.nn.Conv2d):
def forward(self, x):
if x.numel() > 0:
return super(Conv2d, self).forward(x)
# get output shape
output_shape = [
(i + 2 * p - (di * (k - 1) + 1)) // d + 1
for i, p, di, k, d in zip(
x.shape[-2:], self.padding, self.dilation, self.kernel_size, self.stride
)
]
output_shape = [x.shape[0], self.weight.shape[0]] + output_shape
return _NewEmptyTensorOp.apply(x, output_shape)
FrozenBatchNorm2d追索至fcos_core.layers.batch_norm.FrozenBatchNorm2d()
def __init__(self, n):
super(FrozenBatchNorm2d, self).__init__()
self.register_buffer("weight", torch.ones(n))
self.register_buffer("bias", torch.zeros(n))
self.register_buffer("running_mean", torch.zeros(n))
self.register_buffer("running_var", torch.ones(n))
def forward(self, x):
scale = self.weight * self.running_var.rsqrt()
bias = self.bias - self.running_mean * scale
scale = scale.reshape(1, -1, 1, 1)
bias = bias.reshape(1, -1, 1, 1)
return x * scale + bias
register_buffer:pytorch.nn.Module的方法
This is typically used to register a buffer that should not to be considered a model parameter.
通常用于注册不应被视为模型参数的缓冲区
rsqrt(): Returns a new tensor with the reciprocal of the square-root of each of the elements of input.
rsqrt()返回每个元素平方根倒数
这个算法目下看不懂,看懂了再来补解释
stage_specs
→_STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
→ResNet50StagesTo4
ResNet50StagesTo4 = tuple(
StageSpec(index=i, block_count=c, return_features=r)
for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 6, True))
)
在其上方索至,即stage_specs定义了各阶段参数(index序号, block_count该阶段剩余块数, return_features是否返回特征图)
StageSpec = namedtuple(
"StageSpec",
[
"index", # Index of the stage, eg 1, 2, ..,. 5
"block_count", # Number of residual blocks in the stage
"return_features", # True => return the last feature map from this stage
],
)
transformation_module
→_TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]
→BottleneckWithFixedBatchNorm
其中, num_groups=1, stride_in_1x1=True, stride=1, dilation=1, dcn_config=None
→Bottleneck, norm_func=FrozenBatchNorm2d
该类有__init__、forward两个方法,forward和其它算法模型类同故略过不谈
class Bottleneck(nn.Module):
def __init__(
self,
# omit
):
super(Bottleneck, self).__init__()
self.downsample = None
if in_channels != out_channels:
down_stride = stride if dilation == 1 else 1
self.downsample = nn.Sequential(
Conv2d(
in_channels, out_channels,
kernel_size=1, stride=down_stride, bias=False
),
norm_func(out_channels),
)
for modules in [self.downsample,]:
for l in modules.modules():
if isinstance(l, Conv2d):
nn.init.kaiming_uniform_(l.weight, a=1)
downsample:当输入输出通道数不同时, 利用一卷积层映射
if dilation > 1:
stride = 1 # reset to be 1
stride_1x1, stride_3x3 = (stride, 1) if stride_in_1x1 else (1, stride)
stride_1x1, stride_3x3都为1
self.conv1 = Conv2d(
in_channels,
bottleneck_channels,
kernel_size=1,
stride=stride_1x1,
bias=False,
)
self.bn1 = norm_func(bottleneck_channels)
定义第1层卷积层
with_dcn = dcn_config.get("stage_with_dcn", False)
if with_dcn:
# omit
else:
self.conv2 = Conv2d(
bottleneck_channels,
bottleneck_channels,
kernel_size=3,
stride=stride_3x3,
padding=dilation,
bias=False,
groups=num_groups,
dilation=dilation
)
nn.init.kaiming_uniform_(self.conv2.weight, a=1)
self.bn2 = norm_func(bottleneck_channels)
self.conv3 = Conv2d(
bottleneck_channels, out_channels, kernel_size=1, bias=False
)
self.bn3 = norm_func(out_channels)
for l in [self.conv1, self.conv3,]:
nn.init.kaiming_uniform_(l.weight, a=1)
定义第2、3层卷积层