ShuffleNet

1. ShuffleNet

1. ShuffleNet

https://arxiv.org/pdf/1707.01083.pdf 2017/12, face++

shuffnet 也是针对移动端的模型, 和 MobileNetV2 类似, 也使用了 resnet 的 bottleneck 和 depthwise conv, 不同的是它认为 1x1 conv 代价太高, 所以把 1x1 conv 变成 1x1 group conv + channel shuffle

1.1. 1x1 conv

depthwise conv 需要 1x1 conv 是因为 depthwise conv 时各个 channel 都是独立计算的,各个 channel 之间没有交互, 结果会不好, depthwise conv 前后加上 1x1 conv 可以修正这个问题, 因为 1x1 conv 实际是针对 channel 来做融合.

1.2. 1x1 group conv

1x1 conv 的计算量和参数都很大, 以 mobilenet_v1 为例:

因此 shufflenet 使用 1x1 group conv2d 来降低 1x1 conv 的参数和计算量.

1.3. channel shuffle

使用 group conv 可以降低计算量, 但还需要用 channel shuffle 来 `融合` group conv 的 output channel.

例如:

假设有 4 个 group, 每个 group 有 3 个 channel:

| 0 1 2 | 3 4 5 | 6 7 8 | 9 10 11 |

shuffle 之后为:

| 0 3 6 | 9 1 4 | 7 10 2 | 5 8 11 |

这里的 shuffle 并不是随机 shuffle

1.4. Network

1.4.1. stage

shufflenet 中每个 bottleneck 称为 stage

a 是普通的 resnet bottleneck.
b 是 stride = 1 的 shufflenet stage
c 是 stride = 2 的 shufflenet stage

1.4.2. network

Backlinks

DepthwiseConv2D (CNN > DepthwiseConv2D): DepthwiseConv2D 最早是在 MobileNet 的应用的, 后续针对移动端的 ShuffleNet 也都会使用它, 以提高推理速度

Image Classification (Image Classification > ShuffleNet): ShuffleNet